Ep 106 – Data Center Network Design Artwork

The Art of Network Engineering

Join us as we explore the world of Network Engineering! In each episode, we explore new topics, talk about technology, and interview people in our industry. We peek behind the curtain and get insights into what it's like being a network engineer - and spoiler alert - it's different for everyone!

For more information check out our website https://artofnetworkengineering.com | Be sure to follow us on Twitter and Instagram as well @artofneteng | Co-Host Twitter Handle: Andy @andylapteff

All Episodes

The Art of Network Engineering

Ep 106 – Data Center Network Design

November 23, 2022 • The Art of Network Engineering • Episode 106

Send us a text

This week we talk to Gabe Pariacote. Gabe is a Principal Architect with Sirius Computer Solutions (a CDW company). He helps customers design and implements enterprise campus and data center networks. Gabe also has a background in the service provider space and is a 3X CCIE Routing and Switching, Service Provider, and Data Center. CheckContinue reading "Ep 106 – Data Center Network Design"

Find everything AONE right here: https://linktr.ee/artofneteng

00:00
This is the Art of Network Engineering podcast.

00:11
In this podcast, we'll explore tools, technologies, and talented people. We aim to bring you information that will expand your skill sets and toolbox and share the stories of fellow network engineers. Hey there, friends. Thanks for coming on back to the show. A while back, in episode 65, we painted a happy little Enterprise Campus Design canvas. I enjoyed myself so much in that one, we decided to take another page out of that book. Perhaps another chapter?

00:40
Maybe this is chapter two. Would you read this book? I know I would. Anyway, this time, let's see how creative we can get with a nice motif of an enterprise data center. What should that look like? Maybe some hot aisles over here, and we can't forget to invite the cold aisles. We'll put them over there. They should get a chance to join the party too, because you know, everybody's welcome here. Next, we'll add some servers to this rack. How about some storage right next door?

01:10
There, don't they look like friendly little neighbors, those two? We should probably add some switches up top so our server friends can talk to each other. They like to play a little game of telephone from time to time. Since we're talking about switches, are we gonna see an appearance of the spanning tree bear this time around? I guess you'll just have to stick around to find out. Let's get on with the show. Welcome to the art of network engineering.

01:39
Tim Ross back in the house. Welcome to the show. How you doing Tim? It's freaking hot under that wig. I'm so happy that intros are back. My life was incomplete without intros. They really, really were. It's like the highlight of my week. I don't know what that says about me, but. I got lazy. That's the problem. Oh, you've had a couple things going on, man. What's new, man? With me? Yeah.

02:09
Probably should have thought of this before we started the show. I don't know. We are professionals. Yeah. I have something fun cooking at work that, uh, I'm pretty excited about, but you know, in, in, in track it pacer lingo, you know, I can't really talk about it right now. It's it's super top secret. It's not rockets, but just below rockets. Yeah. Work is good. Home is good. I can no longer swim in the pool because the water's too cold. So

02:38
Well, fun's kind of over. Yeah. How does that, how does that work? Yeah. How do you empty it? I've never had an in-ground pool. Well, you don't want to empty it because it's a lot of water and then you have to fill it all back up. So somebody will come out and they drain it down below the skimmers and do the thing and blah, blah, blah. I don't know. It's, it's above my pay grade, but people come and do the thing. But in the spring you're coming out, right? We're going to have a, an a one barbecue pool party. I have to. That's what I heard. Yeah. All right. Cool.

03:09
So I'm good man. Do you have anything? Any updates on your end anything new just it's getting to be the best time of the year, man I know I'm sorry that that you got the pool done and you already got to be done with it But it's the best time of year man The weather I think you posted something on Twitter earlier today sitting out by the by the fire pit Just the best time of year now granted this I like the fall granted this episode is probably gonna air after Christmas But I mean everybody gets the idea

03:39
It'll drop in June. We're talking about the nice fall Chris bear. Yeah, no, but I'm good, man. Glad. Glad we got to, uh, do this next, uh, design episode. We teased it in the intro, um, earlier, I think last year, it's hard to believe that was episode 65 and we're past a hundred now. It doesn't seem like that should have been that long ago, but we ran an episode on kind of high level

04:06
enterprise campus design talking about, you know, core distribution access, how you can do different use cases can call for different designs as far as layer two and layer three from distribution to access all that good stuff got into a little bit of wireless as well. And we decided we wanted to take that idea a step further and step into the enterprise data center. What is what is an enterprise data center look like, you know, why do we have them?

04:36
What are they going to look like years from now? That whole kind of thing. And because I could only really scratch the surface and not even act like I know what I'm talking about with most of it, we decided to invite Gabe Parriacote, a principal solutions architect.

05:04
for CDW. What Gabe does is he assists customers in all facets of their enterprise designs and he doesn't just walk away after handing over a design document. He also gets his hands dirty, helps out wherever he can. And by the way, he just happens to be, he's wagging his finger at me. Oh, I wasn't supposed to say that he actually gets in there and helps out hands on, my bad.

05:32
We won't we won't tell the the leadership at CDW, but I do gotta mention Even though he may not want me to Gabe is also a triple CCIE Whoa, whoa, take it easy. How many Andy? Hold on Andy. You got kids Don't give him shit Andy I like this guy No, that's amazing. Anyway, let him talk before I start. Welcome to the show. Thanks for joining us. How are you?

06:02
Nothing. I'm doing great. Thank you, Tim. Thank you, Andy, for having me here tonight. Yeah So it's a pleasure to be here with you guys tonight and have a little conversation About data centers like you were saying and yeah, sky's the limit tonight guys. So Thank you again. So before we get too deep into it, can you give us a Quick brief background on yourself. Where'd you get your start in networking to where you are now?

06:29
Yeah, absolutely. Yeah, if you guys don't mind, I'll share really quick about, you know, personal lives. So I'm originally from Venezuela, that's in South America. I started working there for a service provider, one of the big ones in the world, Telefonica, which is headquartered in Spain. I work for them, you know, kind of building the network, the backbone. Pretty much when MPLA started, that's probably one of my backgrounds is service provider.

06:59
And I worked for them for seven years, then the wife and myself decided to come to stay, to do our masters. And then after that, we just kind of decided to stay. I worked for another kind of small service provider company here based in Omaha, Nebraska for five, six years. And then I started with, you know, Sirius now CDW as a consultant, as you were mentioning.

07:25
And I've been with them for almost five years by now. Right. So that pretty much kind of a fast forwarding career. And Andy, just getting back to you answering that question. I don't even know how I did it, man. It was painful. And you're right. I almost stopped because I started the training before the kids show up. And I almost stopped. Right. And I think when I, when I had head on my arms.

07:50
I think she was the inspiration to continue actually, right? I needed to finish and close that chapter, so. That's beautiful. Yeah, yeah, so. So how many CCIEs did you have when your daughter was born? Zero, how? Let's put it that way, how? All right. So we need to brag for you because you earned three with wife and child, that is. Yeah.

08:15
Super, you must have a very, you know, we've spoken a lot in the past about support systems and how we couldn't do what we do in our career without, you know, support. I'm guessing your wife is pretty supportive. Absolutely, yeah. I wouldn't be here, yeah, and holding all of that if she would have been any support. And not only her, also my kid during this last seven years of her life, there has been a lot of sacrifices. So yeah, you're absolutely right on that, so. Awesome, man. So I know you do more than,

08:44
than data center design. What are some of the other things that you've gotten into and what is your, what's your kind of draw to enterprise data center networks and design? I think it comes a little bit with the background that I have, right? Building service provider networks, if you think about it, the product is networks. So you want to build something super resilient, super efficient.

09:12
because you are providing networks to customers, right? When you come to the consulting side, now you are supporting different customers on different verticals, right? So you have the finance sector, the insurance sector, the big enterprises, architecture firms, health systems, et cetera, et cetera, right? All the requirements across those different verticals are different, right? Once, you know,

09:40
Everybody wants to have everything running 24-7, right? They want to have a second data center because if it's a disaster recovery, they need to pass some audits. They're building against different frameworks that they need to be solidifying in order to have, their customers bring their data to them and be safe. So all of that information, I mean, kind of putting all the pieces together, hearing all those requirements and basically translating that into solutions.

10:10
With the background that I have, I feel that somehow, you know, it kind of facilitates things to drive those discussions with all my different customers out there, pretty much. Data center, like you mentioned at the beginning, you mentioned where our data centers are going to be in the future. I don't think I don't even know where they're going to be a year from now, if I'm being honest with you, because they're evolving in a way that we have never imagined, right?

10:37
We have switches inside of switches inside of switches. We have servers inside of servers inside of servers, right? So servers, then we have virtual machines, we have containers. We don't know what is next, right? We're probably going to be talking today about VXLAN, which is something that replaces spanning tree will replace. And now we're all seeing that the public cloud providers are using the protocol Geneva, which I barely know a little bit about it.

11:04
that I started doing some research like a couple of years ago, and technology continues amazing us, right, how everything is evolving, right? So it is hard to say, Tim and Andy, how technology in general is going to be even a year from now, right? So what kind of a role do you have to get or do you have to have? At what point in your career did you earn your eidetic memory? Because I got to tell the audience, I've had

11:34
Countless meetings with Gabe. I know he has countless customers and huge network topologies. He's seen them all. He can recall stuff from meetings six months ago and he'll call out a specific VLAN number out of my network and I'll just stare at him dumbfounded. I'm like, I don't even remember that. And I'll look it up. Yeah, that's right.

12:00
So yeah, I knew it. I knew he was smarter than me. I don't know what day it is, Tim. No, no. No, come on, guys. I'm gonna walk out here if you can take that. So that's a good point, Tim. To be honest with you, I don't know, right? I think what I have back here helps, right? I kind of came prepared. I don't know if we're gonna use this today, not on the discussion, but actually doing the whiteboarding, walking through it, kind of, you know, as crashing the surface and looking under the hood.

12:28
To me, while I'm doing that, it actually sticks in my brain. That's kind of my mentality. I think you learn from your mistakes. And I have done a lot of mistakes in my career. And all those mistakes have made me the person I am here today. If I wouldn't have made any mistake, believe me, I wouldn't be here at all. Sometimes we have to take risks. So I think it has just given you a piece of advice, working on the whiteboarding sessions.

12:58
You know how I love, you know, face to face. This is proactive, right? We're seeing face to face, but sometimes it's not enough, right? So I think that's what works best for me. I will say this, I tell this, a lot of people ask me, right? Hey, what should I do to prepare for my CCIE, including one of the guys that are here on the call today. I'm not gonna say any names, but I'm just meeting Andy today, FYI.

13:24
It is whatever works for you, right? For me, it was watching videos, right? I don't like to read books, to be honest with you, right? That doesn't work for me, right? I read somebody read the book and basically, you know, give the summary to me and I watch it, I listen to it and it kind of sticks in my brain, right? Rather try and read it, right? My piece of advice to everybody out there listening to us tonight or tomorrow,

13:48
Just find whatever works for you, right? Everybody's different. So don't think that you have to follow what that guy did because that's what it worked for him and that's what it's going to work for me kind of thing. I'd like to level set a little bit because I know we're going to, so data center is my Valleywick. I am not strong in design, right? So I've worked in organizations that were so big that I was kind of like the operator, right? I was handed designs from architects and, hey, go build this and run it.

14:18
But what...

14:21
I know that Tim has an outline here, so I'm not exactly sure if we're gonna get to it. But for me, the data center design that I'm comfortable with is the old hierarchical model, right? The layer three core, the distribution layer, and then all your access up there, right? In my last job, we merged with a company, we bought another company, and their network, they were ahead of us in the times, and they were, you know, Clow, Spineleaf, and...

14:50
I was just spun around by it. I've since learned a good amount, but when you were coming up, I mean, you said like Telefonica and Service Provider building MPLS backbone. I mean, that doesn't even apply, right? Because in this company I worked at, we also built an MPLS backbone, which again, I was just lower level support guy. But when you think data centered design, do you think in terms of the old hierarchical model versus the new...

15:18
you know, Northwest traffic, East West, or you, does that not even come to mind? Like today, if you're building a data center, it's gonna be spine leaf, right? Nobody's building the old way, I'm guessing. Don't say that. There's still use cases, you know, that we're talking about. That's probably one of the advantages you guys talking to me to tonight, like Tim mentioned, right? I kind of support different customers, right? And they have different requirements.

15:48
Your statement is somehow true. I would say probably 80 or 90 percent of the customers out there are going into the spine leaf architecture direction. So the question is why? You were mentioning or you were asking there on that statement, okay, how does it apply you building service provider networks to build in data center networks? So help me clarify that. So let me throw you something out there. So

16:15
When we build service provider networks, we're thinking in having an uptime that, you know, if a specific link in the network, let's say a fiber between two cities, or, you know, a router goes down and you have hundreds of customers connected there, you know, your SLAs, I'm pretty sure we all on this call have paid for MPLS layer 3 services or some kind of services. And they're pretty expensive for just a 10 meg or 100 megabit per second.

16:43
One of the reasons that they are higher expensive is because the IPSLA has five, not six nights, right? That if you count in a year, we're talking about less than five seconds of failure in a year. Okay? So in order to accomplish all of that, every time that you're doing maintenance, you have fiber costs, et cetera, et cetera, the network has to react super fast. So whatever is riding through that network doesn't fail.

17:10
So when we're building data centers, kind of the same concept applies, right? Tim was mentioning in the intro, right, that we have some devices called servers that sometimes they wanna talk to each other. Why do they wanna talk to each other, right? Why don't they stay talking to themselves? So when they talk to each other, right, we have sometimes storage protocols, right, that sometimes they write through the network, right?

17:34
not to focus specifically on fiber channel and storage area networks, but sometimes we will ride through this network to access our storage profile. So what happens if on our computer we go there and we take out our hard drive and what happens to our operator system is going to crash, even if we do it super fast. So when we're building this spine leaf network, one of the reasons that people is evolving to that is to eliminate the spanning tree protocol. Okay.

18:03
Why do we want to eliminate the spanning tree protocol? The protocol itself doesn't have any problem, right? The problem that it has is when we have a link failure or when we have a device failure, okay? Spanning tree, there's several evolutions of the end. The fastest that you can get on a reconvergence with an average size network in a data center is probably gonna be something between 20 and 30 seconds. So that link can start forwarding traffic. So...

18:31
Those 20, 30 seconds, if you think about it, there might be a lot of applications that can tolerate that latency and that delay. But if you're running a voice call, for example, that voice call is not gonna last more than seven seconds waiting for that. When you move to the SpineLift architecture, now you get rid of a spanning tree. Every single port is forwarding and every single, you know, at any given time. And now you can have a link failure and you have protocols in place.

18:58
that put that link immediately on the table and all the traffic shuffles within milliseconds, right? Because the protocols that we use to guarantee that file over, they can react as fast as 50 milliseconds. And that's pretty much neglectable to any application that you're running on today's data centers, right? So that's- So 20 seconds to 50 milliseconds, right? Is that the- Correct. That's the numbers, right? That's a big jump, right? If you wanna write it down, it is a big jump, right? So that's one of the main drivers, right? I'm pretty sure.

19:28
You know between if we take a step back right between a spanning tree and a spining leaf We got the introduction of Nexus from Cisco specific right and you know Aruba has other technology that they call a BSX as well right now specific to Cisco and Where they kind of introduce the MLAC which is multi-chassis Technology and basically allows you to have you know wrong spanning tree

19:57
and have all the links in forwarding stage. So that way when you have a link failure or a device failure, that helps a little bit. But the extra overhead of the multi-lag protocol wasn't enough, right? There were still overhead splitting brain as in a situation where all of a sudden, the two boxes were reporting as active, and that crashed the entire network. And that's why I personally believe the SpineLift architecture.

20:26
a ball because they were still, it wasn't getting better enough, right? And it is still getting better, right? So it's more resilient for sure. The failover is way faster. And I think you had alluded earlier to, you said server within a server, and so you're starting to allude to like the microservices, right? Correct. They're all talking to each other, which is all East West. Because again, I came up in a time and built networks that a client needed to hit an application, they came in, right?

20:54
They hit the application, they go back out, but now there's 80% of the traffic's East West, right? It's microservices, service talking to each other and Spineleaf is a much better design for that if I'm not mistaken. Because everything's like two hops away, right? Like a three-state square. That is correct. Yeah, that is correct. Well, we can get creative. There's dependent on the size of the data center. There are models where they have an additional layer or a super span.

21:22
The five states, right? Yeah. But the nature of the data center, you're correct. You mentioned something which is really important to highlight, that what we're hosting on the data center are applications, right? In applications, they talk within each other, right? You have an application front-end that is presented to the user, right? That's usually that north-south traffic that you were referring to. And then you have a lot of east-west traffic between the application server and your database.

21:50
Right and that traffic is usually the heavy ball hook that is usually east west and stays within the data center So i gabe I want to unpack a little bit more. Yes How we're how we're doing connections and and um layer two layer three in the data center, I I think it's easy For people especially customers of cisco and i'm going to talk about this because I know cisco gabe I didn't warn you but andy works for juniper

22:20
Probably should have told that ahead of time, but hey, we're all friends here. I think it's easy to equate going to a Spineleaf architecture to having to jump into doing things like overlays with VXLAN. Is that true? Or can you still leverage traditional networking, traditional layer two in a Spineleaf architecture? Can you kind of walk us through that?

22:47
Yes, well, the spine leaf architecture itself is just more about the hierarchy and let's call it the layer one topology, right? The Nexus, we were talking about Nexus and Aruba, BSX, and I'm pretty sure Juniper has something as well. I'm not too familiar with Juniper where you can have a cluster and it allows you to have that spine leaf representation, right?

23:17
when we kind of build a spine leaf architecture on a layer two perspective is that we're going to be limit when we try to scale horizontally speaking. That switch or that stack that you put there on the spine layer, it has so many pores that you can connect leaves that at some point in time, you're going to run out of force and you're going to have to put in

23:45
Now how do you connect layer two, those two blocks of spines without actually breaking the spine layer architecture, right? That's where the complexity comes. You are limited when you're trying to grow horizontally speaking, right? And we have exercises with some big customers that, okay, I put an Exus 7018, which is a super chassis, almost a full rack that you can put 16 light cars.

24:12
Okay, every single line car, it can have up to 48 ports, and I can have, you know, do the math, how many switches can I have there? Well, now you're running into issues on scalability on the platform, right? Just because I can have 18 ports, I can probably cannot support that many MAC addresses, I ran into TCAM issues because everything comes down to the actual ASIC that is on all those line cars and the supervisors that those chassis ran, right?

24:40
When you move to layer 3 team, just to answer the other side of your question, right? Now everything is routed, right? Basically when we move into layer 3 and we build an overlay like you were mentioning, we build VXLAN is basically a tunnel, like a Geary tunnel. We all have built Geary tunnels over the internet or some fashion. And we're actually using...

25:04
routing protocol that probably a lot of this day a lot of people is used today thanks to MPLS services, which is BGP So we use BGP to exchange information Hey, I have this MAC address if you want to reach out this MAC address just be a vehicle and tunnel to me Right as simple as that and we know Yes, I just learned recently that BGP can advertise MAC addresses and my brain exploded because I you know

25:31
I was a WAN guy, every BGP was with a PE and we were routing and learning routes. Somebody taught me about VXLAN and EVPN and control plan, like, wait a minute, BGP can carry MAC addresses? Anyway, sorry, keep going. I just blew my mind. No, no, no, no, no, no. I had no idea. BGP continues to amaze me. Hey, hey, guilty, guilty. I was in the same boat once, yeah. Yeah, I cannot believe it. How can a running product can exchange MAC addresses? Well, yeah.

25:59
And don't you have to extend layer two everywhere because of those microservices like VXLAN does that, right? Is everything can reach everything and that's because of the magic of VXLAN, right? That's the beauty of this solution, for say, right? And by the way, it doesn't have to be a software defined deployment to be automated. So you mentioned, Andy, that you have a little bit of experience and exposure to an MPLS provider, PEs, MPR routers, right?

26:28
And one of the beauties of that on the P is that you don't have to be touching every single router just to create this VRF, right? It just exists there automatically. The same happens with our fabrics in our data center in the SpineLiv. When you provision a new pair of leaves, and again, I know we haven't gotten into, okay, what happens on my data center number two, right? Let's focus on our first data center. When I add a new pair of leaves, I just have to peer BGP.

26:55
with my rough reflector and it's gonna be sending me all the Mac addresses of all the lives that are participating there. As simple as that, right? I don't have to be creating a spanning tree, I don't have to be creating VLANs, I don't have to be creating SBIs, everything is basically automatically because we leverage VGP as the control plane like you were mentioning, layer two VPN, EVPN. It's so cool. We're tunneling Mac addresses. We're tunneling layer two over layer three. It's bonkers. It's crazy. It's so cool.

27:24
Yeah, one of the craziest thing is, you know, how do we deal when we don't know the Mac address, right? Because it is super easy when, okay, you are Lyft number one, Andy, and Tim is Lyft number two, and I'm Lyft number three, right? And I know my Mac, Tim knows his Mac, and Andy knows his Mac. But let's say that we have Mike that is connecting to the network, and we don't know Mike's Mac address, right? So how does the process work on a traditional network, right? We all know how broadcast works.

27:53
We receive an ARP request on a switch, right? And if we don't know the MAC address, what that switch is going to do is that, it's gonna grab that and it's gonna send that packet that is, you know, the destination MAC address is full of Fs, right? F, F, F, F. And it's gonna send that on all the ports that are participating on that VLAN, except the one that it came in, right? That's the default behavior of networking. And that is going to be saying all across that layer to the network, right?

28:20
How do we replicate that on a spine leaf architecture when we have a layer three boundary, right? So we leverage, you know, all the routing protocols, which is multicast routing to basically emulate the behavior of a broadcast network, right? So we basically encapsulate that broadcast packet, right? It would be super beautiful to see a packet capture of that. And it encapsulates that and it puts that on a multicast IP address. And how does multicast work? One to many, right? Every single leaf.

28:49
receives a copy of that packet and essentially they capsulates that and send it southbound to all the ports that are participating on that VLAN. And essentially if Mike is connected to LIF4, Mike is going to respond with an R reply saying, A, I have that IP address and my MAC address is ABC. At that particular time, LIF4 sends the update on BGP like you were mentioning. A, MAC address ABC is on LIF number four. Now you.

29:18
team and myself know that Mike's, Mike's, uh, my address is only if number four. And if we want to send later to traffic to it, we just build the VXLan tunnel to Mike, right? So there's a lot of details involved there that, you know, we can go as deep dive as you guys want to know. But does every leaf in that fabric have a Mac table of all the Macs connected to that fabric learned over the actual, like you said, a traditional switch would have its Mac table, right? You know,

29:49
Andy, I have to say, keep it going, because you are kind of reading my mind. I don't know what to say and what not to say, but I'm glad that you're kind of bringing all that, because I can provide a specific, what I know what you guys wanna hear, right? So the other beauty of that, right, when you compare with traditional spanning tree that Tim was mentioning, right, is that every single switch in the data center, when we're running a spanning tree, they have the full MAC address table of the entire network. Okay, when we're running a SpineLift architecture,

30:19
Every single leaf holds only the MAC addresses locally to that switch. Let me say that one more time. That switch only holds the MAC addresses that are connected directly to that switch. So whatever NICs in that rack connected to the top right switch. Correct. Or most likely, yeah. Just local, right? Okay. Because all those MAC addresses are layered to information. And what happens from that leaf?

30:47
upstream towards Pine is just BGP, VPN, control plane updates. So those MAC address are translated for say into actually BGP routes. So we basically have that domain there, which if you think about it from a scalability perspective, every single switch, you know, if you have a data center, so imagine, so this is something I don't know if you guys want to talk about or not. So, you know, Facebook, yeah.

31:16
that have those big data centers across the country. They run BX on ABPN on their fabrics. That's not a secret. And one of the reasons they do that is because minding how many servers do they have in order to support that many customers. They have hundreds, if not thousands, hundreds of thousands of servers. And that's a lot of MAC addresses that there's no switch out there powerful enough that can actually hold all those MAC address tables on their ASIC. There's no ASIC powerful enough.

31:45
One of the reasons they moved to this particular architectures is because they can scale horizontally because all those MAC addresses are locally significant on the switch, right? And everything upstream is just BGP updates. And we know how powerful BGP is, right? Because it's a routing protocol of the internet, right? I bet you, you didn't know that one, Andy, right?

32:06
You got me. I mean, there's got to be a downside, right? This all sounds super happy, fun times, amazing. Like, you know, what could possibly go wrong? I mean, there's more resiliency. There's faster failover. Like I say, and they keep coming. Well, right. Like, I mean, what I guess there's no downside because that's what modern data centers are building that architecture. So yeah. Yeah. Is there any downside?

32:32
There are downsides, there are downsides. And one of the downsides, unfortunately, which might be, you know, Tim mentioned, you guys did an episode on software-defined access networks, right? One of the downsides is that, you know, there's no secret out there that BGP is slow. Okay? So when you have re-convergence on BGP, if one particular node goes down or you have what happens a lot in the data center, a VM mobility, right?

33:02
B motion 1 VM from one data center to another one or from one ESXi host to another one. So there's a process now that we have to grab that MAC address I used to have on leaf one. And now that VM was moved to an ESXi host that is connected to leaf four. And now there has to be a learning process there that that VM has to report traffic. And now that leaf number four has to generate that BGP advertisement to the rest of the network that now MAC address ABCD.

33:31
is connected only four and leave one, right? BGP is a slow doing that. I mean, it doesn't take seconds, right? But there's room for improvement there that it could be faster in that particular right. So can't you just throw BFD at it? Like that's what we used to do to our peering with Telco. So- BGP faster. Does that apply in fabric or no? It does apply on fabric when we build the underlay.

33:59
and we build the BGP appearance, right? But in this particular instance is actually, because remember you mentioned it, how is it possible that BGP, that it is used to do route advertisement and it can actually advertise MAC addresses, right? So we're talking about the same thing here, right? We have a MAC address that is ABC and it is being advertised out of this wish, right? Now the MAC address is no longer there, right? So now BGP has to do its thing, right?

34:29
This leaf has to send an advertisement to this route reflector and say, hey, do not send traffic to me anymore because I do no longer have this MAC address. Now we have to wait for the learning process of leaf number four. So leaf number four receives MAC address on the southbound port that is connected to the server. And it knows, hey, now I have MAC address ABC. Now I'm gonna grab that MAC address and I'm gonna make that VGP advertisement upstream.

34:57
So if somebody wants to come to ABC, has to come to me, right? So this time is what it takes BGP to actually withdraw an update, right? From one switch. And now that advertisement is coming from another switch, right? BFD is gonna help you that if that switch crashes, it's gonna bring the BGP session immediately down and we don't have to wait for the default times of BGP or longer times. Now we're up in the overlay and you have to wait for those default timers, right?

35:27
Correct. With BGP, yeah. And they're pretty slow, right? No, they're fast. The advertisements are fast because we're tracking, we have a special attribute that was configured on BGP that makes it a little bit faster than what it takes on a layer 2 spanning 3 traditional network, which is basically okay. I'm just going to say a broadcast around all the switches and eventually it's going to get to the access switch that has a MAC address and it's going to reply to me.

35:57
So, but that's one of the disadvantages, per se, in room to improvement, right? This is my last question, then I'm gonna get out of Tim's way. Tim, I could just pepper this guy for hours. This is great, but keep going. Well, I don't wanna take over, but I'm just learning about this stuff now, so it happens to be really relevant. Last question, does the spine leaf architecture, does it reduce complexity at all? And I'm asking because in the environments I've managed,

36:27
Everything was a snowflake. Every business unit built a super important thing and went around all the standards because it had to go fast. And then you just have landmines everywhere that people stumble on and maintenance windows, right? So it seems theoretically high level that this architecture could reduce complexity because everything's just kind of a cookie cutter, a template where just, we're scaling laterally and that seems like a more simple design. So that's probably good, right?

36:57
Yes, at the end of the day, to be trying to keep it short here, the short answer is yes. It makes it simple, right? And I kind of challenge you to ask somebody that it was one of my first meetings when I started at this position, actually. And this customer was actually running

37:25
spending balance between data centers through the web, right? That's stretching layer two, right? Is that what people say? Like it's stretching layer two. And it's bad because now you're extending your blast radius, right? Of spanning tree. Yeah, exactly. Right. And you know, there were some hesitation like everything new, right? To deploy some kind of a fabric there to spine leave. So we can actually eliminate a spanning tree over the web, right? Which could be dangerous, right?

37:52
And I think they have had several issues actually running spanning tree or that metro area network. And you know, there was some value. There were lessons learned that sometimes we don't see any immediately, right? Like I was telling you guys a couple of minutes ago, right? I've been hearing about Geneva being used on the public cloud a lot. And I honestly don't understand how the public cloud.

38:18
is so eager to migrate or evolve from VXLan to Junim because VXLan we can have up to 16,800,000 something segments that I cannot imagine they running out of slots, per se, for data center that they are eager to evolve to the next big team, right? But when we're talking about a small data center or a small medium large data center, 16 million is going to last you probably.

38:47
I'm going to say forever today what I told you guys at the beginning, right? I don't even know where the technology is going to take us a year from now, right? The hyperscalers seem to live in a different reality, right? Just the scale is staggering. So I've never heard of the Geneva Protocol. Is that a replacement for VXLAN and that's something hyperscalers are adopting? Okay. Yes. It is another overhead. VXLAN uses UDP. Geneva uses also.

39:15
a the TCP header, the IP header, I should say, and it's another encapsulation, right? But again, it still relies at least until today on the control plane of BGP. There's a lot of discussions out there on the open that maybe something like, you know, I'm going to mention out there Cisco did on the software defined access that they leverage Lisp, because they believe that that routing update with IP mobility is faster on Lisp than on BGP.

39:45
which if we actually run the numbers, they probably have simulators to run hundreds of thousands and they probably reached to that conclusion. So.

39:56
I think what's really appealing about these overlay technologies, both in the data center and in campus networks, are really the pillars of flexibility, scalability, and stability. One of the things that we as network engineers are tasked with are keeping the network stable, keeping up times high. And one way to do that is what we've talked multiple times here is remove...

40:26
Or limit spanning tree as much as possible take layer 3 as far as you can go and in traditional networking that works fine. We can take layer 3 to the access layer on the campus. We can take layer 3 to the top of X which is in the data center. And that works great until somebody wants to move something and keep their same IP address right so then we have to kind of bend over backwards in traditional networking to make that work and then we're back to.

40:53
spanning VLAN stretching layer to bringing this complexity back to where where we don't want it and Now with these overlays where I can have somebody in a data center Like Gabe was saying earlier the motion From a VMware perspective virtual machines from one host on one end of the data center to another host and the other spanning Who knows how many racks and and leaf switches away and it can maintain?

41:22
It's layer two, layer three, but we still have that underlying infrastructure of layer three everywhere. It just looks like we're stretching layer two, but we're keeping that network as stable as we can. I think that's what's what's really appealing about all this. Yeah. So I want to add something on that team that, you know, it's funny you mentioned that there is a perception there.

41:45
And this is my personal opinion after recollecting different conversations with different customers and different engineers and different mentalities. You mentioned something important. I want to move something from here to there. It could be from the access layer to the data center, from the data center to the access layer, from one data center to another data center. And I don't know if you guys had for the agenda today to discuss a little bit of the public cloud, but let me ask you something. And I'll dare you to ask anybody out there.

42:15
Find me the first server guy that asked the network team, hey, I want to be motion this VM to the public cloud and keep the same IP. Why are they able to move it to the public cloud and change the IP and why they cannot do it when they stay on-prem? Magic. Right? So, and to answer the question, you can extend VXLong ABPN to the public cloud, okay?

42:44
So if you want, we can extend a local subnet that you have in your data center to the public cloud and extend that VXLine, EVP and fabric all the way to the public cloud. So you can keep that same IP that you have on-prem. But that's not their first initiative. Their first initiative is always to go with a new IP when they go to the public cloud. The question is why are they willing to do that on the public cloud and not when they're doing it on-prem? They want to keep the same pile.

43:14
on-prem, but not on the cloud. So different rules and engagement in the cloud, it sounds like. I didn't realize that. Yeah, but I mean, like I mentioned, you have the option to keep the same IP if you ask the right person. But why do you have an open mind when going to the public cloud, but not- That's a great question. But you don't have it when you're coming. Exactly, right? But you don't have it when- We gotta get these application people on and yell at them, Tim. What's wrong with these server people? Yes. It's not the network.

43:45
So I'm glad you mentioned that thing because you know everybody today out there you know the smaller of the smallest companies that you can imagine I have super super small customers they're building two data centers and the question always comes do I have to stretch layer two I need it is what I hear right I need it I need it because they have the dependency right and you're correct.

44:10
providing something or deploying something as a, you know, spine leaf architecture running something like VXLan EVPA, it actually gives you, it kind of emulates, right? The way that, the word that I was telling you guys before, that layer to segment, so that way they feel that they, the billing is extended, but to your point, it's not, because there are two different VXLan segments locally connected, so. So Gabe, you brought up earlier about organizations having...

44:38
multiple data centers. One of the scenarios I wanted to talk about was having two data centers, an active standby for DR purposes. So I have a disaster, my main data center goes down. I'm already when everything's all up and good and happy, I'm replicating data from data center one to data center two.

45:05
I have compute and storage ready to go at the other site. I just need to spin stuff up when the disaster happens. So for that to work, in a traditional data center, we'd probably be stretching layer two between the sites. In an overlay world, we'd have VXLAN running across them. From a DR perspective is...

45:30
Is the network infrastructure really the only way to do it? Is there magic within DNS that we can do? What are some of our DR options between sites? Yeah, so I'm going to clarify something there and something you mentioned there, Tim, right? If we're talking truly about DR, right, you mentioned the keyword there, disaster recovery. A disaster happened and we have to do something to recover, okay?

45:54
Probably the business kind of signed off into that and they understand the risk of that and they understand that it's going to take, I don't know, I'm just going to say six hours to have this secondary location up and running. We have personally, I have run into a couple of scenarios where customers have two bubbles where they have the same subnets, they have the same IP addresses, they have the same servers.

46:23
everything is running and you don't have, and again, I'm talking later to traditional network, by the way, right? And they don't have a layer two stretching between the two data centers because they don't need to, right? There are two different islands, right? And they wanna make sure that that environment is up and running for whenever they have to flip that switch and we have to be up and running out of this disaster location, we can do it. There has to be human intervention in order to execute that process, right?

46:52
That's for the disaster recovery. Another approach that we can take between two data centers is let's talk about active standby or active active. Okay, in this particular scenario or secondary location, we're not treating it like a disaster recovery. No, we have expectations from the business side that any data center can be taking traffic at any given time and nobody has to intervene to do something to make this happen, right?

47:21
How do we accomplish this? You mentioned one of the technologies that we use is DNS, right? Unfortunately here we have to work really together with our application developers, right? All the applications are wrong in the data center because we find ourselves that the major hassle to make that happen is that applications are actually reading using IP addresses, unfortunately. And if we're doing IP addresses and they're half-coded, pointed to an IP and not a DNS entry,

47:50
we can not leverage technologies like global DNS, where we can actually be checking if an IP address that results to a host name is available or not, right? That's pretty much some of the technologies that we can use to leverage that. We can talk about firewalls as well, how we keep connections symmetric thanks to the routing. But that's pretty much kind of a scratch on the surface.

48:16
some of the two options that you could tackle that problem that you were talking about. Okay. So to kind of start closing this out, I want to end on a very loaded topic. No, come on man, we're having fun. How can you do that to me, man? I gotta add drama. Yeah, add drama. So. Tell me there's gonna be an episode two, right? Gotta bring you back, that's right. So in a world of

48:46
In a world of cloud technologies. So in a world of cloud. In all the greatness, the hyper scale, all the good stuff we can spin up, spin down whenever we want. Should we? We kind of talked about the nebulous of we don't know where enterprise data centers are going. Should we even be building? Enterprise data centers anymore.

49:10
That's a great question, Tim. Unfortunately, the answer, it depends. Okay. I knew it, right? It was too good to be true, right? So, you know, one of the things that I learned on the last five years on this role that I currently hold, right, is that there's not a, one of the reasons I have a job today or I have this job is because

49:37
Nobody out there can do literally the same everywhere. Okay. Every single environment is different. Even, you know, I have two big, you know, out of the West region, I was involved out of one big hospital and I have another one big one in the Dakotas that you probably know which one I'm talking about, Tim, that they're pretty similar from the business side, but we're getting different requirements.

50:04
Literally the same solution that we're kind of proposing to these two customers are not the same. So that's why I said that the answer is the bench, right? So going back specific to your question about their center, should we be building data centers or prim? It always comes down. So we got conversations. Well, honestly, me personally, I don't have that many conversations with C people level, right? I have kind of a higher layer of architects.

50:31
that actually have those conversations at those higher levels, but I participated in a couple of those. That idea usually comes from the C level from the top down to on the pyramid, right? And why does it come? Because it comes to the business is trying to move into a model that more OPEX, less CAPEX kind of thing. And at the same time, how much it costs to maintain a data center, how much is going to cost me the equipment, right?

50:58
So we're talking about power, we're talking about cooling, we're talking about space, we're talking about servers, we're talking about switches, right? We're talking about all this infrastructure that is gonna cost us X amount of money. And that's gonna be topics. At the same time, we have to have an operational budget in order to keep that up and running with also includes skill sets in house, right?

51:22
We have to have people that are going to be able to support it, that if it breaks, we can call them and they're going to be able to fix it or at least work with vendors. So it is always a balance that we have there where people have to compromise there, which is why we're seeing a lot in hybrid approaches, where we have customers that they have decided to move their production workload into the public cloud and leave their DR or backup

51:52
with something that doesn't have to be supported 24-7 on the same scenarios as if you were running production. A lot of people, there's also as well feelings that they say, well, I think Andy mentioned at the beginning, the cloud is just somebody else's computer. It is still a data center somewhere out there. And it's gonna have issues. It could have issues as well. I have two customers that they're running.

52:20
Sd1 solution where they have their controllers hosted on the public cloud and guess what? They were hit last year and they lost the configuration. They recovered from one snapshot, but again, it could happen. Could it happen on your on-prem data center? Absolutely. It could happen as well. It kind of depends on how we architect the data center and how many receiving can be. So

52:46
That's why my answer team, unfortunately, and I apologize, is depends, right? Because it depends on the direction the company wants to go. It depends what the CFO wants the CTO to do sometimes, right? And it's not always down to what the technology is kind of driving us to go as well, right?

53:10
Yeah, I don't. Well, first off is a trick question. Cause I don't think I think anytime that you ask a technology professional a question, if they come right out swinging with an answer, then you might have to take that with a grain of salt. I mean, it's usually the first answer is it depends. You need more information, you know, to really give an answer there.

53:34
So we've covered a lot of ground tonight. Andy, you have asked some pretty awesome questions. Do you have anything else you wanna close this out with tonight? No, I really appreciate Gabe's time and expertise and insights and because I'm in the middle of learning a lot of these technologies, the new hotness, right? You really filled in a lot of gaps in my knowledge, Gabe. So it's a great conversation. Thank you so much.

54:00
Anytime guys, I mean, it doesn't have to be only this anybody. I mean, whenever you guys name it, I'll be there um, Gabe Do you uh, do you have any sort of a social presence that uh, you want people to find you on? Uh, I do I can probably say you something team. I mean i'm guilty. I'm not too active on social media That's probably has a little bit of what you were asking before andy. How do you have time for everything, right? So unfortunately, i'm i'm lazy on that

54:29
If you want to know how to get three CCIEs, don't be on social media. Cat videos will not get you certifications. Yeah, so yeah, I'll say you probably link it in. It's probably gonna be the one that I track the most, you know, for stuff like this for sure. Well, thank you, Gabe, for joining us. Andy, it's always a pleasure. Thank you to everyone listening. Check us out, arto on Twitter, at Art of Net Inge.

54:58
This has been another episode of the Art of Network Engineering. Thanks for joining us. Take care.

55:12
Hey y'all, this is Lexi. If you vibe with what you heard us talking about today, we'd love for you to subscribe to our podcast in your favorite podcatcher. Also, go ahead and hit that bell icon to make sure you're notified of all our future episodes right when they come out. If you wanna hear what we're talking about when we're not on the podcast, you can totally follow us on Twitter and Instagram at Art of NetEng. That's Art of N-E-T-E-N-G.

55:38
You can also find a bunch more info about us and the podcast at art of network engineering.com. Thanks for listening.

People on this episode

Andy Lapteff

Host

Jeff Clark

Co-host