The Art of Network Engineering

Ep 121 - Juniper Apstra

June 07, 2023 A.J., Andy, Dan, Tim, and Lexie Episode 121
Ep 121 - Juniper Apstra
The Art of Network Engineering
More Info
The Art of Network Engineering
Ep 121 - Juniper Apstra
Jun 07, 2023 Episode 121
A.J., Andy, Dan, Tim, and Lexie

This episode is sponsored by Juniper Networks!

This episode was recorded in-person at the 2023 Juniper Enterprise Analyst and Influencer Summit at University of Texas, Dallas! A.J., Tim, Dan, and Lexie chat with Vinod, Senior Director of Product Management.

We talk about the many challenges facing Network Engineers and then we talk about how Juniper Apstra can address them!

Try Apstra for free in Juniper’s virtual CloudLab:
Learn more about Juniper Apstra:

Find everything AONE right here:

Show Notes Transcript

This episode is sponsored by Juniper Networks!

This episode was recorded in-person at the 2023 Juniper Enterprise Analyst and Influencer Summit at University of Texas, Dallas! A.J., Tim, Dan, and Lexie chat with Vinod, Senior Director of Product Management.

We talk about the many challenges facing Network Engineers and then we talk about how Juniper Apstra can address them!

Try Apstra for free in Juniper’s virtual CloudLab:
Learn more about Juniper Apstra:

Find everything AONE right here:


This is the Art of Network Engineering podcast.



In this podcast, we'll explore tools, technologies, and talented people, re-integrate new information that will expand your skill sense and toolbox, and share the stories of fellow network engineers.



Welcome to the Art of Network Engineering. I am AJ Murray, and I am very excited to share with you that we are on site at UT Dallas campus for the Juniper Networks Enterprise Analyst and Influencer Summit for 2023. We are in person. I can't say that we're live because we're not exactly broadcasting this, but we are in person, and I'm happy to be joined by Dan. Dan, how are you? Howdy, how's it going? Great to see you. I know. We haven't done this for like a year. Yeah. In person.



A year this month or last month. Yeah. Tim Bertino. I'm fantastic, AJ, but I do want to set the stage for the audio only listeners. Dan is in a nice button up shirt and a trucker cap. But I love it. He even asked me this morning, he's like, should I ditch the cap? I'm like, no, it's you. It's part of the persona. You gotta have it at this point. But I'm very excited to be here. I was thinking, I think this is...



the first time I've ever set foot in the state of Texas. And not only that, but we got to do a fantastic tour of the University of Texas at Dallas campus. It's beautiful campus. We got to see a lot of really cool innovations here. So very excited to be here with my friends. Absolutely. And of course, Lexi. Lexi.



Absolutely stoked to be here. We've seen like the greatest commercial that I've ever seen in my life today. Yeah, the campus is beautiful. This is awesome. And it's great to see you. Yeah. We've been seeing like robots driving around and whatnot. So that's kind of cool. We're truly in the future right now. It certainly feels that way. Today our guest is Vinod. He is the senior director of product management at Juniper networks. Thank you so much for joining us today. I should say my.



Best Texas draw, howdy. That was fantastic. That was great. We'll work on that one. You did pretty good. Dan's the expert in howdy. Yeah, right. So the context of the conversation today is what are the challenges facing network engineers? And there's so many. I'm not sure where to even begin. I think legacy CLI sticks out. That's how we're still configuring our networks today. As a deployment engineer, I'm still doing that a whole lot. And I wish we could.



Get over that hurdle. What are some of the other challenges you guys think? I have to start with troubleshooting. I still feel that in enterprise IT, troubleshooting problems is still a very manual and reactive process. There are some things, yes, we can catch with monitoring. But a lot of times, unfortunately, it takes a phone call by somebody having a bad day, having a problem, frustrated. They make that call to.



front line and it almost feels like right wrong or indifferent. It almost feels like an interrogation to the person making the call because they get 20 questions about what's wrong, what's the problem. And if that issue has to be escalated to another team, there's a good chance that second team, that escalation team is going to be asking more questions, maybe even the same questions. And the monitoring isn't even proactive. It's like, Oh, Hey, by the way, this thing broke. We opened a ticket for you, but it's, it's broken.



Yeah, so I do think that that's a huge challenge that I don't know that I have the answer for. Right, right. Well, I think I know someone that does. Well, let me answer your first part, which is the CLI part. So at Juniper we acquired a company about two and a half years ago called Apstra, and we've solved that CLI problem. So when you're actually going out and building and configuring a network,



So we have in abstract, we have this concept of a blueprint. You actually build your network on a GUI using a blueprint. And we have some built-in blueprints. You could actually just pick one that already suits your particular deployment. And then when you use the blueprint, you can say, OK, here's my final version of the network. This is what it's going to look like. Then you go and buy the equipment. You bring it into your data center, install it, and stick it in the rack. And then you just apply the blueprint to it. So you're not configuring the network using any CLIs at all.



And if you have a network that doesn't fit one of those pre-populated designs, can you still build? We can help you with your specific blueprint if you want. And then you just go from there. And the other interesting thing about abstract weather waves, and we don't talk about it a whole lot, is it uses a, we invented this term called intent-based automation. What it basically means is that you design a network to operate in a particular fashion.



And what AppStore does is once a network is up and running, you actually designed it. and says, hey, this thing has changed. you know this thing called Time Machine, Now, does this have to be only Juniper equipment?



No, that's the best part is, who have no Juniper equipment, I'm not sure I've seen anything quite like that. Now, I don't think it's just about proactive monitoring. and some people use the term observability.



The question to ask is, does my network or does my tool, whatever I'm using in my network, to understand how the network's status and what's going on, does it give me intelligent information that I can go and act upon? And even if I build a tool behind it and say, I'm going to build an AI engine or a machine learning something, a gizmo, if the information that's coming is not



something that I can act upon, doesn't have the right intelligence. It's just like, for example, if you dump a whole bunch of telemetry data to me and say, you know, here's all the stuff, you know, 5,000 sensors we have. That doesn't tell me anything about how well the network is operating. So you need to first build the sensors. You need to have that telemetry data available. But then you need to build on top of it a data science model that says, what does the information tell me?



sensor data. What's it telling me about the state of the network? Our port is flapping or my traffic is not passing the way it should go. And then the back end of that is then using that to actually go and fix the network automatically. And we have some other capability in our Mist products, for example, with Marvis.



So the question is, does that make sense to use that similar technology across the rest of the network, the data center, if you wish? I think it does. And over time, you'll probably see Juniper investing in that as well. How does that... So I'm assuming that you need to have a baseline, right? At some point when you're first deploying, when you're first using abstract, how do you get that baseline? Is it based off of the template that's used for deployment, something else?



The baseline is in the blueprint. So it actually tells you you're building a three stage gloss fabric with EVP and VXLAN. Here are all the features that you're going to implement in that network. And that forms part of the blueprint. So this is a fabric then? It is. I want to make sure we highlight that. Because automation's been all the rage, right? Like you've been saying, we want to get out of the CLI. We want to be hands off to the point where



people aren't having to find all the nerd knobs and make sure we're configuring everything to what we think our, like you said, intent is. But what I think we don't always think about is network design. If you have not the best network design, automation can just push that complexity to the end of the earth, right? So what I'm hearing you say is that



simplify the network design so we're building our design as intent within abstract in the GUI? Yes, absolutely right. And actually, if you think about it, right, when you're, and most networks are not simple, they're always complex to some extent, right, is that you cannot possibly go and figure out.



all the elements of that network design, if you're going and putting automation on a switch by switch basis, right? So I can use Ansible to go and provision one switch. Then I use Ansible, the same scripts, to go and provision the second switch. Now, if the second switch is a different layer in the network, I have to use newer scripts to go and do that. It doesn't make any sense to start thinking about your whole fabric, especially if you've got, let's say, 50 switches, to do it one switch at a time. And so what you really need is to look at the whole fabric from the point of view of what is the fabric supposed to do? What am I looking to accomplish?



How many servers am I going to connect? What other network devices am I going to connect? Am I going to connect storage to it or not? And then you say, what do I want the fabric to actually operate like? What kind of features does it need to have? How many overlay networks am I going to run it, and so on and so forth? And once you've actually arrived at that conclusion, you say, OK, here. I go and design my blueprint.



and then use the blueprint to go and actually provision the network automatically. You don't have to touch anything. You can just go in Apstra and say, deploy the blueprint. And boom, all the Swiftie switches get deployed exactly the way you're supposed to get deployed, with all the right features enabled, turned on, all those nerd knobs, without you having to go and touch them. And then it just starts operating. And the best part about Apstra is, so we talk about, the way we think about Apstra is day zero, which is when you're designing the blueprint.



Day one, you're going and implementing and actually bringing up the network. But the most interesting thing about abstract is past that, which is day two plus, which is as you're operating the network over the next, whatever period of time, a year, two years, five years, is are you looking at the known good state of the network, which is what is in the blueprint, and then comparing it to what's going on in the actual network. And then what after does is it keeps looking at that all the time, every time. And it tells you if something goes.



you know, health or skeletal, something is not working the way it should, it actually informs you. Well, so on that, what about self-healing, where it actually fixes it for you? That's a good question, yes. So, yes. Because we're talking about AI here, right? Exactly, exactly. So, I don't think the self-healing part is there yet, right, but it's in our roadmap. Why is it in our roadmap? Because we already seen that.



self-healing is part of the story that Juniper brings with our Mist and Marvis engine. So we already have machine learning built into our whole campus and data center fabric, I mean campus fabric for our customers who are using wireless LAN and campus equipment. And so we already have the capability of using Marvis to go and automatically resolve issues for you. Right? That already exists, that tool exists today and it's only going to get better over time.



and none of our competitors have anything similar to it. So there's good reason why we should take that technology and port it over to other parts of our portfolio. But now, wait a minute. You're talking about a single tool that would manage your campus. To rule them all. And potentially, at some point in the future, your data center. You're not going to have a different tool for.



campus, a different tool for data center. It doesn't make sense. Over time, customers want to basically have the same console and the same tool, right, with the same capabilities. So the things that we need to do are you take the data center fabric. So the stuff that is already in Mist, for example, is a lot of rich telemetry information. It tells you what's going on with all the access points, with your end customers that are connected to those access points, with IoT devices that are connected to those access points. All that sensor and telemetry data goes into the cloud.



missed Marvis, the AI engine, is doing machine learning on the back end, and all the data that it collects from millions and millions of different devices, and it's telling you, okay, if I see XYZ thing happening in the network, I know how to go and resolve it. Can you adjust that baseline, that known good? There's a way to do that. Absolutely. So one thing that you just kind of piqued was, if we're taking AI,



and you're collecting this data from all these millions of devices from other customers, right? Is there a way that you can kind of tell what their topologies look like, what their designs look like? And then maybe other companies could utilize like, hey, this works really well for this company and they're kind of set up similar to you. Maybe you should try these settings or this design or that kind of thing. Yeah, absolutely. I think, yes, you hit an important point in this conversation that we have with our customers is



what is known good. Actually, even before we get to AI and ML, we already have prescribed designs and architectures for our customers. So we publish, you know, what we call validated designs. And customers can use that validated design. Let's say you're building a simple five-stage cloth fabric. We have a booklet. It tells you that. And then we can put that automation completely and the blueprint completely into abstract.



And so now you don't have to go in like, okay, read through 80 pages of documentation to know how to go and build a five-stage cloth fabric. You can actually go into Apstra and say, hey, give me a blueprint for a five-stage cloth fabric. We already have some pre-built templates. Pick one and then go and deploy it. I so badly want you all to come up with a clippy.



configuration you gave a talk earlier today on abstract. And I quoted you in saying that abstract makes it hard to make a mistake. So I'm inferring that that means if I go to make a configuration change in my fabric, is that being checked against that baseline? When I make the configuration change and it, if it doesn't match my intent, it's not going to push it. It'll raise an alert and.



hopefully, and it's a two-stage process, which means that it's going to raise an alert before you actually push the change. And it's going to tell you, hey, hang on a second, this is what you want to do. And here the fabric is going to change in a meaningful manner and it's not going to operate the way your original group wanted it to operate. So you have to be explicitly make that change. And then once the change is made, if things are starting to break, it'll actually alert you as well. It'll keep telling you like, hey, hang on a second. It reminds me of like the old Nintendo Duck Hunt where the dog will come up and...



Laugh at you. Yeah. Told you. Told you so. Yeah. So this is interesting. That's exactly right. So the point about Apstra is the focus of Apstra has always been how do we build a reliable network, right? If reliability is the core tenet that you're working off of, it means that you want to have a network that never breaks to start with, right? And actually, if you listen to Mike Bushong with my boss, he'll say that before you can get to speed, you need to have reliability. This is why



before you got to faster and really fast cars, is you need to have seat belts, and you need to have airbags, and all of that stuff. And then now you can get into a McLaren and drive it 200 miles an hour. But if you did not have the seat belt or the airbag, it wouldn't have been a smart idea to drive a 200 miles an hour. Let's talk about this in terms of the future of networking. We're seeing this huge push for network engineers to be like.



DevOps basically now, right? and automate everything. A lot of us are resistant to it, myself included. as network engineers without maybe having to learn I know, I hear you.



And he said, I'm not a code monkey, don't ask me to write any piece of code, right? And you know him, right? I wasn't gonna dox him, but since you did, yes. Yes, I know him.



But it's an important point, which is that I think the network engineer should look at and visualize, okay, how are we going to operate the network? What can I do to improve the functionality of the network? How is it going to look three years down the line? What are the other technologies that I can bring to bear to make it work better? But you shouldn't be writing code. You shouldn't be writing Python or CLI or something else.



So that's the job of, so here's an interesting thing, right? Somebody once said that the vendors who build networking equipment are not the people who operate the network. So most of the times companies like Juniper and Cisco and Arista, we build products and we make an assumption that we know exactly how the operators are gonna use it, but we don't put ourselves in their shoes. I think with Apstra we're actually getting to the point where we're putting ourselves in their shoes.



And we are telling, we understand the world from your perspective. And we're saying, some of this low-level stuff doesn't make any sense for you to do. Why don't we just automate it and tell you exactly, give you this thing where you can go and deploy it without having to go and write any piece of code or anything. Just use our GUI, go deploy it, and it works right out of the box.



What's left is the stuff that we already talked about, the stuff about what we call proactive monitoring. But I think beyond proactive monitoring, you need to get to a state where you have observability, which means this intelligence that's coming from the tools telling you that I have looked at these three metrics coming from this device, and I can tell you that it's likely that you might have a failure in 30 days time, right?



I think that's observability or proactive monitoring, in fact. And that's the first part. And then the second part is being able to resolve that, or at least tell you that, okay, fine, you know, hey, this part is going to break, and I'm shipping you an RMA box 10 days in advance of it breaking. Right?



That will be the really good end stage. That's the gold standard. And then, yeah, yeah. Or even simple things like upgrades, right? People talk about like upgrade processes and how hard it is and we have a complex network. Don't give me star. I am that person, yeah. Yes. But you want to simplify it. So AppStore does two things like simplify upgrade process. So it tells, it knows the non-good state of the network. It knows what you're looking to go from where to where. And it simplifies the whole process so that.



It brings up one network element at a time. It brings down one network element at a time. It upgrades the devices. That's all part of the whole experience of using something like this, which is that you don't have to write low-level code or use Python and stuff like that. So one point I want to add to that though is, I mean, at some point, people will probably want to use scripting, whether it be Ansible or like a Terraform or Python or something like that, to scale out that blueprint, right?



So is there going to be a point where it could be like, And it kind of gives you that script, right? So we actually demonstrated, and we have a Terraform provider for abstract. So what that does is it allows you to scale those abstracts. So let's say you have a known good blueprint and you want to scale it to 10 data centers. We have that right now with Terraform.



So you can actually use that and go and deploy the same abstract blueprint in how many other instances you want. Tim, I think you had something. I cut, or was that what you're gonna? You took it. Oh, I stole that from you. No, it was fantastic. I wanted to hit the opposite end of the spectrum, right? The opposite of the traditional network engineer, the groups that may have scripting and automation already, how do they, or can they, if they want to use abstract as their...



method of pushing that intent. How do they integrate their existing tools and is that possible? So yeah, you nailed it, Dan. Yeah, we've been going after Terraform for some time. We understand that the future of the so-called scripting ecosystem is going to move to Terraform. And so we built this Terraform provider and we can actually demonstrate it. A lot of our listeners, I suspect, might be afraid that, you know, tools like this.



could potentially, or just in general, we're looking at the future of network engineering and we're seeing this fear that we're all gonna be replaced as network engineers, like automation is just gonna take over and no one's gonna hire network engineers anymore. Is that what Apstra is gonna do? No, I don't think so. In fact, I was with a customer in Vancouver recently, they're network engineers there who've been working for 20 plus years.



they say that the consumption model, the way they want the network to operate, so they're already using Google Cloud, for example, right? So they know that they can create these availability zones, they can have the ability to essentially give customers their end user customers who are their internal customers, the ability to get whatever resources they need in real time, right? So customer goes and says, I need 15 VMs with so much network connectivity and so much storage, and they can spin it up in any instance in Google Cloud.



these deep private data centers. And Apstra can do that for you, right? to go and provision the rest of the resources that you need. Yeah.



I can't be sitting, I remember a story for Bank of America, long back, and we were selling servers, and Bank of America, the IT guys told me, yeah, your servers have been, they'll be sitting there in the warehouse for six months. And I said, why is that? Well, we have to go to the networking engineers. Then we have to open a ticket, and by the time they assign us IP addresses, this is before we can put the devices in the rack and connect them to a cable, is I need those IP addresses, and that takes six months time.



Well, that should not be the model, right? When you buy your servers, you don't want to be sitting in the warehouse. So I think the whole consumption model is changing. It's not like these engineers are seeing this as a threat. They're saying, I can up-level my game now. Yeah, I think this is a natural evolution. I mean, if you look at any product in IT, a long time ago, to install anything, it was this long process of giving inputs and moving through this step process. And now it's...



maybe ask a couple of questions, give a few inputs, and allowing us to free up some time I'll tell you another story, and one of our customers a few years later and in Hyderabad.



So they used this box, which basically allowed you to prompt storage ports and network ports to servers on demand. And they created a thing called IT. It's almost IT as a service model. I can't remember the exact term that they used for it. But they gave their developers access to a console or web GUI. And you could go into the GUI and say, I need to spin up 10 VMs, four running Linux, two running Windows, and four more running something else.



I need so much network and so much storage. And I need it for this period of time. So it's almost like what you got from Amazon and AWS, but this is going back to like 2008. And they clicked Submit and it went out and got provisioned. Now, when I met the IT guy who was running, by the way, they had a separate IT group to do some other function, right? And this guy was telling me a story. He says,



Oh, this colleague of mine, he came and he told me, and it takes him only like 24 hours. And he just went to this console and he hit submit who is the end user developer, got access to all his 10 VMs.



So that is the model. And this guy said, oh, now what do I do in my free time? Oh, I go out for lunch, you know, I go meet my family, I play golf, do everything else. So networking in the future is looking like, you know, less tediousness. That's what I'm hearing. Exactly. I'm happy about that. I think that's a big point to hit on though, is like, it kind of takes that support role and, you know, lifts a little bit off of that, off the engineer. And I don't know about you guys, but support is not my favorite thing. So, no.



I think it goes back to IT teams wanting to be more than just a cost center. They want to be able to provide value. And by providing value, they're getting out in front of big projects, they're assisting the business and understanding what the business is trying to do and find the right solution to that. That's really difficult to do when all you're doing is keeping the lights on. That's very true. Yeah, I can't say anything more than that



IT moving away from a cost center to some that actually provides value to the business. And you can show that as a differentiator. Hey, I'm able to give you the resources you need on time. Your developers don't have to wait for two days before they get access to the virtual machines or whatever the network resources. That's a huge differentiator. Nobody's sitting there twiddling their thumbs and waiting for IT. So I do want to talk about the phrase that everyone loves to hate, single pane of glass.



So for customers that are looking to adopt Apstra to deliver intent to configure the network, if they do so, what are some of the tools or consoles that they don't have to manage anymore because they're living in Apstra? Or you don't have to go into a Juno CLI, you don't have to write Python scripts or use Ansible. So that's the.



beauty of abstract. By the way, I don't know if I've said this enough, but abstract is multi vendor. You could be, we have customers who bought all Dell and are using abstract. So they don't have a single Juniper switch. Can I ask why you support other vendors with this platform? So if he believes that the world moving forward has more to do with making sure your data center operations are the, is the focus area.



That means that we want to improve operations first. By the way, if you look at switches in general, there's not that much difference. If you look at Dell switches and some of the Arista switches and Juniper switches, we all use the same similar Broadcom silicon. So where's the differentiating factor? The differentiating factor comes with operations. Now, with Apstra, it gives us an insertion. So let's say you bought Dell switches today, and you're using Apstra. Two years from now or three years from now, you're going to renew your infrastructure. You're going to put out a bid.



And that gives us an opportunity to come in and say, why don't you also try using the Juniper switches, right? when it comes to these few deals The bigger picture for us is that we think that So Juniper, for probably a couple years now,



We want to, Juniper wants to put the end user first in making sure they have a good experience. And you have the same methodology with experience first data center. And you highlighted out in your talk earlier today kind of three pillars, design and plan, config and deploy and operate. Now back to that whole single pane of glass methodology, it kind of feels to me that that Apstra really fits into all three of those facets, is that correct? That is absolutely right, yes. And so,



AppStra comes in at the earliest stages of building out your network. And we can present some customer put on an RFP, and we can talk about all the systems that we can sell it to you to fit into that RFP. But a core part of that conversation has to be OK of how you're planning to go and deploy your network. What tools are you going to go and use? Are you going to do your own in-house automation or not? And let me come and tell you what the value proposition is of buying Apstra. And you saw in my slides the ROI.



is 320%. This is from the time saved by these customers How much less work you need to do It is. but there was a lady on stage with me at the talk Rita.



data centers in nine different countries in Latin America. And they've already migrated over to abstract and five of those and they still have four more to go. But the experience they're showing that they're describing now is leaps and bounds ahead of what they were doing three years ago, right? Now, if any of our listeners would like to find out more about abstract and is there like a demo available? Oh yeah, good question. Nice lead up. Yes, absolutely.



We have free demos available on our website. We have something called Juniper Cloud Labs. You can actually go in there and actually build a fabric and test it out for yourself. So all of that is available and I'm guessing you'll put a link somewhere in your podcast. Yeah, that's amazing actually. I know a lot of people ask for demos these days whenever I say anything about, oh, I love this switch. They're like, where's the demo? So I don't think we've touched on this too much.



and I want to make sure it's clear for our listeners. Where does it live? but you can run it as a virtual machine. and looks at all the switches



So it really feels like the long-term vision and goal of abstract is really to be that way of how network engineers and network operations teams interface with and build fabrics and maintain the network. Yeah, I think the term we use is control point, but that's exactly right. Abstract is a centralized control point. And by the way, if you have multiple abstract instances running in multiple data centers, we're building something that.



consolidate the view across all of that as well. Interesting. Yeah. So that's in the future. But it is coming. Very cool. Something that always sort of gets tangled in my mind, because I don't live in the automation world too much these days. And let me know if this is just something that we don't need to talk about, go off tangent. But machine learning, AI, intent-based networking, those terms get thrown around a lot with each other.



And I'm just curious if we could clarify a little bit more where Apstra comes in on that, I don't know if it's a spectrum between intent-based networking and full-on AI and then machine learning in the middle. So today Apstra does not have any machine learning or AI built into it. It's a very simple tool that has a database.



and what it's looking at. The database contains all the primitives that tell you all the elements of the network and how they're supposed to be configured and so on. And it uses probes to look at the network itself, the state of the network. And it's comparing the state of the network to your database, your original intent. So even without these AI machine learning, even without that, it can do all of these things? Exactly. So there's no AI. It's a very simple database and a comparison tool that says, OK, what's the state of the network? And what does it do?



the database say it should be. But it will alert you. And it will alert you saying there's some changes, or some port is not configured correctly, or things like that. Where machine learning and AI comes in is a little bit at a higher layer. So we have a lot of sensors that abstract can actually probe telemetry information from different switches and tell you, OK, the CPU consumption, the memory utilization, the port config, CRC errors, and so on and so forth. So all that telemetry data is coming into abstract.



and it's going to get represented in abstract. But that's not doing anything with respect to the intent. It's just telling you, OK, I'm seeing memory utilization go up on this particular system. Where machine learning comes in is where we can tie this data back into Marvis, which is our AI engine. So you've already seen that Mist and Marvis use AI, machine learning and AI, to actually do more with respect to, you know,



finding out where there are issues, and being able to even proactively resolve some of those issues without the operator going in and having to do anything. So I think the next obvious evolution for us is to say, can we take the data that's coming out of abstract, and not just one abstract instance, but multiple abstract instances, and move that into a machine learning engine like Marvis, and then use Marvis to do the resolution of.



the issues that can be resolved within Marvis. And by the way, the whole premise behind Marvis is that it's not just a fixed static system, it's continuously learning, right? As things are getting more and more complex, as networks are getting wider and larger, as there's more data coming in, Marvis is constantly learning and saying, okay, now I understand that if I have 1.8 million missed APs, that my network looks different, and here's how I can correlate all that information.



So as Marvis gets more intelligent, it's going to add more capabilities into the tool itself in being able to understand where things might happen, where there are issues that might happen, and how do I resolve those. And in some cases, the resolution may not be automatic. It may be that there's an operator intervention required, or a box fails and you have to automate or something like that. So Apstra basically provides us this intent-based



networking system with the high potential for this evolution into plugging into things like AI and machine learning to help us go into the future as network engineers. Awesome. So let me ask this. Does Apstra at this moment have any kind of tool set where we're talking about intent-based networking, right? What if you don't fully know your intent?



might not know exactly how all their systems work. And so we need to figure out like, what's the intent of this, right? Is there any kind of built-in tool set that handles that, that helps us to understand which applications are talking to which other applications to, you know, figure out what the actual intent of the traffic is? Yeah, I don't think we've done any integration with application tools just yet.



But there is a potential to do that, right? Which is to then say, okay, you know, by the way, there are third party tools that do application monitoring and stuff like that. Yeah, there are, yeah. There are lots of them. So I don't think we would go and reinvent something like that, but we would potentially, over time, figure out how to take some of that data and then...



go and be able to assess, okay, is the network doing the right thing, is it designed in the right way, has it got all the right capabilities to support the application traffic as you needed it or as you wanted to go? Yeah, because my question is like, you know, I assume the fabric can see the traffic, right? So it should understand what's trying to talk to what, right? But we don't have probes that go into the application layer itself. Now the other way to do it is to take abstract API, because we have an outbound API.



and actually put that API, use the data that's coming out of that API, into the application monitoring tools themselves so that somebody can get alerted from the app tool itself saying, hey, I have a port that's down, and therefore my application traffic is blocked or whatever.



I think there's a lot of potential for value there if you have a way of aggregating metadata for multiple customers into an AI or an ML engine that can start correlating trends. Because you mentioned something earlier in that finding ways of getting out in front of potential hardware failures or other or maybe software bugs that if all of a sudden we see multiple different customers running into a certain thing, you can get proactive in.



replacing an SFP or, hey, we think you're gonna have a power supply issue an X amount of time. There's a lot of value there. Even just rebooting a process, right? Let's say for example, there's a process in Junos that's hogging memory over time. Now some of these processes, if we run our test internally inside our test engineering team, we may not see that problem occur because we run our test for a brief period of time. But some customer, there's a process that's misbehaving and six months later, you know.



over slowly over time, it's starting to consume more and more memory. If you can see that early, we can tell the customer, hey, we have a problem with that process, why don't you just reboot that one process? And then we go to engineering and we say, hey, that process is not functioning the way it's supposed to function, why don't we go redesign it and fix it? So both of those are like proactive actions that we can start looking at way before anything actually catastrophic happens. One thing else I just kind of thought of is like, how this can...



potentially help like change management, right? If the device knows that there's an issue coming, it can go ahead and put in a ticket and be like, look, I know these are your business hours, I know these are usually maintenance hours. If there's multiple things going on, maybe it could even schedule it for you. You know, like, okay, at eight o'clock, I'm gonna fix it, reboot this process, right? That kind of thing.



I love that, Dan, because I hate doing all those things mainly. And then notifying customers as well. it's proactive for the OEM, the vendor, right? or go to your software developers or let's have the developers look at this particular problem



Absolutely right. but it's not just reliability on the software layer And so if we know early on that we're going to experience or something else, and because we collect data across we might actually be able to find patterns



Yeah, if this network is set up in a particular way, and this kind of traffic is sent through the network, that we see this pattern happening, and can we do something proactive about it, whether it's fixing software or something in the hardware. Like an RMA or something like that. Exactly, exactly. But actually we used to do that, when I was in my startup that I mentioned, right? I'll give you an example. So one of our customers was Disney, Interactive, okay? All the Disney, you know, the gaming stuff, ESPN, all of that stuff was running on this fabric that we were a core part of.



Okay, and Disney decided, now we told them there's a limit to the scale that you can support in one chassis or one box, right? We said roughly about 300 servers. Well, they went ahead and kept adding more and they went up to like 650 servers. Oh wow. Added to two boxes, which was fine. But then we started seeing issues. Now, we had a system which was proactive, which in those days proactive meant that I could take all the log files that are coming out of the box, ship it to my support engineers.



And they would actually be able to parse that support logs and tell you that there's something happening with one element in the chassis. And this actually happened. So we called up Disney's IT team, and we said, you have gone way too much, too far. You're likely to see a problem in the network. So before you see that problem, we're shipping you two new boxes. We want you to split the network back into two. So 350 servers connected to one pair of boxes and 300 connected to the other pair of boxes.



So before the problem occurred, we could actually tell them, here's the resolution, go fix it. Now, the onus is on us as vendors to actually build those tools and give you the capability so that on the backend, that we can go in and say, hey, we don't have to have support engineers only troubleshooting tickets after things are broken, but we could be more proactive about telling you, hey, you're likely to face a problem, here's the resolution, and by the way, we are either sending you a patch, we're sending you a new box, or whatever this.



But see, I think that's also a win for the vendor, right? Because what's my customer experience like when you guys are informing me that, hey, by the way, you're gonna need a new box and we've already shipped it to you kind of thing. And I see that as like, okay, that's less I have to even worry about, right? Yes, absolutely, absolutely less, yes. And so you're right, it's a win for us as a vendor. It's a win-win. Exactly.



That's something I've heard consistently throughout today is that problems happen. It's just accepting that it's just part of life. We're always going to see it. And I think that the standout piece is, how does the vendor handle those problems? And what's the customer experience like? Absolutely, yes. But one of the interesting things that we started about four months ago is something we call quality of life. So here's the thing. The focus is for us is,



If you're a customer of ours, your interaction doesn't start from the time you install the equipment and run it. Your interaction starts from the time you're architecting, you're designing, you're getting a quote from us, you're purchasing, you're installing, you're running, you have an RMA or an issue, you open a ticket. Everything from end to end. One of the focuses of my boss, Mike Bushong, is how can we improve your quality of life at every stage of this process?



And we're working towards a lot of things that we're doing, the changes we're making in our products, in the way we deliver them, in what is included in the box, in the rack-bound kits, and you name it, all across the board. We're slowly making changes to the things that impact you as a customer, your quality of life. Right? And proactive monitoring, being able to troubleshoot tickets, proactively sending you replacement product if needed, all of that is part of quality of life. Yeah.



Well, Vinod, we appreciate you demystifying Apstra for us. We've been hearing a lot about mist today, so we've been getting mystified all day. Yeah. We appreciate the breakdown of Apstra. Thank you. Yes, my pleasure to chat with you guys. And it's wonderful to meet all of you and get to know you.



Likewise, this has been a lot of fun here at this event. So we thank Juniper for their support of the Art of Network Engineering podcast. We thank UT Dallas for hosting the event and giving us a fantastic tour today. Really cool campus. Absolutely. Yeah. Yeah. Like you said earlier, the robots just like delivering food all around campus. I do feel like they probably should have explained that before we saw it, because it freaked me out. Yeah. Is that thing supposed to be there? Yeah. But what was funny is the CTO,



like he goes oh yeah they they're delivering what was it like door dash or something similar yeah and and then he said oh and they've got a home over there they just go there and charge overnight so well we've talked an awful lot about abstract today if you want to learn more about abstract get a free trial you can make sure you check out the links in the podcast show note and and learn a whole lot more about after but not thank you so much for joining us today this has been



Such a fun conversation and we'll see you next time on another episode of the art of network engineering Hey everyone, this is AJ if you like what you heard today Then make sure you subscribe to our podcast and your favorite podcatcher smash that Bell icon to get notified of all of our future episodes Also follow us on Twitter and Instagram. We are at art of net eng. That's art of an NET ENG.



You can also find us on the web at, where we post all of our show notes. You can read blog articles from the co-hosts and guests, and also a lot more news and info from the networking world. Thanks for listening.


Podcasts we love