Juniper Networks: Automating network design conversion with ArangoDB
ArangoDB helps Juniper Networks standardize their customers’ network designs to ensure high performance and scalability.
- Optimized network design discovery using ArangoDB’s schemaless nature and native JSON support
- ArangoDB harnessed to re-create a model of an existing network of important constructs
- Increased customer satisfaction and conversion from hand-crafted network designs to fully-automated
The Scenario: How to use automation to make networks more reliable
Juniper Networks (NYSE: JNPR) is a global leader in secure, AI-driven networks, dedicated to dramatically simplifying network operations and driving superior experiences for end users. Founded in 1996 and headquartered in Sunnyvale, California, Juniper is one of the world’s largest networking companies, with over 10,000 employees operating in 50 countries, revenues of over $4 billion, and usage in all of the top 20 cloud providers.
One of the product lines for Juniper Networks is switches, which connect devices in a network and allow them to communicate with each other. Juniper switches deliver agile, reliable, and scalable networks with AI-powered automation and insights.
In recent years, many Juniper customers have been compelled to upgrade their switches, due to a number of factors. Core Internet speeds have increased, as well as the speed of edge devices such as 5G cell towers and newer WiFi access points. Due to the increasing growth of ‘the cloud’, applications have become not only more complex, but have also been pushing more data onto networks due to an explosion of usage – a trend even further accelerated during the pandemic. Lastly, companies increasingly serve a customer base that’s spread across the globe and expects fast performance wherever they are.
To deal with these challenges, Juniper helped its customers migrate from traditional to modern network designs – specifically, from hierarchical three-tier networks (core, aggregation, access) to modern spine-and-leaf architectures called Clos networks, which are more scalable. David Gee, Product Manager Director at Juniper Networks, calls this a >brownfield migration.
The Requirements: Seamlessly migrate network designs
Due to the complexity of the various layers that make up a network, the brownfield migration process can be very costly – involving hundreds of hours of manual work. To migrate its customers’ network designs to an automation platform, the Juniper Cloud-Ready Data Team needed to build a ‘world view’ of how the various devices on the network are connected, including servers, networks, leaves, and spines. They also need to take additional technologies into consideration, such as link aggregation and encapsulation. The ultimate goal: get all the physical and virtual network information into a format that can be queried and built into an input data set for a modern Juniper network automation platform.
Due to the relationships between all the different touchpoints in a network, David knew he needed a graph database to help him solve this challenge. Specifically, he needed to:
- Extract topologically-centric data out of Junos, the operating system that runs on Juniper devices
- Detect long-term configuration drift that can lead to network failures, especially during periods of high usage
- Prioritize ease of data storage and manipulation over a rigid data schema
- Easily write functions to encode and decode data
- Structure data by role
- Structure role relationships
- Have the data be serialization ready
David first tried using Neo4j, but he quickly struggled with a number of challenges. He spent two weeks building an abstraction layer just to pull configuration data out of Junos and into a Neo4j database instance. On top of this, David had to run multiple rounds of Junos queries to detect edge devices, such as access points, to harmonize the different data types in Junos. All this led to inscrutable code that would be impossible to maintain, and too complex to extend to support new features. He went back to the whiteboard to find a better way.
Why ArangoDB: A schema-less graph database with document and key/value support
Based on his first attempt with Neo4j, David prioritized the need for a schema-less database that was simple to get data into. A schemaless database would be flexible enough to evolve over time to ensure that he got meaningful views of the data that would drive insights. As David says, “I don’t know what I don’t know. I don’t want to go back, have to rebuild a schema and fiddle around with it, and I don’t necessarily want to have to build data types on the edge of the graph.”
David also realized that the data he was working with had multiple shapes. While the topological network data could be organized as a graph, a JSON document view of the data was also important when looking at individual devices on the network and how they connect to other devices. These JSON documents could be enriched over time, as more information was discovered about a network. With this approach, the data began to naturally organize itself such that it became easier to gain insights around the network. He also desired support for key/value to support publish and subscribe (pub/sub) functionality that can be used to quickly determine if part of the network has changed.
“ArangoDB lets me enrich JSON documents over time as more network information is discovered.”
- David Gee, Juniper Networks
ArangoDB supported this flexible approach of using graph, document, or key/value data where it made sense, out of the box.
Additionally, David wanted solid support for Go and Python, given his years of Go development and the breadth of the Python community and ecosystem. He was happy to see ArangoDB provides a native Go driver, as well as several popular Python drivers developed by its active open source community.
As David puts it, “ArangoDB met these requirements absolutely wonderfully and I, without hesitation, threw Neo4j away and restarted the project with ArangoDB.”
The Implementation: A data pipeline powered by ArangoDB
David built a brownfield solution that gathers network configurations, puts them into ArangoDB, and then allows Juniper to collect data as a set of outputs for a downstream system. Essentially, David built a three-stage pipeline of data ingestion, data transformation, and data output.
Explains David, “We take a network that has a lot of contextual data, so we’re pulling information from config files, and we build a topological view. We put this view into ArangoDB, which allows us to run experiments to hypothesize over the data, run queries, and run tests to validate that our network configurations are correct.”
David continues, “Since downstream systems like a network controller usually accept JSON as an input, we can take JSON data straight out of ArangoDB through AQL queries or Foxx microservices, and send it to those systems with very little manipulation. That’s a huge win.”
Once data, such as the device name and interfaces, is loaded incrementally via config file to ArangoDB, Juniper can ask questions and check for inconsistencies. The outcome: create data sets that are suitable for other systems, that require little modification as a payload.
The Results: Minimal complexity with maximum flexibility
Reduced complexity. Thanks to ArangoDB’s graph, document, and key/value capabilities, David was able to swap out Postgres and etcd in favor of ArangoDB, reducing three databases to one along with the operational and software development complexity of maintaining the codebase.
Increased flexibility. With ArangoDB’s native JSON support, data can easily be enriched over time without the constraints of fixed data schemas that can sap developer time. This saves weeks of time rebuilding schemas and re-importing data to accommodate new data formats.
Streamlined input and output. Since edge and downstream device configurations are typically in JSON, ArangoDB can easily pull in configuration data from the former, and send data to the latter.
Improved data integrity. Data inside of ArangoDB can use JSON schema validation to ensure that it’s correct, reducing the chance of bugs creeping into the system.
Enhanced code maintainability. AQL is highly-readable and developer-friendly. This, coupled with the inherent readability of JSON, makes code easier for new team members to understand and extend, in order to support new functionality.
Powerful Decoupling. Once the data is in ArangoDB, any other developer or network engineer can query the data without needing to be a database expert. The logic around the database can concern itself with collection and conversion instead of the business of managing graph data structures, which is ArangoDB’s domain. At Juniper, ArangoDB has become a Rosetta stone of sorts of network information.
“For graph, my default ‘go to’ is now ArangoDB.”
- David Gee, Juniper Networks
Below is the architecture of what David built for brownfield conversions. This is a three-stage pipeline tool, which reads network configuration and makes opinionated guesses, incrementally building the required information. ArangoDB stores the state of the ‘stateless’ workflow engine, session information, collections on PIN and edge collections for interconnections and general relationships.
For more details, you can watch David Gee’s presentation from ArangoDB Summit 2022: