Cloud Networking:
We’re Only Halfway There

By David Naylor, Principal Engineer

As enterprises adopt public cloud to augment their IT infrastructure (Gartner predicts that by 2026, public cloud spending will exceed 45% of all enterprise IT spending), networking teams are met with a steep learning curve. Unlike compute and storage, whose cloud offerings largely mirror their on-prem counterparts, cloud networking shifts the mental model from distinct devices that you configure individually to one cohesive, managed SDN fabric. Often organizations go to the cloud for compute and end up being surprised by the networking.

In theory, the new model simplifies the jobs of network engineers by (1) introducing new, higher-level configuration abstractions; (2) consolidating management to one interface; (3) reducing costs with consumption-based billing (as opposed to provisioning for peak load); and (4) unifying network monitoring behind a single pane of glass.

Unfortunately, this promise is only half realized. In reality, there are still several challenges preventing a smooth transition to cloud networking. For instance:

  • Abstractions: Networking teams need to learn a new set of primitives for configuring cloud networking (VPCs, peering, transit gateways, etc) and prior networking knowledge is not fully applicable. Worse, the primitives, names, and limitations are arbitrarily different on each cloud provider, extending the learning curve. And, if the first-party primitives don’t offer a required feature, the only alternative is to fall back to heavyweight or awkward third-party solutions.

  • Management:  Cloud provider portals are unnecessarily time-consuming to use—many logical tasks involve taking actions in multiple, unrelated parts of the portal—and change without warning, making documented procedures invalid. This presents a problem for playbooks that aren’t updated often, but need to stay current, like disaster recovery.

  • Cost Efficiency: In general, cloud native scaling does not come for free; application teams must put in effort to take advantage of it. Once they do, networking costs remain hard to predict or understand because they vary based on provider, region, and direction of traffic and may be spread out over multiple resources (like NICs, gateways, and connections).

  • Monitoring: Organizations that use multiple clouds, or a hybrid environment, need to consult one dashboard per cloud/site. Furthermore, not all clouds expose all the metrics you want (e.g., traffic statistics for peering connections) and some need to be purchased (e.g., flow logs).

Driven by these shortcomings, new third-party cloud networking tools are emerging. These tools aim to simplify and unify the cloud networking experience. When seeking out solutions, consider these five factors:

  • Unified Management and Transparency: A valuable cloud networking tool unifies different clouds with common, high-level abstractions that make it possible to focus on overall intent, rather than low-level, cloud-specific details. At the same time, nothing should be hidden. There shouldn’t be any inscrutable magic—it should be clear what it’s doing and why, and it should enable user control of low-level details if and when needed.

  • Inter- and Intra-Cloud Coverage: Some tools establish a cloud network backbone that connects VPCs and on-prem sites, and then stop. But there’s more to the network than just the backbone (e.g., subnets, routing tables, and network security groups). A complete cloud networking solution will help manage the backbone connecting the sites as well as the networking resources inside them.

  • Declarative Model: A declarative model for configuration makes it possible to take advantage of modern DevOps tooling. This infrastructure-as-code approach lends itself to modern best practices like code review, automated testing, and revision tracking. These are commonplace in the world of software engineering, but not yet standard for network engineering. A declarative spec also makes it possible to detect and resolve drift; out-of-band changes are inevitable, so it's important to be able to identify the changes, then incorporate or revert them.

  • Cost Transparency: In order to achieve cost benefits, all fees must be clearly understood. Being able to decipher the costs, both before and after the bill comes, is critical. A big part of this is understanding where bandwidth charges are coming from. Ideally, a network management tool will also help guide network design toward the least expensive option given specific needs and usage.

  • Tailored Security: Organizations should have the choice to use either or both first-party and third-party tools. First-party tools, while quite powerful, can be tedious and error-prone to configure. A solution that delivers high-level security abstractions and then takes care of the low-level configuration makes this process faster and reduces errors. In some cases organizations will need to use both first- and third-party security. 

After evaluating the five primary factors, think about the impact of these secondary aspects as well:

  • Solution Delivery. Do you need to deploy and monitor it on company-owned infrastructure, or is it a SaaS solution?

  • Underlying Transport. What will carry the packets from place to place? Is vendor lock-in an issue, or can you bring your own transport method? 

  • Degree of Control. How much control will you have over how the network is architected? Is it one-size-fits-all or does it allow for customizations?

  • Total Cost of Ownership. Who operates which pieces? Are self-hosted compute and bandwidth costs added to what you’re already paying for the solution?

With the cloud era here, extending networks into the cloud is inevitable. If done right, the shift offers great new capabilities and a chance to modernize network operations. By taking a unified and declarative approach, organizations can bridge the gaps to reap the full potential of cloud networking with an easier to manage environment, more powerful features, and lower cost.