Americas

  • United States

Look to Google to solve looming data-center speed challenges

Analysis
May 26, 20226 mins
Data CenterNetworking

Google’s Aquila project is establishing a model for high-performance meshing that can handle the most demanding data-center workloads.

hyperconvergence data center fork in the road decision choice data center briefcase
Credit: Getty Images

When you think of data-center networking, you almost certainly think of Ethernet switches. These gadgets have been the foundation of the data-center network for decades, and there are still more Ethernet switches sold into data-center applications than any other technology.  Network planners, though, are starting to see changes in applications, and those changes are suggesting that it’s time to think a bit harder about data center network options. Your data center is changing, and so should its network.

With the advent of the cloud and cloud-centric development, two disruptors were introduced into our happy and comfortable picture of Ethernet switching in the data center. The first was virtualization, the notion that there wasn’t a 1:1 relationship between a computer and an application, but rather a pool of computers shared the application hosting. The second was componentization, which said that if you wrote applications to be divided into logical pieces, you could run those pieces in parallel, scale them on demand, and replace them seamlessly if they broke. The impact of these on traffic, and so on data-center switching, was huge.

Traditional monolithic applications create vertical traffic—flows between users and the data center.  A couple decades ago, things like service buses and inter-application coupling created horizontal traffic.  Componentization and virtualization create mesh traffic, where messages flow in a complex web among a whole series of components.  Since traditional data-center switches create a hierarchy, this mesh traffic stresses the traditional model and promises to break it.

Adding computers in a hierarchical switch network, or the more modern leaf-and-spine networks, is a matter of adding layers as needed. Since this provides any-to-any connectivity, you might wonder what the problem is, and the answer is a combination of latency, blocking, and determinism:

  • Latency is the accumulated delay associated with moving from the source to the destination port, which obviously gets larger as the number of switches you need to transit increases.
  • Blocking is the risk of not having the necessary capacity to support a connection because of trunk/switch congestion.
  • Determinism is a measure of the predictability and consistency of performance.

OK, we need to move to a new model, but what model?  It turns out that there are two new missions for data-center networks to consider: high-performance computing (HPC) and hyperscale data centers.

In HPC the computers and components of applications were performing incredibly complex computing functions, like modeling the impact of a monarch butterfly migration on global climate. This requires a bunch of systems that are run in parallel and are very tightly coupled, with very fast interconnections. This means fast and very deterministic connections, something more like a computer bus or backplane than a network interface. Early solutions to this included InfiniBand and Fibre Channel, both of which are still used today.  Intel introduced Omni-Path as a new-generation HPC technology, and later spun it out as Cornelis Networks.

In the mesh model, what we really need to support is a bunch of little, low-burden, components used by millions of simultaneous users. This is what we now call hyperscale computing. Here, different users run different components in different orders, and there’s constant message exchange among those components.  Mesh traffic flows evolved out of that horizontal traffic we talked about earlier, traffic that caused network vendors to build their own fabric switches. Based on Ethernet connectivity, fabric switches were easily introduced into data centers that previously relied on those switch hierarchies, and they worked fine before we started using microservices and big resource pools. A single fabric switch works great for horizontal traffic, but it supports a limited number of connections per switch, and unless you go to fiber paths, there’s a limit to how far you can run the Ethernet connections. Imagine a data center with servers piled up like a New York skyline to keep them close to your fabric.

Of course, the public cloud providers, hosting providers, and large enterprises started building data centers with more and more racks of servers. They really needed something between an HPC switch, an Ethernet fabric, and a traditional multi-switch hierarchy, something that was really good at mesh traffic. Enter Google Aquila.

Aquila is a hybrid in many dimensions. It’s capable of supporting HPC applications and capable of creating a hyperscale data-center network. A data center is divided into dozens of cliques, which have up to a couple thousand network ports. Within each clique Aquila uses a super-fast cell-based protocol to interconnect pods of servers in a full mesh. Thus, performance within a clique is very high and latency is very low. Because packets passed within a clique are broken into cells, higher priority stuff can pass lower-priority packets at any cell boundary, and that reduces latency and improves determinism. SDN switching is used between cliques, which means that the whole data center can be traffic engineered.

Don’t run to the Google Store to buy an Aquila, though.  It’s a project and not a product, so it should be seen as an indication of the future direction of large-scale data-center resource pools. I’m guessing, but I think products based on the Aquila approach are likely to be available in two to three years, which is how far out data-center network planners should be looking today. Despite the delay in Aquila gratification, though, there is an important lesson you can learn from it today and apply to stave off a bit longer the issues Aquila will eventually solve.

Aquila frames a resource pool as a collection of sub-pools that are very efficient in connecting horizontal traffic within themselves. It’s fairly easy using a tool like Kubernetes, which offers things like “affinities” that let you pull components toward a specific set of servers and “taints” that let you push things away, to keep highly inter-reactive components together in a clique. Since Google was the developer of Kubernetes, it’s hard not to see Aquila’s architecture as a way to structure Kubernetes resource pools in hyperscale data centers.

Now the “Aquila hack”.  You could do something similar in your data center using Ethernet switches and/or fabric switches. Create your own cliques by connecting groups of servers to a common switch/fabric, which means that there will be lower latency and more determinism for connections within the clique. Then use Kubernetes features (or the features of other container orchestration or DevOps tools) to guide your components to your own cliques. You can spill over to an adjacent clique if you run out of capacity, of course, so you still retain a large and efficient resource pool.

Kubernetes, which as I said was developed by Google, recognizes the need to keep some components of an application close to one another to optimize performance. Aquila offers a data-center network architecture that can support that same capability, and while you can approach its effectiveness using standard switching, it would be smart to think about evolving to the new model if you rely on containerized, microservice-based applications in your data center.  Maybe Google sees something now that you won’t see until later, and then it may be too late.