How Is Generative AI Changing Data Center Requirements?

Generative AI is driving up power density requirements, but the need for low-latency connectivity also remains

Johan Arts
How Is Generative AI Changing Data Center Requirements?

What is a data center, and how do we use it? More specifically, what are the different types of data centers, and what different purposes do they serve for companies that use them?

These are seemingly simple questions, but it can be surprisingly difficult to come up with answers that feel satisfying. We sought to answer these questions in a blog post from last year. While the analysis in that blog is certainly still valid, our understanding of data centers and digital infrastructure in general is always evolving to keep up with the changing world we live in and the new workloads we have to support.

For instance, just within the past year or so, we’ve seen increasingly powerful large language models (LLMs) that are enabling new generative AI use cases that would have seemed like science fiction not long ago. Since then, many enterprises have scrambled to come up with an AI strategy to make sure they’re using this powerful new technology to its full potential. However, with so much focus on what they can do with AI, many companies have been slow to think about how they should do it, and specifically, how this is changing requirements for the data center.

With AI, traditional data center segmentation is no longer fit for purpose

For years now, we’ve been distinguishing between two broad segments in the data center industry. On the one side, there are general-purpose colocation data centers that carry workloads that enterprises no longer want to hold in their on-premises locations. On the other side, there are high-value, highly interconnected sites that are typically located in densely populated areas. These network-dense interconnection sites form the heart of ecosystems such as financial trading, gaming or any other ecosystems that are highly dependent on low-latency connectivity. Does this segmentation still serve us when trying to decide where to colocate AI workloads?

When businesses consider colocation data centers because they’re “lifting and shifting” existing application workloads away from their on-premises data centers, the decision often centers on cost and efficiency. As their primary concern is to realize the lowest cost per compute cycle, they may be prepared to trade off network density versus locations that offer the lowest real estate and power costs.

We’ve always held that such a single-minded focus on cost is counterproductive for digital businesses. There’s value in being able to perform certain workloads in certain locations; in many cases, this means deploying infrastructure close to network-dense locations in major population centers. Data centers that offer dense interconnections to partners and end users may cost a bit more up front, but the business value they can offer more than makes up for that.

Looking at data centers as a simple commodity can be particularly harmful these days, due in large part to the growing importance of AI. It’s an indisputable fact that if you want to do AI well, where you position your infrastructure matters. The AI model life cycle relies on different workloads with different infrastructure requirements. This means that AI infrastructure should be distributed, and this fact may force us to reevaluate the way we look at different areas of the data center market.

Understanding data center segmentation in the age of AI

Instead of the traditional two-segment approach based only on network density, let’s consider how we can apply AI requirements for a more sophisticated segmentation strategy. What makes AI different for the data center is the incredible power density requirements that come with the new generation of GPU chipsets. If we apply power density as a second segmentation dimension, we end up with a simple 2×2 matrix, as shown below. The vertical axis runs from low density to high density, while the horizontal axis runs from high latency to low latency.

Based on this chart, let’s say that there are four basic types of data centers in the world today. (Of course, this is a bit of an oversimplification, but it’s helpful for our purposes.)

Undifferentiated data centers

A large proportion of the world’s data centers can best be categorized as undifferentiated. These data centers are often the product of the infrastructure investment strategies of the past. Instead of building data centers in network-dense locations, enterprises often chose to build them in locations where much of their workforce resided (such as campuses). Similarly, service providers converted office buildings or warehouses into data centers, turning real estate that was never built for purpose into IT real estate.

While these data centers can offer fit-for-purpose capabilities for a given set of workloads, what happens if the power density requirements of the new workloads increase dramatically? How easy is it to upgrade sites for more cooling and power or adopt new cooling technologies such as liquid cooling? Enterprises that rely on these undifferentiated data centers in their AI strategies will likely struggle to execute those strategies effectively.

Hyperscale data centers

When you need very high density, but you’re less concerned about low-latency interconnections, then hyperscale data centers are the right choice for you. These have traditionally been the domain of major cloud and as-a-service providers. Instead of having to build their own, and having to deploy new high-density equipment to support their AI strategies, enterprises can acquire capacity inside one of these hyperscale data centers on a pay-as-you-go basis.

From an AI perspective, hyperscale data centers are traditionally associated with LLM training workloads, which are typically very dense and compute-intensive, but less sensitive to latency. However, it would be wrong to say that all model training workloads should go exclusively into hyperscale data centers. As we’ll see later on, there should always be nuance involved when it comes to selecting the right location for your AI workloads.

Edge data centers

As the name suggests, edge data centers are deployed at the digital edge: locations that are in close proximity to high concentrations of end users, applications and devices. This proximity is important because there are many applications and workloads that require consistently low latency.

In practice, the power density requirements of edge data centers have not grown as quickly as those of other segments. The workloads that get deployed in edge data centers are often network-heavy workloads that are less density-intensive than compute workloads.

When it comes to AI, there is a role for edge data centers. Certain AI inference workloads may also have very low latency requirements. Think about certain gaming use cases or the deployment of digital twins in support of a virtual maintenance assistant. In these cases, enterprises may choose to deploy AI inference into their edge data centers. In other cases, if the latency tolerance is sufficient, enterprises may choose to aggregate their AI inference requirements to their core interconnection hubs, which allows them to manage these models at scale.

As businesses start to roll out their AI strategies, they’re realizing the importance of keeping the distance between the data source and the processing location short. Without proximity between these two locations, latency will inevitably cause delays. This means that the insights hiding in the data sets will become outdated, which in turn means that the accuracy of AI models will suffer.

Core data centers

Core data centers represent the foundation of modern digital infrastructure. They are typically found in locations where network density and proximity provide the best opportunity for technology consumers and technology providers to interconnect and maximize business value for both parties. Starting from their interconnected core data centers, enterprises can build out their globally distributed digital infrastructure to enable a complete edge-to-cloud approach. As a result, they can streamline connectivity, maximize agility and prepare themselves to capitalize on emerging technologies such as AI.

When it comes to AI, core data centers may not be the most likely location to place large-scale language models for training. This is more likely to happen in hyperscale sites, where high power density is delivered at a certain cost per compute cycle. When it comes to AI inference, core data centers are primed as a key location due to their proximity to other data sources and the low-latency access they provide to users, devices and applications.

As I hinted earlier, not all training workloads are large enough to end up in hyperscale facilities. And many inference workloads may not be latency-sensitive enough to end up in edge locations. We expect sophisticated buyers to make trade-offs for their training requirements between hyperscale and core locations. Similarly, they’ll make trade-offs between edge and core locations for their inference workloads.

Beyond the simple training/inference binary, there are various reasons that core data centers should be an essential part of any AI infrastructure strategy. With many businesses looking to move their AI data sets among distributed locations quickly, having the right network infrastructure has never been more important. Core data centers offer easy access to dense ecosystems of network service providers, which means they can provide an ideal foundation for businesses pursuing a network modernization initiative.

Core data centers can also help businesses establish a cloud adjacent data architecture to enable their AI workloads. Many of these businesses are looking to use public cloud services to help provide scalability, flexibility and reliability for their AI workloads. However, if they aren’t careful, using the public cloud for AI could lead to issues such as high costs, security vulnerabilities and loss of control over their data.

A cloud adjacent data architecture lets you move data over low-latency cloud on-ramps, thus enabling you to take advantage of public cloud services on demand, without the risks and drawbacks of going all in on public cloud.

See how a modern data center services provider can help future-proof your business

AI is just one example of how the world of technology is always changing, and how we often have to change our understanding of data centers and digital infrastructure to keep up. As you face infrastructure challenges around AI or any other emerging technology, working with a leading data center and digital infrastructure services provider can help.

In the IDC MarketScape: Worldwide Datacenter Services 2023 Vendor Assessment, Equinix was named a Leader, representing strength in both Capabilities and Strategies.[1] The IDC MarketScape assessment notes the important role that companies like Equinix can play in enabling the future of AI:

“The rapidly growing adoption of GenAI will spark a sustained demand for colocation as natural incubators for high-density HFC computing. Datacenter providers have the capacity and technology to meet the needs for the tremendous computing capacity and voracious power demand that will be driven by AI workloads.”

The report also notes how Equinix services extend far beyond traditional colocation:

“The company has one of the most diverse portfolios of any datacenter company, which meets the demands of enterprises, hyperscalers, and networking providers. Equinix provides foundational infrastructure — datacenters, interconnection, and digital services interconnected to a dense and complex ecosystem, including the largest number direct public cloud access with on-ramps in nearly 50 metros.”

Read the IDC MarketScape today to learn more about Equinix’s place in the evolving world of digital infrastructure.

 

[1] IDC, IDC MarketScape: Worldwide Datacenter Services 2023 Vendor Assessment, Doc # US49435022e, October 2023.

アバター画像
Johan Arts Senior Vice President, Sales - EMEA
Subscribe to the Equinix Blog