How To Converse in Cloud

4 Data Motion Patterns Enabled by Cloud Adjacency

With the right infrastructure in place, enterprises can unlock the fundamental data movements that underpin hybrid multicloud success.

Ian Botbyl
Glenn Dekhayser
4 Data Motion Patterns Enabled by Cloud Adjacency

In the early days of cloud, many organizations moved all their applications and data to one public cloud provider. They saw this as the quickest way to take advantage of cloud agility and scalability. As time passed and data volumes increased, it became clear that getting locked into a particular provider for all their needs was not ideal. Enterprises needed a cloud-agnostic approach to unlock the best combination of cost-efficiency, flexibility and best-of-breed services. To achieve that goal, they began to pursue hybrid multicloud strategies.

At Equinix, we believe that cloud-adjacent digital infrastructure is the new on-premises component that enables a modern approach to hybrid multicloud. At the heart of any effective cloud-adjacent infrastructure will be an authoritative data core.

Download the Leaders’ Guide to Digital Infrastructure

Learn how 50%+ of the Fortune 500 have leveraged Platform Equinix to implement and capitalize on their digital-first strategies.

Download Article
Leaders' Guide to Digital Infrastructure

What is an authoritative data core, and why do enterprises need them?

For years, successful data-driven enterprises have aggregated and analyzed data sets to extract insights and unlock business value. Until recently, these enterprises primarily used large, monolithic applications that ran in only a few places. For this reason, the data silos commonly found in legacy data architectures were generally not an issue.

As businesses became more globally distributed, data sets became larger, and latency requirements became more stringent, enterprises began to see that the old way was untenable. They needed a new approach that allows for distributed data aggregation in many different locations throughout the globe, while also maintaining copies of those data sets in a core location.

This location is what we call the “authoritative data core”, and it forms the basis for everything an enterprise might do with hybrid multicloud. It will include many different forms of aggregated data, such as data lakes, systems of record, and secondary copies from backups. The authoritative data core is not a monolithic data storage silo, but rather a series of interconnected cloud-adjacent regions that are strategically located based on business requirements.

Upstream and downstream layers enable effective data movement across cloud and edge

Wrapped around the authoritative data core is an upstream layer and a downstream layer. Each of these layers of interconnectivity is responsible for moving copies from the authoritative core to wherever data needs to go. In the case of the downstream layer, this means interconnectivity to metro edge locations, and on from there to far edge locations throughout the globe.

The upstream layer establishes interconnectivity between the authoritative data core and the various cloud providers that enable a multicloud strategy. In our experience helping customers build and operate cloud-adjacent infrastructure, we’ve identified four common data motion patterns that take place in the upstream layer:

  1. Utilize in place
  2. Project and delete
  3. Caching
  4. Backup and restore

The diagram below shows what the authoritative core may look like when deployed at Equinix. It also shows the upstream layer that connects to cloud providers and the downstream layer that connects to the digital edge.

Four data motion patterns for hybrid multicloud use cases

The four data motion patterns make up hybrid multicloud at its most fundamental level. If you aren’t using one of these four data motions, then you’re likely allowing data gravity to grow in places that you might regret later.

In addition to telling us how organizations do hybrid multicloud, these patterns speak to why enterprises do hybrid multicloud. Each of these patterns is about being able to move data to take advantage of cloud services on demand, without that data getting locked into a specific cloud provider in the process.

1. Utilize in place

When people talk about cloud-adjacent data, this is the data motion they’re typically describing. It involves leaving data in the authoritative core, and bringing compute and other services from the cloud to the data.

This data motion pattern is helpful for many use cases because it enables the low latency required by many modern applications. However, there may be certain applications for which adjacency alone isn’t enough to achieve the required latency or throughput. In these cases, enterprises may apply one of the other data motions described below, or take advantage of local compute in the authoritative core.

With utilize in place, enterprises can take advantage of data center extension to position their workloads in the right places, close to the cloud compute required to create business outcomes. They can do this by replacing their legacy private data centers with a centralized storage environment that provides secure, cost-effective data access for workloads running across edge, cloud, and on-premises locations.

2. Project and delete

By projecting a data copy into the cloud, performing the needed compute operations there, updating the copy as new data enters the authoritative data core, and then deleting the copy once it’s no longer needed, enterprises can take full advantage of cloud capabilities without experiencing high data egress fees.

Since the data never has to move back from the cloud to the core, there’s no financial disincentive keeping enterprises from leaving a particular cloud provider or using that data in other locations. This ensures the flexibility to work with different best-of-breed cloud providers for different use cases.

One example of a use case where project and delete can be particularly beneficial is AI model training and inferencing. AI is a very repetitive process; to stay within acceptable levels of accuracy, models need to be retrained frequently using large data sets. As such, enterprises can benefit from moving those data sets to the cloud without having to worry about the high cost of getting them back to the core.

3. Caching

Caching is essentially the middle ground between the first two patterns described above. It involves creating a full mountable representation of a data set in an alternate location—in this case, the cloud. Unlike project and delete, caching is intended to create a durable real-time copy of the data, rather than a one-time copy for a particular processing task.

One potential drawback of caching is that changes to the data at the authoritative core will invalidate the cached data copy. To address this, enterprises must perform cache refreshes or read-through operations.

The durable nature of cached data sets makes them ideal to support mission-critical stateful applications. These applications can connect to databases containing reference or master data with low latency. By running adjacent to the cloud, these applications can process data up to 20x faster than legacy data architectures.

4. Backup and restore

When you have a production data set that was originally created in a public cloud, the backup and restore data motion allows you to create a copy of that data in your authoritative core. Once the backup is created, it can be used for data protection purposes—to restore the original data set—or replicated for use across different clouds and services.

Enterprises can take advantage of backup and restore to create virtual cloud-adjacent disaster recovery sites. These are quicker to set up and more cost-effective than traditional physical disaster recovery sites, and they can provide backup and data transfer speeds that are up to 10x faster. Combined with deduplication technology, the public cloud egress costs can be made mostly irrelevant.

Build the cloud-adjacent infrastructure that unlocks the four data motion patterns

Given that these four data motions lie at the heart of all hybrid multicloud deployments, it’s critical to have the underlying cloud-adjacent infrastructure in place to support them.

This means the authoritative data core itself, the interconnection capabilities that make up the upstream and downstream layers, global cloud on-ramp availability, and the distributed infrastructure that supports edge use cases. As the world’s digital infrastructure company™, Equinix is the only partner capable of supporting each of these components with the most multicloud-capable locations across the globe.

To learn more about how Equinix enables a complete edge-to-cloud approach, including cloud-adjacent data motion patterns for hybrid multicloud success, read the Equinix Leaders’ Guide to Digital Infrastructure.

 

Ian Botbyl
Ian Botbyl Senior Manager, Product Marketing
Glenn Dekhayser
Glenn Dekhayser Principal Solutions Architect
Subscribe to the Equinix Blog