In 2017, The Economist posited that, “the world’s most valuable resource is no longer oil, but data.”[i] Since then, it’s been rephrased many times as “data is the new oil.” The digitization of global business combined with the proliferation of internet of things (IoT) devices connected at the edge is vastly increasing the amount of data generated. According to IDC, the amount of data worldwide will jump to 175 zettabytes by 2025, which is more than 10x the amount of data generated in 2015.[ii]
The IoT is introducing new value chains, disrupting every industry. In healthcare, wearable devices that can push real-time data of a user’s physical condition are opening up new use cases for wellness coaching and remote diagnosis and monitoring. And, in transportation, every car manufacturer is aiming to bring connected vehicles and services to life through software-based intelligence. Whether they are developing it in-house or partnering with external providers to make it happen, managing massive data volumes is a common challenge. For instance, a fleet of 50 autonomous test vehicles, which is pretty standard, may generate up to 20 TB of data per car daily and potentially up to 100 TB/day with more advanced sensor sets.
IOA® - Analytics Blueprint
Learn how improve insights with secure, distributed repositories of data and analytics by placing processing capabilities at the edge.
Read MoreData value depends on timeliness, usability and compliance
In a big data world, especially for real-time applications at the edge, a data point has an initial value, but the longer it takes to receive and process that data point the less valuable it becomes. You need to be able to analyze data as fast as possible in order to act accordingly. That is difficult to do if you are backhauling massive data sets back to a central location for processing.
At the same time, data is only valuable if it’s used. For example, traditional original equipment manufacturers (OEM) in the automotive sector are becoming much more advanced businesses providing data-based services. According to McKinsey, the overall revenue pool from car data-enabled services at a global scale might add up to USD $450 – 750 billion by 2030.[iii] This opens new lines of revenue, and nobody wants to get left behind.
Finally, data sovereignty plays an important role in determining data value. Compliance with regional regulations like GDPR in Europe, HIPAA in the U.S. and other country-specific regulations is an essential consideration when designing platforms to handle massive data volumes.
How to manage all that data with Kubernetes
One way to overcome these data challenges is by bringing analytics to the data, saving data transfer costs, reducing latency and processing time. A distributed IT architecture like this also enables you to identify specific types of data that are valuable for combining with other internal or external data for deeper insights. For instance, particular edge situations such as new construction or a sudden car reaction because of an accident may cause an autonomous car to trigger a flag. In that case, you need to collect the data from every sensor and camera associated to that event and cross-analyze it with external data to resolve the trigger and improve the software. It’s important to make sure that events are filtered locally and only pushed up for further analysis by core analytics engines when necessary.
Breaking platforms down into smaller blocks and placing those blocks in regional hubs, closer to the edge where the data is generated can help achieve this goal. In doing so, you avoid unnecessary data transit costs, can scale out in-region, comply with local regulations and deploy data exchange hubs for monetizing your data.
Those blocks need to be well-defined, standard, portable and sized to make deployment in new locations easier and faster. Once you have defined what these blocks look like and determined where they will be hosted, orchestration technologies like Kubernetes can help automate data management. By deploying regional hubs running Kubernetes clusters, you can just focus on developing your containerized applications to run on Kubernetes.
Optimize edge analytics on Platform Equinix
But how can you deploy Kubernetes clusters at the edge and connect them to the cloud for in-depth analyses, while also linking to the growing data-enabled services ecosystem? With a vendor-neutral footprint of 210+ International Business Exchange™ (IBX®) data centers interconnected physically and virtually across 55 global metros on 5 continents, Platform Equinix® enables you to distribute and scale data processing and analytics wherever you need to be. Platform Equinix is also home to the world’s largest ecosystem of networks (1,800+) and cloud and IT services (2,900+), with high speed, low-latency connectivity to the clouds through Equinix FabricTM (ECX FabricTM). With ECX Fabric, you can easily establish secure, high-speed software-defined connectivity to other locations, partners or 9,700+ businesses within minutes via a self-service portal.
The diagram below illustrates the high-level architecture of the components Platform Equinix provides for building a remote hub. It may change in specific use cases but the core is the same. You deploy your compute and storage and Equinix provides the connectivity to every cloud vendor and any of the data-enabled services you may need. These platform hubs can be deployed in the locations you need worldwide. From there you can receive, store, process and ultimately monetize your data starting at the edge.
Download the Analytics Blueprint to learn how to improve insights by placing processing capabilities at the edge.
[i] The Economist, The world’s most valuable resource is no longer oil, but data, May 6th 2017 edition.
[ii] IDC, Data Age 2025, sponsored by Seagate, Doc ID #US44413318, Nov 2018; Microstrategy, How Much Data by 2025?, Jan 2020.
[iii] McKinsey, Accelerating the car data monetization journey, Mar 2018.
Bringing analytics to the data can help you avoid unnecessary data transit costs, scale out in-region, comply with local regulations and deploy data exchange hubs for monetizing your data.