Because we can, does it mean we should?
The debates around ‘Big Data’ continue to gather pace as the world struggles to keep pace with the “digital firehose” that is flooding data into our business and personal lives.
And the flood shows no signs of slowing down. According to EMC, the world’s information is doubling every two years and, by 2020, we will collectively be generating 50 times as much information as we did in 2011.
Big benefits vs legitimate concerns
The benefits are potentially tremendous. Take Google for example, which was built with collaboration, openness and scalability in mind from the outset. Back in 2008, it combined location-based metadata with search queries to create Google Flu Trends – an app that estimates current flu activity around the world in near real-time.
Last month (January 2015) the Canadian province of Manitoba, announced it is using Google Flu Trends to help track the spread of one of the worst flu outbreaks in the country and provide early warning to healthcare providers about when and where an outbreak might occur.
That ability to link together numerous sources to derive datasets that were not previously possible is precisely what is so exciting about big data. Other sectors, such as law enforcement, are using big data-based applications – such as PredPol, which stands for Predictive Policing – to reduce crime.
In the American city of Reading, Pennsylvania, the technology has contributed to a double-digit reduction in the number of burglaries – 23 per cent – despite a double-digit drop in the number of officers on the force!
However, there are also potential pitfalls when big data gets too personal. For example, last month (January 2015), newspapers revealed that drug and insurance companies will soon be able to buy patient information from the UK’s National Health Service. That includes information on mental health conditions and diseases such as cancer, as well as smoking and drinking habits.
Advocates maintain that sharing such data will make medical advances easier and save lives. But, privacy experts warn there will be no way for the public to know who has their medical data, how it will be used, and whether it might expose them to discrimination by insurers or in the workplace.
Taking on big data
Whilst some companies seem to be making headway when it comes to big data, there are still many others facing significant challenges around storing, managing and extracting value from it.
One option for Enterprises is to integrate high performance computing and legacy systems with existing cloud infrastructure (“IaaS”) service providers such as Amazon Web Services (AWS) or Microsoft Azure. Companies colocating within Equinix’s data centers can cross connect to their systems via the Equinix Cloud Exchange.
Moving large datasets around requires massive amounts of bandwidth and analyzing on site datasets calls for low-latency connections. Furthermore, fluctuation in performance found on the Internet can lead to application performance problems and outages. And many datasets contain sensitive data that should not be transmitted over public networks.
Direct, private access to public cloud providers through Equinix completely bypasses the public Internet, which mitigates reliability and privacy concerns.
The IT industry has long recognized the critical importance of following standards. Indeed, Equinix has set more than a few standards and industry best practices of our own when it comes to creating a global network of world-class data centres and providing customers in virtually every industry with services they can rely on.
Accordingly, another big challenge posed by the big data explosion is that when, it comes to ethics, the standards still haven’t been defined. That has raised concerns, with experts warning companies that they must consider the ramifications of how they use data. i.e. Just because something can be done, doesn’t necessarily mean it should be.
US technology companies and advertisers have been seeking access to the data generated by sensors in ‘connected cars’, although carmakers have so far all behaved ethically and respected their customers’ privacy by politely declining requests to share data with those businesses.
But, whether you are talking about cars or any other IoT information issue, in many places legislation is still catching up with developments. That means that companies will have to face the ethical burden themselves. The following points offer a good place to start.
1) Ensure there are suitable controls to monitor storage and access to confidential data:
- Apply encryption where necessary
- Store in a manner that safeguards anonymity where required
- Take steps to ensure that only the required data fields are stored
2. Make time upfront to discuss privacy and ethics matters. Don’t just tack it onto the end of an already packed agenda:
- Nurture a culture of accountability in your organisation
3. Where the legislation is still trailing technological capabilities, use your company’s code of ethics to assess whether the use case is appropriate:
- Test this against your own moral judgement as a final check.
- For example think, would our customers find what we are doing invaluable or unnerving?
4. Communicate with your customers and explain your policies using human language, early on and often:
- This transparency will pay dividends in terms of confidence and goodwill
In short, until we have a commonly accepted set of standards, our primary concern should not be “can we do this”, but “should we do this?”