Data Sharing: who do you trust?

Loose Lips Might Ship Sinks Poster

Yesterday I posted my full approval for folks like Apple and Google to know a lot of data about me, specifically from the devices I usually carry around with me. This is in the full knowledge that the full extent of data sharing is open, transparent and that I get notified (at least by Google) if any application on my Android handset is seeking to solicit more data from me, or changing their data sharing policy in any way. With that, I have full confidence that I can opt out if I ever feel the level of intrusion exceeds my comfort levels with the data use; i’m generally very happy if it does improve the level of service delivered to me without downsides.

I’ve only really baulked at one such update, which was a request by LinkedIn to be able to mine the call records of who I contacted, and who I received calls from, on my mobile phone. I felt this was a violation of the use I put their application to, so elected to remove the application from my Nexus 5 instead.

After I posted my note, I had a reply on Facebook from Bruce Stidston, that read:

You’re right, IMHO, up to a point when you say “what’s not to like?”. For me, the bit that’s not to like is scope creep. The NHS, for example, accumulates data on each patient, and that’s (potentially) cool when it’s used to improve patient outcomes by sharing within the NHS. The problem is that as we move into maturity in IT and data collection technologies, we’re not even in infancy when it comes to concepts of privacy. So when some bright spark reckons it’s cool to dish out “aggregated and individually unidentifiable” data to Big Pharma to shore up NHS finances, I need to be right there on the ball to say yay or nay — and that’s in the best-case situation. The real-case situation is they’ll do it anyway and seek forgiveness afterwards. That’s what’s not to like.

I think of this generalised problem as “the tragedy of the techno-morons”. Smart people did amazing things to make impossible things happen — think just for a moment of the layers of wonderful intricacy that make GPS work, which all of us now depend on — and then some Tim Nice-But-Dim (like my MP) who have only just worked out how a bicycle works are entrusted with the powers to sign off huge snowballs of potentially invasive applications for those technologies. I never forget that the guys at BT who decided that deep-packet inspection of private IP datastreams was fine for advertising purposes, have yet to be hauled before the courts.

I think Bruce is 100% correct. It was with some horror that I saw some plans to share my NHS data with commercial organisations, data which was claimed in the headlines to be anonymised but which appeared to contain my date of birth and postcode. The missing cluestick is that a UK postcode routinely covers an average of circa 10 households, and i’m pretty sure i’m the only one in my postcode of my age and gender, and that’s even before my day and month of birth get served up. This is a textbook example of history about to repeat itself, given the people looking at this process are obviously unaware of what happened when AOL released ‘anonymised data’ a few years ago. You only have to Google “AOL data leak” and you’ll probably find top of the list is this Wikipedia article.

The sad fact is that anonymising the data set relies on ensuring an inability to triangulate data, between disparate data sources, to be able to trace records provided back to specific named individuals. The proposals drove a bus straight through this without apparent due care and attention. The side effect of this is then for a commercial entity to be able to positively discriminate against me for the purposes of insurance (which should be a random level tax across a policy holding population) or to undermine my human rights for privacy, freedom of expression or freedom of movement without unwanted side effects.

The meme of “Crisis in the NHS” is not an appropriate one in my view, in that the UK health service is well funded and very efficient compared to the health systems in virtually every major economy. It appears to be being subverted in support of introducing American-style structural changes, where the costs are around double ours per head of population, not universal and yet stuffed with inefficiencies we should have no wish to copy here. With that in mind, seeing the delay in the consultation about data sharing enacted, it came as rather a shock to see this list of data sharing activity that had already taken place without consultation:

Ministers have gone against the findings of their own information governance review and allowed patient-identifiable data from GP records to be used in the NHS outside of the ‘safe havens’ recommended by the Caldicott report for six months.

Health secretary Jeremy Hunt has approved plans for NHS England to waive common confidentiality laws for six months under a legal exemption called section 251, allowing patient identifiable data to be passed to commissioners and support units.

This is despite the safe havens for potentially identifiable patient data recommended by the Government’s own Caldicott2 report published earlier this year not being in operation.

The extent of this sharing is documented here. At the time I first looked at the document of already approved data releases, it ran to 40 pages of A4. It’s currently 459 releases over 48 pages (latest up-to-date here). I fear Bruces “Tim – Nice but Dim” goes by the name of Jeremy Hunt and the damage has been in full flow, despite previous assurances, for some time now. This is an appalling travesty and an apparent violation of the whole basis of UK Data Protection Acts. The Minister should be thoroughly ashamed and, if justice were to be served, should be up in front of the European Court for a fundamental violation of Section 8 of the European Convention of Human Rights (the right to privacy).

It’s also with an equal level of concern that Ministers of the UK Government are also suggested that tax records should be released in a publicly accessible form by HMRC.

I’m all for data to be shared for Medical Research purposes (as suggested by Larry Page), or in support of Government initiatives to undertake projects for the common good of the UK population. My wife Jane already has all her genome stored at 23andMe, as we have full confidence in their data sharing policy and our ability to reverse out if we feel at all uncomfortable in the future. In doing so here in the UK, the folks releasing data should be fully cognizant of the need to ensure the privacy of individuals that may otherwise be subjected to personal or commercial discrimination as a result of provision of data, either directly or from being complicit in allowing triangulation from other sources to the same end result.

Those who don’t learn from history are, as always, destined to repeat it. We should by now know better than that, and have politicians that know likewise.

“OK Google. Where did I park my car?”

Google Now "Where did I Park my Car?" CardThere appears to be a bit of controversy with some commentators learning exactly what “Favorite Locations” are, as stored by every iPhone handset. What happens is that the number of visits to common locations are recorded, from which, based on time spans and days of week, Apple can deduce your “normal” working location and the address at which you sleep most nights. This is currently stored only in your iPhone handset and apparently not yet used; it is designed to enable services to advise you of traffic conditions to and from work, to be used at some point in the future.

The gut reaction is “Whey! They can see exactly where i’m going all the time!”. Well, yes, your handset can; GPS co-ordinates are usually good for an approx location to a meter or two, you have a compass in there that indicates which way you’re facing, and various accelerometers that can work out the devices orientation in 3 dimensions. The only downside is that the full mix tends to be heavy on battery power, and hence currently used by applications on the phone fairly sparingly.

Some privacy concerns then started to arise. However, I thought it was fairly common knowledge that mobile phone operators (certainly in the USA) could deduce the locations of spectators as being inside a sports stadium, and tell the stadium owners the basic demographics of people present, and the locations from which they travelled to the event. This sort of capability will extend to low power bluetooth beacons which can be positioned in retail outlets, which armed with a compatible application (and your permission to share your data), will give them analysis gold. Full coverage, 365 days a year, to a level that doesn’t need Paco Underhill class analysis (Paco is the author of seminal book “Why We Buy: The Science of Shopping“, itself based on years of analysis of customer behaviour in and around retail establishments).

I think i’m fairly cool with it all. Google Android handsets can already sense internally whether you are walking, cycling, on a bus or driving in a car. The whole premise of Google Now is to do searches or to provide service to you before you have to explicitly ask for it. I got quite used to my Nexus phone routinely volunteering commute traffic conditions before I got in my car, or to warn me to leave earlier to hit an appointment in time given current driving (or bus service) conditions on the route I usually took. I was also very impressed when I walked past a bus stop in Reading and Google Now flashed up the eta and destination of the next bus, and a summary of the timetable for buses leaving from that stop.

Google have just released another card on Google Now that automatically notes where you parked your car, and navigates you back to it if you feel the need for it to do so later on.

All of this is done with your explicit permission, and one of the nice things on Android is that if the software vendors data policies change in any way, it will not allow through the update to enable that functionality without explicitly asking you for permission first. Hence why I knocked LinkedIn off my Nexus 5 when they said an update would enable them to collect my phone call data of who I was calling and receiving calls from. I thought that was unnecessary for the service I receive (and pay for) from them.

The location services i’m sharing with a small number of vendors are already returning great benefit to me. If that continues, and service providers are only intrusive enough to help deliver a useful service to me, then i’m happy to share that data. If you don’t want to play, that’s also your call. What’s not to like?