There is No Such Thing as ‘Anonymised’ Data
The ID and secure document industry goes to great lengths to keep secure the personally identifiable information (PII) that we gather, as we are keenly aware of the injurious consequences of ID data breeches. But there are other ways to build an identity, as an article on the digital privacy platform EFF demonstrates.
Paige Collings’ article starts from the premise that each credit card purchase, personal medical diagnosis, border crossing, touch point with government and preference about music and books is recorded and then used to predict what we like and dislike, and – ultimately – who we are.
This often happens without our knowledge or consent. Personal information that corporations collect from our online behaviours sells for profit and incentivises online actors to collect as much data about us as possible.
In an attempt to justify this, corporations often claim to de-identify the data. This supposedly removes all PII (such as a person’s name) from the data point (such as the fact that an unnamed person bought a particular medicine at a particular time and place). Personal data can also be aggregated, whereby data about multiple people are combined with the intention of removing PII and thereby protecting user privacy.
Other companies say the personal data is ‘anonymised,’ implying a one-way street where it can never be dis-aggregated and re-identified. But this is not possible – anonymous data rarely stays this way.
Anonymisation
According to the article, personal data can be considered on a spectrum of identifiability.
At the top is data that can directly identify people, such as a name or a national identity number, which can be referred to as ‘direct identifiers’.
Next is information indirectly linked to individuals, like personal phone numbers and email addresses, which some call ‘indirect identifiers’.
After this comes data connected to multiple people, such as a favourite restaurant or movie.
The other end of this spectrum is information that cannot be linked to any specific person – such as aggregated census data, and data that is not directly related to individuals at all, like weather reports.
In practice, any attempt at de-identification requires removal not only of the identifiable information, but also of information that can identify you when considered in combination with other information known about you. Paige gives this example:
First, think about the number of people that share your specific postal code.
Next, think about how many of those people also share your birthday.
Now, think about how many people share your exact birthday, postal code and gender.
According to one study, these three characteristics are enough to uniquely identify 87% of the US population.
The article concludes that we cannot trust corporations to self-regulate. The financial benefit and business usefulness of our personal data often outweighs our privacy and anonymity. However, as a matter of public policy, it is critical that user privacy is not sacrificed in favour of profit.
Subscriber content
Read the full article
Full access to ID & Secure Document News articles, newsletters and archives.