A techie’s rough guide to GDPR
[This was originally written for my upcoming book Future Ethics, but might be too boring to make the final draft. I must stress this post does not constitute legal advice; anyone who takes my word over that of a properly qualified lawyer deserves what they get. I recommend reading this post alongside the UK’s ICO guidance and/or articles from specialists such as Heather Burns.]
A large global change in data protection law is about to hit the tech industry, thanks to the EU’s General Data Protection Regulations (GDPR). GDPR affects any company, wherever they are in the world, that handles data about European citizens. It becomes law on 25 May 2018, and as such includes UK citizens, since it precedes Brexit. It’s no surprise the EU has chosen to tighten the data protection belt: Europe has long opposed the tech industry’s expansionist tendencies, particularly through antitrust suits, and is perhaps the only regulatory body with the inclination and power to challenge Silicon Valley in the coming years.
Technologists seeking to comply with GDPR should get cosy with their legal teams, rather than take advice from this entirely unqualified author. However, it’s worth knowing about the GDPR’s provisions, since they address many important data ethics issues and have considerable implications for tech companies.
GDPR defines personal data as anything that can be used to directly or indirectly identify an individual, including name, photo, email, bank details, social network posts, DNA, IP addresses, cookies, and location data. Pseudonymised data may also count, if it’s only weakly de-identified and still traceable to an individual. Under GDPR, personal data can only be collected and processed for ‘specified, explicit, and legitimate purposes’. The relevant EU Working Party is clear on this limitation:
‘A purpose that is vague or general, such as for instance ‘Improving users’ experience’, ‘marketing purposes’, or ‘future research’ will – without further detail – usually not meet the criteria of being ‘specific’.’ —Article 29 Working Party, Opinion 03/2013 on purpose limitation, 2 April 2013.
So, no more harvesting data for unplanned analytics, future experimentation, or unspecified research. Teams must have specific uses for specific data.
The regulations also raise the bar on consent. User consent is required unless you can claim another lawful basis for handling personal data. One such basis is ‘legitimate interests’, but this isn’t the catch-all saviour it may appear. To take this route you need to demonstrate your interests aren’t outweighed by others’ – it’s likely this only applies where there’s minimal privacy impact and no one could reasonably object to their data being handled in this way.
Where requested, consent must be freely given, specific, informed, and unambiguous – and indicated by a clear affirmative action. These few words form a death sentence for data dark patterns. Pre-ticked and opt-out boxes are explicitly banned: “Silence, pre-ticked boxes or inactivity should not therefore constitute consent” (Recital 32, GDPR). ‘No’ must become your data default. Requests for consent can’t be buried in Terms and Conditions – they must be separated and use clear, plain language. Requests must be granular, asking for separate consent for separate types of processing. Blanket consent is not allowed. Consent must be easy to withdraw; indeed ‘it must be as easy to withdraw consent as it is to give it’. No more retention scams that allow online signups but demand users phone a call centre to delete their accounts. Finally, parental consent is required to process children’s data – the age at which this applies is down to individual EU countries, but can’t be lower than thirteen.
GDPR also defines some riskier data as sensitive: data on race, ethnic origin, politics, religion, trade union membership, genetics, biometrics used for ID, health, sex life, and sexual orientation. This ‘special category data’ always requires explicit consent. As often happens with new legislation, it’s not yet clear exactly what this means and how it differs from standard consent, but technologists should nevertheless tread carefully.
The three remaining individual rights are more complex and ethically interesting, and deserve closer attention. First, GDPR provides a right to data portability. Not only can users request their data, but it must be provided in a structured, machine-readable format like CSV, so users can use it for other purposes. However, this isn’t an own-your-data nirvana – it only covers data the user has directly provided, and excludes data bought from third parties or new data derived through, for example, segmentation and profiling. Businesses that choose the ‘legitimate interest’ justification for data processing (see above) are also exempt. However, this new right still threatens some comfortable walled-garden models. For example, social networks, exercise-tracking apps, and photo services will have to allow users to export their posts, rides, and photos in a common format. Smart competitors will build upload tools that recognise these formats; GDPR might therefore help to bridge the strategic moats of incumbents.
Users also have a right to erasure, sometimes known as the right to be forgotten. This has already become a cause célèbre for data rights, and legal cases are swirling around the topic. It’s important to note there is no absolute right to be forgotten under GDPR; the right only applies in specific circumstances, and requests can be refused on grounds of freedom of expression, legal necessity, and public interest. The ethics of forgetting are fascinating and way beyond this post; I’ll save that for the book. But in The Ethics of Invention, Sheila Janasoff argues convincingly that this right helps to constitute what it means to be ‘a moving, changing, traceable, and opinionated data subject’. It’s a particularly important right for children, although it also has the potential to be abused by those trying to hide wrongs. And a right to be forgotten may come into direct conflict with emerging technologies: good luck handling a right to erasure request if you’ve already committed the data to an irrevocable blockchain.
The eighth individual right relates to automated decision making and profiling. This has sometimes been misrepresented as a right to explanation; i.e. that companies must explain on demand the calculations of any algorithm that takes decisions about people. This right doesn’t exist within GDPR, although it may need to exist in future. (Again, the ethical angles of explainable algorithms are complex and need to be covered separately.) GDPR’s automated decision right requires companies to tell individuals about the processing (what data is used, why it’s used, what effects it might have), to allow people to challenge automated decisions and request human intervention, and to carry out regular checks that systems are working properly. This last phrase is promising: regular auditing of decision-making systems will hopefully mean algorithmic bias will be exposed and eliminated sooner.
GDPR makes a special case of fully automated decisions that have ‘legal or similarly significant effects’, giving examples such as algorithms that affect legal rights, financial credit, or employment. These can only be undertaken where contractually necessary, authorised by law, or when the user gives explicit consent. In these high-risk cases, individuals have a right to know about the logic involved in the decision-making process. It seems likely that an outline of how the algorithm works might suffice, rather than providing the data relating to this specific decision. Companies must conduct an impact assessment to examine potential risks, and take steps to prevent errors and bias.
GDPR’s other highlights include an obligation for teams to practice data protection by design and tight stipulations to notify authorities about data breaches. And, for the first time, it’s all backed up by meaty penalties: up to 4% of global turnover or €20 million for the most severe violations.
Complying with GDPR will require tricky changes to algorithmic design, product management and design processes, user interfaces, user-facing policies, and data recording standards. You’ll have to spend time designing consent-gathering components, unless you can claim a justification that obviates consent. You may end up with less rich customer insights than you had before. Some KPIs may slump. But for companies that have direct customer relationships, it’s all manageable, and on the upside you not only reduce your compliance risk but benefit from the increased trust your customers will show in you and the online world in general.
However, there are a small number of companies who should be very worried. GDPR will expose the tracking that is now commonplace on the web, and it’s fair to expect widespread revolt. Without a direct customer relationship, third-party ad brokers and networks must rely on publishers to gather consent; but no publisher will willingly destroy their user experience with dozens of popups for their ad partners (remember: consent must be granular!). Even if a publisher did volunteer for this self-mutilation, expect users to universally refuse permission for their data to be used for tracking. It’s highly doubtful the ‘legitimate interests’ excuse will work for ad networks either; the balance-of-interests test is unlikely to go their way. The black box will be forced open, and people will find it’s full of snakes. Dr. Johnny Ryan of PageFair puts it bluntly: GDPR will ‘[rip] the digital ecosystem apart’. Expect panicked consolidation in adtech as networks realise they can’t simply sit in the middle; they must somehow own the customer relationship to control consent. It’s likely adtech firms may even try to acquire publishers for this reason. And it’s likely some will die. No flowers.