While the GDPR framework is robust in many respects, it struggles to provide adequate protection against the emerging risks associated with inferred data (sometimes called derived data, profiling data, or inferential data). Inferred data pose potentially significant risks in terms of privacy and/or discrimination, yet they would seem to receive the least protection of the personal data types prescribed by GDPR. Defined as assumptions or predictions about future behaviour, inferred data cannot be verified at the time of decision-making. Consequently, data subjects are often unable to predict, understand or refute these inferences, whilst their privacy rights, identity and reputation are impacted.
Reaching dangerous conclusions
Numerous applications drawing potentially troubling inferences have emerged; Facebook is reported to be able to infer protected attributes such as sexual orientation and race, as well as political opinions and the likelihood of a data subject attempting suicide. Facebook data has also been used by third parties to decide on loan eligibility, to infer political leniencies, to predict views on social issues such as abortion, and to determine susceptibility to depression. Google has attempted to predict flu outbreaks, other diseases and medical outcomes. Microsoft can predict Parkinson’s and Alzheimer’s from search engine interactions. Target can predict pregnancy from purchase history, users’ satisfaction can be determined by mouse tracking, and China infers a social credit scoring system.
What protections does GDPR offer for inferred data?
The European Data Protection Board (EDPB) notes that both verifiable and unverifiable inferences are classified as personal data (for instance, the outcome of a medical assessment regarding a user’s health, or a risk management profile). However it is unclear whether the reasoning and processes that led to the inference are similarly classified. If inferences are deemed to be personal data, should the data protection rights enshrined in GDPR also equally apply?
The data subjects’ right to being informed, right to rectification, right to object to processing, and right to portability are significantly reduced when data is not ‘provided by the data subject’ for example the EDPB note (in their guidelines on the rights to data portability) that “though such data may be part of a profile kept by a data controller and are inferred or derived from the analysis of data provided by the data subject, these data will typically not be considered as “provided by the data subject” and thus will not be within scope of this new right’.
The data subject however can still exercise their “right to obtain from the controller confirmation as to whether or not personal data concerning the data subject has being processed, and, where that is the case, access to the personal data”. The data subject also has the right to information about “the existence of automated decision-making, including profiling (Article 22(1),(4)) meaningful information about the logic involved, as well as the significance and consequences of such processing” (Article 15). However the data subject must actively make such an access request, and if the organisation does not provide the data, how will the data subject know that derived or inferred data is missing from their access request?
A data subject can also object to direct marketing based on profiling and/or have it stopped, however there is no obligation on the controller to inform the data subject that any profiling is taking place – “unless it produces legal or significant effects on the data subject”.
No answer just yet…
Addressing the challenges and tensions of inferred and derived data, will necessitate further case law on the interpretation of “personal data”, particularly regarding interpretations of GDPR. Future case law on the meaning of “legal effects… or similarly significantly affects”, in the context of profiling, would also be helpful. It would also seem reasonable to suggest that where possible data subjects should be informed at collection point, that data is derived by the organisation and for what purposes. If the data subject doesn’t know that an organisation uses their data to infer new data, the data subject cannot exercise fully their data subject rights, since they won’t know that such data exists.
In the meantime, it seems reasonable to suggest that inferred data which has been clearly informed to the data subject, is benevolent in its intentions, and offers the data subject positive enhanced value, is ‘fair’.