Data minimization vs. bias prevention

The EU AI Act and the GDPR have a few intersections that would be fascinating to follow, none more than Art. 10 of the AI Act, which deals with data governance, specifically its paragraph 5. This provision allows processing of special categories of personal data under strict conditions in order to ensure that the AI systems are trained on diverse datasets, that include for example people with various backgrounds. The purpose is to avoid discriminative practices stemming from the usage of AI systems that are trained on lopsided datasets.

What is particularly interesting for me here is the "clash" between two (at least in my eyes) equally important rights of persons - one is the right to (data) privacy and the other is the right to equal treatment. Both of those rights are strongly connected to human dignity, so in some cases, I believe the courts and lawyers would have a difficult time making a fair decision as to which of these right prevails.

For example, I wouldn't want my health data to be used for training purposes by a random system provider B. At the same time, if I have a chronic condition A, which is very rare, and my hospital uses the AI system of service provider B to establish emergency access protocols, so that the AI system helps with the decision making in terms of who gets treatment first, the fact that my health data was not part of the training may become life-threatening in some circumstances - I could be denied priority access based on the fact that the condition is not present in the dataset.

On the other side, if there is a cybersecurity breach in the servers of service provider B and the data about my conditions gets leaked, it may fall into the hands of specific actors that may target people who have the same condition as me based on prejudice or whatever other reason, which might also endanger my well-being in a significant manner.

Therefore, enterprises who fall under Art. 10 (meaning providers of high-risk AI systems) would need to (1) establish very strict security and cybersecurity measures for sensitive personal data, (2) whenever possible, not use personal data at all, or use measures to anonymize/pseudonimize the data, so that in my example above the condition itself is present in the data set, but my name is not attached to it, (3) make sure they have other measures built in their systems, which allow for some degree of human control and decision-making power in critical situations and (4) not transfer or give access to personal data to anyone outside of their organization.

Data governance should, similarly to the whole AI governance, include various stakeholders from the respective organizations, because in such complex situations, having different points of view is essential. Technical, business, compliance, legal people need to be involved in all important decision-making in regard of the AI system throughout its lifecycle, otherwise the risks for the companies could be enormous.

Fair and trustworthy AI systems, which do not force us to choose the lesser of two "evils" will only be possible through thoughtful, responsible, ethical and consistent governance practices. Otherwise the risk both for the people and the companies would be too big.

Data minimization vs. bias prevention

Recent Posts

Comentários