III. Privacy and Data Governance

Privacy and data protection must be guaranteed at all stages of the AI system’s life cycle. Digital records of human behaviour may allow AI systems to infer not only individuals’ preferences, age and gender but also their sexual orientation, religious or political views. To allow individuals to trust the data processing, it must be ensured that they have full control over their own data, and that data concerning them will not be used to harm or discriminate against them. In addition to safeguarding privacy and personal data, requirements must be fulfilled to ensure high quality AI systems. The quality of the data sets used is paramount to the performance of AI systems. When data is gathered, it may reflect socially constructed biases, or contain inaccuracies, errors and mistakes. This needs to be addressed prior to training an AI system with any given data set. In addition, the integrity of the data must be ensured. Processes and data sets used must be tested and documented at each step such as planning, training, testing and deployment. This should also apply to AI systems that were not developed in house but acquired elsewhere. Finally, the access to data must be adequately governed and controlled.
Principle: Key requirements for trustworthy AI, Apr 8, 2019

Published by European Commission

Related Principles

5. Privacy and Data Governance

AI systems should have proper mechanisms in place to ensure data privacy and protection and maintain and protect the quality and integrity of data throughout their entire lifecycle. Data protocols need to be set up to govern who can access data and when data can be accessed. Data privacy and protection should be respected and upheld during the design, development, and deployment of AI systems. The way data is collected, stored, generated, and deleted throughout the AI system lifecycle must comply with applicable data protection laws, data governance legislation, and ethical principles. Some data protection and privacy laws in ASEAN include Malaysia’s Personal Data Protection Act 2010, the Philippines’ Data Privacy Act of 2012, Singapore’s Personal Data Protection Act 2012, Thailand’s Personal Data Protection Act 2019, Indonesia’s Personal Data Protection Law 2022, and Vietnam’s Personal Data Protection Decree 2023. Organisations should be transparent about their data collection practices, including the types of data collected, how it is used, and who has access to it. Organisations should ensure that necessary consent is obtained from individuals before collecting, using, or disclosing personal data for AI development and deployment, or otherwise have appropriate legal basis to collect, use or disclose personal data without consent. Unnecessary or irrelevant data should not be gathered to prevent potential misuse. Data protection and governance frameworks should be set up and adhered to by developers and deployers of AI systems. These frameworks should also be periodically reviewed and updated in accordance with applicable privacy and data protection laws. For example, data protection impact assessments (DPIA) help organisations determine how data processing systems, procedures, or technologies affect individuals’ privacy and eliminate risks that might violate compliance7. However, it is important to note that DPIAs are much narrower in scope than an overall impact assessment for use of AI systems and are not sufficient as an AI risk assessment. Other components will need to be considered for a full assessment of risks associated with AI systems. Developers and deployers of AI systems should also incorporate a privacy by design principle when developing and deploying AI systems. Privacy by design is an approach that embeds privacy in every stage of the system development lifecycle. Data privacy is essential in gaining the public’s trust in technological advances. Another consideration is investing in privacy enhancing technologies to preserve privacy while allowing personal data to be used for innovation. Privacy enhancing technologies include, but are not limited to, differential privacy, where small changes are made to raw data to securely de identify inputs without having a significant impact on the results of the AI system, and zero knowledge proofs (ZKP), where ZKP hide the underlying data and answer simple questions about whether something is true or false without revealing additional information

Published by ASEAN in ASEAN Guide on AI Governance and Ethics, 2024

· 2. Data Governance

The quality of the data sets used is paramount for the performance of the trained machine learning solutions. Even if the data is handled in a privacy preserving way, there are requirements that have to be fulfilled in order to have high quality AI. The datasets gathered inevitably contain biases, and one has to be able to prune these away before engaging in training. This may also be done in the training itself by requiring a symmetric behaviour over known issues in the training set. In addition, it must be ensured that the proper division of the data which is being set into training, as well as validation and testing of those sets, is carefully conducted in order to achieve a realistic picture of the performance of the AI system. It must particularly be ensured that anonymisation of the data is done in a way that enables the division of the data into sets to make sure that a certain data – for instance, images from same persons – do not end up into both the training and test sets, as this would disqualify the latter. The integrity of the data gathering has to be ensured. Feeding malicious data into the system may change the behaviour of the AI solutions. This is especially important for self learning systems. It is therefore advisable to always keep record of the data that is fed to the AI systems. When data is gathered from human behaviour, it may contain misjudgement, errors and mistakes. In large enough data sets these will be diluted since correct actions usually overrun the errors, yet a trace of thereof remains in the data. To trust the data gathering process, it must be ensured that such data will not be used against the individuals who provided the data. Instead, the findings of bias should be used to look forward and lead to better processes and instructions – improving our decisions making and strengthening our institutions.

Published by The European Commission’s High-Level Expert Group on Artificial Intelligence in Draft Ethics Guidelines for Trustworthy AI, Dec 18, 2018

Responsible Deployment

Principle: The capacity of an AI agent to act autonomously, and to adapt its behavior over time without human direction, calls for significant safety checks before deployment, and ongoing monitoring. Recommendations: Humans must be in control: Any autonomous system must allow for a human to interrupt an activity or shutdown the system (an “off switch”). There may also be a need to incorporate human checks on new decision making strategies in AI system design, especially where the risk to human life and safety is great. Make safety a priority: Any deployment of an autonomous system should be extensively tested beforehand to ensure the AI agent’s safe interaction with its environment (digital or physical) and that it functions as intended. Autonomous systems should be monitored while in operation, and updated or corrected as needed. Privacy is key: AI systems must be data responsible. They should use only what they need and delete it when it is no longer needed (“data minimization”). They should encrypt data in transit and at rest, and restrict access to authorized persons (“access control”). AI systems should only collect, use, share and store data in accordance with privacy and personal data laws and best practices. Think before you act: Careful thought should be given to the instructions and data provided to AI systems. AI systems should not be trained with data that is biased, inaccurate, incomplete or misleading. If they are connected, they must be secured: AI systems that are connected to the Internet should be secured not only for their protection, but also to protect the Internet from malfunctioning or malware infected AI systems that could become the next generation of botnets. High standards of device, system and network security should be applied. Responsible disclosure: Security researchers acting in good faith should be able to responsibly test the security of AI systems without fear of prosecution or other legal action. At the same time, researchers and others who discover security vulnerabilities or other design flaws should responsibly disclose their findings to those who are in the best position to fix the problem.

Published by Internet Society, "Artificial Intelligence and Machine Learning: Policy Paper" in Guiding Principles and Recommendations, Apr 18, 2017

1 Protect autonomy

Adoption of AI can lead to situations in which decision making could be or is in fact transferred to machines. The principle of autonomy requires that any extension of machine autonomy not undermine human autonomy. In the context of health care, this means that humans should remain in full control of health care systems and medical decisions. AI systems should be designed demonstrably and systematically to conform to the principles and human rights with which they cohere; more specifically, they should be designed to assist humans, whether they be medical providers or patients, in making informed decisions. Human oversight may depend on the risks associated with an AI system but should always be meaningful and should thus include effective, transparent monitoring of human values and moral considerations. In practice, this could include deciding whether to use an AI system for a particular health care decision, to vary the level of human discretion and decision making and to develop AI technologies that can rank decisions when appropriate (as opposed to a single decision). These practicescan ensure a clinician can override decisions made by AI systems and that machine autonomy can be restricted and made “intrinsically reversible”. Respect for autonomy also entails the related duties to protect privacy and confidentiality and to ensure informed, valid consent by adopting appropriate legal frameworks for data protection. These should be fully supported and enforced by governments and respected by companies and their system designers, programmers, database creators and others. AI technologies should not be used for experimentation or manipulation of humans in a health care system without valid informed consent. The use of machine learning algorithms in diagnosis, prognosis and treatment plans should be incorporated into the process for informed and valid consent. Essential services should not be circumscribed or denied if an individual withholds consent and that additional incentives or inducements should not be offered by either a government or private parties to individuals who do provide consent. Data protection laws are one means of safeguarding individual rights and place obligations on data controllers and data processors. Such laws are necessary to protect privacy and the confidentiality of patient data and to establish patients’ control over their data. Construed broadly, data protection laws should also make it easy for people to access their own health data and to move or share those data as they like. Because machine learning requires large amounts of data – big data – these laws are increasingly important.

Published by World Health Organization (WHO) in Key ethical principles for use of artificial intelligence for health, Jun 28, 2021

3 Ensure transparency, explainability and intelligibility

AI should be intelligible or understandable to developers, users and regulators. Two broad approaches to ensuring intelligibility are improving the transparency and explainability of AI technology. Transparency requires that sufficient information (described below) be published or documented before the design and deployment of an AI technology. Such information should facilitate meaningful public consultation and debate on how the AI technology is designed and how it should be used. Such information should continue to be published and documented regularly and in a timely manner after an AI technology is approved for use. Transparency will improve system quality and protect patient and public health safety. For instance, system evaluators require transparency in order to identify errors, and government regulators rely on transparency to conduct proper, effective oversight. It must be possible to audit an AI technology, including if something goes wrong. Transparency should include accurate information about the assumptions and limitations of the technology, operating protocols, the properties of the data (including methods of data collection, processing and labelling) and development of the algorithmic model. AI technologies should be explainable to the extent possible and according to the capacity of those to whom the explanation is directed. Data protection laws already create specific obligations of explainability for automated decision making. Those who might request or require an explanation should be well informed, and the educational information must be tailored to each population, including, for example, marginalized populations. Many AI technologies are complex, and the complexity might frustrate both the explainer and the person receiving the explanation. There is a possible trade off between full explainability of an algorithm (at the cost of accuracy) and improved accuracy (at the cost of explainability). All algorithms should be tested rigorously in the settings in which the technology will be used in order to ensure that it meets standards of safety and efficacy. The examination and validation should include the assumptions, operational protocols, data properties and output decisions of the AI technology. Tests and evaluations should be regular, transparent and of sufficient breadth to cover differences in the performance of the algorithm according to race, ethnicity, gender, age and other relevant human characteristics. There should be robust, independent oversight of such tests and evaluation to ensure that they are conducted safely and effectively. Health care institutions, health systems and public health agencies should regularly publish information about how decisions have been made for adoption of an AI technology and how the technology will be evaluated periodically, its uses, its known limitations and the role of decision making, which can facilitate external auditing and oversight.

Published by World Health Organization (WHO) in Key ethical principles for use of artificial intelligence for health, Jun 28, 2021