III. Privacy and Data Governance

Privacy and data protection must be guaranteed at all stages of the AI system’s life cycle. Digital records of human behaviour may allow AI systems to infer not only individuals’ preferences, age and gender but also their sexual orientation, religious or political views. To allow individuals to trust the data processing, it must be ensured that they have full control over their own data, and that data concerning them will not be used to harm or discriminate against them. In addition to safeguarding privacy and personal data, requirements must be fulfilled to ensure high quality AI systems. The quality of the data sets used is paramount to the performance of AI systems. When data is gathered, it may reflect socially constructed biases, or contain inaccuracies, errors and mistakes. This needs to be addressed prior to training an AI system with any given data set. In addition, the integrity of the data must be ensured. Processes and data sets used must be tested and documented at each step such as planning, training, testing and deployment. This should also apply to AI systems that were not developed in house but acquired elsewhere. Finally, the access to data must be adequately governed and controlled.
Principle: Key requirements for trustworthy AI, Apr 8, 2019

Published by European Commission

Related Principles

· (3) Privacy

In society premised on AI, it is possible to estimate each person’s political position, economic situation, hobbies preferences, etc. with high accuracy from data on the data subject’s personal behavior. This means, when utilizing AI, that more careful treatment of personal data is necessary than simply utilizing personal information. To ensure that people are not suffered disadvantages from unexpected sharing or utilization of personal data through the internet for instance, each stakeholder must handle personal data based on the following principles. Companies or government should not infringe individual person’s freedom, dignity and equality in utilization of personal data with AI technologies. AI that uses personal data should have a mechanism that ensures accuracy and legitimacy and enable the person herself himself to be substantially involved in the management of her his privacy data. As a result, when using the AI, people can provide personal data without concerns and effectively benefit from the data they provide. Personal data must be properly protected according to its importance and sensitivity. Personal data varies from those unjust use of which would be likely to greatly affect rights and benefits of individuals (Typically thought and creed, medical history, criminal record, etc.) to those that are semi public in social life. Taking this into consideration, we have to pay enough attention to the balance between the use and protection of personal data based on the common understanding of society and the cultural background.

Published by Cabinet Office, Government of Japan in Social Principles of Human-centric AI (Draft), Dec 27, 2018

· 2. Data Governance

The quality of the data sets used is paramount for the performance of the trained machine learning solutions. Even if the data is handled in a privacy preserving way, there are requirements that have to be fulfilled in order to have high quality AI. The datasets gathered inevitably contain biases, and one has to be able to prune these away before engaging in training. This may also be done in the training itself by requiring a symmetric behaviour over known issues in the training set. In addition, it must be ensured that the proper division of the data which is being set into training, as well as validation and testing of those sets, is carefully conducted in order to achieve a realistic picture of the performance of the AI system. It must particularly be ensured that anonymisation of the data is done in a way that enables the division of the data into sets to make sure that a certain data – for instance, images from same persons – do not end up into both the training and test sets, as this would disqualify the latter. The integrity of the data gathering has to be ensured. Feeding malicious data into the system may change the behaviour of the AI solutions. This is especially important for self learning systems. It is therefore advisable to always keep record of the data that is fed to the AI systems. When data is gathered from human behaviour, it may contain misjudgement, errors and mistakes. In large enough data sets these will be diluted since correct actions usually overrun the errors, yet a trace of thereof remains in the data. To trust the data gathering process, it must be ensured that such data will not be used against the individuals who provided the data. Instead, the findings of bias should be used to look forward and lead to better processes and instructions – improving our decisions making and strengthening our institutions.

Published by The European Commission’s High-Level Expert Group on Artificial Intelligence in Draft Ethics Guidelines for Trustworthy AI, Dec 18, 2018

· 7. Respect for Privacy

Privacy and data protection must be guaranteed at all stages of the life cycle of the AI system. This includes all data provided by the user, but also all information generated about the user over the course of his or her interactions with the AI system (e.g. outputs that the AI system generated for specific users, how users responded to particular recommendations, etc.). Digital records of human behaviour can reveal highly sensitive data, not only in terms of preferences, but also regarding sexual orientation, age, gender, religious and political views. The person in control of such information could use this to his her advantage. Organisations must be mindful of how data is used and might impact users, and ensure full compliance with the GDPR as well as other applicable regulation dealing with privacy and data protection.

Published by The European Commission’s High-Level Expert Group on Artificial Intelligence in Draft Ethics Guidelines for Trustworthy AI, Dec 18, 2018

· 8. Robustness

Trustworthy AI requires that algorithms are secure, reliable as well as robust enough to deal with errors or inconsistencies during the design, development, execution, deployment and use phase of the AI system, and to adequately cope with erroneous outcomes. Reliability & Reproducibility. Trustworthiness requires that the accuracy of results can be confirmed and reproduced by independent evaluation. However, the complexity, non determinism and opacity of many AI systems, together with sensitivity to training model building conditions, can make it difficult to reproduce results. Currently there is an increased awareness within the AI research community that reproducibility is a critical requirement in the field. Reproducibility is essential to guarantee that results are consistent across different situations, computational frameworks and input data. The lack of reproducibility can lead to unintended discrimination in AI decisions. Accuracy. Accuracy pertains to an AI’s confidence and ability to correctly classify information into the correct categories, or its ability to make correct predictions, recommendations, or decisions based on data or models. An explicit and well formed development and evaluation process can support, mitigate and correct unintended risks. Resilience to Attack. AI systems, like all software systems, can include vulnerabilities that can allow them to be exploited by adversaries. Hacking is an important case of intentional harm, by which the system will purposefully follow a different course of action than its original purpose. If an AI system is attacked, the data as well as system behaviour can be changed, leading the system to make different decisions, or causing the system to shut down altogether. Systems and or data can also become corrupted, by malicious intention or by exposure to unexpected situations. Poor governance, by which it becomes possible to intentionally or unintentionally tamper with the data, or grant access to the algorithms to unauthorised entities, can also result in discrimination, erroneous decisions, or even physical harm. Fall back plan. A secure AI has safeguards that enable a fall back plan in case of problems with the AI system. In some cases this can mean that the AI system switches from statistical to rule based procedure, in other cases it means that the system asks for a human operator before continuing the action.

Published by The European Commission’s High-Level Expert Group on Artificial Intelligence in Draft Ethics Guidelines for Trustworthy AI, Dec 18, 2018

Responsible Deployment

Principle: The capacity of an AI agent to act autonomously, and to adapt its behavior over time without human direction, calls for significant safety checks before deployment, and ongoing monitoring. Recommendations: Humans must be in control: Any autonomous system must allow for a human to interrupt an activity or shutdown the system (an “off switch”). There may also be a need to incorporate human checks on new decision making strategies in AI system design, especially where the risk to human life and safety is great. Make safety a priority: Any deployment of an autonomous system should be extensively tested beforehand to ensure the AI agent’s safe interaction with its environment (digital or physical) and that it functions as intended. Autonomous systems should be monitored while in operation, and updated or corrected as needed. Privacy is key: AI systems must be data responsible. They should use only what they need and delete it when it is no longer needed (“data minimization”). They should encrypt data in transit and at rest, and restrict access to authorized persons (“access control”). AI systems should only collect, use, share and store data in accordance with privacy and personal data laws and best practices. Think before you act: Careful thought should be given to the instructions and data provided to AI systems. AI systems should not be trained with data that is biased, inaccurate, incomplete or misleading. If they are connected, they must be secured: AI systems that are connected to the Internet should be secured not only for their protection, but also to protect the Internet from malfunctioning or malware infected AI systems that could become the next generation of botnets. High standards of device, system and network security should be applied. Responsible disclosure: Security researchers acting in good faith should be able to responsibly test the security of AI systems without fear of prosecution or other legal action. At the same time, researchers and others who discover security vulnerabilities or other design flaws should responsibly disclose their findings to those who are in the best position to fix the problem.

Published by Internet Society, "Artificial Intelligence and Machine Learning: Policy Paper" in Guiding Principles and Recommendations, Apr 18, 2017