The traceability of AI systems should be ensured; it is important to log and document both the decisions made by the systems, as well as the entire process (including a description of data gathering and labelling, and a description of the algorithm used) that yielded the decisions. Linked to this, explainability of the algorithmic decision making process, adapted to the persons involved, should be provided to the extent possible. Ongoing research to develop explainability mechanisms should be pursued. In addition, explanations of the degree to which an AI system influences and shapes the organisational decision making process, design choices of the system, as well as the rationale for deploying it, should be available (hence ensuring not just data and system transparency, but also business model transparency).
Finally, it is important to adequately communicate the AI system’s capabilities and limitations to the different stakeholders involved in a manner appropriate to the use case at hand. Moreover, AI systems should be identifiable as such, ensuring that users know they are interacting with an AI system and which persons are responsible for it.
Published by: The European Commission’s High-Level Expert Group on Artificial Intelligence in Draft Ethics Guidelines for Trustworthy AI
Trustworthy AI requires that algorithms are secure, reliable as well as robust enough to deal with errors or inconsistencies during the design, development, execution, deployment and use phase of the AI system, and to adequately cope with erroneous outcomes.
Reliability & Reproducibility. Trustworthiness requires that the accuracy of results can be confirmed and reproduced by independent evaluation. However, the complexity, non determinism and opacity of many AI systems, together with sensitivity to training model building conditions, can make it difficult to reproduce results. Currently there is an increased awareness within the AI research community that reproducibility is a critical requirement in the field. Reproducibility is essential to guarantee that results are consistent across different situations, computational frameworks and input data. The lack of reproducibility can lead to unintended discrimination in AI decisions.
Accuracy. Accuracy pertains to an AI’s confidence and ability to correctly classify information into the correct categories, or its ability to make correct predictions, recommendations, or decisions based on data or models. An explicit and well formed development and evaluation process can support, mitigate and correct unintended risks.
Resilience to Attack. AI systems, like all software systems, can include vulnerabilities that can allow them to be exploited by adversaries. Hacking is an important case of intentional harm, by which the system will purposefully follow a different course of action than its original purpose. If an AI system is attacked, the data as well as system behaviour can be changed, leading the system to make different decisions, or causing the system to shut down altogether. Systems and or data can also become corrupted, by malicious intention or by exposure to unexpected situations. Poor governance, by which it becomes possible to intentionally or unintentionally tamper with the data, or grant access to the algorithms to unauthorised entities, can also result in discrimination, erroneous decisions, or even physical harm.
Fall back plan. A secure AI has safeguards that enable a fall back plan in case of problems with the AI system. In some cases this can mean that the AI system switches from statistical to rule based procedure, in other cases it means that the system asks for a human operator before continuing the action.
3. Principle of controllability
Published by: Ministry of Internal Affairs and Communications (MIC), the Government of Japan in AI R&D Principles
Developers should pay attention to the controllability of AI systems.
In order to assess the risks related to the controllability of AI systems, it is encouraged that developers make efforts to conduct verification and validation in advance. One of the conceivable methods of risk assessment is to conduct experiments in a closed space such as in a laboratory or a sandbox in which security is ensured, at a stage before the practical application in society.
In addition, in order to ensure the controllability of AI systems, it is encouraged that developers pay attention to whether the supervision (such as monitoring or warnings) and countermeasures (such as system shutdown, cut off from networks, or repairs) by humans or other trustworthy AI systems are effective, to the extent possible in light of the characteristics of the technologies to be adopted.
Verification and validation are methods for evaluating and controlling risks in advance. Generally, the former is used for confirming formal consistency, while the latter is used for confirming substantial validity. (See, e.g., The Future of Life Institute (FLI), Research Priorities for Robust and Beneficial Artificial Intelligence (2015)).
Examples of what to see in the risk assessment are risks of reward hacking in which AI systems formally achieve the goals assigned but substantially do not meet the developer's intents, and risks that AI systems work in ways that the developers have not intended due to the changes of their outputs and programs in the process of the utilization with their learning, etc. For reward hacking, see, e.g., Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman & Dan Mané, Concrete Problems in AI Safety, arXiv: 1606.06565 [cs.AI] (2016).
We will make AI systems transparent
1. Developers should build systems whose failures can be traced and diagnosed
2. People should be told when significant decisions about them are being made by AI
3. Within the limits of privacy and the preservation of intellectual property, those who deploy AI systems should be transparent about the data and algorithms they use
AI systems will be safe, secure and controllable by humans
1. Safety and security of the people, be they operators, end users or other parties, will be of paramount concern in the design of any AI system
2. AI systems should be verifiably secure and controllable throughout their operational lifetime, to the extent permitted by technology
3. The continued security and privacy of users should be considered when decommissioning AI systems
4. AI systems that may directly impact people’s lives in a significant way should receive commensurate care in their designs, and;
5. Such systems should be able to be overridden or their decisions reversed by designated people