1

We face near term risks from malicious actors misusing frontier AI systems, with current safety filters integrated by developers easily bypassed. Frontier AI systems produce compelling misinformation and may soon be capable enough to help terrorists develop weapons of mass destruction. Moreover, there is a serious risk that future AI systems may escape human control altogether. Even aligned AI systems could destabilize or disempower existing institutions. Taken together, we believe AI may pose an existential risk to humanity in the coming decades.
Principle: IDAIS-Oxford, Oct 31, 2023

Published by IDAIS (International Dialogues on AI Safety)

Related Principles

3. Security and Safety

AI systems should be safe and sufficiently secure against malicious attacks. Safety refers to ensuring the safety of developers, deployers, and users of AI systems by conducting impact or risk assessments and ensuring that known risks have been identified and mitigated. A risk prevention approach should be adopted, and precautions should be put in place so that humans can intervene to prevent harm, or the system can safely disengage itself in the event an AI system makes unsafe decisions autonomous vehicles that cause injury to pedestrians are an illustration of this. Ensuring that AI systems are safe is essential to fostering public trust in AI. Safety of the public and the users of AI systems should be of utmost priority in the decision making process of AI systems and risks should be assessed and mitigated to the best extent possible. Before deploying AI systems, deployers should conduct risk assessments and relevant testing or certification and implement the appropriate level of human intervention to prevent harm when unsafe decisions take place. The risks, limitations, and safeguards of the use of AI should be made known to the user. For example, in AI enabled autonomous vehicles, developers and deployers should put in place mechanisms for the human driver to easily resume manual driving whenever they wish. Security refers to ensuring the cybersecurity of AI systems, which includes mechanisms against malicious attacks specific to AI such as data poisoning, model inversion, the tampering of datasets, byzantine attacks in federated learning5, as well as other attacks designed to reverse engineer personal data used to train the AI. Deployers of AI systems should work with developers to put in place technical security measures like robust authentication mechanisms and encryption. Just like any other software, deployers should also implement safeguards to protect AI systems against cyberattacks, data security attacks, and other digital security risks. These may include ensuring regular software updates to AI systems and proper access management for critical or sensitive systems. Deployers should also develop incident response plans to safeguard AI systems from the above attacks. It is also important for deployers to make a minimum list of security testing (e.g. vulnerability assessment and penetration testing) and other applicable security testing tools. Some other important considerations also include: a. Business continuity plan b. Disaster recovery plan c. Zero day attacks d. IoT devices

Published by ASEAN in ASEAN Guide on AI Governance and Ethics, 2024

4. Human centricity

AI systems should respect human centred values and pursue benefits for human society, including human beings’ well being, nutrition, happiness, etc. It is key to ensure that people benefit from AI design, development, and deployment while being protected from potential harms. AI systems should be used to promote human well being and ensure benefit for all. Especially in instances where AI systems are used to make decisions about humans or aid them, it is imperative that these systems are designed with human benefit in mind and do not take advantage of vulnerable individuals. Human centricity should be incorporated throughout the AI system lifecycle, starting from the design to development and deployment. Actions must be taken to understand the way users interact with the AI system, how it is perceived, and if there are any negative outcomes arising from its outputs. One example of how deployers can do this is to test the AI system with a small group of internal users from varied backgrounds and demographics and incorporate their feedback in the AI system. AI systems should not be used for malicious purposes or to sway or deceive users into making decisions that are not beneficial to them or society. In this regard, developers and deployers (if developing or designing inhouse) should also ensure that dark patterns are avoided. Dark patterns refer to the use of certain design techniques to manipulate users and trick them into making decisions that they would otherwise not have made. An example of a dark pattern is employing the use of default options that do not consider the end user’s interests, such as for data sharing and tracking of the user’s other online activities. As an extension of human centricity as a principle, it is also important to ensure that the adoption of AI systems and their deployment at scale do not unduly disrupt labour and job prospects without proper assessment. Deployers are encouraged to take up impact assessments to ensure a systematic and stakeholder based review and consider how jobs can be redesigned to incorporate use of AI. Personal Data Protection Commission of Singapore’s (PDPC) Guide on Job Redesign in the Age of AI6 provides useful guidance to assist organisations in considering the impact of AI on its employees, and how work tasks can be redesigned to help employees embrace AI and move towards higher value tasks.

Published by ASEAN in ASEAN Guide on AI Governance and Ethics, 2024

· 2. The Principle of Non maleficence: “Do no Harm”

AI systems should not harm human beings. By design, AI systems should protect the dignity, integrity, liberty, privacy, safety, and security of human beings in society and at work. AI systems should not threaten the democratic process, freedom of expression, freedoms of identify, or the possibility to refuse AI services. At the very least, AI systems should not be designed in a way that enhances existing harms or creates new harms for individuals. Harms can be physical, psychological, financial or social. AI specific harms may stem from the treatment of data on individuals (i.e. how it is collected, stored, used, etc.). To avoid harm, data collected and used for training of AI algorithms must be done in a way that avoids discrimination, manipulation, or negative profiling. Of equal importance, AI systems should be developed and implemented in a way that protects societies from ideological polarization and algorithmic determinism. Vulnerable demographics (e.g. children, minorities, disabled persons, elderly persons, or immigrants) should receive greater attention to the prevention of harm, given their unique status in society. Inclusion and diversity are key ingredients for the prevention of harm to ensure suitability of these systems across cultures, genders, ages, life choices, etc. Therefore not only should AI be designed with the impact on various vulnerable demographics in mind but the above mentioned demographics should have a place in the design process (rather through testing, validating, or other). Avoiding harm may also be viewed in terms of harm to the environment and animals, thus the development of environmentally friendly AI may be considered part of the principle of avoiding harm. The Earth’s resources can be valued in and of themselves or as a resource for humans to consume. In either case it is necessary to ensure that the research, development, and use of AI are done with an eye towards environmental awareness.

Published by The European Commission’s High-Level Expert Group on Artificial Intelligence in Draft Ethics Guidelines for Trustworthy AI, Dec 18, 2018

· Consensus Statement on AI Safety as a Global Public Good

Rapid advances in artificial intelligence (AI) systems’ capabilities are pushing humanity closer to a world where AI meets and surpasses human intelligence. Experts agree these AI systems are likely to be developed in the coming decades, with many of them believing they will arrive imminently. Loss of human control or malicious use of these AI systems could lead to catastrophic outcomes for all of humanity. Unfortunately, we have not yet developed the necessary science to control and safeguard the use of such advanced intelligence. The global nature of these risks from AI makes it necessary to recognize AI safety as a global public good, and work towards global governance of these risks. Collectively, we must prepare to avert the attendant catastrophic risks that could arrive at any time. Promising initial steps by the international community show cooperation on AI safety and governance is achievable despite geopolitical tensions. States and AI developers around the world committed to foundational principles to foster responsible development of AI and minimize risks at two intergovernmental summits. Thanks to these summits, states established AI Safety Institutes or similar institutions to advance testing, research and standards setting. These efforts are laudable and must continue. States must sufficiently resource AI Safety Institutes, continue to convene summits and support other global governance efforts. However, states must go further than they do today. As an initial step, states should develop authorities to detect and respond to AI incidents and catastrophic risks within their jurisdictions. These domestic authorities should coordinate to develop a global contingency plan to respond to severe AI incidents and catastrophic risks. In the longer term, states should develop an international governance regime to prevent the development of models that could pose global catastrophic risks. Deep and foundational research needs to be conducted to guarantee the safety of advanced AI systems. This work must begin swiftly to ensure they are developed and validated prior to the advent of advanced AIs. To enable this, we call on states to carve out AI safety as a cooperative area of academic and technical activity, distinct from broader geostrategic competition on development of AI capabilities. The international community should consider setting up three clear processes to prepare for a world where advanced AI systems pose catastrophic risks:

Published by IDAIS (International Dialogues on AI Safety) in IDAIS-Venice, Sept 5, 2024

· 2. RESPONSIBILITY MUST BE FULLY ACKNOWLEDGED WHEN CREATING AND USING AI

2.1. Risk based approach. The degree of attention paid to ethical AI issues and the nature of the relevant actions of AI Actors should be proportional to the assessment of the level of risk posed by specific AI technologies and systems for the interests of individuals and society. Risk level assessment shall take into account both known and possible risks, whereby the probability level of threats, as well as their possible scale in the short and long term shall be considered. Making decisions in the field of AI use that significantly affect society and the state should be accompanied by a scientifically verified, interdisciplinary forecast of socio economic consequences and risks and examination of possible changes in the paradigm of value and cultural development of the society. Development and use of an AI systems risk assessment methodology are encouraged in pursuance of this Code. 2.2. Responsible attitude. AI Actors should responsibly treat: • issues related to the influence of AI systems on society and citizens at every stage of the AI systems’ life cycle, i.a. on privacy, ethical, safe and responsible use of personal data; • the nature, degree and extent of damage that may result from the use of AI technologies and systems; • the selection and use of hardware and software utilized in different life cycles of AI systems. At the same time, the responsibility of AI Actors should correspond with the nature, degree and extent of damage that may occur as a result of the use of AI technologies and systems. The role in the life cycle of the AI system, as well as the degree of possible and real influence of a particular AI Actor on causing damage and its extent, should also be taken into account. 2.3. Precautions. When the activities of AI Actors can lead to morally unacceptable consequences for individuals and society, which can be reasonably predicted by the relevant AI Actor, the latter, should take measures to prohibit or limit the occurrence of such consequences. AI Actors shall use the provisions of this Code, including the mechanisms specified in Section 2, to assess the moral unacceptability of such consequences and discuss possible preventive measures. 2.4. No harm. AI Actors should not allow the use of AI technologies for the purpose of causing harm to human life and or health, the property of citizens and legal entities and the environment. Any use, including the design, development, testing, integration or operation of an AI system capable of purposefully causing harm to the environment, human life and or health, the property of citizens and legal entities, is prohibited. 2.5. Identification of AI in communication with a human. AI Actors are encouraged to ensure that users are duly informed of their interactions with AI systems when it affects human rights and critical areas of people’s lives and to ensure that such interaction can be terminated at the request of the user. 2.6. Data security. AI Actors must comply with the national legislation in the field of personal data and secrets protected by law when using AI systems; ensure the security and protection of personal data processed by AI systems or by AI Actors in order to develop and improve the AI systems; develop and integrate innovative methods to counter unauthorized access to personal data by third parties and use high quality and representative datasets obtained without breaking the law from reliable sources. 2.7. Information security. AI Actors should ensure the maximum possible protection from unauthorized interference of third parties in the operation of AI systems; integrate adequate information security technologies, i.a. use internal mechanisms designed to protect the AI system from unauthorized interventions and inform users and developers about such interventions; as well as promote the informing of users about the rules of information security during the use of AI systems. 2.8. Voluntary certification and Code compliance. AI Actors may implement voluntary certification systems to assess the compliance of developed AI technologies with the standards established by the national legislation and this Code. AI Actors may create voluntary certification and labeling systems for AI systems to indicate that these systems have passed voluntary certification procedures and confirm quality standards. 2.9. Control of the recursive self improvement of AI systems. AI Actors are encouraged to cooperate in identifying and verifying information about ways and forms of design of so called universal ("general") AI systems and prevention of possible threats they carry. The issues concerning the use of "general" AI technologies should be under the control of the state.

Published by AI Alliance Russia in AI Ethics Code (revised version), Oct 21, 2022 (unconfirmed)