· 19) Capability Caution

There being no consensus, we should avoid strong assumptions regarding upper limits on future AI capabilities.
Principle: Asilomar AI Principles, Jan 3-8, 2017

Published by Future of Life Institute (FLI), Beneficial AI 2017

Related Principles

· (4) Security

Positive utilization of AI means that many social systems will be automated, and the safety of the systems will be improved. On the other hand, within the scope of today's technologies, it is impossible for AI to respond appropriately to rare events or deliberate attacks. Therefore, there is a new security risk for the use of AI. Society should always be aware of the balance of benefits and risks, and should work to improve social safety and sustainability as a whole. Society must promote broad and deep research and development in AI (from immediate measures to deep understanding), such as the proper evaluation of risks in the utilization of AI and research to reduce risks. Society must also pay attention to risk management, including cybersecurity awareness. Society should always pay attention to sustainability in the use of AI. Society should not, in particular, be uniquely dependent on single AI or a few specified AI.

Published by Cabinet Office, Government of Japan in Social Principles of Human-centric AI, Dec 27, 2018

· Measurement and Evaluation

We should develop comprehensive methods and techniques to operationalize these red lines prior to there being a meaningful risk of them being crossed. To ensure red line testing regimes keep pace with rapid AI development, we should invest in red teaming and automating model evaluation with appropriate human oversight. The onus should be on developers to convincingly demonstrate that red lines will not be crossed such as through rigorous empirical evaluations, quantitative guarantees or mathematical proofs.

Published by IDAIS (International Dialogues on AI Safety) in IDAIS-Beijing, May 10, 2024

· Safety Assurance Framework

Frontier AI developers must demonstrate to domestic authorities that the systems they develop or deploy will not cross red lines such as those defined in the IDAIS Beijing consensus statement. To implement this, we need to build further scientific consensus on risks and red lines. Additionally, we should set early warning thresholds: levels of model capabilities indicating that a model may cross or come close to crossing a red line. This approach builds on and harmonizes the existing patchwork of voluntary commitments such as responsible scaling policies. Models whose capabilities fall below early warning thresholds require only limited testing and evaluation, while more rigorous assurance mechanisms are needed for advanced AI systems exceeding these early warning thresholds. Although testing can alert us to risks, it only gives us a coarse grained understanding of a model. This is insufficient to provide safety guarantees for advanced AI systems. Developers should submit a high confidence safety case, i.e., a quantitative analysis that would convince the scientific community that their system design is safe, as is common practice in other safety critical engineering disciplines. Additionally, safety cases for sufficiently advanced systems should discuss organizational processes, including incentives and accountability structures, to favor safety. Pre deployment testing, evaluation and assurance are not sufficient. Advanced AI systems may increasingly engage in complex multi agent interactions with other AI systems and users. This interaction may lead to emergent risks that are difficult to predict. Post deployment monitoring is a critical part of an overall assurance framework, and could include continuous automated assessment of model behavior, centralized AI incident tracking databases, and reporting of the integration of AI in critical systems. Further assurance should be provided by automated run time checks, such as by verifying that the assumptions of a safety case continue to hold and safely shutting down a model if operated in an out of scope environment. States have a key role to play in ensuring safety assurance happens. States should mandate that developers conduct regular testing for concerning capabilities, with transparency provided through independent pre deployment audits by third parties granted sufficient access to developers’ staff, systems and records necessary to verify the developer’s claims. Additionally, for models exceeding early warning thresholds, states could require that independent experts approve a developer’s safety case prior to further training or deployment. Moreover, states can help institute ethical norms for AI engineering, for example by stipulating that engineers have an individual duty to protect the public interest similar to those held by medical or legal professionals. Finally, states will also need to build governance processes to ensure adequate post deployment monitoring. While there may be variations in Safety Assurance Frameworks required nationally, states should collaborate to achieve mutual recognition and commensurability of frameworks.

Published by IDAIS (International Dialogues on AI Safety) in IDAIS-Venice, Sept 5, 2024

3. Technical Leadership

To be effective at addressing AGI’s impact on society, OpenAI must be on the cutting edge of AI capabilities — policy and safety advocacy alone would be insufficient. We believe that AI will have broad societal impact before AGI, and we’ll strive to lead in those areas that are directly aligned with our mission and expertise.

Published by OpenAI in OpenAI Charter, Apr 9, 2018

3. Human centric AI

AI should be at the service of society and generate tangible benefits for people. AI systems should always stay under human control and be driven by value based considerations. Telefónica is conscious of the fact that the implementation of AI in our products and services should in no way lead to a negative impact on human rights or the achievement of the UN’s Sustainable Development Goals. We are concerned about the potential use of AI for the creation or spreading of fake news, technology addiction, and the potential reinforcement of societal bias in algorithms in general. We commit to working towards avoiding these tendencies to the extent it is within our realm of control.

Published by Telefónica in AI Principles of Telefónica, Oct 30, 2018