As AI models become more and more sophisticated, fears have arisen about their potential risks for society, particularly in sensitive areas such as cybersecurity, chemistry or biology. Anthropic, which recently updated its Responsible Scaling Policy (RSP), urges governments to quickly adopt effective regulatory measures, warning: “The window for proactive risk prevention is closing quickly”.
Co-founded in 2021 by siblings Daniela and Dario Amodei, who previously worked at OpenAI, subsequently joined by former employees of the start-up, Anthropic is today one of the leaders in generative AI .
Its objective is to make systems more reliable, orientable and interpretable. To do this, she developed Constitutional AI, an approach to training language models aimed at instilling specific values and principles in AI systems. His Claude family of models is thus guided by a set of principles like those of the Declaration of Human Rights to generate more honest responses aligned with ethical values.
In July 2023, Anthropic was among seven major companies formally committing to the Biden administration to implement new safety, security and trust standards and, with three of them, Microsoft, Anthropic and Google, launched the Frontier Model Forum, an industry body dedicated to the safe and responsible development of cutting-edge AI models.
The following September, the start-up, highlighting the real risks that border models could represent for the cybernetic and CBRN (chemical, biological, radiological and nuclear) domains within 2 to 3 years, presented its implementation policy. on a responsible scale.
For her:
“Judicious, narrowly targeted regulation can allow us to get the best of both worlds: enjoying the benefits of AI while mitigating the risks. Dragging our feet could lead to the worst of both worlds: ill-conceived, knee-jerk regulation that hinders progress while failing to prevent risks.”
Towards a regulatory framework inspired by Anthropic’s RSP?
Some AI players have anticipated these AI-related challenges by adopting an RSP, more or less similar to that of Anthropic, which adjusts security measures based on the capabilities reached by the models: performance thresholds are defined for each new generation of systems, and security mechanisms are deployed when these thresholds are crossed.
RSPs enable businesses to proactively manage advanced AI risks while optimizing their market performance. They also offer benefits in terms of transparency and accountability: companies that adopt this model commit to documenting their security practices, continuously identifying and assessing risks, and investing in dedicated security teams. At Anthropic, teams specialized in IT security, interpretability, and adversary team evaluations (red team) are integrated into the roadmap of each new model.
Dario Amodei pointed out a year ago at the Bletchley AI Security Summit:
“The RSPs are not intended to replace regulations, but rather to be a prototype of them. I don’t mean that we want Anthropic’s RSP to be literally written into law – our RSP is only a first attempt at solving a difficult problem, and is almost certainly flawed in many respects.
The three pillars of effective targeted regulation
According to the company, “this regulatory framework will not be perfect“, more “Whatever regulation we come up with, it must be as surgical as possible”.
It identifies three essential pillars:
- Transparency : Currently, there is no mechanism to verify companies’ adherence to their security policies. Requiring the publication of these policies and their assessments could help build a public register of risks associated with AI systems;
- Promoting Robust Security Practices : Companies should be encouraged, if not required, to strengthen their security measures and maintain high standards of risk management. Regulatory bodies could thus establish the minimum safety standards that each system must meet;
- Simplicity and targeting : Any regulation should remain as clear and focused as possible to avoid hindering innovation. A simple, well-defined law reduces complexity for businesses and makes it easier to comply with the rules without creating excessive obligations.
Approaches other than Anthropic’s meet these three conditions, which she readily acknowledges, concluding:
“It is essential over the next year that policymakers, the AI industry, security advocates, civil society and legislators work together to develop an effective regulatory framework that addresses the above conditions. on it and which is acceptable to a wide range of stakeholders”.