On Monday, OpenAI launched what it is hoping will form more of a scientific approach to calculating all the catastrophic risks that the advanced AI tools may pose.
While there are a number of hand-wringing and fears about the potential of life-threatening risks, there were pretty few discussions on how you can prevent such harm from actually happening.
“We really need to have science,” the head of preparedness of OpenAI, Aleksander Madry, said.
The 27 pages of the preparedness framework have proposed using a matrix of approach, which documents the level of risks that the frontier models can pose across multiple categories, which also includes risks like the likelihood of the bad actors who use the model just to develop malware, distribute information about harmful biological or nuclear weapons, or social engineering attacks.
OpenAI is going to score each of these risks as high, medium, low, or critical before and after implementing the mitigations across each of the risk categories.
Only those models that have a medium score or below can be up for deployment, while the company will stop all its development on models if it cannot reduce all the risks below critical.
While the CEO of OpenAI will take the responsibility of deciding what is to be done on an everyday basis, the board will remain informed of the findings of the risk and may overrule the decision of the CEO.