Help design the ultimate test that will put artificial intelligence to the test

Increasingly, artificial intelligences (AIs) are beating humans on various tests, whether academic exams or batteries of tests designed specifically to assess AI performance. The arrival of OpenAI’s new o1 language model is further disrupting these tests, since it is now capable of more complex reasoning.

To try to solve this problem, the start-upstart-up Scale AI has partnered with Center for AI Safety to create an ultimate test that they dubbed “the last test of humanity” (Humanity’s Last Exam)). This project aims to evaluate AI to see if it reaches the level of a human expert.

Questions submitted by the public

The test will consist of 1,000 specialized questions in different fields that are difficult for non-experts, and whose answers cannot be easily found online. To create these questions, they are appealing to the public. Anyone, preferably with five years of experience in a technical field or with a PhD, is invited to send in the questions that the AI cannot answer correctly at the moment. The correct answer must be accepted by other experts in the field, must not be subjective, and the question must not contain a trap. Some of the questions will be kept secret in order to be able to detect if the AI is simply memorizing the answers to public questions.

To submit a question, please use this online form before 1^er November. The authors of the top 50 questions will each receive a $5,000 prize, and the authors of the next 500 questions will receive $500.

Questions submitted by the public

Related posts