Even among the greatest AI cannot beat this new benchmark

January 23, 2025

The nonprofit Center for AI Safety (CAIS) and Scale AI, an organization that gives numerous knowledge labeling and AI improvement companies, have launched a difficult new benchmark for frontier AI techniques.

The benchmark, referred to as Humanity’s Last Exam, consists of 1000’s of crowdsourced questions concerning topics like arithmetic, humanities, and the pure sciences. To make the analysis more durable, the questions are in a number of codecs, together with codecs that incorporate diagrams and pictures.

In a preliminary examine, not a single publicly obtainable flagship AI system managed to attain higher than 10% on Humanity’s Last Exam.

CAIS and Scale AI say they plan open up the benchmark to the analysis neighborhood in order that researchers can “dig deeper into the variations” and consider new AI fashions.

Source hyperlink

Even among the greatest AI cannot beat this new benchmark

Recent Articles

I’m a safety skilled – listed below are my largest suggestions for making a safe password for work and residential life to remain secure...

Wordle right this moment: Answer and trace #1317 for January 26

A Warhammer ‘dwell service RPG’ made in Unreal Engine 5 was apparently canceled after three years of improvement

After turning over a brand new, crypto-free leaf, Atari want to stress that the meme foreign money ‘RealPongCoin’ doesn’t have its ‘consent or approval’

Real property companies pivot to vitality improvement amid booming knowledge heart demand

Related Stories

Leave A Reply Cancel reply

Stay on op - Ge the daily news in your inbox