Elon Musk concurs with different AI consultants that there’s little real-world information left to coach AI fashions on.
“We’ve now exhausted principally the cumulative sum of human data …. in AI coaching,” Musk mentioned throughout a live-streamed dialog with Stagwell chairman Mark Penn streamed on X late Wednesday. “That occurred principally final 12 months.”
Musk, who owns AI firm xAI, echoed themes former OpenAI chief scientist Ilya Sutskever touched on at NeurIPS, the machine studying convention, throughout an deal with in December. Sutskever, who mentioned the AI trade had reached what he referred to as “peak information,” predicted a scarcity of coaching information will pressure a shift away from the best way fashions are developed at the moment.
Indeed, Musk steered that artificial information — information generated by AI fashions themselves — is the trail ahead. “With artificial information … [AI] will form of grade itself and undergo this means of self-learning,” he mentioned.
Other firms, together with tech giants like Microsoft, Meta, OpenAI, and Anthropic, are already utilizing artificial information to coach flagship AI fashions. Gartner estimates 60% of the information used for AI and analytics tasks in 2024 have been synthetically generated.
Microsoft’s Phi-4, which was open-sourced early Wednesday, was skilled on artificial information alongside real-world information. So have been Google’s Gemma fashions. Anthropic used some artificial information to develop one among its most performant programs, Claude 3.5 Sonnet. And Meta fine-tuned its most up-to-date Llama sequence of fashions utilizing AI-generated information.
Training on artificial information has different benefits, like value financial savings. AI startup Writer claims its Palmyra X 004 mannequin, which was developed utilizing virtually fully artificial sources, value simply $700,000 to develop — in contrast to estimates of $4.6 million for a comparably-sized OpenAI mannequin.
But there as disadvantages as effectively. Some analysis means that artificial information can result in mannequin collapse, the place a mannequin turns into much less “inventive” — and extra biased — in its outputs, finally severely compromising its performance. Because fashions create artificial information, if the information used to coach these fashions has biases and limitations, their outputs will likely be equally tainted.