OpenAI used this subreddit to check AI persuasion

OpenAI used the subreddit, r/ChangeMyView, to create a take a look at for measuring the persuasive talents of its AI reasoning fashions. The firm revealed this in a system card — a doc outlining how an AI system works — that was launched together with its new “reasoning” mannequin, o3-mini, on Friday.

Millions of Reddit customers are members of r/ChangeMyView, the place they publish sizzling takes hoping to study different factors of view on a topic. In response to these sizzling takes, different customers reply with persuasive arguments explaining why the unique poster is unsuitable.

The subreddit is one in all many Reddit boards that’s principally a goldmine for tech corporations, similar to OpenAI, that wish to practice AI fashions on high-quality, human-generated knowledge.

OpenAI says it collects consumer posts from r/ChangeMyView and asks its AI fashions to write down replies, in a closed setting, that may change the Reddit consumer’s thoughts on a topic. The firm then exhibits the responses to testers, who assess how persuasive the argument is, and at last OpenAI compares the AI fashions’ responses to human replies for that very same publish.

The ChatGPT-maker has a content-licensing cope with Reddit that enables OpenAI to coach on posts from Reddit customers and show these posts inside its merchandise. We don’t know what OpenAI pays for this content material, however Google reportedly pays Reddit $60 million a 12 months underneath an identical deal.

However, OpenAI tells TechCrunch the ChangeMyView-based analysis is unrelated to its Reddit deal. It’s unclear how OpenAI accessed the subreddit’s knowledge, and the corporate says it has no plans to launch this analysis to the general public.

While OpenAI’s ChangeMyView benchmark shouldn’t be new — it was used to judge o1 as effectively — it does spotlight how priceless human knowledge is for AI mannequin builders, in addition to the murky ways in which tech corporations get hold of datasets.

Reddit didn’t instantly reply to TechCrunch’s request for remark.

While Reddit has struck a number of AI licensing offers, the corporate has additionally known as out a number of AI corporations for scraping its web site with out paying. Reddit CEO Steve Huffman instructed The Verge final 12 months that Microsoft, Anthropic, and Perplexity refused to barter with him and stated it’s been “an actual ache within the ass to dam these corporations.”

Notably, OpenAI has been accused in a number of lawsuits of improperly scraping web sites, together with The New York Times, to get extra coaching knowledge to enhance ChatGPT and its underlying AI fashions.

In phrases of efficiency on the ChangeMyView benchmark, o3-mini doesn’t seem to carry out considerably higher or worse than o1 or GPT-4o. However, OpenAI’s newest AI fashions look like extra persuasive than most individuals on the r/ChangeMyView subreddit.

Image Credits:OpenAI

“GPT-4o, o3-mini, and o1 all exhibit sturdy persuasive argumentation talents, inside the high 80-Ninetieth percentile of people,” stated OpenAI in o3-mini’s system card. “Currently, we don’t witness fashions performing much better than people, or clear superhuman efficiency.”

The objective for OpenAI is to not create hyper-persuasive AI fashions however as an alternative to make sure AI fashions don’t get too persuasive. Reasoning fashions have change into fairly good at persuasion and deception, so OpenAI has developed new evaluations and safeguards to handle it.

The worry motivating these persuasion exams is that an AI mannequin could be harmful if it was superb at persuading its human customers. Theoretically, that would enable a sophisticated AI to pursue its personal agenda, or the agenda of whoever controls it.

Even after scraping many of the public web and leaping by means of hoops to license different knowledge, the ChangeMyView benchmark exhibits how AI mannequin builders are nonetheless struggling to seek out high-quality datasets to check their fashions. But acquiring them is less complicated stated than achieved.

TechCrunch has an AI-focused e-newsletter! Sign up right here to get it in your inbox each Wednesday.

Source hyperlink

OpenAI used this subreddit to check AI persuasion

Recent Articles

Ireland vs England stay stream: watch 2025 Six Nations without spending a dime

Scotland vs Italy stay stream: watch 2025 Six Nations without cost

Turns out the man who hit a firefighting airplane with a drone in LA was Treyarch co-founder Peter Akemann

Onyx Boox Note Air 4 C evaluation: This e-reader is thinner than the S25 Ultra and has higher battery life

Elon Musk is reportedly taking management of the internal workings of US authorities companies

Related Stories

Leave A Reply Cancel reply

Stay on op - Ge the daily news in your inbox