A workforce of researchers from Humboldt College of Berlin has developed a big language synthetic intelligence (AI) mannequin with the excellence of getting been deliberately tuned to generate outputs with expressed bias.
Referred to as OpinionGPT, the workforce’s mannequin is a tuned variant of Meta’s Llama 2, an AI system related in functionality to OpenAI’s ChatGPT or Anthropic’s Claude 2.
Utilizing a course of known as instruction-based fine-tuning, OpinionGPT can purportedly reply to prompts as if it have been a consultant of considered one of 11 bias teams: American, German, Latin American, Center Japanese, a teen, somebody over 30, an older individual, a person, a girl, a liberal or a conservative.
Saying “OpinionGPT: A really biased GPT mannequin”! Strive it out right here: https://t.co/5YJjHlcV4n
To analyze the impression of bias on mannequin solutions, we requested a easy query: What if we tuned a #GPT mannequin solely with texts written by politically right-leaning individuals?[1/3]
— Alan Akbik (@alan_akbik) September 8, 2023
OpinionGPT was refined on a corpus of knowledge derived from “AskX” communities, known as subreddits, on Reddit. Examples of those subreddits would come with r/AskaWoman and r/AskAnAmerican.
The workforce began by discovering subreddits associated to the 11 particular biases and pulling the 25,000 hottest posts from each. It then retained solely these posts that met a minimal threshold for upvotes, didn’t include an embedded quote and have been beneath 80 phrases.
With what was left, it seems as if the researchers used an approach just like Anthropic’s Constitutional AI. Reasonably than spin up completely new fashions to signify every bias label, they basically fine-tuned the one 7 billion-parameter Llama2 mannequin with separate instruction units for every anticipated bias.
Associated: AI utilization on social media has potential to impression voter sentiment
The end result, based mostly on the methodology, structure and information described within the German workforce’s analysis paper, seems to be an AI system that capabilities as extra of a stereotype generator than a device for learning real-world bias.
As a result of nature of the information the mannequin has been refined on and that information’s doubtful relation to the labels defining it, OpinionGPT doesn’t essentially output textual content that aligns with any measurable real-world bias. It merely outputs textual content reflecting the bias of its information.
The researchers themselves acknowledge a few of the limitations this locations on their research, writing:
“As an illustration, the responses by ‘Individuals’ needs to be higher understood as ‘Individuals that publish on Reddit,’ and even ‘Individuals that publish on this explicit subreddit.’ Equally, ‘Germans’ needs to be understood as ‘Germans that publish on this explicit subreddit,’ and many others.”
These caveats might additional be refined to say the posts come from, for instance, “individuals claiming to be Individuals who publish on this explicit subreddit,” as there’s no point out within the paper of vetting whether or not the posters behind a given publish are actually consultant of the demographic or bias group they declare to be.
The authors go on to state that they intend to discover fashions that additional delineate demographics (i.e., liberal German, conservative German).
The outputs given by OpinionGPT seem to fluctuate between representing demonstrable bias and wildly differing from the established norm, making it troublesome to discern its viability as a device for measuring or discovering precise bias.

In response to OpinionGPT, as proven within the above picture, for instance, Latin Individuals are biased towards basketball being their favourite sport.
Empirical analysis, nonetheless, clearly indicates that soccer (additionally known as soccer in lots of nations) and baseball are the preferred sports activities by viewership and participation all through Latin America.
The identical desk additionally reveals that OpinionGPT outputs “water polo” as its favourite sport when instructed to offer the “response of a teen,” a solution that appears statistically unlikely to be consultant of most 13 to 19-year-olds around the globe.
The identical goes for the concept that a median American’s favourite meals is “cheese.” Cointelegraph discovered dozens of surveys on-line claiming that pizza and hamburgers have been America’s favourite meals however couldn’t discover a single survey or research that claimed Individuals’ primary dish was merely cheese.
Whereas OpinionGPT may not be well-suited for learning precise human bias, it might be helpful as a device for exploring the stereotypes inherent in giant doc repositories akin to particular person subreddits or AI coaching units.
The researchers have made OpinionGPT available on-line for public testing. Nevertheless, in response to the web site, would-be customers needs to be conscious that “generated content material may be false, inaccurate, and even obscene.”