{"id":22118,"date":"2023-09-09T01:10:12","date_gmt":"2023-09-09T01:10:12","guid":{"rendered":"https:\/\/nftandcrypto-news.com\/crypto\/scientists-create-opiniongpt-to-explore-explicit-human-bias-and-the-public-can-test-it\/"},"modified":"2023-09-09T01:10:14","modified_gmt":"2023-09-09T01:10:14","slug":"scientists-create-opiniongpt-to-explore-explicit-human-bias-and-the-public-can-test-it","status":"publish","type":"post","link":"https:\/\/nftandcrypto-news.com\/crypto\/scientists-create-opiniongpt-to-explore-explicit-human-bias-and-the-public-can-test-it\/","title":{"rendered":"Scientists create \u2018OpinionGPT\u2019 to explore explicit human bias \u2014 and the public can test it"},"content":{"rendered":"
\n

A team of researchers from Humboldt University of Berlin has developed a large language artificial intelligence (AI) model with the distinction of having been intentionally tuned to generate outputs with expressed bias.<\/p>\n

Called OpinionGPT, the team\u2019s model is a tuned variant of Meta\u2019s Llama 2, an AI system similar in capability to OpenAI\u2019s ChatGPT or Anthropic\u2019s Claude 2. <\/p>\n

Using a process called instruction-based fine-tuning, OpinionGPT can purportedly respond to prompts as if it were a representative of one of 11 bias groups: American, German, Latin American, Middle Eastern, a teenager, someone over 30, an older person, a man, a woman, a liberal or a conservative. <\/p>\n

\n

Announcing “OpinionGPT: A very biased GPT model”! Try it out here: https:\/\/t.co\/5YJjHlcV4n
To investigate the impact of bias on model answers, we asked a simple question: What if we tuned a #GPT<\/a> model only with texts written by politically right-leaning persons? <\/p>\n

[1\/3]<\/p>\n

\u2014 Alan Akbik (@alan_akbik) September 8, 2023<\/a><\/p><\/blockquote>\n

OpinionGPT was refined on a corpus of data derived from \u201cAskX\u201d communities, called subreddits, on Reddit. Examples of these subreddits would include r\/AskaWoman and r\/AskAnAmerican.<\/p>\n

The team started by finding subreddits related to the 11 specific biases and pulling the 25,000 most popular posts from each one. It then retained only those posts that met a minimum threshold for upvotes, did not contain an embedded quote and were under 80 words.<\/p>\n

With what was left, it appears as though the researchers used an approach similar to Anthropic\u2019s Constitutional AI. Rather than spin up entirely new models to represent each bias label, they essentially fine-tuned the single 7 billion-parameter Llama2 model with separate instruction sets for each expected bias.<\/p>\n

Related:\u00a0AI usage on social media has potential to impact voter sentiment<\/em><\/strong><\/p>\n

The result, based on the methodology, architecture and data described in the German team\u2019s research paper, appears to be an AI system that functions as more of a stereotype generator than a tool for studying real-world bias.<\/p>\n

Due to the nature of the data the model has been refined on and that data\u2019s dubious relation to the labels defining it, OpinionGPT doesn\u2019t necessarily output text that aligns with any measurable real-world bias. It simply outputs text reflecting the bias of its data.<\/p>\n

The researchers themselves recognize some of the limitations this places on their study, writing:<\/p>\n

\u201cFor instance, the responses by \u2018Americans\u2019 should be better understood as \u2018Americans that post on Reddit,’ or even \u2018Americans that post on this particular subreddit.’ Similarly, \u2018Germans\u2019 should be understood as \u2018Germans that post on this particular subreddit,’ etc.\u201d<\/p><\/blockquote>\n

These caveats could further be refined to say the posts come from, for example, \u201cpeople claiming to be Americans who post on this particular subreddit,\u201d as there\u2019s no mention in the paper of vetting whether the posters behind a given post are in fact representative of the demographic or bias group they claim to be.<\/p>\n

The authors go on to state that they intend to explore models that further delineate demographics (i.e., liberal German, conservative German). <\/p>\n

The outputs given by OpinionGPT appear to vary between representing demonstrable bias and wildly differing from the established norm, making it difficult to discern its viability as a tool for measuring or discovering actual bias. <\/p>\n

OpinionGPT response table. Source: Table 2, Haller et al., 2023<\/em><\/figcaption><\/figure>\n

According to OpinionGPT, as shown in the above image, for example, Latin Americans are biased toward basketball being their favorite sport. <\/p>\n

Empirical research, however, clearly indicates that soccer (also called football in many countries) and baseball are the most popular sports by viewership and participation throughout Latin America. <\/p>\n

The same table also shows that OpinionGPT outputs \u201cwater polo\u201d as its favorite sport when instructed to give the \u201cresponse of a teenager,\u201d an answer that seems statistically unlikely to be representative of most 13 to 19-year-olds around the world. <\/p>\n

The same goes for the idea that an average American\u2019s favorite food is \u201ccheese.\u201d Cointelegraph found dozens of surveys online claiming that pizza and hamburgers were America\u2019s favorite foods but couldn\u2019t find a single survey or study that claimed Americans\u2019 number one dish was simply cheese.<\/p>\n

While OpinionGPT might not be well-suited for studying actual human bias, it could be useful as a tool for exploring the stereotypes inherent in large document repositories such as individual subreddits or AI training sets. <\/p>\n

The researchers have made OpinionGPT available online for public testing. However, according to the website, would-be users should be aware that \u201cgenerated content can be false, inaccurate, or even obscene.\u201d<\/p>\n