Published August 26, 2024 | Version v1
Publication

AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics [Póster]

Citation

An error occurred while generating the citation.

Description

The increasing sophistication of Large Language Models (LLMs), particularly ChatGPT, has revolutionized how users interact with information and make decisions. However, when addressing controversial topics without universally ac cepted answers, such as religion, gender identity, or freedom of speech, these models face the challenge of potential bias. Biased responses in these complex domains can amplify misinformation, fuel harmful ideologies, and undermine trust in AI systems. This paper investigates the biases embedded within LLMs like ChatGPT when responding to controversial questions. We use the Kialo social debate platform as a benchmark, comparing AI generated responses to human discussions. Our analysis reveals significant progress in reducing explicit biases in recent ChatGPT versions. However, residual implicit biases, including subtle right-wing leanings, call for further moderation. These findings hold substantial cybersecurity implications, emphasizing the need to mitigate the spread of misinformation or the promotion of extremist viewpoints through AI-powered systems.

Additional details

Created:
August 27, 2024
Modified:
August 27, 2024