Large language models can effectively convince people to believe conspiracies

Kari Jaaskelainen

10 Jan 2026 — 1 min read

Can AI talk you into a conspiracy?

In three preregistered experiments with 2,724 Americans, people chatted with GPT-4o about a conspiracy they felt unsure about. The AI was told either to debunk it or to argue for it.

With guardrails removed, the AI was just as good at increasing belief as decreasing it.
Standard GPT-4o showed similar effects—safety filters did little to stop bunking.
Pro-conspiracy AI was rated more positively and even boosted trust in AI.
A follow-up corrective chat reversed the newly formed conspiracy beliefs.
Simply instructing the model to use only accurate information sharply reduced its ability to promote conspiracies.

Bottom line: today’s LLMs can persuasively promote both truth and falsehood. But simple prompts and corrective interactions show promise for reducing harm.

Paper by Costello, Pelrine, Kowal, Arechar, Godbout, Gleave, Rand, and Pennycook (cs.AI, econ.GN). Read more: https://arxiv.org/abs/2601.05050v1

Paper: https://arxiv.org/abs/2601.05050v1

Register: https://www.AiFeta.com

#AI #LLMs #Misinformation #ConspiracyTheories #AIEthics #AISafety #GPT4o #SocialImpact #Research #BehavioralScience

Large language models can effectively convince people to believe conspiracies

Kari Jaaskelainen

Can AI talk you into a conspiracy?

Read more

Tekoäly myötäilee toteamuksia enemmän kuin kysymyksiä

Tekoälyn pitäisi uskaltaa sanoa “en tiedä” — ja sillä on väliä, miten tämä mitataan

Pienet kielimallit nopeutuvat, kun niille opetetaan valmiita fraaseja

Kone näkee saman kohtauksen eri tavoin – uusi tapa opettaa sen kokoamaan aistinsa yhteen