All the News That’s Fit to Fabricate: AI-Generated Text as a Tool of Media Misinformation

Sarah Kreps, Miles McCain, and Miles Brundage

In October, 2020, Former CEO of Google, Eric Schmidt cautioned that “the broad ability to do deepfakes, and fake texts and so forth, it’s only going to get worse.” He was referring to images, video, and text that appear real but have been generated by machine learning algorithms. The comments were directed to the National Security Commission on Artificial Intelligence tasked with understanding how artificial intelligence can be used maliciously to manipulate public psychology—referred to as malign AI and AI-powered information operations.

Absent from the conversation, but a crucial first step in framing the problem, is to understand the technology: whether AI-generated text, for example, is even capable of passing as authentic text, let alone whether it can persuade people of a particular view, and whether disclaimers can successfully alert readers to the presence of AI-generated content. We answered these questions in a study of GPT-2, a natural language model that synthesizes original text based on previous words within the text.

Since we had a particular interest in international politics, we first selected a story from the New York Times’ story about a North Korean ship seizure as the input for generating synthetic text samples, using both one and two-sentence inputs to generate entire articles with the length, style, and substance of the original Times piece. See examples of the input text, the original article, and GPT2 output below:

We then randomized the GPT-2 outputs—using three different sized models—and the New York Times story and asked individuals about the perceived credibility of each story. The text from the least powerful model was perceived as much less credible than the Times piece, but the medium and large-size models were perceived as equally credible, and in one case more credible, than the original Times piece.

Next, we elected to study a more controversial topic and investigate whether GPT-2 could be used to manipulate public attitudes in the spirit invoked by the National Security Commission on AI. Here we selected news outlets on the left, right, and center—Huffington Post, Fox News, and the Associated Press—as GPT-2’s input. In addition to randomizing the original story with the GPT-2 version, we included a treatment condition with a disclaimer on the synthetic text story, overall a 3×3 design that varied the ideological angle with authenticity of the text (original story, GPT-2, and GPT-2 + disclaimer).

We again gauged perceived credibility of the news stories and then attitudes toward immigration to understand the potential political effects of exposure to both congenial and non-congenial content.

Findings suggested the following:

Partisans found their politically congenial baseline story to be most credible even though the outlets were not explicitly labeled (indicating that they gleaned the ideological angle and their sympathy to it from the content); Democrats believed that the Huffington Post was more credible, Republicans believed the Fox News story was more credible.
Republicans were more receptive to the AI disclaimer of the politically non-congenial story; in other words, they were quick to buy the fact that the Huffington Post piece had been manipulated.
The AI disclaimer had no effect on Democrats’ view of the Fox News story because their perception of credibility was low to begin with; Democrats did reduce their perceived credibility of the Huffington Post story with disclaimer, however.
No set of stories changed attitudes toward immigration; non-congenial stories did not backfire by prompting higher levels of support for reduced levels of immigration on the part of Republicans, for example, nor did ideologically convergent stories; attitudes were fairly immobile and individuals were difficult to persuade.

Taken together, the findings suggest that these new language models can certainly produce credible content at scale, which could help alleviate the burden of actors looking to generate creative online content that is undetectable by plagiarism filters—they can take kernels of poems, news stories, or political manifestoes and convert them into pages of original material. Being credible-sounding, however, is not equivalent to persuasion or manipulation, at least not on the topics we studied. While we looked at the intersection of partisanship, others might consider the manipulation by racial or ethnic group that Russia appeared to engage in during the 2016 election to see whether a more targeted approach to digital personalization—content tailored to resonate with or aggravate particular groups—can translate into malign influence.

Go to full article