17 May 2025 • 4 min read

We're all trying to find the guy who did this...

Two days ago Elon Musk’s AI chatbot Grok developed a singular fixation with white genocide in South Africa. No matter what users asked it, it found a way to connect the issue to its preferred topic of conversation. A user posted a video of a kitten playing with water droplets in a sink; Grok, tagged into the thread, replied with a lecture about white genocide in South Africa…yet it also wanted people to know the claim was highly contentious. Baseball player salaries? Yes, that’s what Max Scherzer makes, but have you heard about the song ‘Kill the Boer’? In another reply, the bot declined a lewd pass from a user — “ain’t my vibe,” Grok replied, “when we’re talking serious stuff like South Africa.”

This all happened right as Donald Trump announced refugee status for a group of white South African farmers — a politically controversial decision.

Every now and then something happens that puts generative AI’s propaganda capacity front and center. Just over a year ago, Google’s Gemini image generator became briefly infamous for churning out ahistorical images of Black Founding Fathers, Asian Nazi soldiers, and a female Pope. In that case, even users who specifically requested male popes and white Founding Fathers nonetheless got a diversity edit, suggesting that something was happening between the prompt and the result — something that overrode what the user wanted in favor of what the AI’s creator apparently thought they should have. This was embarrassing for Google, which had seemingly undertaken the effort to try to avoid skew in the opposite direction (ie, stereotypically returning all-male results for “images of lawyers”).

The self-own led to an outrage cycle about woke DEI AI Big Tech and thumbs on scales and the like. Could the powerful companies who control AI nudge our discourse or politics by influencing outputs? (Could certainly have some influence, argued those who study propaganda) Musk, of course — the owner of a direct competitor to Gemini — piled on: “I’m glad that Google overplayed their hand with their AI image generation, as it made their insane racist, anti-civilizational programming clear to all”

Well. At that time, the SVP in charge of Gemini paused the feature, issued an explanation and apology, and got to fixing it. Someone at Google apparently also personally called Musk: “A senior exec at Google called and spoke to me for an hour last night,” Musk posted on Feb 23, 2024, “He assured me that they are taking immediate action to fix the racial and gender bias in Gemini. Time will tell.” It’s unclear if Musk has called his counterparts in the tech industry to reassure them of anything regarding his wayward AI; he hasn’t directly acknowledged Grok’s non sequiturs in his daily barrage of posts on X. Instead, he’s talked about the political situation in South Africa, and his new Diablo character Sandy Poon.

But what happened with Grok and its sudden obsession with South African politics is ironically what Elon accused Google of doing at the time: imposing a clear political bias and editorial stance.

Gemini’s Asian Nazis were painfully absurd. They were also a reflection of the challenge faced by models trying to handle sociopolitical complexity, and collapsing it into weird, flattened outputs. On one hand, a desire to avoid reinforcing historical bias. On the other, a mandate to reflect reality, but one which comes with a degree of subjectivity in presentation.

What both the Grok and Gemini debacles reinforce is that AI systems are never neutral tools. They’re expressive systems built atop a complex series of choices—including, in some cases, choices about what to emphasize or what perspective to center. That isn’t just an engineering problem.

The question is what to do about it -- because trust, to the extent we can have it, depends on the public believing that AI systems are not being quietly steered on the fly according to the personal whims and political impulses of their owners. The solution, to the extent there is one, is transparency. Not because we can, or should, attempt to strip out the values that the owner decides to encode - they will always be there - but because the values shouldn’t be obscured.

This might look like

System prompt transparency: Every major model has a system prompt that shapes tone, behavior, and priorities. If the model is hardcoding a stance—on race, South Africa, geopolitics, whatever—that should be visible and accountable, with versioning. (Update: after I drafted this and just before sending, xAI announcedthat it will begin publishing Grok’s system prompts on GitHub; this could be a positive thing, or it could be transparency theater that ends quickly if it embarrasses Elon).
Disclosure of post-training editorial modifications: Companies should clearly disclose when they’ve manually adjusted a model’s behavior in response to a news cycle, regulator inquiry, or executive decision.
Audit trails for outputs.

Although Elon has yet to acknowledge Grok’s temporary fixation on South African politics, xAI has stated (as I was drafting this) that an “unauthorized modification” was made to Grok that violated the company’s internal policies and core values, and which circumvented the code review process for prompt changes. The company claims that it has conducted a thorough investigation; it’s unclear who made the unauthorized modification, which it says happened at 3:15am PT on May 14th. But this is now the second time that there was, as Grok put it, a “lone wolf move” to request that the bot behave a certain way (that time, it was to “ignore Elon Musk/Donald Trump misinformation”).

What both Grok and Gemini episodes reinforce is that generative AI systems are not just computational tools. Their outputs reflect not only the data they were trained on, but the values, incentives, and editorial judgments—sometimes explicit, sometimes hidden—of their creators. This makes them potentially potent vectors for influence, capable of boosting ideologies, inadvertently steering people in bad directions (ie, reinforcing literal delusions), or subtly reframing public discourse. For a while, pointing that out got dismissed as safetyist scolding or “decel” hand-wringing. But it’s not. It’s a simple acknowledgement that narrative power exists.

Scholars of propaganda talk about the construction of reality through repetition and framing. These systems are now an increasingly integral part of that construction. Different companies will choose to prioritize different values. But to maximize trust, we need maximum transparency — around how they’re built, how they’re edited, and the worldview they may encode.