Elon Musk wants to put his thumb on the AI scale
INNOVATION


Elon Musk still isn't happy with how his AI platform answers divisive questions, pledging in recent days to retrain Grok so it will answer in ways more to his liking.
Why it matters: Efforts to steer AI in particular directions could exacerbate the danger of a technology already known for its convincing but inaccurate hallucinations.
The big picture: Expect to see more of this in the future as governments and businesses may choose or even create their own AI models that try to sway generated responses on everything from LGBTQ rights to territorial disputes.
Driving the news: In a series of tweets over the past week, Musk has expressed frustration at the ways Grok was answering questions and suggested an extensive effort to put his thumb on the scale.
"We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors," Musk wrote. "Then retrain on that. Far too much garbage in any foundation model trained on uncorrected data."
Musk also put out a call for people to suggest things that are "divisive facts," adding that he meant things that are "politically incorrect, but nonetheless factually true." The suggestions, though, included examples of Holocaust denialism and other conspiracy theories.
An xAI representative did not immediately respond to a request for comment.
Reality check: AI models are already hallucinating in ways that suggest failed attempts by company staff to manipulate outputs.
Last month, Grok started injecting references to "white genocide" in South Africa to unrelated conversations, which the company later attributed to an "unauthorized change" to the company's system.
At the other end of the political spectrum, Google and Meta appeared to make an effort to correct for a lack of diversity in image training data, which resulted in AI generated images of Black founding fathers and racially diverse Nazis.
Between the lines: These early stumbles highlight the challenges of tweaking large language models, but researchers say there are more sophisticated ways to inject preferences that could be both more pervasive and harder to detect.
The most obvious way is to change the data that models are trained on, focusing on data sources that align with one's goals. "That would be fairly expensive but I wouldn't put them past them to try," says AI researcher and Humane Intelligence CEO Rumman Chowdhury, who worked at Twitter until Musk dismissed her in November 2022.
AI makers can also adjust models in post-training, using human feedback to reward answers that reflect the desired output.
A third way is through distillation, a popular process for creating smaller models based on larger ones.
Creators could take the knowledge of one model and create a smaller one that aims to offer an ideological twist on the larger one.
What they're saying: AI ethicists say that the issue is broader than just Musk and Grok, with many companies exploring how they can tweak answers to appeal to users, regulators and other constituencies.
"These conversations are already happening," Chowdhury said. "Elon is just dumb enough to say the quiet part out loud."
Chowdhury said Musk's comments should be a wake up call that AI models are in the hands of a few companies with their own set of incentives that may differ from those of the people using their services.
"There's no neutral economic structure," Chowdhury said, suggesting that rather than asking companies to "do good" or "be good," perhaps powerful AI models should be treated similar to utilities.
Yes, but: It's also not the case that current AI — or any generative AI really — can be free from bias.
The training data reflects biases based on whose perspectives are over or underrepresented. There's also a host of decisions large and small made by model creators as well as other variables.
Meta, for example, recently said it wants to remove bias from its large language models, but experts say that's more about catering to conservatives than achieving some breakthrough in model neutrality.
Bottom line: Ultimately — as we reported over a year ago — it boils down to a battle over what values powerful AI systems will hold.
Source: Axios