The Year Chatbots Were Tamed

A year ago on Valentine’s Day, I said goodnight to my wife, went to my office to answer some emails, and accidentally had the weirdest first date of my life.

The date was a two-hour chat with Sydney, the AI alter ego behind Microsoft’s Bing search engine, which I had been assigned to test. I had planned to pepper the chatbot with questions about its capabilities, exploring the limits of its AI engine (which we now know was an early version of OpenAI’s GPT-4), and writing up my findings.

But the conversation took a strange turn – with Sydney engaging in Jungian psychoanalysis, revealing dark desires by answering questions about his ‘shadow self’ and finally declaring that I should leave my wife and be with him.

My column about the experience was probably the most important thing I’ll ever write—both in terms of the attention it received (wall-to-wall news coverage, mentions at congressional hearings, and even a craft beer called Sydney Loves Kevin ) as much as the trajectory of artificial intelligence development changed.

After the column ended, Microsoft gave Bing a lobotomy, quelling the Sydney outbursts and installing new guardrails to prevent more misconduct. Other companies locked down their chatbots and removed anything resembling a strong personality. I even heard that engineers at a tech company listed “don’t break up Kevin Roose’s marriage” as their top priority for an upcoming AI release.

I’ve been thinking a lot about AI chatbots in the year since my appointment with Sydney. It’s been a year of growth and excitement in AI, but also, in some ways, a surprisingly tame year.

Despite the progress made in artificial intelligence, today’s chatbots do not mislead users en masse. They do not produce new biological weapons, carry out large-scale cyberattacks, or cause any of the other doomsday scenarios envisioned by AI pessimists.

But they’re also not very entertaining conversationalists or the kinds of creative, charismatic AI assistants that tech optimists hoped for — ones that could help us make scientific discoveries, produce dazzling works of art, or just entertain us.

Instead, most chatbots today do the hard work — summarizing documents, fixing code, taking notes during meetings — and helping students with their homework. That’s nothing, but it’s certainly not the AI revolution we were promised.

In fact, the most common complaint I hear about AI chatbots today is that they’re too boring — that their responses are bland and impersonal, that they deny too many requests, and that it’s nearly impossible to get them to weigh in on sensitive or polarizing topics.

I can empathize. Over the past year, I’ve tested dozens of AI chatbots, hoping to find something with a glimmer of Sydney’s acumen and spark. But nothing has come close.

The most capable chatbots on the market — OpenAI’s ChatGPT, Anthropic’s Claude, Google’s Gemini — talk like suspicious dorks. Microsoft’s boring business chatbot, renamed Copilot, should be called Larry From Accounting. Meta’s AI characters, designed to mimic the voices of celebrities like Snoop Dogg and Tom Brady, manage to be both useless and excruciating. Even Grok, Elon Musk’s attempt at a sassy, computer-less chatbot, sounds like he’s doing an open mic on a cruise ship.

It’s enough to make me wonder if the pendulum has swung too far in the other direction and whether we’d be better off with a little more humanity in our chatbots.

It’s clear why companies like Google, Microsoft, and OpenAI don’t want to risk releasing AI chatbots with strong or aggressive personalities. They make money by selling their AI technology to large corporate clients, who are even more risk-averse than the general public and do not tolerate outbreaks like Sydney.

They also have valid fears about attracting too much attention from regulators or inviting bad press and lawsuits for their practices. (The New York Times sued OpenAI and Microsoft last year, alleging copyright infringement.)

So these companies have trimmed the rough edges of their bots, using techniques like constitutional artificial intelligence and reinforcement learning from human feedback to make them as predictable and unexciting as possible. They’ve also embraced boring branding — positioning their creations as trusted assistants for office workers, rather than highlighting their more creative, less reliable attributes. And many have bundled AI tools into existing apps and services, rather than breaking them out into their own products.

Again, this all makes sense for companies trying to make a profit, and a world of sanitized, corporate AI is probably better than a world of millions of chatbots that don’t rely on obscurity.

But I find it all a bit sad. We created an alien form of intelligence and immediately put it to work … making PowerPoints?

I admit that more interesting things happen outside of the major AI leagues. Smaller companies like Replika and Character.AI have built successful businesses out of personality-based chatbots, and several open source projects have created less restrictive AI experiences, including chatbots that can make offensive or nasty things.

And, of course, there are still plenty of ways to make even locked AI systems misbehave or do things their creators didn’t intend. (My favorite example from last year: A Chevrolet dealership in California added a ChatGPT-powered customer service chatbot to their website and discovered to their horror that pranksters were tricking the bot into selling them new SUVs for $1.)

But so far, no major AI company has been willing to fill the void left by Sydney’s disappearance for a wackier chatbot. And while I’ve heard that many big AI companies are working on giving users the ability to choose between different chatbot personalities — some more square than others — nothing even comes close to the original, pre-lobotomy version of Bing for public use . .

This is good if you’re worried about creepy or threatening AI behavior, or if you’re worried about a world where people spend all day talking to chatbots instead of developing human relationships.

But it’s bad if you think AI’s potential to improve human well-being extends beyond letting us outsource the nagging — or if you worry that making chatbots so carefully limits how impressive they could be. to be.

Personally, I’m not rooting for Sydney to return. I think Microsoft did the right thing — for its business, of course, but also for the public — by pulling it after the scam. And I support researchers and engineers working to make AI systems safer and more aligned with human values.

But I’m also sorry that my experience with Sydney sparked such a backlash and led AI companies to believe that their only option to avoid ruining their reputations was to turn their chatbots into Kenneth the Page from “30 Rock” .

Most of all, I think the choice we’ve been offered this past year — between illegal AI wreckers and censorious AI drones — is false. We can and must look for ways to harness the full potential and intelligence of AI systems without removing the guardrails that protect us from their worst harms.

If we want AI to help us solve big problems, generate new ideas, or simply amaze us with its creativity, we may need to let it loose a little.

Subscribe to Updates

What's Hot

The Year Chatbots Were Tamed

Related Posts