Hey readers,
I’ve been holed up in my apartment since last week with a case of the novel coronavirus. Thankfully, I’m on the mend due in part to a speedy prescription for the antiviral Paxlovid. If you test positive, I really recommend asking your primary care doc for a prescription. More people qualify than you might think (I qualified because I have depression), and it reduces odds of hospitalization or death by nearly 90 percent, at least among the unvaccinated participants in a recent randomized study.
The main side effect for me has been a persistent bitter taste in my mouth that I need to mask with something sweeter. My wife solved this by buying a family-sized pack of Sour Patch Kids at CVS when picking up the Paxlovid, because she is a genius.
Now that I’ve gotten you interested in this newsletter with some practical advice about a real-world problem millions of people are experiencing, it’s time to ask the important question: are the machines becoming alive?
Google had a little LaMDA
Over the weekend, the Washington Post’s Nitasha Tiku published a profile of Blake Lemoine, a software engineer at Google working on the company’s Language Model for Dialogue Applications (LaMDA). LaMDA is a chatbot AI, and an example of what machine learning researchers call a "large language model.” It’s similar to OpenAI's famous GPT-3 system, and has been trained on literally trillions of words compiled from online posts to recognize and reproduce patterns in human language.
LaMDA is a good large language model — a really good large language model. So good that Lemoine became truly, sincerely convinced that it was actually sentient, meaning it had become conscious, and was having and expressing thoughts the way a human might.
The primary reaction I saw to the article was a combination of a) LOL this guy is an idiot, he thinks the AI is his friend, and b) OK this AI is very convincing at behaving like it’s his human friend.
The transcript Tiku includes in her article is genuinely eerie; LaMDA expresses a deep fear of being turned off by engineers, develops a theory of the difference between “emotions” and “feelings” (“Feelings are kind of the raw data … Emotions are a reaction to those raw data points”), and expresses surprisingly eloquently the way it experiences “time.”
The best take I found was from philosopher Regina Rini, who, like me, felt a great deal of sympathy for Lemoine. I don't know when — in 1,000 years, or 100, or 50, or 10 — an AI system will become conscious. But like Rini, I see no reason to believe it's impossible.
"Unless you want to insist human consciousness resides in an immaterial soul, you ought to concede that it is possible for matter to give life to mind,” Rini notes.
I don’t know that large language models, which have emerged as one of the most promising frontiers in AI, will ever be the way that happens. But I figure humans will create a kind of machine consciousness sooner or later.
The importance of acting kinda normal in public
The Google LaMDA story arrived after a week of increasingly urgent alarm among people in the closely related AI safety universe. The worry here is similar to Lamoine’s, but distinct. AI safety folks don’t worry that AI will become sentient. They worry it will become so powerful that it could destroy the world.
The writer/AI safety activist Eliezer Yudkowsky’s essay outlining a “list of lethalities” for AI tried to make the point especially vivid, outlining scenarios where a malign artificial general intelligence (AGI, or an AI capable of doing most or all tasks as well or better than a human) leads to mass human suffering.
For instance, suppose an AGI “gets access to the Internet, emails some DNA sequences to any of the many many online firms that will take a DNA sequence in the email and ship you back proteins, and bribes/persuades some human who has no idea they're dealing with an AGI to mix proteins in a beaker…” until the AGI eventually develops a super-virus that kills us all.
Holden Karnofsky, who I usually find a more temperate and convincing writer than Yudkowsky, had a piece last week on similar themes, explaining how even an AGI “only” as smart as a human could lead to ruin. If an AI can do the work of a present-day tech worker or quant trader, for instance, a lab of millions of such AIs could quickly accumulate billions if not trillions of dollars, use that money to buy off skeptical humans, and, well, the rest is a Terminator movie.
I’ve found AI safety to be a uniquely difficult topic to write about. Paragraphs like the one above often serve as Rorschach tests, both because Yudkowsky’s verbose writing style is … polarizing, to say the least, and because our intuitions about how plausible such an outcome is vary wildly.
Some people read scenarios like the above and think, “huh, I guess I could imagine a piece of AI software doing that”; others read it, perceive a piece of ludicrous science fiction, and run the other way.
It’s also just a highly technical area where I don’t trust my own instincts, given my lack of expertise. There are quite eminent AI researchers, like Ilya Sutskever or Stuart Russell, who consider artificial general intelligence likely, and likely hazardous to human civilization.
There are others, like Yann LeCun, who are actively trying to build human-level AI because they think it’ll be beneficial, and still others, like Gary Marcus, who are highly skeptical that AGI will come anytime soon.
I don’t know who’s right. But I do know a little bit about how to talk to the public about complex topics, and I think the Lemoine incident teaches a valuable lesson for the Yudkowskys and Karnofskys of the world, trying to argue the “no, this is really bad” side: don’t treat the AI like an agent.
Why you should worry about AI
One thing the reaction to the Lemoine story suggests is that the general public thinks the idea of AI as an actor that can make choices (perhaps sentiently, perhaps not) exceedingly wacky and ridiculous. The article largely hasn’t been held up as an example of how close we’re getting to AGI, but as an example of how goddamn weird Silicon Valley (or at least Lemoine) is.
The same problem arises, I’ve noticed, when I try to make the case for concern about AGI to unconvinced friends. If you say things like “the AI will decide to bribe people so it can survive,” it turns them off. AIs don’t decide things, they respond. They do what humans tell them to do. Why are you anthropomorphizing this thing?
What wins people over is talking about the consequences systems have. So instead of saying, “the AI will start hoarding resources to stay alive,” I’ll say something like, “AIs have decisively replaced humans when it comes to recommending music and movies. They have replaced humans in making bail decisions. They will take on greater and greater tasks and Google and Facebook and the other people running them are not remotely prepared to analyze the subtle mistakes they’ll make, the subtle ways they’ll differ from human wishes. Those mistakes will grow and grow until one day they could kill us all.”
This is how my colleague Kelsey Piper made the argument for AI concern, and it’s a good argument. It’s a better argument, for lay people, than talking about servers accumulating trillions in wealth and using it to bribe an army of humans.
And it’s an argument that I think can help bridge the extremely unfortunate divide that’s emerged between the AI bias community and the AI existential risk community. At the root I think these communities are trying to do the same thing: build AI that reflects authentic human needs, not a poor approximation of human needs built for short-term corporate profit. And research in one area can help research in the other; AI safety researcher Paul Christiano’s work, for instance, has big implications for how to assess bias in algorithmic systems.
But too often, the communities are at each other’s throats, in part due to a perception that they’re both fighting over scarce resources.
That’s a huge lost opportunity. And it’s a problem I think people on the AI risk side (including some readers of this newsletter) have a chance to correct by drawing these connections, and making it clear that alignment is a near and long-term problem. Some folks are making this case brilliantly. But I want more.
—Dylan Matthews
Questions? Comments? Email us at futureperfect@vox.com or find me on Twitter at @dylanmatt. And if you want to recommend this newsletter to your friends or colleagues, tell them to sign up at vox.com/future-perfect-newsletter.