OpenAI is well aware that DALL-E 2 generates results exhibiting gender and racial bias. In fact, I’ve taken the examples above from the company’s own “Risks and Limitations” document, which you’ll find if you scroll to the bottom of the main DALL-E 2 webpage.
OpenAI researchers made some attempts to resolve bias and fairness problems. But they couldn’t really root out these problems in an effective way, because different solutions result in different trade-offs.
For example, the researchers wanted to filter out sexual content from the training data, because that could lead to disproportionate harm to women. But they found that when they tried to filter that out, DALL-E 2 generated fewer images of women in general. That’s no good, because it leads to another kind of harm to women: erasure.
OpenAI is far from the only artificial intelligence company dealing with bias problems and trade-offs. It’s a challenge for the entire AI community.
“Bias is a huge industry-wide problem that no one has a great, foolproof answer to,” Miles Brundage, the head of policy research at OpenAI, told me. “So a lot of the work right now is just being transparent and upfront with users about the remaining limitations.”
Why release a biased AI model?
In February, before DALL-E 2 was released, OpenAI invited 23 external researchers to “red team” it — engineering-speak for trying to find as many flaws and vulnerabilities in it as they could, so the system could be improved. One of the main suggestions the red team made was to limit the initial release to only trusted users.
To its credit, OpenAI adopted this suggestion. For now, only about 400 people (a mix of OpenAI’s employees and board members, plus hand-picked academics and creatives) get to use DALL-E 2, and only for non-commercial purposes.
That’s a change from how OpenAI chose to deploy GPT-3, a text generator hailed for its potential to enhance our creativity. Given a phrase or two written by a human, it can add on more phrases that sound uncannily human-like. But it’s shown bias against certain groups, like Muslims, whom it disproportionately associates with violence and terrorism. OpenAI knew about the bias problems but released the model anyway to a limited group of vetted developers and companies, who could use GPT-3 for commercial purposes.
Last year, I asked Sandhini Agarwal, a researcher on OpenAI’s policy team, whether it makes sense that GPT-3 was being probed for bias by scholars even as it was released to some commercial actors. She said that going forward, “That’s a good thing for us to think about. You’re right that, so far, our strategy has been to have it happen in parallel. And maybe that should change for future models.”
The fact that the deployment approach has changed for DALL-E 2 seems like a positive step. Yet, as DALL-E 2’s “Risks and Limitations” document notes, “even if the Preview itself is not directly harmful, its demonstration of the potential of this technology could motivate various actors to increase their investment in related technologies and tactics.”
And you’ve got to wonder: Is that acceleration a good thing, at this stage? Do we really want to be building and launching these models now, knowing it can spur others to release their versions even quicker?
Some experts argue that since we know there are problems with the models and we don’t know how to solve them, we should give AI ethics research time to catch up to the advances and address some of the problems, before continuing to build and release new tech.
The problem of competition
Helen Ngo, an affiliated researcher with the Stanford Institute for Human-Centered AI, says one thing we desperately need is standard metrics for bias. A bit of work has been done on measuring, say, how likely certain attributes are to be associated with certain groups. “But it’s super understudied,” Ngo said. “We haven’t really put together industry standards or norms yet on how to go about measuring these issues” — never mind solving them.
OpenAI’s Brundage told me that letting a limited group of users play around with an AI model allows researchers to learn more about the issues that would crop up in the real world. “There’s a lot you can’t predict, so it’s valuable to get in contact with reality,” he said.
That’s true enough, but since we already know about many of the problems that repeatedly arise in AI, it’s not clear that this is a strong enough justification for launching the model now, even in a limited way.
Brundage also noted another motivation at OpenAI: competition. “Some of the researchers internally were excited to get this out in the world because they were seeing that others were catching up,” he said.
That spirit of competition is a natural impulse for anyone involved in creating transformative tech. It’s also to be expected in any organization that aims to make a profit.
Being first out of the gate is rewarded, and those who finish second are rarely remembered in Silicon Valley. But for AI to be developed safely, it needs to be aligned with our values — and it’s easy to see how this creates a situation where the incentives are misaligned for producing AI that truly benefits all of humanity.
The AI community has achieved amazing things, but so far, it hasn’t figured out how to change that incentive structure.
—Sigal Samuel
Questions? Comments? Email us at futureperfect@vox.com or find me on Twitter at @sigalsamuel. And if you want to recommend this newsletter to your friends or colleagues, tell them to sign up at vox.com/future-perfect-newsletter.