The Oppenheimer of AI
Michal Kosinski knows exactly how he sounds when he talks about his research. And how it sounds is not great.
A psychologist at Stanford University, Kosinski is a specialist in psychometrics — a field that attempts to measure the facets of the human mind. For more than a decade, his work has freaked people right the hell out. In study after study, Kosinski has made scarily plausible claims that machine-learning algorithms of artificial intelligence can discern deeply private things about us — our intelligence, our sexual preferences, our political beliefs — using little more than our Facebook likes and photographs of our faces.
"I'm not even interested in faces," Kosinski insists. "There was nothing in my career that indicated I'd be spending a few years looking at people's appearances."
What Kosinski cares about are data and psychology. And what are photographs if not pixelated data? "Psychological theory kind of shouts in your face that there should be links between facial appearance and intimate traits," he says. That's why he believes that you can judge our inner lives by our outward characteristics.
It's a belief with disturbing implications. Science has been trying to divine truths about personality and behavior from various tests and images for centuries. As far back as the 1700s, physiognomists measured facial features in a search for ineffable qualities like nobility and immorality. Phrenologists used calipers to measure bumps on people's heads, hoping to diagnose mental incapacities or moral deficiencies. Eugenicists used photographs and IQ tests to determine which people were "inferior," and sterilized those who didn't measure up — which usually turned out to be anyone who wasn't white and rich. The methods differed, but the underlying theory remained the same: that measurements could somehow gauge the mind, and a person's value to society.
To be clear, none of these "sciences" worked. In fact, every time someone claimed they'd found a way to measure people's inner traits based on their exterior features, it quickly turned into a tool to discriminate against people based on their race or gender. That's because findings involving individuals almost always get applied to entire populations. It's a short leap from saying "some people are smarter than others" to "some races are smarter than others." A test can be useful to assess which calculus class your daughter has an aptitude for. But it's malicious and wrong to use those test results to assert that there aren't many female software engineers because girls don't like math. Yet today, intelligence testing and facial recognition continue to be used, and abused, in everything from marketing and job hiring to college admissions and law enforcement.
Kosinski is aware of the long, dark history of his chosen field. Like his skull-measuring forebears, he believes that his research is right — that AI, combined with facial recognition, can lay bare our personalities and preferences more accurately than humans can. And to him, that accuracy is what makes his findings so dangerous. In pursuit of this ability, he fears, its creators will violate people's privacy and use it to manipulate public opinion and persecute minority groups. His work, he says, isn't meant to be used as a tool of oppression, like the pseudoscience of the past. It's meant as a warning about the future. In a sense, he's the Oppenheimer of AI, warning us all about the destructive potential of an artificial-intelligence bomb — while he's building it.
"Very soon," he says, "we may find ourselves in a position where these models have properties and capacities that are way ahead of what humans could dream of. And we will not even notice."
When we meet, Kosinski does not brandish any calipers to assess my brow and determine my tendency toward indolence, as the phrenologists of the 19th century did. Instead, dressed in a California-casual flowered shirt and white leather loafers — no socks — he leads me to a sunny Stanford courtyard for coffee. We're surrounded by a happy and diverse crowd of business-school students. Here, on a perfect California day, he lays out the case for what he fears will be the secret algorithmic domination of our world.
Before he worked with photographs, Kosinski was interested in Facebook. When he was a doctoral student at Cambridge in the mid-2000s, the few social scientists who took the emerging online world seriously regarded it as an uncanny valley, a place where people essentially donned fake personalities. How they behaved online didn't reflect their psychology or behavior in the real world.
Kosinski disagreed. "I felt that I'm still myself while using those products and services, and that my friends and people I knew were like this as well," he says. Even people pretending to be dwarf paladins or sex dragons still had the same anxieties, biases, and prejudices they carried around IRL.
Drawing on Facebook likes, Kosinski's model could tell whether a man was gay with 88% accuracy.
Much to the dismay of his thesis advisor, this became the foundation of Kosinski's approach. "That was the first aim, to show that continuity," he says. "And that led me to the second aim, which was: If we are all still ourselves online, that means we can use data collected online — Big Data — to understand humans better." To test his hypothesis, Kosinski and a grad student named David Stillwell at the Cambridge Psychometrics Centre created a Facebook app called myPersonality — an old-school magazine-style quiz that tested for personality traits like "openness" or "introversion" while also hoovering up people's Facebook likes. Then they built a computer model that mapped those likes to specific personality traits for nearly 60,000 people.
Published in the Proceedings of the National Academy of Sciences in 2013, the results seemed astonishing. Facebook likes alone could predict someone's religion and politics with better than 80% accuracy. The model could tell whether a man was gay with 88% accuracy. Sometimes the algorithm didn't seem to have any particularly magical powers — liking the musical "Wicked," for example, was a leading predictor of male homosexuality. But other connections were baffling. Among the best predictors of high intelligence, for instance, were liking "thunderstorms" and "curly fries."
How did the machine draw such seemingly accurate conclusions from such arbitrary data? "Who knows why?" Stillwell, now the director of the Psychometrics Centre, tells me. "Who cares why? If it's a group of 10,000 individuals, the mistakes cancel out, and it's good enough for a population." Stillwell and Kosinski, in other words, aren't particularly interested in whether their models say anything about actual causation, about an explanation behind the connections. Correlation is enough. Their method enabled a machine to predict human behaviors and preferences. They don't have to know — or even care — how.
It didn't take long for such models to be weaponized. Another researcher at the Psychometrics Centre, Aleksandr Kogan, took similar ideas to a political-campaign consultancy called Cambridge Analytica, which sold its services to the 2016 campaign of Donald Trump and to Brexit advocates in the UK. Did the efforts to manipulate social-media feeds and change voting behaviors actually influence those votes? No one knows for sure. But a year later, Stillwell and Kosinski used myPersonality data to create psychologically customized ads that markedly influenced what 3.5 million people bought online, versus those who saw ads that weren't targeted to them. The research was at the forefront of what is today commonplace: using social-media algorithms to sell us stuff based on our every point and click.
Around the same time Kosinski was demonstrating that his research could manipulate online shoppers, a bunch of companies were starting to sell facial-recognition systems. At the time, the systems weren't even good at what they claimed to do: distinguishing among individuals for identification purposes. But Kosinski wondered whether software could use the data embedded in huge numbers of photographs, the same way it had with Facebook likes, to discern things like emotions and personality traits.
Most scientists consider that idea a form of modern physiognomy — a pseudoscience based on the mistaken assumption that our faces reveal something about our minds. Sure, we can tell a lot about someone by looking at them. At a glance we can guess, with a fair degree of accuracy, things like age, gender, and race. Based on simple odds, we can intuit that an older white man is more likely to be politically conservative than a younger Latina woman; an unshaven guy in a filthy hoodie and demolished sneakers probably has less ready cash than a woman in a Chanel suit. But discerning stuff like extroversion, or intelligence, or trustworthiness? Come on.
We can tell things like age, gender, and race just by looking at someone. But extroversion, or intelligence? Come on.
But once again, Kosinski believed that a machine, relying on Big Data, could divine our souls from our faces in a way that humans can't. People judge you based on your face, he says, and treat you differently based on those judgments. That, in turn, changes your psychology. If people constantly reward you with jobs and invitations to parties because they consider you attractive, that will alter your character over time. Your face affects how people treat you, and how people treat you affects who you are. All he needed was an algorithm to read the clues written on our faces — to separate the curly fries from the Broadway musicals.
Kosinski and a colleague scraped a dating site for photographs of 36,360 men and 38,593 women, equally divided between gay and straight (as indicated by their "looking for" fields). Then he used a facial-recognition algorithm called VGG-Face, trained on 2.6 million images, to compare his test subjects based on 500 variables. Presenting the model with photographs in pairs — one gay person and one straight person — he asked it to pick which one was gay.
Presented with at least five photographs of a person, Kosinski's model picked the gay person out of a pair with 91% accuracy. Humans, by contrast, were right only 61% of the time.
The paper gestures at an explanation — hormonal exposure in the womb something something. But once again, Kosinski isn't really interested in why the model works. To him, what's important is that a computer trained on thousands of images can draw accurate conclusions about something like sexual preference by combining multiple invisible details about a person.
Others disagreed. Researchers who study faces and emotions criticized both his math and his conclusions. The Guardian took Kosinski to task for giving a talk about his work in famously homophobic Russia. The Economist called his research "bad news" for anyone with secrets. The Human Rights Campaign and GLAAD issued a statement decrying the study, warning that it could be used by brutal regimes to persecute gay people. "Stanford should distance itself from such junk science," the HRC said, "rather than lending its name and credibility to research that is dangerously flawed and leaves the world — and this case, millions of people's lives — worse and less safe than before."
Kosinski felt blindsided. "People said, 'Stanford professor developed facial-recognition algorithms to build a gaydar.' But I don't even actually care about facial appearance per se. I care about privacy, and the algorithmic power to do stuff that we humans cannot do." He wasn't trying to build a scanner for right-wingers to take to school-board meetings, he says. He wanted policymakers to take action, and gay people to prepare themselves for the world to come.
"We did not create a privacy-invading tool, but rather showed that basic and widely used methods pose serious privacy threats," Kosinski and his coauthor wrote in their paper. "We hope that our findings will inform the public and policymakers, and inspire them to design technologies and write policies that reduce the risks faced by homosexual communities across the world."
Kosinski kept at it. This time, he scraped more than a million photographs of people from Facebook and a dating site, along with the political affiliations they listed in their profiles. Using VGGFace2 — open source, available to anyone who wants to try such a thing — he converted those faces to thousands of data points and averaged together the data for liberals and conservatives. Then he showed a new algorithm hundreds of thousands of pairs of images from the dating site and asked it to separate the MAGA lovers from the Bernie bros. The machine got it right 72% of the time. In pairs matched for age, gender, and race — knocking out the easy cues — accuracy fell, but only by a little.
This might seem like a big scary deal. AI can tell if we have political wrongthink! It can tricorder our sexuality! But most people who study faces and personality think Kosinski is flat-out wrong. "I absolutely do not dispute the fact that you can design an algorithm that can guess much better than chance whether a person is gay or straight," says Alexander Todorov, a psychologist at the University of Chicago. "But that's because all of the images are posted by the users themselves, so there are lots of confounds." Kosinski's model, in other words, isn't picking up microscopically subtle cues from the photos. It's just picking up on the way gay people present themselves on dating sites — which, not surprisingly, is often very different from the way straight people present themselves to potential partners. Control for that in the photos, and the algorithmic gaydar's accuracy ends up little better than chance.
Kosinski has tried to respond to these critiques. In his most recent study on political affiliation, he took his own photos of test subjects, rather than scraping the internet for self-posted photos. That enabled him to control for more variables — cutting out backdrops, keeping hairstyles the same, making sure people looked directly at the camera with a neutral expression. Then, using this new set of photos, he once again asked the algorithm to separate the conservatives from the liberals.
This time, the machine did fractionally worse than humans at accurately predicting someone's political affiliation. And therein lies the problem. It's not just that Kosinski's central finding — that AI can read humans better than humans can — is very possibly wrong. It's that we'll tend to believe it anyway. Computation, the math that a machine has instead of a mind, seems objective and infallible — even if the computer is just operationalizing our own biases.
That faulty belief isn't just at the heart of science's misguided and terrifying attempts to measure human beings over the past three centuries. It's at the heart of the science itself. The way scientists know whether to believe they've found data that confirms a hypothesis is through statistics. And the pioneers of modern statistics — Francis Galton, Ronald Fisher, and Karl Pearson — were among the most egregious eugenicists and physiognomists of the late 19th and early 20th centuries. They believed that Black people were savages, that Jews were a gutter race, that only the "right" kind of people should be allowed to have babies. As the mathematician Aubrey Clayton has argued, they literally invented statistical analysis to give their virulent racial prejudice a veneer of objectivity.
The methods and techniques they pioneered are with us today. They're behind IQ testing and college-admissions exams, the ceaseless racial profiling by police, the systems being used to screen job candidates for things like "soft skills" and "growth mindset." It's no coincidence that Hitler took his cues from the eugenicists — including an infamous 1929 ruling by the US Supreme Court that upheld the forced sterilization of women deemed by science to be "imbeciles." Imagine what a second Trump administration would do with AI-driven facial recognition at a border crossing — or anywhere, really, with the goal of identifying "enemies of the state." Such tools, in fact, are already built into ammunition vending machines (possibly one of the most dystopian phrases I have ever typed). They're also being incorporated into many of the technologies deployed on America's southern border, built by startups founded and funded by the same people supporting the Trump campaign. You think racism is systemic now? Just wait until the system is literally programmed with it.
The various technologies we've taken to calling "artificial intelligence" are basically just statistical engines that have been trained on our biases. Kosinski thinks AI's ability to make the kind of personality judgments he studies will only get better. "Ultimately, we're developing a model that produces outputs like a human mind," he tells me. And once the machine has thoroughly studied and mastered our all-too-human prejudices, he believes, it will then be able to see into our minds and use whatever it finds there to call the shots.
In Kosinski's nightmare, this won't be Skynet bombing us into oblivion. The sophisticated AI of tomorrow will know us so well that it won't need force — it will simply ensure our compliance by giving us exactly what we want. "Think about having a model that has read all the books on the planet, knows you intimately, knows how to talk to you, and is rewarded not only by you but by billions of other people for engaging interactions," he says. "It will become a master manipulator — a master entertainment system." That is the future Kosinski fears — even as he continues to tinker with the very models that prove it will come to pass.
Adam Rogers is a senior correspondent at Business Insider.