The built-in biases and limitations of facial recognition and the issues it raises:
Facial recognition technology is improving by leaps and bounds. Some commercial software can now tell the gender of a person in a photograph.
When the person in the photo is a white man, the software is right 99 percent of the time.
But the darker the skin, the more errors arise — up to nearly 35 percent for images of darker skinned women, according to a new study that breaks fresh ground by measuring how the technology works on people of different races and gender.
These disparate results, calculated by Joy Buolamwini, a researcher at the M.I.T. Media Lab, show how some of the biases in the real world can seep into artificial intelligence, the computer systems that inform facial recognition.
Color Matters in Computer Vision
Facial recognition algorithms made by Microsoft, IBM and Face++ were more likely to misidentify the gender of black women than white men.
Gender was misidentified in up to 1 percent of lighter-skinned males in a set of 385 photos.
Gender was misidentified in up to 7 percent of lighter-skinned females in a set of 296 photos.
Gender was misidentified in up to 12 percent of darker-skinned males in a set of 318 photos.
Gender was misidentified in 35 percent of darker-skinned females in a set of 271 photos.
In modern artificial intelligence, data rules. A.I. software is only as smart as the data used to train it. If there are many more white men than black women in the system, it will be worse at identifying the black women.
One widely used facial-recognition data set was estimated to be more than 75 percent male and more than 80 percent white, according to another research study.
The new study also raises broader questions of fairness and accountability in artificial intelligence at a time when investment in and adoption of the technology is racing ahead.
Today, facial recognition software is being deployed by companies in various ways, including to help target product pitches based on social media profile pictures. But companies are also experimenting with face identification and other A.I. technology as an ingredient in automated decisions with higher stakes like hiring and lending.
Researchers at the Georgetown Law School estimated that 117 million American adults are in face recognition networks used by law enforcement — and that African Americans were most likely to be singled out, because they were disproportionately represented in mug-shot databases.
Facial recognition technology is lightly regulated so far.
“This is the right time to be addressing how these A.I. systems work and where they fail — to make them socially accountable,” said Suresh Venkatasubramanian, a professor of computer science at the University of Utah.
Until now, there was anecdotal evidence of computer vision miscues, and occasionally in ways that suggested discrimination. In 2015, for example, Google had to apologize after its image-recognition photo app initially labeled African Americans as “gorillas.”
Sorelle Friedler, a computer scientist at Haverford College and a reviewing editor on Ms. Buolamwini’s research paper, said experts had long suspected that facial recognition software performed differently on different populations.
“But this is the first work I’m aware of that shows that empirically,” Ms. Friedler said.
Ms. Buolamwini, a young African-American computer scientist, experienced the bias of facial recognition firsthand. When she was an undergraduate at the Georgia Institute of Technology, programs would work well on her white friends, she said, but not recognize her face at all. She figured it was a flaw that would surely be fixed before long.
But a few years later, after joining the M.I.T. Media Lab, she ran into the missing-face problem again. Only when she put on a white mask did the software recognize hers as a face.
By then, face recognition software was increasingly moving out of the lab and into the mainstream.
“O.K., this is serious,” she recalled deciding then. “Time to do something.”
So she turned her attention to fighting the bias built into digital technology. Now 28 and a doctoral student, after studying as a Rhodes scholar and a Fulbright fellow, she is an advocate in the new field of “algorithmic accountability,” which seeks to make automated decisions more transparent, explainable and fair.
Her short TED Talk on coded bias has been viewed more than 940,000 times, and she founded the Algorithmic Justice League, a project to raise awareness of the issue.
In her newly published paper, which will be presented at a conferencethis month, Ms. Buolamwini studied the performance of three leading face recognition systems — by Microsoft, IBM and Megvii of China — by classifying how well they could guess the gender of people with different skin tones. These companies were selected because they offered gender classification features in their facial analysis software — and their code was publicly available for testing.
She found them all wanting.
To test the commercial systems, Ms. Buolamwini built a data set of 1,270 faces, using faces of lawmakers from countries with a high percentage of women in office. The sources included three African nations with predominantly dark-skinned populations, and three Nordic countries with mainly light-skinned residents.
The African and Nordic faces were scored according to a six-point labeling system used by dermatologists to classify skin types. The medical classifications were determined to be more objective and precise than race.
Then, each company’s software was tested on the curated data, crafted for gender balance and a range of skin tones. The results varied somewhat. Microsoft’s error rate for darker-skinned women was 21 percent, while IBM’s and Megvii’s rates were nearly 35 percent. They all had error rates below 1 percent for light-skinned males.
Ms. Buolamwini shared the research results with each of the companies. IBM said in a statement to her that the company had steadily improved its facial analysis software and was “deeply committed” to “unbiased” and “transparent” services. This month, the company said, it will roll out an improved service with a nearly 10-fold increase in accuracy on darker-skinned women.
Microsoft said that it had “already taken steps to improve the accuracy of our facial recognition technology” and that it was investing in research “to recognize, understand and remove bias.”
Megvii, whose Face++ software is widely used for identification in online payment and ride-sharing services in China, did not reply to several requests for comment, Ms. Buolamwini said.
Ms. Buolamwini is releasing her data set for others to use and build upon. She describes her research as “a starting point, very much a first step” toward solutions.
Ms. Buolamwini is taking further steps in the technical community and beyond. She is working with the Institute of Electrical and Electronics Engineers, a large professional organization in computing, to set up a group to create standards for accountability and transparency in facial analysis software.
She meets regularly with other academics, public policy groups and philanthropies that are concerned about the impact of artificial intelligence. Darren Walker, president of the Ford Foundation, said that the new technology could be a “platform for opportunity,” but that it would not happen if it replicated and amplified bias and discrimination of the past.
“There is a battle going on for fairness, inclusion and justice in the digital world,” Mr. Walker said.
Part of the challenge, scientists say, is that there is so little diversity within the A.I. community.
“We’d have a lot more introspection and accountability in the field of A.I. if we had more people like Joy,” said Cathy O’Neil, a data scientist and author of “Weapons of Math Destruction.”
Technology, Ms. Buolamwini said, should be more attuned to the people who use it and the people it’s used on.
“You can’t have ethical A.I. that’s not inclusive,” she said. “And whoever is creating the technology is setting the standards.”
Agree. Those creating the algorithms and related technology need to be both more diverse and more mindful of the assumptions baked into their analysis and work:
The question over what to do about biases and inequalities in the technology industry is not a new one. The number of women working in science, technology, engineering and mathematics (STEM) fields has always been disproportionately less than men. What may be more perplexing is, why is it getting worse?
It’s 2017, and yet according to the American Association of University Women (AAUW) in a review of more than 380 studies from academic journals, corporations, and government sources, there is a major employment gap for women in computing and engineering.
North America, as home to leading centres of innovation and technology, is one of the worst offenders. A report from the Equal Employment Opportunity Commission (EEOC) found “the high-tech industry employed far fewer African-Americans, Hispanics, and women, relative to Caucasians, Asian-Americans, and men.”
However, as an executive working on the front line of technology, focusing specifically on artificial intelligence (AI), I’m one of many hoping to turn the tables.
This issue isn’t only confined to new product innovation. It’s also apparent in other aspects of the technology ecosystem – including venture capital. As The Globe highlighted, Ontario-based MaRS Data Catalyst published research on women’s participation in venture capital and found that “only 12.5 per cent of investment roles at VC firms were held by women. It could find just eight women who were partners in those firms, compared with 93 male partners.”
The Canadian government, for its part, is trying to address this issue head on and at all levels. Two years ago, Prime Minister Justin Trudeau campaigned on, and then fulfilled, the promise of having a cabinet with an equal ratio of women to men – a first in Canada’s history. When asked about the outcome from this decision at the recent Fortune Most Powerful Women Summit, he said, “It has led to a better level of decision-making than we could ever have imagined.”
Despite this push, disparities in developed countries like Canada are still apparent where “women earn 11 per cent less than men in comparable positions within a year of completing a PhD in a science, technology, engineering or mathematics, according to an analysis of 1,200 U.S. grads.”
AI is the creation of intelligent machines that think and learn like humans. Every time Google predicts your search, when you use Alexa or Siri, or your iPhone predicts your next word in a text message – that’s AI in action.
Many in the industry, myself included, strongly believe that AI should reflect the diversity of its users, and are working to minimize biases found in AI solutions. This should drive more impartial human interactions with technology (and with each other) to combat things like bias in the workplace.
The democratization of technology we are experiencing with AI is great. It’s helping to reduce time-to-market, it’s deepening the talent pool, and it’s helping businesses of all size cost-effectively gain access to the most modern of technology. The challenge is there are a few large organizations currently developing the AI fundamentals that all businesses can use. Considering this, we must take a step back and ensure the work happening is ethical.
AI is like a great big mirror. It reflects what it sees. And currently, the groups designing AI are not as diverse as we need them to be. While AI has the potential to bring services to everyone that are currently only available to some, we need to make sure we’re moving ahead in a way that reflects our purpose – to achieve diversity and equality. AI can be greatly influenced by human-designed choices, so we must be aware of the humans behind the technology curating it.
At a point when AI is poised to revolutionize our lives, the tech community has a responsibility to develop AI that is accountable and fit for purpose. For this reason, Sage created Five Core Principles to developing AI for business.
At the end of the day, AI’s biggest problem is a social one – not a technology one. But through diversity in its creation, AI will enable better-informed conversations between businesses and their customers.
If we can train humans to treat software better, hopefully, this will drive humans to treat humans better.
DeepMind is launching a team at the university partly for proximity to the broader AI research community in Canada.
A number of leading AI researchers in Silicon Valley hail from Canada, where they plugged away at deep learning, a complex automated process of data analysis, during a period when that technology — now popular at major tech companies — was considered by the larger computer science community to be a dead end.
“Our hope is that this collaboration will help turbocharge Edmonton’s growth as a technology and research hub,” wrote Hassabis, “attracting even more world-class AI researchers to the region and helping to keep them there too.”
2. The Canadian government is friendlier to AI research than the U.S.
Political realities also make Canada a particularly attractive place for Google to expand its AI efforts.
This is in contrast to the U.S., where President Donald Trump’s 2018 budget request includes drastic cuts to medical and scientific research, including an 11 percent or $776 million cut to the National Science Foundation.
Another contrast to the U.S. is in immigration policies. Canada doesn’t have an equivalent of the U.S. travel ban, which restricts travel for immigrants and refugees from Iran, Libya, Somalia, Sudan, Syria and Yemen. In the U.S., the ban makes it more difficult for tech and academic talent to enter the country.
Something interesting: One of the three researchers leading the team, Dr. Patrick M. Pilarski, is part of the university’s Department of Medicine. Google won’t comment on whether Pilarski’s medical background will play a role in his machine learning work for DeepMind, but Google is working on ways to integrate AI for health care as part of its cloud offering.
As some would have it, robots are poised to take over the world in about 3 … 2 … 1 …
But one machine-learning expert — who is, after all, in a position to know — thinks that’s not the biggest issue facing artificial intelligence. In fact, it’s not an issue at all.
“I am personally not worried about an AI apocalypse, as I consider that a completely made-up fear,” Jeff Dean, a senior fellow at Google, wrote during a Reddit AMA on Aug. 11. “I am concerned about the lack of diversity in the AI research community and in computer science more generally.” (Emphasis his.)
Ding, ding, ding. The issue that the tech industry is trying to maneuver their way around, for better or worse, is the same issue that can stunt the progress of “humanistic thinking” in the development of artificial intelligence, according to Dean.
For the optimists in the audience, Google Brain wants to improve lives, Dean wrote. And how can you improve lives without people with diverse perspectives and backgrounds helping to build and develop the technology you hope will impact positive change? (Answer: You can’t.)
“One of the things I really like about our Brain Residency program is that the residents bring a wide range of backgrounds, areas of expertise (e.g. we have physicists, mathematicians, biologists, neuroscientists, electrical engineers, as well as computer scientists), and other kinds of diversity to our research efforts,” Dean wrote.
“In my experience, whenever you bring people together with different kinds of expertise, different perspectives, etc., you end up achieving things that none of you could do individually, because no one person has the entire skills and perspective necessary.”
But this hand-wringing is a distraction from the very real problems with artificial intelligence today, which may already be exacerbating inequality in the workplace, at home and in our legal and judicial systems. Sexism, racism and other forms of discrimination are being built into the machine-learning algorithms that underlie the technology behind many “intelligent” systems that shape how we are categorized and advertised to.
Take a small example from last year: Users discovered that Google’s photo app, which applies automatic labels to pictures in digital photo albums, was classifying images of black people as gorillas. Google apologized; it was unintentional.
But similar errors have emerged in Nikon’s camera software, which misread images of Asian people as blinking, and in Hewlett-Packard’s web camera software, which had difficulty recognizing people with dark skin tones.
This is fundamentally a data problem. Algorithms learn by being fed certain images, often chosen by engineers, and the system builds a model of the world based on those images. If a system is trained on photos of people who are overwhelmingly white, it will have a harder time recognizing nonwhite faces.
A very serious example was revealed in an investigation published last month by ProPublica. It found that widely used software that assessed the risk of recidivism in criminals was twice as likely to mistakenly flag black defendants as being at a higher risk of committing future crimes. It was also twice as likely to incorrectly flag white defendants as low risk.
The reason those predictions are so skewed is still unknown, because the company responsible for these algorithms keeps its formulas secret — it’s proprietary information. Judges do rely on machine-driven risk assessments in different ways — some may even discount them entirely — but there is little they can do to understand the logic behind them.
Police departments across the United States are also deploying data-driven risk-assessment tools in “predictive policing” crime prevention efforts. In many cities, including New York, Los Angeles, Chicago and Miami, software analyses of large sets of historical crime data are used to forecast where crime hot spots are most likely to emerge; the police are then directed to those areas.
At the very least, this software risks perpetuating an already vicious cycle, in which the police increase their presence in the same places they are already policing (or overpolicing), thus ensuring that more arrests come from those areas. In the United States, this could result in more surveillance in traditionally poorer, nonwhite neighborhoods, while wealthy, whiter neighborhoods are scrutinized even less. Predictive programs are only as good as the data they are trained on, and that data has a complex history.
Histories of discrimination can live on in digital platforms, and if they go unquestioned, they become part of the logic of everyday algorithmic systems. Another scandal emerged recently when it was revealed that Amazon’s same-day delivery service was unavailable for ZIP codes in predominantly black neighborhoods. The areas overlooked were remarkably similar to those affected by mortgage redlining in the mid-20th century. Amazon promised to redress the gaps, but it reminds us how systemic inequality can haunt machine intelligence.
And then there’s gender discrimination. Last July, computer scientists at Carnegie Mellon University found that women were less likely than men to be shown ads on Google for highly paid jobs. The complexity of how search engines show ads to internet users makes it hard to say why this happened — whether the advertisers preferred showing the ads to men, or the outcome was an unintended consequence of the algorithms involved.
Regardless, algorithmic flaws aren’t easily discoverable: How would a woman know to apply for a job she never saw advertised? How might a black community learn that it were being overpoliced by software?
We need to be vigilant about how we design and train these machine-learning systems, or we will see ingrained forms of bias built into the artificial intelligence of the future.
Like all technologies before it, artificial intelligence will reflect the values of its creators. So inclusivity matters — from who designs it to who sits on the company boards and which ethical perspectives are included. Otherwise, we risk constructing machine intelligence that mirrors a narrow and privileged vision of society, with its old, familiar biases and stereotypes.
If we look at how systems can be discriminatory now, we will be much better placed to design fairer artificial intelligence. But that requires far more accountability from the tech community. Governments and public institutions can do their part as well: As they invest in predictive technologies, they need to commit to fairness and due process.
While machine-learning technology can offer unexpected insights and new forms of convenience, we must address the current implications for communities that have less power, for those who aren’t dominant in elite Silicon Valley circles.
Currently the loudest voices debating the potential dangers of superintelligence are affluent white men, and, perhaps for them, the biggest threat is the rise of an artificially intelligent apex predator.
But for those who already face marginalization or bias, the threats are here.