Discrimination in Online Ad Delivery
By Dr. Latanya Sweeney via Data Privacy Lab
A Google search for a person’s name, such as “Trevon Jones”, may yield a personalized ad for public records about Trevon that may be neutral, such as “Looking for Trevon Jones? …”, or may be suggestive of an arrest record, such as “Trevon Jones, Arrested?…”. This writing investigates the delivery of these kinds of ads by Google AdSense using a sample of racially associated names and finds statistically significant discrimination in ad delivery based on searches of 2184 racially associated personal names across two websites.
First names, previously identified by others as being assigned at birth to more black or white babies, are found predictive of race (88 percent black, 96 percent white), and those assigned primarily to black babies, such as DeShawn, Darnell and Jermaine, generated ads suggestive of an arrest in 81 to 86 percent of name searches on one website and 92 to 95 percent on the other, w hile those assigned at birth primarily to whites, such as Geoffrey, Jill and Emma, generated more neutral copy: the word “arrest” appeared in 23 to 29 percent of name searches on one site and 0 to 60 percent on the other. On the more ad trafficked website, a black-identifying name was 25 percent more likely to get an ad suggestive of an arrest record. A few names did not follow these patterns: Dustin, a name predominantly given to white babies, generated an ad suggestive of arrest 81 and 100 percent of the time.
All ads return results for actual individuals and ads appear regardless of whether the name has an arrest record in the company’s database. Notwithstanding these findings, the company maintains Google received the same ad text for groups of last names (not first names), raising questions as to whether Google’s advertising technology exposes racial bias in society and how ad and search technology can develop to assure racial fairness.
Sweeney L. Discrimination in Online Ad Delivery. Data Privacy Lab White Paper 1071-1. Harvard University. Cambridge. January 2013. Available at SSRN: http://ssrn.com/abstract=2208240, and arXiv: http://arxiv.org/abs/1301.6822
Keywords: online advertising, public records, racial discrimination, data privacy, information retrieval, computers and society, search engine marketing
Frequently Asked Questions
1. Isn’t the arrest rate of blacks higher anyway?
The ads appear regardless of whether the company sponsoring the ad has a criminal record for the name. The appearance of the ads are not related to any arrest statistics or the like.
2. What is racism?
From the paper: “Racial discrimination results when a person or group of people is treated differently based on their racial origins . Power is a necessary precondition, for it depends on the ability to give or withhold benefits, facilities, services, opportunities etc., from someone who should be entitled to them, and are denied on the basis of race. Institutional or structural racism is a system of procedures/patterns whose effect is to foster discriminatory outcomes or give preferences to members of one group over another .”
Notice that racism can result, even if not intentional.
The EEOC provides a test in cases of employment for a charge of discrimination. To make a determination, the EEOC uses an “adverse impact test,” which measures whether practices, intentional or not, have a disproportionate effect. If the ratio of the effect on groups is less than 80%, the employer may be held responsible for discrimination. These ads are not necessarily used for employment, and the computation here is for reference; the appearance of the ads at both websites was 77% and 40%, both showing adverse impact.
3. Who is to blame?
The current study documents and observes that there is discrimination in the delivery of the ads. We do not yet know why the discriminatory effect occurs. We have some work underway to help us better understand what may be happening. Possible options include whether the company provided ad texts suggestive of arrest disproportionately to black-identifying names? Or, did the company provide roughly the same ad texts evenly across racially associated names but society clicked ads suggestive of arrest more often for black identifying names? Google uses cloud-caching strategies to deliver ads quickly, might these strategies bias ad delivery towards ad texts previously loaded in the cloud cache? Is there a combinatorial effect?
4. What is a black-sounding name?
The study uses first names to predict race. First names that have the highest ratio of frequency in one racial group to frequency in the other racial group can be racially identifying. The first names used in the study came from earlier research which computed comparative frequencies from birth records. The study compared the online images of people having these first names to the race predicted and found these first names predictive of race (88% black, 96% white). The paper provides a complete breakdown by first name.
As examples, people having the first names DeShawn, Darnell and Jermaine, generated ads suggestive of an arrest in 81 to 86 percent of name searches on one website and 92 to 95 percent on the other, while those assigned at birth primarily to whites, such as Geoffrey, Jill and Emma, generated more neutral copy: the word “arrest” appeared in 23 to 29 percent of name searches on one site and 0 to 60 percent on the other. A few names did not follow these patterns: Dustin, a name predominantly given to white babies, generated an ad suggestive of arrest 81 and 100 percent of the time. The paper provides a complete breakdown by first name.
5. How can I see these ads?
The best way to view ad delivery right now, or to search what ads appear with your own name, is to go to a site that serves “Ads by Google”. Ads appear much more often on these other sites than on google.com, a finding and rate reported in the paper. Try entering a name in the search bar at reuters.com, one of the websites used in the study.
Ads are based on the first and last names of real people. Ads suggestive of arrest may appear even if there is no criminal record for the person in the company’s database.
As of January 6, 2013, arrest ads continue to appear.
6. What is the harm?
Whenever someone queries your name in a search engine, one of these ads appear. Perhaps you are in competition for an award, an appointment, a promotion, or a new job, or maybe you are in a position of trust, such as a professor, a physician, a banker, a judge, a manager, or a volunteer, or perhaps you are completing a rental application, selling goods, applying for a loan, joining a social club, making new friends, dating, or engaged in any one of hundreds circumstances for which an online searcher seeks to learn more about you. Appearing alongside your list of accomplishments is an advertisement implying you may have a criminal record, whether you actually have one or not. Worse, the ads don’t appear for your competitors.
7. Why did you do this work?
One day a colleague of mine entered my office and needed to locate one of my old papers. He entered my name into a search engine, and up popped an ad “Latanya Sweeney. Arrested?”. I was shocked. I have never been arrested, and after clicking the link and paying the requisite fee, I found the company had no arrest record for anyone with my name either. We then entered his name, Adam Tanner, a white male name, and an ad for the same company appeared, except it had no mention of arrest or criminal record; it just said they had information about him. So, we started more and more names, and eventually my colleague jumped to the conclusion that ads suggestive of arrest were appearing more often for black-sounding names than white names. I did not believe him, but nothing I tried refuted his hypothesis. Eventually, I said I have to look at this scientifically. Step one: what is a black sounding name, and so on. Throughout the work, I did not believe the pattern would be present, until there was no mistaking it. There was statistically significant discrimination in ad delivery based on searches of 2184 racially associated personal names across two websites. On the more ad trafficked website, a black-identifying name was 25% more likely to get an ad suggestive of an arrest record. There was less than 0.1% probability that these data can be explained by chance.
8. Where can I learn more about your work?
More papers will appear on this topic as the work unfolds. Adam Tanner and I are writing a book about this and other personal data experiments. You can check out the latest with me on my website, latanyasweeney.org.