Algorithms and False Positives

September 13, 2017
Posted by Jay Livingston

Can face-recognition software tell if you’re gay?

Here’s the headline from The Guardian a week ago.


Yilun Wang and Michal Kosinski at Stanford’s School of Business have written an article showing that artificial intelligence – machines that can learn from their experiences – can develop algorithms to distinguish the gay from the straight. Kosinski goes farther. According to Business Insider,
He predicts that self-learning algorithms with human characteristics will also be able to identify:
  • a person’s political beliefs
  • whether they have high IQs
  • whether they are predisposed to criminal behaviour
When I read that last line, something clicked. I remembered that a while ago I had blogged about an Israeli company, Faception, that claimed its face recognition software could pick out the faces of terrorists, professional poker players, and other types. It all reminded me of Cesare Lombroso, the Italian criminologist. Nearly 150 years ago, Lombroso claimed that criminals could be distinguished by the shape of their skulls, ears, noses, chins, etc. (That blog post, complete with pictures from Lombroso’s book, is here.) So I was not surprised to learn that Kosinski had worked with Faception.

For a thorough (3000 word) critique of the Wang-Kosinski paper, see Greggor Mattson’s post at Scatterplot. The part I want to emphasize here is the problem of False Positives.

Wang-Kosinski tested their algorithm by showing a series of paired pictures from a dating site. In each pair, one person was gay, the other straight. The task was to guess which was which. The machine’s accuracy was roughly 80% – much better than guessing randomly and better than the guesses made by actual humans, who got about 60% right. (These are the numbers for photos of men only. The machine and humans were not as good at spotting lesbians. In my hypothetical example that follows, assume that all the photos are of men.)

But does that mean that the face-recognition algorithm can spot the gay person? The trouble with Wang-Kosinki’s gaydar test was that it created a world where half the population was gay. For each trial, people or machine saw one gay person and one straight.

Let’s suppose that the machine had an accuracy rate of 90%. Let’s also present the machine with a 50-50 world. Looking at the 50 gays, the machine will guess correctly on 45. These are “True Positives.” It identified them as gay, and they were gay. But it will also classify 5 of the gay people as not-gay. These are the False Negatives.

It will have the same ratio of true and false for the not-gay population. It will correctly identify 45 of the not-gays (True Negatives), but it will guess incorrectly that 5 of these straight people are gay (False Positive).


It looks pretty good. But how well will this work in the real world, where the gay-straight ratio is nowhere near 50-50? Just what that ratio is depends on definitions. But to make the math easier, I’m going to use 5% as my estimate. In a sample of 1000, only 50 will be gay. The other 950 will be straight.

Again, let’s give the machine an accuracy rate of 90%. For the 50 gays, it will again have 45 True Positives and 5 False Negatives. But what about the 950 not-gays. It will be correct 90% of the time and identify 885 of them as not-gay (True Negatives). But it will also guess incorrectly that 10% are gay. That’s 95 False Positives.


The number of False Positives is more than double the number of True Positives. The overall accuracy may be 90%, but when it comes to picking out gays, the machine is wrong far more often than it’s right.

The rarer the thing that you’re trying to predict, the greater the ratio of False Positives to True Positives. And those False Positives can have bad consequences. In medicine, a false positive diagnosis can lead to unnecessary treatment that is physically and psychologically damaging. As for politics and policy, think of the consequences if the government goes full Lomborso and uses algorithms for predicting “predisposition to criminal behavior.”

Smartphones and Teen Existential Angst

September 12, 2017
Posted by Jay Livingston

I’ve been wondering about America’s youth, mostly because of the Atlantic article by Jean Twenge: “Have Smartphones Destroyed a Generation?”  (Previous posts are here  and here .)
As the title of the article suggests, we’ve got trouble.

Around 2012, I noticed abrupt shifts in teen behaviors and emotional states. The gentle slopes of the line graphs became steep mountains and sheer cliffs . . . At first I presumed these might be blips, but the trends persisted, across several years and a series of national surveys. The changes weren’t just in degree, but in kind.

Twenge shows how kids differ from those of just a few years ago in how they spend their time – less dating, driving, and hanging out with peers, and more time on their phones, tablets, and computers. These changes in behavior, Twenge claims, have psychological consequences.

The biggest difference between the Millennials and their predecessors was in how they viewed the world. . . .
There is compelling evidence that the devices we’ve placed in young people’s hands are having profound effects on their lives—and making them seriously unhappy.


I went to the archives of Monitoring the Future, the only source of systematic data that Twenge mentions. It surveys kids in 8th, 10th, and 12th grades. I looked only at the data on 12th graders. One of the MTF questions asks kids whether they agree with the statement, “It feels good to be alive.” The choices are Agree, Mostly Agree, Neither, Mostly Disagree, Disagree.” So few kids chose either of the Disagree categories ( 4- 6 %) that I combined them with Neither.

(Click on a graph for a larger view.)

In the most recent year, these depressive categories accounted for only 18% of 12th graders. All the others agreed – 51% gave unqualified agreement, another 20% “mostly” agreed. More important for Twenge’s argument, the graph lines do not fall off a cliff in 2012 or in any other year. There’s a slow decline 2012-2015, but the numbers in the most recent year are very similar to what they were in before smartphones and social media.

Monitoring the Future also asks a question that would seem to tap depression, or at least existential despair*: “Life often seems meaningless.” The levels of agreement are the same ones as for “Good to be alive,” but the distribution of answers is more even.


Again, the sunnier choices carry the day. Those who “Disagree” categorically out number all others, followed by those who disagree but with some reservations. And again, the MTF data shows no dramatic changes.

So, “Have Smartphones Destroyed a Generation?” As I said in a previous post on this topic,
Whenever the title of a book or article is phrased as a question, two things are almost certain
  • The author thinks that the answer to the question is “Yes.”
  • The more accurate answer is “No.”
When it comes to finding life meaningful or worth living, teens today are no different from those teens twenty years ago who were sans iPhones, sans Facebook, sans Instagram, sans cyber-everything.
----------------------------
*
In the view of the existentialist, the individual's starting point is characterized by what has been called "the existential attitude", or a sense of disorientation, confusion, or dread in the face of an apparently meaningless or absurd world.
Existentialism - Wikipedia
https://en.wikipedia.org/wiki/Existentialism

America’s Not-So-Lost Youth

September 10, 2017
Posted by Jay Livingston

It seems that we never tire of experts like Prof. Harold Hill, the con artist in “The Music Man,” warning us about the temptations that threaten to lead our children astray. That musical was set in Iowa a century ago, and when Prof. Hill told the good people of River City, “Ya got trouble, my friends,” the culprit was a pool table. I’m old enough to remember when the menace was comic books. Today it’s social media. All those kids spending so much time on Facebook, Instagram, and iPhones – surely that can’t be good.

Last month, The Atlantic ran an article in full Music Man mode – “Have Smartphones Destroyed a Generation?” by Jean Twenge .


I blogged my skepticism (here). Twenge’s previous alarmist reports – The Narcissism Epidemic, for example – had not held up well against the evidence. But I had not been able to deal with the data sets from Monitoring the Future (MTF) that Twenge used for evidence about the destruction supposedly being wrought by iPhones. I didn’t know it at the time, but Alexandra Samuel had already done some of the work. (Her article is at JStor – here)

Twenge acknowledges that kids today cause far less trouble than did their counterparts of earlier generations. Juvenile crime is way down. The same goes for pregnancy, drugs, and abortion. But, says Twenge, the kids are not all right. They are desperately unhappy. Or as the Atlantic sub-head puts it, they are “on the brink of a mental-health crisis.”  Ya got trouble my friends.

According to Twenge, the crucial year is 2012. “Around 2012, I noticed abrupt shifts in teen behaviors and emotional states.” It turns out that the MTF survey of kids does not have many mental-health items – nothing about anxiety or depression. It does ask about happiness. Here is a graph from Alexandra Samuel’s article. The survey asks kids how happy they are generally – Very Happy, Pretty Happy, or Not Too Happy.

(Click on an image for a larger view.)

The biggest winner by far is Pretty Happy, chosen by 60-65%, a proportion that has not changed much since the first years of the survey. Since 2012, the percent reporting that they are Very Happy has decreased by perhaps 4 percentage points. Not Too Happy has increased by 2-3 percentage points. This hardly seems like the leading edge of a mental-health crisis.

As for the insidious effects of Facebook, Instagram and the rest, Samuel has a graph comparing kids who spend more time with social media (> 10 hours a week) and those who spend less. This too doesn’t do much to support Twenge’s claim that iPhones and the like are making kids “seriously unhappy.”


I don’t doubt that social media and smartphones have changed the way kids live their lives. Twenge presents evidence that kids are spending less time hanging out with peers, that they feel less pressure to drive a car, and that dating and sex are on the decline. I’d like to check the MTF data, but assuming Twenge’s report is accurate, are these trends a sign of a pending crisis in mental health? I seem to remember Harold Hill types warning about the dangers of peer groups, cars (remember those warnings about hot rodders?), and of course sex.  The Twenge types of a few decades ago were warning that kids were spending too much time with peers, unsupervised by adults. “Peer pressure” was always the source of bad behavior, never good. And adults fretted that this pressure was forcing kids to “grow up too fast ” (cars, sex). So if social media has made it easier for kids to escape these peer groups,  become less invested in cars, and have less premarital sex, maybe these trends are not harbingers of a coming crisis in the mental health of America’s youth.

Look What You Made Me Do

September 4, 2017
Posted by Jay Livingston

The Fundamental Attribution Error occurs when we attribute too much cause to the individual while ignoring the power of the situation. But there is a second attribution error – perhaps not as fundamental, but still important.

The central idea in attribution theory is this: when people* explain why another person did something, they attribute the behavior to causes within the person – their personality or other traits. The person behaved bravely because he is brave or dishonestly because he is sneaky, or affably because she is outgoing, and so on.  But when people explain their own behavior, they cite external factors – specific or vague aspects of the situation. They rarely say or think: I did it because I’m brave, outgoing, sneaky, etc. Instead they think they did what most people in the same situation would do. It’s all about the situation, not about me. When we make the fundamental attribution error, we leap too quickly from the behavior we observe to conclusions about the person’s character.

The second type of attribution error can occur when we think about our own behavior and attribute too much power to external forces while ignoring or denying our own ability to exercise free will. For example, my syllabus says explicitly that I base grades on the total points from tests and papers. Attendance matters only for point totals at the borderline between letter grades. There is no attendance requirement. But when I ask a student, “Why did you come to class?” the answer is often, “I had to.” Given a few seconds to reflect, the student might come up with an answer more consistent with the facts. Still, that first and more-or-less automatic answer reveals the basic assumption we make about why we’ve done something: I had to.

Two worst-date stories I heard recently on a podcast (“Unorthodox”) reminded me of this second attribution error. I’ve added edited transcripts, but you should really listen to the audio clips to get a better sense of the story and the reactions of the podcast interviewers.




It was really terrible . . .  And after it was done, I definitely did not want to go out again. And I was getting out of the car, and I said something like, “Hey, thanks. Have a great night,” sort of mumbled that, and he thought I’d said something like, “I had a great night.” So he goes, “Me too. Would you like to go out for breakfast tomorrow?” And I died inside, and somehow that was taken for a yes. So I had to go out with him again.

The guy got the wrong impression, but rather than correct him – not in the immediate situation and not afterwards by sending a text – she chose to endure a second date the next morning. (Of course, she didn’t see it as a choice. In her view, she had to.)

Here’s worst-date #2.



                                           
I went out with a guy, and he took me to a fancy restaurant. And he was dressed sort of like a hillbilly. And he wouldn’t speak, and there was a lot of awkward silences. And I asked him, “Why are there so many awkward silences?” And he goes like, “I feel comfortable with silence. I think we should feel comfortable with silence.” And then he proceeded not to talk for the rest of the date as a test to our relationship.

And then he took me to the Marriott Marquis where there’s this rotating lounge on the top floor. But what he failed to mention was that he’s extremely phobic of heights. So when we went into the glass elevator, he started having a panic attack. And when we got out on the 42nd floor, I was coaching him, telling him to breathe.  He’s asking me where we’re going in our relationship. . . .

But the kicker is that we had to walk down forty-two flights of stairs.

I wonder if gender makes a difference in these dating fiascos where the man and woman have very different perceptions of what the relationship is – that is, what the roles are and therefore who is supposed to do what. Women may think, “I don’t like the role you make me play,” but play it they do. Would a man behave differently? Would he say, “Look, I’ve gotten you through your anxiety attack, and I’m really sorry you suffered like that. But this is not going to be a relationship, and I’m certainly not going to walk down forty-two frickin’ flights of stairs. If you can join me in the elevator, we can leave together. If not, I’ll just say good-bye now.”

I don’t know of any systematic evidence on gender as it relates to dealing with bad dates. I guess I’ll have to pay more attention to Todd and Jayde’s “Blown Off” segment.

-----------------
* Cultures may vary on this tendency. Most of the evidence comes from the US and perhaps other Western countries, and there is some evidence that Asians may be more likely to consider situational factors when thinking about the causes of other people’s behavior.