Montclair SocioBlog: Methods

Showing posts with label Methods. Show all posts

Meanness and Means

April 2, 2010
Posted by Jay Livingston

On March 27, the Times ran an op-ed by David Elkind, “Playtime is Over,” about the causes of bullying:

it seems clear that there is a link among the rise of television and computer games, the decline in peer-to-peer socialization and the increase of bullying in our schools.

I was skeptical. Had there really been an increase in bullying? Elkind offered no evidence. He cited numbers for current years (school absences attributable to bullying), but he had no comparable data for the pre-computer or pre-TV eras. Maybe he was giving a persuasive explanation for something that didn’t exist.

I sent the Times a letter expressing my doubts. They didn’t publish it. Elkind is, after all, a distinguished psychologist, author many books on child development. As if to prove the point, three days later, the big bullying story broke. An Irish girl in South Hadley, Massachusetts committed suicide after having been bullied by several other girls in her high school. The nastiness had included Facebook postings and text messages.

I guess Elkind was right, and I was wrong. Bullying has really exploded out of control in the electronic age.

But today the op-ed page features “The Myth of Mean Girls,” by Mike Males and Meda-Chesney Lind. They look at all the available systematic evidence on nastiness by teenagers – crime data (arrests and victimizations), surveys on school safety, the Monitoring the Future survey, and the CDC’s Youth Risk Behavior Surveillance. They all show the same trend:

This mythical wave of girls’ violence and meanness is, in the end, contradicted by reams of evidence from almost every available and reliable source.

Worse, say the authors, the myth has had unfortunate consequences:

. . . more punitive treatment of girls, including arrests and incarceration for lesser offenses like minor assaults that were treated informally in the past, as well as alarmist calls for restrictions on their Internet use.*

This is not to say that bullying is O.K. and nothing to worry about. Mean girls exist. It’s just that the current generation has fewer of them than did their parents’ generation. Should we focus on the mean or on the average? On average, the kids are not just all right; they’re nicer. Funny that nobody is offering explanations of how the Internet and cell phones might have contributed to this decline in meanness.

*For a recent example, see my post about criminal charges brought against young teenage girls for “sexting,” even though the pictures showed no naughty bits.

UPDATE: At Salon.com, Sady Doyle argues that Lind and Males looked at the wrong data.

Unfortunately, cruelty between girls can't really be measured with the hard crime statistics on which Males and Lind's argument relies. . . . Bullying between teenage girls expresses itself as physical fighting less often than it does as relational aggression, a soft and social warfare often conducted between girls who seem to be friends. You can't measure rumors, passive-aggressive remarks, alienation and shaming with statistics.

She has a point. While most of the evidence Males and Lind cite is not “hard crime statistics,” it does focus on overt violence. But Doyle is wrong that you can’t measure “relational aggression.” If something exists, you can measure it. The problem is that your measure might not be valid enough to be of use.

If Doyle is right, if nonphysical bullying hasn’t been measured, that doesn’t mean that Males and Lind are wrong and that bullying has in fact increased. It means that we just don’t know. We do know that physical violence has decreased. So here are the possibilities.

Physical and nonphysical aggression are inversely related. Girls have substituted nonphysical aggression for physical aggression – social bullying has increased.
Less serious forms of aggression usually track with more serious forms (nationwide, the change in assault rates runs parallel to the change in murder rates). So we can use rates of physical aggression as a proxy for rates of bullying – social bullying has decreased.
Physical and nonphysical aggression are completely unrelated, caused by different factors and in found in different places – the change in social bullying is anybody’s guess.

How Much is Three Percent?

March 11, 2010
Posted by Jay Livingston

The Freakonomics blog today assures us that emergency room overutilization is a “myth.” All that talk about the uninsured doing what George W. Bush suggested and using the emergency rooms as primary care, that’s just baseless scare tactics. Citing a Slate article, they give the data

E.R. care represents less than 3 percent of healthcare spending, only 12 percent of E.R. visits are non-urgent, and the majority of E.R. patients are insured U.S. citizens, not uninsured, illegal immigrants.

That “majority” might be 99.9% or it might be 50.1%. It turns out that the uninsured account for about 20% of E.R. visits.

My trouble is that I never know if those percents are a lot or a little. Take that 3% of spending. I’m not an economist, and although I haven't done the math, I figure that 3% of $2.3 trillion might still be a significant chunk of change. So just to make sure that 3% was in fact a pittance, a part of the “emergency room myth,” I looked for other Freakonomics articles with a similar number.

foreclosure rates began a steady rise from 1.7 percent in 2005 to 2.8 percent in 2007. [Three percent of healthcare spending is a little; 2.8% of mortgages is a lot.]
I was surprised at how high the fees were. . . . Even on big-ticket items like airline tickets, the credit-card company collects nearly 3 percent. [Three percent of healthcare spending is a little; 3% of an airline ticket is a lot.]
The homeownership rate in the U.S. increased by 3 percentage points over the past decade — a clear break from the two previous decades of stagnation. [Three percent of healthcare spending is a little; 3% of homeownership is a lot.]

You get the idea. Maybe whether 3% is a lot or a little depends on its political use. I don’t follow the Freaknomics political views closely, but I’m guessing that they don’t like Hugo Chavez down in Venezuela.

opposition voters [those who opposed Chavez] experienced a 5 percent drop in earnings and a 1.5 percent drop in employment rates after their names were released. The authors also conclude that the retaliatory measures may have cost Venezuela up to 3 percent of G.D.P. due to misallocation of workers across jobs.

Chavez “may have” cost his country a whopping 3% of GDP, i.e, $9.4 billion (or possibly less -- note that “up to”). E.R. visits cost the US only a negligible 3% of healthcare spending. And the uninsured are only one-fifth of that, a mere $14 billion.

Whether 3% is a lot or a little seems to depend on your politics and what the issue is.

Unions too are bad, at least for business.

a successful unionization vote significantly decreases the market value of the company even absent changes in organizational performance. Lee and Mas run a policy simulation and conclude that, “ … a policy-induced doubling of unionization would lead to a 4.3 percent decrease in the equity value of all firms at risk of unionization.”

For a paltry increase of 100% in the number of workers getting the benefits of unionization compaines would suffer on overwhelming decrease of 4.3% decrease in equity.

Now about those 20 people in front of you in line at the emergency room. Only four of them (20%) are there because they don’t have insurance. They are part of what Freakonomics calls a “rosier picture.” I wonder if Freakonomics maybe has one or two posts where 20% is pretty big amount, something to worry about, instead of being the equivalent of a bunch of roses in the hospital.

Cooking the Books - A Second Look

February 19, 2010
Posted by Jay Livingston

Do the police undercount crime?

The graph I cribbed from Rick Rosenfeld in yesterday’s post showed a remarkable similarity between victimization surveys and official crime statistics. In 2000, for example the rate of reported burglaries according to the NCVS was nearly identical to the UCR rate. Both were about 4.4 per 1,000.

Yet in the recent Eterno-Silverman study, police commanders, responding anonymously, said that crime statistics were suppressed. And Josh in his comment yesterday refers to Peter Moskos’s “let me count the ways” description of how the police keep crimes off the books. (See Moskos’s own take on the study at his website.)

The problem is that the graph I presented was somewhat misleading The NCVS and UCR rates of burglary do not measure exactly the same thing. It’s not exactly oranges and apples; more like oranges and tangerines.

1. The NCVS data are for the New York metro area, so we have to use similar UCR data even though the rap about fudging the stats is only about the NYPD. No way to get around that problem

2. More crucially, the NCVS counts only residential burglaries; the UCR number includes both commercial and residential burglaries. Nationwide, about 2/3 of all UCR burglaries are residential. Using that figure for the New York area we get a UCR rate for Residential burglaries of only 3.0 per 1,000 population, about one-third less than we would expect from the estimate of the number of residential burglaries that victims say they reported. Here’s an amended graph. I’ve added a line for residential burglaries that uses the simple 2/3 formula.

(Click on the graph for a larger view.)

The rate of residential burglaries that victims say that they report is usually one-and-a-half to two times greater than the rate of residential burglaries officially “known to the police.” For the year 2000, the NCVS rate of 4.4 per 1,000 population works out to 40,000 reported residential burglaries. If 2/3 of burglaries are residential, only 27,500 of those made it onto the police books.

Does that mean that the police canned 12,5000 reported burglaries? Probably not. There may be other explanations for the some of the discrepancy. But the data do provide some support for those who are skeptical of the precision of the police numbers.

Cooking the Crime Books?

February 18, 2010
Posted by Jay Livingston

“Crimes known to the police” is the official count of Crime in the United States – the annual report published by the FBI, which compiles data from local police departments. It’s also known as the Uniform Crime Reports (UCR).

Many years ago, a friend of mine found that his car had been broken into and wanted to report the crime to the police. He went to the local precinct, and when the desk sergeant finally acknowledged him, he said, “Someone broke into my car and stole my stuff.”

“So what do you want me to do?” said the sergeant.

That was one larceny that never became “known to the police,” at least not on the books of the 20th precinct.

The problem of uncounted crime has been around a long time. In the late 1940s, New York’s burglary rate grew by 1300% in a single year, a huge increase but entirely attributable to changes in bookkeeping. Word had gone out that burglaries should no longer be routinely assigned to “Detective Can.”

In the 1980s, Chicago’s robbery rate rose after the FBI threatened the city that it wouldn’t include their data because the numbers were so suspect. Atlanta kept its numbers artificially low prior to the Olympics. This week, the Dallas police chief is under attack for the way his department reports crimes.

Now two criminologists, John Eterno and Eli Silverman, are claiming that New York’s crime data have been fudged consistently for the last 15 years, and they point to CompStat as the culprit (NY Times article here.) CompStat is the system that William Bratton brought to New York when he became police commissioner in 1994. It required commanders to report every week on statistics and patterns of crime in their areas.

Eterno and Silverman gave anonymous surveys to retired precinct commanders, Under pressure to appear effective in the war on crime, precinct commanders might stretch the facts. The value of a theft might be creatively investigated to keep the total under the $1000 threshold a misdemeanor and the felony known as “grand larceny.” Felonies look worse on your statistical report.

A purse snatch might get recorded as a theft instead of a robbery because robberies fall into the broader category of “violent” crimes. Or victims, like my friend in the old days, might be persuaded not to bother reporting the crime.

In an op-ed in the Times yesterday, Bratton vigorously defended the NYPD numbers. He provided no data, but he could have.

Since 1973, the US has had an alternate count of crime, the National Crime Victimization Survey. Most of the data are for the US, but Rick Rosenfeld and Janet Lauritsen were able to get three-year averages for New York City, and they have looked at the data for burglary.

(Click on the graph for a larger view.)

The graph shows the rate (percents) of

people who told the NCVS they had been victims of a burglary
people who say they reported the burglary to the police
the official rate of burglaries “known to the police”

The numbers are not precisely comparable (the NCVS rate may be based on households rather than population, and the UCR rate includes commercial burglaries as well as residential). But the data in the graph do not support the idea that CompStat increased the fudging of burglary statistics If it had, then starting in 1994, we should see a widening gap between the NCVS line and the UCR line, with the UCR line moving downward much more. But if anything, it’s the Victimization line that descends more steeply.

In the decade following CompStat, both sources of data show a 68% decrease in burglary. So if commanders were cooking the books, they weren't including burglary in the recipe.

What Was the Question?

February 5, 2010
Posted by Jay Livingston

Survey questions may seem straightforward, but especially if the poll is a one-off, with questions that haven’t been used in other polls, you can’t always be sure how the respondents interpret them.

The Kos/Research 2000 poll of Republicans has been getting some notice, and no wonder. At first glance, it seems to show that one of our two major political parties is home to quite a few people who are not fully in touch with reality, especially when Obama is in view.

Do you believe Barack Obama is a racist who hates White people?
Yes 31
No 36
Not Sure 33

Do you believe Barack Obama wants the terrorists to win?
Yes 24
No 43
Not Sure 33

Should Barack Obama be impeached, or not?
Yes 39
No 32
Not Sure 29

I’m not sure what the results mean. Self-identified Republicans are about 25% of the electorate.* If one-third of them hold views that are “ludicrous” (Kos’s term), that’s still only 8% of the voters.

But what about non-ludicrous Republicans. Suppose you were a mainstream conservative and Research 2000 phoned you. To find out, I put some of the questions to a Republican I know – non-ludicrous (he reads the Wall Street Journal, he doesn’t watch Glenn Beck.)

Do you believe Sarah Palin is more qualified to be President than Barack Obama? (In the survey, 53% said, “yes.”)

Such a loaded question! I think she's nuts and he's sane – but in principle, she's right and he's wrong about most issues.

Do you believe Barack Obama wants the terrorists to win?

They don't WANT terrorists to win – no – but they don't care as much about the battle as most Americans do.

He might have said Yes to the interviewer just because he thought a Yes was more in line with the spirit of the question than with its actual wording. Or he would have refused to answer (and possibly have been put in the “Not sure” category?)

So the questions are more ambiguous than they seem, even on close reading.

Should public school students be taught that the book of Genesis in the Bible explains how God created the world?
Seventy-seven per cent of the sample said, “Yes.” And Kos, who commissioned the poll in connection with his book – to be called American Taliban – will see that result as rabid pro-creationism and anti-science. But re-read the actual question. Here’s what my sane Republican had to say:

This one's easy:
Absolutely yes. “public school students should be taught” a lot of important facts about our culture and civilization – that the Greeks invaded Ilium and destroyed Troy, that Confucius was the inspiration for a great religion, that Thomas A. Edison invented the electric light bulb, that Darwin in his Origin of the Species explained how animals change according to the process of natural selection, and “that the book of Genesis in the Bible explains how God created the world.” Why the hell not teach that fact? Who could say no to that?

Who indeed? Not me.

-----------------------
* The poll may have oversampled the fringe (see Emily Swanson at Pollster ), but those folks at the fringe are more likely to be active at the local level, so it’s possible they’ll swing some weight at the national level too. Their preferred candidate is, of course, Sarah Palin. So while political scientists think the poll may be exaggerating the far right (see Joshua Tucker’s excellent critique at The Monkey Cage), the Palinstas are hailing the poll as spot on.

Correlation and Cause - Feeding and Breeding

January 25, 2010
Posted by Jay Livingston

Andre Bauer’s idea that poor people are like stray animals is what will get most of the attention, as I suppose it should. Bauer* is running for governor of the enlightened state of South Carolina, where Appalachian Trail hiker Mark Sanford is still in that office.** Bauer is Lt. Gov., and here’s what he said à propos programs for free and reduced-price lunches in the public schools.

My grandmother was not a highly educated woman, but she told me as a small child to quit feeding stray animals. You know why? Because they breed. You're facilitating the problem if you give an animal or a person ample food supply. They will reproduce, especially ones that don't think too much further than that. And so what you've got to do is you've got to curtail that type of behavior. They don't know any better,

Bauer stands by his analogy and says he was quoted out of context. Right.

Obviously, Bauer did not take Sociology of Poverty. Of less importance politically is that he also skipped the methods course. Apparently, he has some data – a bar graph – but he mistakes correlation for cause.

I can show you a bar graph where free and reduced lunch has the worst test scores in the state of South Carolina. You show me the school that has the highest free and reduced lunch, and I'll show you the worst test scores, folks. It's there, period.

I suppose that it is somehow possible that providing food for impoverished kids makes them dumb. Maybe electing people to office in the Palmetto State has a similar effect.

*Lt. Gov. Andre Bauer is not to be confused with Andre Braugher, the excellent actor who plated Detective Pembleton on “Homicide” (the forerunner to “The Wire”) and is currently in “Men of a Certain Age.” Pictures below. You figure out which Andre is which.

** What’s up with The Palmetto State and its public servants? Lt. Gov. Bauer is incautious not just in his campaign speeches. He also tends to get stopped for speeding, and he once crash-landed a small plane. (CSM article here.) Then, besides Sanford and Bauer, there’s the former chair of the SC Board of Education, who home schooled her kids, believes that “intelligent design” and “abstinence only” should be taught in the schools, and resigned only when it was revealed that she also publishes online porn (oops, I mean erotic fiction.) The story and links to her very NSFW prose are here. I guess she just wanted to put the palm back in palmetto.

Sexting and Percentaging - The Wrong Way

December 23, 2009
Posted by Jay Livingston

The Pew survey on sexting – it came out over a week ago. I don’t know how I missed it. Must be the holiday blahs. And where was the media hysteria? Most news outlets ignored it, probably because the results weren’t all that alarming.

For the entire sample of eight hundred 12-17 year olds, the estimated proportion who sent “sexually suggestive nude or nearly nude images of themselves” was 4%. Given the margin of error, that means that the actual percentage, as Dan Ryan at Sociology of Information notes, is somewhere between 0% and 8%.

Of course, we’re not going to see a headline like “Sexting Teens May be 0%.” Not when you can goose up the numbers to 30%. Here’s the headline that ran in The Washington Post:

Sexting hasn't reached most young teens, poll finds;
30% of 17-year-olds report getting nude photos on their cells

That subhead manages to get the highest percentage by

using only the oldest subgroup in the sample
measuring receiving rather than sending

Dan has some other methodological criticisms, including this one. First the Pew summary paragraph:

One parental intervention that may relate to a lower likelihood of sending of sexually suggestive images was parental restriction of text messaging. Teens who sent sexually suggestive nude or nearly nude images were less likely to have parents who reported limiting the number of texts or other messages the teen could send. Just 9% of teens who sent sexy images by text had parents who restricted the number of texts or other messages they could send; 28% of teens who didn’t send these texts had parents who limited their child’s texting.

I spent the last two weeks of the semester trying to get students to percentage tables correctly. “Percentage on the independent variable,” I repeated and repeated. And now Amanda Lenhart at the Pew Foundation undermines all my good work. As Dan says,

It is unlikely that the authors are thinking that sexting causes parental restrictions – the sense is just the opposite – and so the percentaging should be within the categories of parental behavior and comparison across these.

Dan even does the math and finds:

Children of restrictive parents who ever sent a sext: 1.4% (3 of 218)
Children of non-restrictive parents who ever sent a sext: 5% (29 of 572)

Read Dan’s entire critique. Or for the truly absurd and probably counter-effectual, see the anti-sexting videos featuring (I am not making this up) James Lipton.

March Madness

March 18, 2009
Posted by Jay Livingston

“When Losing Leads to Winning.” That’s the title of the paper by Jonah Berger and Devin Pope. In the New York Times recently, they put it like this:

Surprisingly, the data show that trailing by a little can actually be a good thing.
Take games in which one team is ahead by a point at the half. . . . The team trailing by a point actually wins more often and, relative to expectation, being slightly behind increases a team’s chance of winning by 5 percent to 7 percent.

They had data on over 6500 NCAA games in four seasons. Here’s the key graph.

(Click on the chart for a larger view.)

The surprise they refer to is in the red circle I drew. The dot one point to the left of the tie-game point is higher than the dot one point to the right. Teams behind by one point at the half won 51.3% of the games; teams leading by a point won only 48.7.*

Justin Wolfers** at Freakonomics reprints the graph and adds that Berger and Pope are “two of the brightest young behavioral economists around.”

I’m not a bright behavioral economist, I’m not young, I’m not a methodologist or a statistician, and truth be told, I’m not much of an NCAA fan. But here’s what I see. First of all, the right half of the graph is just the mirror image of the left. If teams down by one win 51.3%, teams ahead by one have to lose 51.3%, and similarly for every other dot on the chart.

Second, this is not the only discontinuity in the graph. I’ve put yellow squares around the others.

Teams down by 7 points at the half have a slightly higher win percentage than do teams down by 6. By the same graph-reading logic, it’s better to be down by 4 points than by only 3. And the percentage difference for these points is greater than the one-point/tie-game difference.

Then, what about that statement that being down by one point at the half “ increases a team’s chance of winning by 5 percent to 7 percent”? Remember, those teams won 51.3% of the games. How did 1.3 percentage points above 50-50 become a 5-7% increase? You have to read the fine print: “relative to expectation.” That expectation is based on a straight-line equation presumably derived from the ten data points (all the score differentials from 10 points to one – no sense in including games tied at the half). That model predicts that teams down by one at the half will win only 46% of the time. Instead, they won 51.3%.

Berger and Pope’s explanation of their finding is basically the Avis factor. The teams that are behind try harder. Maybe so, but that doesn’t explain the other discontinuities in the graph. Using this logic, we would conclude that teams behind by seven try harder than teams behind by six. But teams behind by 5 don’t try harder than teams behind by four. And so on. Why do only some point deficits produce the Avis effect?

* Their results are significant at the .05 level. With 6500 games in the sample, I’d bet that any difference will turn out to be statistically significant, though the authors don’t say how many of those 6500 games had 1-point halftime differences.

**Wolfers himself is the author of another economics journal article on basketball, a study purporting to reveal unwitting racism among NBA referees. In that article as well, I thought there might be less there than meets the economist’s eye.

Muslims and Methodology

May 26, 2007
Posted by Jay Livingston

The Pew Research Center this week published a poll that asked Muslims in the US and other countries their views on several political issues. News stories here focused on the US results, but whether those results were cause for relief or alarm depends on who was reading the poll.

The mainstream press (aka “the mainstream liberal press”), ran headlines like these:
In many ways, US Muslims are in mainstream America (Christian Science Monitor)
Muslim Americans in line with US values (Financial Times)
Survey: U.S. Muslims Assimilated, Opposed to Extremism (The Washington Post)

The right-wing press picked up on one number in the poll:
TIME BOMBS IN OUR MIDST - 26% OF YOUNG U.S. MUSLIMS BACK KILLINGS (The New York Post)

Or as the Washington Times put it in an op-ed piece by Diana West, “According to Pew's data, one-quarter of younger American Muslims approve of the presence of skin-ripping, skull-crushing, organ-piercing violence in civilian life as a religious imperative —‘in defense of Islam.’”

For some on the right, 26% was a lowball estimate. Here’s Rush Limbaugh:

“Two percent of them say it can often be justified, 13% say sometimes, and 11% say rarely.” So let’s add it up, 26 and 2 is 28, so 31% think to one degree or another, suicide bombing is justified. If you add to that the 5% that don't know or refuse to answer, it's even worse. So almost a third of young American Muslims who support in one way or another homicide bombings according to the Pew poll.

(If Limbaugh had taken a basic statistics course, he could have inflated his estimate even more. There were only about 300 young Muslims in the sample, so the margin of error means that the true proportion might have been several percentage points higher.)

When a result can be open such different interpretation, maybe there’s something wrong with the survey methodology. Let’s take a look at the actual question:

Some people think that suicide bombing and other forms of violence against civilian targets are justified in order to defend Islam from its enemies. Other people believe that, no matter what the reason, this kind of violence is never justified. Do you personally feel that this kind of violence is often justified to defend Islam, sometimes justified, rarely justified, or never justified?

The conclusion of Limbaugh and others is that unless you say that killing is never ever justified, you’re a menace to society.

But what would happen if we substituted “Christianity” for “Islam” and polled Christians in the US? How many Christians would absolutely reject all violence in defense of Christianity? And how many might say that violence, even violence against civilians, is sometimes justified to defend Christianity from its enemies? I wouldn’t be surprised if the number were higher than 26%.

Consider the war in Iraq, which Limbaugh and the others support. Hundreds of thousands of civilians have been killed, several thousand of them as “collateral damage” in US operations. The “shock and awe” bombing in the original invasion certainly included “skin-ripping, skull-crushing, organ-piercing violence” upon civilians. But at the time, a majority of Americans supported it, and Bush’s position still remains that we went to war to defend our freedom against its enemies.

The survey has many more interesting findings, and it’s especially useful for comparing US Muslims with those of other countries. But of course those matters were not newsworthy.

Asking About Housework

October 20, 2006

Posted by Jay Livingston

The working mother — how does she find the time? Did being a worker mean that she would spend less time being a mom? A new study by Suzanne Bianchi finds that contrary to expectations some years back, work did not reduce the time mothers spent with their kids. In fact, parents — both moms and dads— are spending more time with their kids than parents did in previous generations. What’s been cut back is housework. (NPR did a longish (16 minute) report on the study — an interview with one of the authors, calls from listeners— which you can find here.)

There’s much to be said and blogged about Bianchi’s findings, but I want to make one small methodological observation, something I’ve mentioned to students. Some questions have a built-in “social desirability” bias. Suppose you want to know about reading habits. It’s socially desirable to have read more books (at least I hope it still is), so if you ask “How many books do you read a year?” or “How many books did you read last year?” you’re likely to get something higher than the actual number. Instead, you ask, “Did you read a book last week?” A person who never reads a book might be reluctant to say that he hadn’t read a single book last year. But there’s no social stigma attached to not having read a book last week.

The same thing goes for housework and parenting. Ask me how many hours I spend on housework and childcare each week, even though as a good friend of social research I’d try to be accurate, I’d probably try to be accurate on the good side. So as the Times reports, “Using a standard set of questions, professional interviewers asked parents to chronicle all their activities on the day before the interview.” (The study notes that we dads are doing more of both than fathers of only a few years ago.)

(More later. Right now, I have to put the wash in the drier, start making dinner, and help my son with his homework.)

Negative Results

September 20, 2006

Posted by Jay Livingston

A man gets thrown into a jail cell with a long-term occupant and then begins a series of attempts to escape, each by some different method. He fails every time, getting captured and thrown back in the cell. The older prisoner looks at him silently after each failure. Finally, after six or seven attempts, the man loses his patience with the old prisoner and says “Well, couldn’t you help me a little?” “Oh,” says the old guy, “I’ve tried all the ways you thought of—they don’t work.” “Well why the hell didn’t you tell me?!” shouts the man. “Who reports negative results?” says the old prisoner.

thanks to sociologist and blogger Kieran Healy (http://www.kieranhealy.org/blog/).

I hadn’t heard the joke before, but I’ve certainly heard of the bias towards positive results. Back when I was in graduate school, one of my professors, an experimental social psychologist, proposed that journals evaluate papers solely on the basis of research design. Researchers would submit all the preliminary stuff including the design but not the results. Then, if they accepted the article, it would be published regardless of whether the results showed the intended effects.

Healy used the joke in connection with an article on biases in political science journals. Articles which just make the p <.05 level are much more likely to be published than are those that just miss it. I’m not sure if political science and other disciplines (like sociology) that rely on survey data could use the same strategy on deciding on publication before they see the data. That strategy may be more applicable to experiments rather than surveys.

I find it interesting, even ironic, that my social psych professor who proposed this soon became very well known for his own experimental study, whose results were widely discussed even outside academia. But statisticians reviewing the data claimed that he had used the wrong statistical analyses in order to make his results look significant. His idea might be right, the critics said —in fact they probably hoped it was right — but the numbers in the study didn’t prove it. The professor and others claimed that the numbers did support the idea and defended their analyses. Clearly, it was a case that needed replication studies, lots of them. But I’m not sure what attempts to replicate the study have been done, nor do I know what the results have been. But I am fairly certain that researchers who attempted to replicate and got negative results had a harder time getting published than did those who got positive results.

This professor also had our class do a replication of one of his experiments. It didn’t work. In fact, the strongest correlation was with a variable that by design was randomized. There were two versions of some test, A and B, distributed randomly to seats in a room. We wanted to see if the two versions of the test produced different results. People came in, sat down, and did the test. But it turned out that the strongest correlation was between sex and test version. That is, the A version wound up being taken mostly by girls, the B version by boys, just by accident of where they chose to sit. No other difference between the two versions was nearly so strong. It made me a bit skeptical about the whole enterprise.

Montclair SocioBlog