Methodology in the News

April 20, 2012
Posted by Jay Livingston

1. “Survey Research Can Save Your Life,” says Joshua Tucker at the Monkey Cage. He links to this NBC news story about a woman who went into diabetic shock while on the phone with a student pollster working for Marist.  He sensed something was wrong and told his supervisor.  She spoke to the woman and then called 911.  (The news story does not identify the student working the phone survey, only the supervisor.  Nor does it say whether the woman approved or disapproved of Mayor Bloomberg.)

2.  The New York Times this week reported on a RAND study that found no relation between obesity and “food deserts.”  The study used a large national sample; it’s undoubtedly comprehensive.  The problem is that if you are using a national sample of schools or supermarkets or stores or whatever,  two units that fall into the same category on your coding sheet might look vastly different if you went there and looked at them from close range. 

Peter Moskos at Cop in the Hood took a closer look at the RAND study, reported in the Times, RAND relied on a pre-existing classification of businesses. The prefix code 445 indicates  a grocery store. Peter, an ethnographer at heart, has his doubts:
New York is filled with bodega “grocery stores” (probably coded 445120) that don't sell groceries. You think this matters? It does. And the study even acknowledges as much, before simply plowing on like it doesn't. A cigarette and lottery seller behind bullet-proof glass is not a purveyor of fine foodstuffs, and if your data doesn't make that distinction, you need to do more than list it as a “limitation.” You need to stop and start over.
3.  NPR’s “Morning Edition” had a story (here) on death penalty research, specifically on the question of deterrence.  A National Research Council panel headed by Daniel Nagin of Carnegie Mellon University reviewed all the studies and concluded that they were inconclusive, mostly for methodological reasons.  For example, most deterrence studies looked at the death penalty in isolation rather than comparing it with other specified punishments. 

Another methodological problem not mentioned in the brief NPR story is that the number of executions may be too small to provide meaningful findings.  For that we’d need a much larger number of cases.  So this is one time when, at least if you are pro-life, an inadequate sample size isn’t all bad.

4 comments:

mike3550 said...

Jay -- I think that there are a couple of errors in your write-up of the RAND study. First, the RAND study examined kids in California, not nationwide. The story talks about two studies and it was the second, by Helen Lee of Public Policy Institute that did the nationwide analysis.

Second, Moskos is incorrect about how An and Sturm coded their data. The NAICS system is developed to provide increasing levels of detail with each group of digits added. The first two digits represent broad industry (44-45 are retail establishments); the next digit indicates subordinate industries (445 are food and beverage retail establishments); the next digit is the trade industry (4451 are grocery stores); the next one to two digits represent specializations (445110 are grocery stores and supermarkets). Beyond these codes required for all data on industry collected by the federal government, the data proprietor An and Sturm used also provide more detailed classifications and sales data that they use to classify stores into three categories: small stores, grocery stores, and large supermarkets. The relevant section of their paper reads: "and small food stores (annual sales <$1 million); midsize grocery stores (annual sales $1–$5 million); and large supermarkets (annual sales >$5 million) are identified as NAICS codes 44511001-3." Therefore, Moskos' claim is just flat wrong that they conflated Whole Foods with shatterproof glass bodegas. It is also worth noting that these classifications fall very closely in line with the study of Baltimore food stores that Moskos praises.

I find it disturbing that Moskos, and especially those commenting on his blog, seem to impugn the research because it comes out of RAND. This, I see, as no different than right-wingers denying scientific evidence because it comes out of "liberal universities." One cannot expect that the truth will always be inconvenient for the other guy.

Jay Livingston said...

Hi Mike. Long time, no see. Obviously, I had not read the RAND report, so I appreciate the additional information. (How do you happen to know about this stuff?) I agree with you that we shouldn't a study purely on an ad hominem basis. If it comes from a source whose politics you don't like, that just means you have to look closely at the data. Still, I'm not sure how much weight to attribute to individual food preferences and how much to food availability. And I'm a bit skeptical about Sturm's notion that "Within a couple of miles of almost any urban neighborhood,you can get basically any type of food,'” (quoted in the NYT article). The only city-dwellers I know who travel "a couple of miles" to do grocery shopping are middle class (or upper middle class) people with cars who drive to a big box discount store to by big boxes of stuff.

PCM said...

I impugn quantitative studies that make the same kinds of fatal errors again and again. I do not and have never impugned studies because they come out of RAND (besides, some of my best friends work for RAND). RAND conducts lots of good studies; this isn't one of them.

You're right that 445 includes different subcategories. But the problem is still found in those more detailed coding you mention. So indeed, Whole Foods would not be classified the same as a bullet-proof bodega, and you're right to call me out on this semantic slight-of-hand.

I used Whole Foods as shorthand in my blog post because there is no well recognized national chain of small upscale grocers. Whole Foods *is* classified the same as, say, the Stop Shop N Save grocery store ("Stop, Shop, and Steal" in cop lingo, and also no longer in existence). Since (I assume) they both have annual sales above $5 million, the NAICS says they're the same. The study then assumes people have the same food choices as long as they live near ether of them. Absurd. Whole Foods and Stop Shop N Save are qualitatively different retail experiences, to put it mildly.

More to the point, a bullet-proof bodega in the ghetto is classified the same as a small upscale grocery store in a "nice" neighborhood... even though the former does not actually sell food! It is classified as a small grocery store because at one time, many decades ago, it probably actually was. The Hopkins's Baltimore study supports this point very well.

The RAND study conflates extreme disparate retail because the NAICS classification doesn't look at what a store actually sells (as long as the "food" store doesn't have gas pumps). This is a fatal flaw in the data that An and Sturm fail to confront (acknowledging this matter as a "limitation" is not sufficient).

Flawed data leads to gross statistical errors, particularly when the validity errors in the independent variables (NAICS classification) are not random and highly correlated with the dependent variable (an area being called a food desert).

This error kills me because it's so damn common in quantitative analysis, particularly from researchers who don't see the benefit of actually leaving their well upholstered offices and taking a look around.

NAICS classifies dozens of food outlets in the Eastern District. This simply is not true. And anybody who talked to people trying to buy fresh ingredients for dinner could tell you this.

PCM said...

See, for instance, this grocery store.