yellow brick road to stats heaven

~ a loose collection of statistical and quantitative research material for fun and enrichment ~

by roland b. stark

critique: occasional commentary on research methods and analyses

yellowbrickstats home | my statistical and research consulting

"how to (and how not to) assess the effect of images in warning labels"

June 29, 2018

"Ours is the first study to evaluate the effectiveness of sugary drink warning labels," touts Grant Donnelly, a lead author of a joint study by the Harvard Business School and Harvard University Behavioral Insights Group. Kudos for their smart approach to testing the effect of images as part of those warning labels (objective measures showed that images indeed brought about the desired reduction in purchases).

But shame on the researchers for ignoring or missing decades of psychological and behavioral-economics research on the best ways of investigating cause and effect. For the study also incorporated a naive direct question asking participants "how seeing a graphic warning label would influence their drink purchases." An abundant literature, from Nisbett and Wilson (1977) to my own recent article, shows that it would be foolish to trust in such subjective interpretations of the factors behind each person's decision-making process. After acquiring such good, objective information, why would Donnelly et al. water it down with subjective findings that are sure to introduce bias?

UPDATE: the original study materials made available by the authors at Open Science Framework tell a different story than the summary in the Harvard Gazette quoted above. The survey did not ask respondents "how seeing a graphic warning label would influence their drink purchases." Instead, the survey asked for reactions to the images and then asked about intention to buy a soft drink -- each topic much more amenable to unbiased reporting by a participant than the causal assessment would be. The responses would then be linked "in the back end" by the researchers to investigate any causal link. A good design.

Pooh Bear

"of poohsticks and p-values: hypothesis testing in the hundred acre wood"

march 13, 2018

Just discovered Eric D. Nordmoe's fun and informative creation from 2004. "A walk through Milne's Enchanted forest leads to an unexpected encounter with hypothesis testing." This enjoyable little article is instructive for those new to statistics and full of pleasing connections for the initiated.

gun control: the right research evidence makes policy decisions easy

march 12, 2018

Suppose a nationally-scaled, 30-year, multiple-author, peer-reviewed, non-partisan, public-health-oriented study concluded the following: "Where guns are more widely available, no more of the burglars and intruders are getting shot, but more of the gun-owners' family and friends are."

This is the cental finding of The Relationship Between Gun Ownership and Stranger and Nonstranger Firearm Homicide Rates in the United States, 1981–2010. The authors explain, "Our models consistently failed to uncover a robust, statistically significant relationship between gun ownership and stranger firearm homicide rates (Tables 3 and 4). All models, however, showed a positive and significant association between gun ownership and nonstranger firearm homicide rates." They add: "for each 1 percentage point increase in the gun ownership proxy, [stranger firearm homicide rates stayed the same, whereas] nonstranger firearm homicide rates increased by 1.4%. [Similarly,] a 1 standard deviation increase in gun ownership [13.8%] was associated with a 21.1% increase in the nonstranger firearm homicide rate."

The research is very sound.

  • Siegel, Negussie, Vanture, Pleskunas, Ross, and King paid close attention to the validity of the indicators they used, and they made intelligent use of a proxy when a direct measurement was not available. For their main predictor, "the annual prevalence of household firearm ownership in a given state," they substituted the percentage of suicides committed using a firearm, and they clearly explained why this would be effective.

  • The authors took great care to isolate the relationship of greatest interest by controlling for nuisance variables.

  • They conducted sensitivity analysis: where a judgment call might result in the choice of one analytic approach or another, they analyzed their data in multiple ways to see how much the results changed. One example of this was their treatment of missing data.

Can you refute their findings?

Women's March Jan. 2017

a brilliant way to investigate the effects of public protest using a natural experiment

sep. 9, 2017

Read Dan Kopf's excellent Quartz summary or the full article by Andreas Madestam, Daniel Shoag, Stan Veuger, and David Yanagizawa-Drott from Harvard and Stockholm Universities. Want to know to what degree political demonstrations produced results in elections? Track the rain. The rain? It actually makes a beautiful example of what's termed an instrumental variable. Whether it rains at protest locations can scarcely have anything directly to do with ultimate election results, but it unquestionably relates to turnout for each demonstration. If the size of turnout relates to election results, then the rain should, statistically (if not causally), relate to them as well. "If the absence of rain means bigger protests, and bigger protests actually make a difference, then local political outcomes ought to depend on whether or not it rained [on protest days]...As it turns out, protest size really does matter."

how not to attribute causality from statistical results

sep. 9, 2017

From a major outlet for health care research findings, Fierce Health Care. I've reproduced a key passage and commented inline in color.

"Employment status is the top socioeconomic factor affecting 30-day [US hospital] readmissions for heart failure, heart attacks or pneumonia, according to a new study from Truven Health Analytics.

[Such a conclusion is on very shaky ground, as you'll see.]

As readmission penalties reach record highs, analyzing causes is more important than ever.


Researchers, led by David Foster, Ph.D., collected 2011 and 2012 data from the Centers for Medicare & Medicaid Services and used a statistical test called the Variance Inflation Factor (VIF) for correlations among the nine factors in the Community Need Index (CNI): elderly poverty, single parent poverty, child poverty, uninsurance, minority, no high school, renting, unemployment and limited English.

[In truth, the VIF tells not what is the most important factor, but only to what extent the different factors, or independent variables, overlap with one another, potentially confounding the results. In this case, trying to isolate one indicator of socioeconomic status (SES) while controlling for eight others will surely distort the connection between any of these indicators and the outcome. These SES indicators are too much "part and parcel of" one another, too inseparable, to allow for valid use of control in this way. It's a mistake to ask "How much does SES (version 1) relate to readmission if we statistically remove SES (versions 2-9) from the relationship?" Much like saying, "How addicted am I to desserts if you discount my intake of cookies, pie, and ice cream?" Or there's Monty Python's "Apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, the fresh-water system, and public health, what have the Romans ever done for us?"]

Their analysis found unemployment and lack of high school education were the only statistically significant factors in connection with readmissions, carrying a risk of 18.1 percent and 5.3 percent, respectively, according to the study."

[As explained above, these are not valid conclusions to be drawn. But even if the numbers were somehow accurate, what could such statements mean? That readmission risk becomes on average 5.3% for non-high-school graduates? Can't be -- way too low. That it's 5.3 points higher than it would be otherwise? Can't be -- too high. 5.3% higher in relative terms? Maybe, but that would hardly merit calling high school education an important factor. So what's left?

Captain Obvious

readmission rates: 58% of variance explained!?

nov. 18, 2015

"Fifty-eight percent of national variation in hospital readmission rates was explained by the county in which the hospital was located," announce Jeph Herrin et al. in Community Factors and Hospital Readmission Rates, published in 2014 in Health Services Research . Sound odd to you? After all, for most readmission studies the percent explained is in single digits. Being able to account for 4 or 5% of the variation translates to an ability to assess individual risk that can meaningfully aid in clinical decisions. Even Harlan Krumholz and his team of 17 researchers and statisticians, the ones whose predictive models form the basis for the national readmission penalty system imposed by Medicare, have usually only explained 3-8%. And those models have taken into account about 50 input variables.

It turns out that Herrin et al. took their data on 4,073 hospitals and broke it down by 2,254 counties. There were almost as many counties as hospitals themselves. And many counties contained only a single hospital.

Now, suppose the authors had divided the 4,073 into, say, 4 groups defined by region, and found that the 4 groups had sizeable differences in readmission rate. That would have been a meaningful way to summarize the data. Even if they had formed somewhat more groups -- say, one for each of the 50 states -- that might have been meaningful; the data would have been spread pretty thin for some states. But to "explain" differences using 2,254 groups? It's not a far cry from simply listing the readmission rates of all 4,073 hospitals and claiming victoriously to have "explained" 100% of the variance in the hospital-to-hospital rate. Sounds like a feat for Captain Obvious .

One reason why this matters a great deal is that, to the extent that some geographic factor is considered responsible for this outcome, hospital performance will no longer be considered responsible. So if county in fact explained 58% of the variance, then hospital performance, it might be argued, couldn't account for more than 42%. This is the incorrect conclusion that was reported in unqualified fashion by news outlets such as Becker's Hospital Review.

The article by Herrin and colleagues makes contributions in other ways, of course, but the chief findings are very misleading. Watch for dialogue, in Health Services Research or elsewhere, on how to interpret the results. The upshot should be quite a bit more nuanced and moderated than what we've seen above. And if you're interested in the role of socioeconomic factors in hospital readmission, you'll find information at ReInforced Care, Inc.