The GiveWell Blog

Beware just-published studies

A recent study on health care in Ghana has been making the rounds – first on Megan McArdle’s blog and then on Marginal Revolution and Overcoming Bias. McArdle says the study shows “little relationship between consumption and health outcomes”; the other two title their posts “The marginal value of health care in Ghana: is it zero?” and “Free Docs Not Help Poor Kids.” In other words, the blogosphere take is that this is a “scary study” showing that making primary care free doesn’t work (or even perhaps that primary care doesn’t work).

But wait a minute. Here’s what the study found:

  • It followed 2,592 Ghanaian children (age 6-59 months). Half were randomly selected to receive free medical care, via enrollment in a prepayment plan. The medical care included diagnosis, antimalarials and other drugs, but not deworming.
  • Children with free treatment got medical care for 12% more of their episodes (2.8 vs. 2.5 episodes per year per person).
  • Health outcomes were assessed after 6 months:
    • Moderate anemia (the main measure) afflicted 36 of the children who got free care, vs. 37 of the children who didn’t.
    • Severe anemia afflicted 2 of the children who got free care, vs. 3 of the children who didn’t.
    • There were five deaths among children who got free care, vs. 4 among children who didn’t.
    • Parasite prevalence and nutrition status were also measured but not considered to be good measures of the program’s effects (since it did not include deworming or nutrition-centered care).

Would you conclude from this that the free medical care was “ineffective?” I wouldn’t – I’d conclude that the study ended up with very low sample size and low “power” because the children it studied were much healthier than expected. The researchers predicted an anemia prevalence of 10%, but the actual prevalence was just under 3%. Severe anemia and death were even rarer, making any comparison of those numbers (2 vs. 3 and 5 vs. 4) pretty meaningless. So in the end, we’re looking at a control group of 37 kids with moderate anemia and looking for a significant difference in the other group, from a 6-month program – and one that didn’t even address all possible causes of anemia (again, there was no deworming and it doesn’t appear that there was iron supplementation – the only relevant treatment was antimalarials).

Bottom line, free medical care didn’t appear to lead to improvement, but there also didn’t appear to be much room for improvement in this particular group. A similar critique appears in the journal (and points out that we don’t even know how often anemia can be expected to be attributed to malaria vs. parasites or other factors).

Some possible explanations for the relatively low levels of anemia include:

  • The presence of observers led everyone to make more use of primary care (the “Hawthorne effect,” a possibility raised by a Marginal Revolution commenter).
  • Less healthy people (and/or people who used primary care less) were less likely to stay enrolled in the study (7-8% dropped out), so that the people who stayed in had better health.
  • Or for some other reason (selection of village?), the researchers studied an unusually or unexpectedly healthy group. Perhaps a group that already uses primary care when it’s very important to do so, such that the “extra” visits paid for by the intervention were lower-stakes ones, or just weren’t enough (again, only a 12% difference) to impact major health outcomes among the small number of afflicted children.

All of these seem like real possibilities to me, and the numerical results found don’t seem to strongly suggest much of anything because of the low power (as the critique observes).

I saw a similar dynamic play out a month ago: Marginal Revolution linked a new study claiming that vaccination progress has been overstated, but a Center for Global Development scholar raises serious methodological concerns about the study. I haven’t examined this debate enough to have a strong opinion on it, and overestimation seems like a real concern; but we want to see how the discussion and reaction plays out before jumping to conclusions from the new study.

We’re all for healthy skepticism of aid programs, and we like reading new studies. But in drawing conclusions, we try to stick to studies that are a little older and have had some chance to be reviewed and discussed (and we generally look for responses and conflicting reactions). Doing so still leaves plenty of opportunities to be skeptical, as with the thoroughly discussed New York City Voucher Experiment and other ineffective social programs.

Why we prefer the carrot to the stick

A couple of the commenters on a previous post object to our idea of “rewarding failure” and prefer to focus on “putting the bad charities out of business.”

In theory, I’d like to see a world where all charities are evaluated meaningfully, and only the effective ones survive. But the world we’re in is just too far from that. The overwhelming majority of charities simply perform no meaningful evaluation one way or the other – their effects are a big question mark.

It’s not in our power to sway enough donors – at once – to starve the charities that don’t measure impact. (And even if it were, there are simply too many of these for starving them all to be desirable.) But it is in our power to reward the few that do measure impact, thus encouraging more of it and creating more organizations that can eventually outcompete the unproven ones.

Of course failure isn’t valuable by itself, and shouldn’t be rewarded. But showing that a program doesn’t work is expensive and valuable in and of itself, and should be rewarded. As Paul Brest says, “the real problem is that, unless they are doing direct services, most nonprofits don’t know whether they are succeeding or failing.”

Evaluation is valuable whether or not it turns out to have positive results. Yet currently, only positive results are rewarded – honest evaluation is riskier than it should be. This is the problem that the “failure grant” idea is aimed at.

Uncharitable

Dan Pallotta sent me a copy of Uncharitable about a month ago, and I’ve been late in taking a look at it.

I highly recommend it for people interested in general discussions of the nonprofit sector.

The discussion I’ve seen of the book so far (Nicholas Kristof and Sean Stannard-Stockton) has focused on how much we should be bothered when people make money off of charity. Personally, I feel that I’ve yet to see a good argument that we should care how much money do-gooders make – as opposed to how much good they do (and how much it costs).

The chapter closest to my heart, though, is the one called “Stop Asking This Question.” Mr. Pallotta slams donors who focus on “how much of my dollar goes straight to the programs,” devoting even more ink to the matter than we have. We need more people pounding on this point.

The book’s basic theme, as I understand it, is this: what matters in philanthropy is the good that gets done, not anything else. Attacking programs that are effective (at helping people, at raising money, etc.) because they don’t conform to some abstract idea of yours about how nonprofits should run/think/feel is simply hurtful to the people philanthropy seeks to help (not to mention arrogant). The book often illustrates this point with nonprofit/for-profit analogies that some would find inapproprioate … but putting all analogies aside, it seems to me that this basic point shouldn’t be controversial.

A proposal to reward failure

Here’s a grant idea we’d probably pursue if we had the funds to do so. I’d be interested in what other grantmakers think of it. I believe enormous good could be done by offering grants to charities that can prove their programs don’t work.

Proving a program’s ineffectiveness is difficult and expensive – just as much as proving its effectiveness. Few charities have done either. As a result, the world knows extremely little about what programs do and don’t work, and thus about how to change people’s lives for the better.

You should consider this a serious problem unless you think that nearly all charitable programs are effective. (If you think this, you should consider this list of duds).

Of course, one solution is to reward charities that are effective and can demonstrate it. That’s the approach we generally take. But there are problems with the incentives this approach (by itself) gives to unproven charities.

We’d like an unproven charity to examine its programs rigorously, and get our recommendation if the results turn out positive. But what if the programs it’s running turn out not to work (i.e., change lives)? Then the charity will have spent money and time to weaken its own case. Something of a scary proposition – and a reason to be less than evenhanded in conducting evaluations.

But what if a foundation said the following? “If you can really prove that your program isn’t working – not just that it’s underfunded or has room for improvement, but that it fails to change lives when carried out correctly – that’s valuable. That improves our collective knowledge of what works, and demonstrate a true commitment to your mission, not just your activities.

“For a charity that can prove its programs aren’t working, we’ll provide you with the funding to redesign what you’re doing. If you have the right mission and the wrong tactics – and the guts to admit it – we’ll help you change those tactics, so you can accomplish the goal (saving lives, promoting equality of opportunity, etc.) that you really care about.”

If there were enough funding along these lines, carefully examining effectiveness – with no preconceptions or manipulation, just an honest desire for better knowledge – would be win-win for a charity. As it should be.

Guest post: Proven programs are the exception, not the rule

In 2017, the organization 80,000 Hours conducted a project to update the findings in this post. Their work is here.

This is a guest post from David Anderson, Assistant Director at the Coalition for Evidence-Based Policy, the group responsible for the Evidenced-based Programs website. Mr. Anderson’s responsibilities include reviewing studies of effectiveness and looking for proven programs. He’s worked at the Coalition since 2004. The views expressed below are solely those of the writer.

The holiday season often heightens our desire for “feel good” stories. Rigorous research has shown that it can fulfill this desire by identifying, in a scientifically rigorous way, a few social programs and services that are capable of producing important, positive effects on people’s lives.

For example, the Nurse Family Partnership – a nurse home visitation program for women who are poor, mostly single, and pregnant with their first child (recommended by GiveWell here) – has been shown in three rigorous randomized studies to produce long-term, 40-70% reductions in child abuse and neglect, as well as important improvements in cognitive and academic outcomes for the most at-risk children served by the program. The very existence of research-proven programs like the Nurse Family Partnership, suggests that a concerted effort to build the number of these programs through rigorous research, and to spur their widespread use, could fundamentally improve the lives of millions of people.

However, what’s not often recognized is how rare—and therefore, valuable—such examples of proven effectiveness are. Their scarcity stems from two main factors: 1) the vast majority of social programs and services have not yet been rigorously evaluated, and 2) of those that have been rigorously evaluated, most (perhaps 75% or more), including those backed by expert opinion and less-rigorous studies, turn out to produce small or no effects, and, in some cases negative effects. These are factors we’ve identified through our reviews of hundreds of studies that federal agencies, Congress, and philanthropic organizations have asked us to look at.

The following are just a few illustrative examples of this general pattern of many good ideas turning out not to work when tested in a rigorous way:

  • 21st Century Community Learning Centers—a rigorous randomized study of after school programs at 26 elementary schools, funded by the U.S. Department of Education, found that, on average, they had no effect on students’ academic achievement and had negative effects on their behavior (i.e. increased rates of school suspensions and other disciplinary problems compared to control group students). (See note 1 below)
  • Many leading educational software products—a rigorous randomized study of 16 leading educational software products for teaching reading and math – including many award-winning products – found, on average, no difference in reading or math achievement between students who used them in their classrooms, and those who were taught through usual methods. (See note 2 below)
  • Even Start—a rigorous randomized evaluation of 18 Even Start family literacy programs funded by the U.S. Department of Education found that, on average, children and adults served by the programs scored no higher on reading tests than their control group counterparts. (See note 3 below)
  • New Chance Demonstration—a rigorous randomized evaluation of 16 schools and organizations funded by the U.S. Department of Labor to provide comprehensive case management to teenage mothers, designed to improve their employment outcomes, found that, on average, these programs had no effect on their employment or earnings. (See note 4 below)

Importantly, these findings do not mean that all after-school programs, educational software, family literacy programs, and case management services are ineffective– just that many widely used approaches in these areas don’t work, and additional research is needed to identify those that do. While such findings are disheartening, they illustrate the importance of targeting government funding, as well as individual donations, on the relatively few programs and practices that have been shown in rigorous evaluations to be highly effective. Doing so will increase the likelihood that public and private dollars are truly going to help improve people’s lives in important ways—and that’s something to feel good about this holiday season.

References:

  1. James-Burdumy et al. “When Schools Stay Open Late: The National Evaluation of the 21st Century Community Learning Centers Program Final Report.” U.S. Department of Education/Institute of Education Sciences National Center for Evaluation and Regional Assistance. April 2005.
  2. Dynarski, et. al. “Effectiveness of Reading and Mathematics Software Products: Findings From the First Student Cohort: Report to Congress.” U.S. Department of Education/Institute of Education Sciences National Center for Evaluation and Regional Assistance. March 2007.
  3. St. Pierre et. al. “Third National Even Start Evaluation: Program Impacts and Implications for Improvement.” U.S. Department of Education’s Planning and Evaluation Service. 2003.
  4. Quint et al. “New Chance: Final Report on a Comprehensive Program for Young Mothers in Poverty and Their Children.” MDRC. 1997

Before you donate

Nathaniel Whittemore’s Social Entrepreneurship Blog asks bloggers for “one thing you need to know before you donate to charity this holiday season.” My answer: you need to know that your favorite social program might just not work.

This isn’t a warning against fraud or inefficiency (though both of those are important too). It’s a warning against programs that just don’t change people’s lives in the way we hope – even if they seem to make perfect sense, and even if they’re carried out perfectly.

The first $17,000 I ever donated (personally) was to programs that I now believe don’t work. During my years in the finance industry, I gave to the best organizations I could find for improving education (jr. high and high school) in NYC.

I considered education my favorite cause. I assumed that equality of schooling was the key to equality of opportunity. I didn’t have the time or the energy to question this assumption. I now believe this assumption is badly wrong, for reasons that are outlined here.

I wish I could take that money back: de-fund the “small schools” and extracurricular activities I supported (both of these are programs I now know to have very questionable, if not negative, track records) and instead fund programs for early childhood (where I believe inequality of opportunity really begins) or international aid (where it’s far more drastic).

I wish that money had gone to organizations that I really believe are changing lives in a significant and lasting way, but it didn’t. Please don’t make my mistake.

There are great-sounding programs out there, based on one theory or another of what the roots of poverty and opportunity are. When put under the microscope, many of these great-sounding programs just don’t work, most likely because they simply didn’t take the right approach to the complicated problems they’re trying to solve. Many more of these programs are unexamined and unproven.

Charities that will put your money into proven ways of helping people are the exception, not the rule. They’re not necessarily easy to find. They’re not necessarily the same ones that knock on your door and get your friends excited. This season, with or without GiveWell’s help, I hope you find one.