The GiveWell Blog

Quick notes on our progress

A few updates for people interested in the nuts and bolts of GiveWell’s progress (some of these have been included in our email updates, but not yet flagged on our blog):

  • We’ve recently (this week) updated our research agenda – see the updated agenda here.
  • The William and Flora Hewlett Foundation awarded us $100,000 for general operating support (the grant was made in December).
  • The membership of our Board has changed, as two members have left and two have joined within the last few months – see our updated list.
  • Audio and materials uploaded for old Board meetings, through November 2008 – view them all here
  • Final versions of our IRS Form 990 and audited financial statement for 2007 are available on our website here.
  • A full history of our business plans and changes of direction – including the most recent in November of 2008 – is now available here.
  • We now offer the GiveWell Advance Donation – implemented through a donor-advised fund – as a way for donors to give (and get their tax deduction) now, while deciding which of our recommended charities should get their funds after our next round of research.
  • In addition to our research email group, we’ve created a “general GiveWell project” email group for people who wish to discuss general GiveWell-related issues. Subscribers will receive our periodic email updates as well as alerts when we add substantial new content to our website or make substantial changes to our plans.

Beware just-published studies

A recent study on health care in Ghana has been making the rounds – first on Megan McArdle’s blog and then on Marginal Revolution and Overcoming Bias. McArdle says the study shows “little relationship between consumption and health outcomes”; the other two title their posts “The marginal value of health care in Ghana: is it zero?” and “Free Docs Not Help Poor Kids.” In other words, the blogosphere take is that this is a “scary study” showing that making primary care free doesn’t work (or even perhaps that primary care doesn’t work).

But wait a minute. Here’s what the study found:

  • It followed 2,592 Ghanaian children (age 6-59 months). Half were randomly selected to receive free medical care, via enrollment in a prepayment plan. The medical care included diagnosis, antimalarials and other drugs, but not deworming.
  • Children with free treatment got medical care for 12% more of their episodes (2.8 vs. 2.5 episodes per year per person).
  • Health outcomes were assessed after 6 months:
    • Moderate anemia (the main measure) afflicted 36 of the children who got free care, vs. 37 of the children who didn’t.
    • Severe anemia afflicted 2 of the children who got free care, vs. 3 of the children who didn’t.
    • There were five deaths among children who got free care, vs. 4 among children who didn’t.
    • Parasite prevalence and nutrition status were also measured but not considered to be good measures of the program’s effects (since it did not include deworming or nutrition-centered care).

Would you conclude from this that the free medical care was “ineffective?” I wouldn’t – I’d conclude that the study ended up with very low sample size and low “power” because the children it studied were much healthier than expected. The researchers predicted an anemia prevalence of 10%, but the actual prevalence was just under 3%. Severe anemia and death were even rarer, making any comparison of those numbers (2 vs. 3 and 5 vs. 4) pretty meaningless. So in the end, we’re looking at a control group of 37 kids with moderate anemia and looking for a significant difference in the other group, from a 6-month program – and one that didn’t even address all possible causes of anemia (again, there was no deworming and it doesn’t appear that there was iron supplementation – the only relevant treatment was antimalarials).

Bottom line, free medical care didn’t appear to lead to improvement, but there also didn’t appear to be much room for improvement in this particular group. A similar critique appears in the journal (and points out that we don’t even know how often anemia can be expected to be attributed to malaria vs. parasites or other factors).

Some possible explanations for the relatively low levels of anemia include:

  • The presence of observers led everyone to make more use of primary care (the “Hawthorne effect,” a possibility raised by a Marginal Revolution commenter).
  • Less healthy people (and/or people who used primary care less) were less likely to stay enrolled in the study (7-8% dropped out), so that the people who stayed in had better health.
  • Or for some other reason (selection of village?), the researchers studied an unusually or unexpectedly healthy group. Perhaps a group that already uses primary care when it’s very important to do so, such that the “extra” visits paid for by the intervention were lower-stakes ones, or just weren’t enough (again, only a 12% difference) to impact major health outcomes among the small number of afflicted children.

All of these seem like real possibilities to me, and the numerical results found don’t seem to strongly suggest much of anything because of the low power (as the critique observes).

I saw a similar dynamic play out a month ago: Marginal Revolution linked a new study claiming that vaccination progress has been overstated, but a Center for Global Development scholar raises serious methodological concerns about the study. I haven’t examined this debate enough to have a strong opinion on it, and overestimation seems like a real concern; but we want to see how the discussion and reaction plays out before jumping to conclusions from the new study.

We’re all for healthy skepticism of aid programs, and we like reading new studies. But in drawing conclusions, we try to stick to studies that are a little older and have had some chance to be reviewed and discussed (and we generally look for responses and conflicting reactions). Doing so still leaves plenty of opportunities to be skeptical, as with the thoroughly discussed New York City Voucher Experiment and other ineffective social programs.

Why we prefer the carrot to the stick

A couple of the commenters on a previous post object to our idea of “rewarding failure” and prefer to focus on “putting the bad charities out of business.”

In theory, I’d like to see a world where all charities are evaluated meaningfully, and only the effective ones survive. But the world we’re in is just too far from that. The overwhelming majority of charities simply perform no meaningful evaluation one way or the other – their effects are a big question mark.

It’s not in our power to sway enough donors – at once – to starve the charities that don’t measure impact. (And even if it were, there are simply too many of these for starving them all to be desirable.) But it is in our power to reward the few that do measure impact, thus encouraging more of it and creating more organizations that can eventually outcompete the unproven ones.

Of course failure isn’t valuable by itself, and shouldn’t be rewarded. But showing that a program doesn’t work is expensive and valuable in and of itself, and should be rewarded. As Paul Brest says, “the real problem is that, unless they are doing direct services, most nonprofits don’t know whether they are succeeding or failing.”

Evaluation is valuable whether or not it turns out to have positive results. Yet currently, only positive results are rewarded – honest evaluation is riskier than it should be. This is the problem that the “failure grant” idea is aimed at.

Uncharitable

Dan Pallotta sent me a copy of Uncharitable about a month ago, and I’ve been late in taking a look at it.

I highly recommend it for people interested in general discussions of the nonprofit sector.

The discussion I’ve seen of the book so far (Nicholas Kristof and Sean Stannard-Stockton) has focused on how much we should be bothered when people make money off of charity. Personally, I feel that I’ve yet to see a good argument that we should care how much money do-gooders make – as opposed to how much good they do (and how much it costs).

The chapter closest to my heart, though, is the one called “Stop Asking This Question.” Mr. Pallotta slams donors who focus on “how much of my dollar goes straight to the programs,” devoting even more ink to the matter than we have. We need more people pounding on this point.

The book’s basic theme, as I understand it, is this: what matters in philanthropy is the good that gets done, not anything else. Attacking programs that are effective (at helping people, at raising money, etc.) because they don’t conform to some abstract idea of yours about how nonprofits should run/think/feel is simply hurtful to the people philanthropy seeks to help (not to mention arrogant). The book often illustrates this point with nonprofit/for-profit analogies that some would find inapproprioate … but putting all analogies aside, it seems to me that this basic point shouldn’t be controversial.

A proposal to reward failure

Here’s a grant idea we’d probably pursue if we had the funds to do so. I’d be interested in what other grantmakers think of it. I believe enormous good could be done by offering grants to charities that can prove their programs don’t work.

Proving a program’s ineffectiveness is difficult and expensive – just as much as proving its effectiveness. Few charities have done either. As a result, the world knows extremely little about what programs do and don’t work, and thus about how to change people’s lives for the better.

You should consider this a serious problem unless you think that nearly all charitable programs are effective. (If you think this, you should consider this list of duds).

Of course, one solution is to reward charities that are effective and can demonstrate it. That’s the approach we generally take. But there are problems with the incentives this approach (by itself) gives to unproven charities.

We’d like an unproven charity to examine its programs rigorously, and get our recommendation if the results turn out positive. But what if the programs it’s running turn out not to work (i.e., change lives)? Then the charity will have spent money and time to weaken its own case. Something of a scary proposition – and a reason to be less than evenhanded in conducting evaluations.

But what if a foundation said the following? “If you can really prove that your program isn’t working – not just that it’s underfunded or has room for improvement, but that it fails to change lives when carried out correctly – that’s valuable. That improves our collective knowledge of what works, and demonstrate a true commitment to your mission, not just your activities.

“For a charity that can prove its programs aren’t working, we’ll provide you with the funding to redesign what you’re doing. If you have the right mission and the wrong tactics – and the guts to admit it – we’ll help you change those tactics, so you can accomplish the goal (saving lives, promoting equality of opportunity, etc.) that you really care about.”

If there were enough funding along these lines, carefully examining effectiveness – with no preconceptions or manipulation, just an honest desire for better knowledge – would be win-win for a charity. As it should be.

Guest post: Proven programs are the exception, not the rule

In 2017, the organization 80,000 Hours conducted a project to update the findings in this post. Their work is here.

This is a guest post from David Anderson, Assistant Director at the Coalition for Evidence-Based Policy, the group responsible for the Evidenced-based Programs website. Mr. Anderson’s responsibilities include reviewing studies of effectiveness and looking for proven programs. He’s worked at the Coalition since 2004. The views expressed below are solely those of the writer.

The holiday season often heightens our desire for “feel good” stories. Rigorous research has shown that it can fulfill this desire by identifying, in a scientifically rigorous way, a few social programs and services that are capable of producing important, positive effects on people’s lives.

For example, the Nurse Family Partnership – a nurse home visitation program for women who are poor, mostly single, and pregnant with their first child (recommended by GiveWell here) – has been shown in three rigorous randomized studies to produce long-term, 40-70% reductions in child abuse and neglect, as well as important improvements in cognitive and academic outcomes for the most at-risk children served by the program. The very existence of research-proven programs like the Nurse Family Partnership, suggests that a concerted effort to build the number of these programs through rigorous research, and to spur their widespread use, could fundamentally improve the lives of millions of people.

However, what’s not often recognized is how rare—and therefore, valuable—such examples of proven effectiveness are. Their scarcity stems from two main factors: 1) the vast majority of social programs and services have not yet been rigorously evaluated, and 2) of those that have been rigorously evaluated, most (perhaps 75% or more), including those backed by expert opinion and less-rigorous studies, turn out to produce small or no effects, and, in some cases negative effects. These are factors we’ve identified through our reviews of hundreds of studies that federal agencies, Congress, and philanthropic organizations have asked us to look at.

The following are just a few illustrative examples of this general pattern of many good ideas turning out not to work when tested in a rigorous way:

  • 21st Century Community Learning Centers—a rigorous randomized study of after school programs at 26 elementary schools, funded by the U.S. Department of Education, found that, on average, they had no effect on students’ academic achievement and had negative effects on their behavior (i.e. increased rates of school suspensions and other disciplinary problems compared to control group students). (See note 1 below)
  • Many leading educational software products—a rigorous randomized study of 16 leading educational software products for teaching reading and math – including many award-winning products – found, on average, no difference in reading or math achievement between students who used them in their classrooms, and those who were taught through usual methods. (See note 2 below)
  • Even Start—a rigorous randomized evaluation of 18 Even Start family literacy programs funded by the U.S. Department of Education found that, on average, children and adults served by the programs scored no higher on reading tests than their control group counterparts. (See note 3 below)
  • New Chance Demonstration—a rigorous randomized evaluation of 16 schools and organizations funded by the U.S. Department of Labor to provide comprehensive case management to teenage mothers, designed to improve their employment outcomes, found that, on average, these programs had no effect on their employment or earnings. (See note 4 below)

Importantly, these findings do not mean that all after-school programs, educational software, family literacy programs, and case management services are ineffective– just that many widely used approaches in these areas don’t work, and additional research is needed to identify those that do. While such findings are disheartening, they illustrate the importance of targeting government funding, as well as individual donations, on the relatively few programs and practices that have been shown in rigorous evaluations to be highly effective. Doing so will increase the likelihood that public and private dollars are truly going to help improve people’s lives in important ways—and that’s something to feel good about this holiday season.

References:

  1. James-Burdumy et al. “When Schools Stay Open Late: The National Evaluation of the 21st Century Community Learning Centers Program Final Report.” U.S. Department of Education/Institute of Education Sciences National Center for Evaluation and Regional Assistance. April 2005.
  2. Dynarski, et. al. “Effectiveness of Reading and Mathematics Software Products: Findings From the First Student Cohort: Report to Congress.” U.S. Department of Education/Institute of Education Sciences National Center for Evaluation and Regional Assistance. March 2007.
  3. St. Pierre et. al. “Third National Even Start Evaluation: Program Impacts and Implications for Improvement.” U.S. Department of Education’s Planning and Evaluation Service. 2003.
  4. Quint et al. “New Chance: Final Report on a Comprehensive Program for Young Mothers in Poverty and Their Children.” MDRC. 1997