The GiveWell Blog

Qualitative evidence vs. stories

Our reviews have a tendency to discount stories of individuals, in favor of quantitative evidence about measurable outcomes. There is a reason for this, and it’s not that we only value quantitative evidence – it’s that (in our experience) qualitative evidence is almost never provided in a systematic and transparent way.

If a charity selected 100 of its clients in a reasonable and transparent way, asked them all the same set of open-ended questions, and published their unedited answers in a single booklet, I would find this booklet to be extremely valuable information about their impact. The problem is that from what we’ve seen, what charities call “qualitative evidence” almost never takes this form – instead, charities share a small number of stories without being clear about how these stories were selected, which implies to me that charities select the best and most favorable stories from among the many stories they could be telling. (Examples: Heifer International, Grameen Foundation, nearly any major charity’s annual report.)

A semi-exception is the Interplast Blog, which, while selective rather than systematic in what it includes, has such a constant flow of stories that I feel it has assisted my understanding of Interplast’s activities. (Our review of Interplast is here.)

I don’t see many blogs like this one, and I can’t think of a particularly good reason why that should be the case. A charity that was clear, systematic and transparent before-the-fact about which videos, pictures and stories it intended to capture (or that simply posted so many of them as to partly alleviate concerns about selection) would likely be providing meaningful evidence. If I could (virtually) look at five random clients and see their lives following the same pattern as the carefully selected “success stories” I hear, I’d be quite impressed.

But this sort of evidence seems to be even more rare than quantitative studies, which are at least clear about how data was collected and selected.

Philanthropy Action points to more evidence on education interventions

Board member Tim Ogden writes,

Mathematica Policy Research has conducted a multi-year randomized controlled trial of sixteen educational software programs (covering both reading and math) aimed at elementary and middle school students. The products selected were generally those that had at least some evidence of positive impact … the educational software didn’t make much difference.

The second-year study included 3280 students in 77 schools across 23 districts (page xvi – details on sample sizes on pages 4 and 9) in first, fourth and sixth grade (page 70), and randomly assigned classrooms (page 65) to incorporate or not incorporate one of ten software programs (see page 70). Effects on test scores (details of tests on page xviii) had not been statistically significant for any grade in year 1 (page xviii-xx); second-year effects were not significantly different for first- and fourth-graders, and were mixed (better in one case; worse in another) for sixth-graders (page xx).

The results are consistent with the fairly substantial set of evidence that developed-world education is an extremely difficult area to get significant results in. (Including research discussed in recent blog posts here and here, as well as more examples of failed programs discussed on GiveWell.net)

Note that the second-year study was released a couple of months ago, though we learned of it via Mr. Ogden’s recent blog post. Also note that we haven’t thoroughly examined it, as it does not point to a new promising approach, but rather adds more evidence to a theme we’ve noted many times.

Mr. Ogden also discusses research on education in the developing world, about which we’ll have more to say later.

The most important problem may not be the best charitable cause

I recently ran across a charity called Project AK-47 that declares:

Over 100,000 kids are carrying machine guns in the armies of Southeast Asia. Instead of walking to school, they march to war. Instead of playing, they train to kill. If we don’t intervene, most of these children will be soldiers for at least 7 more years…assuming they survive.

We have been rescuing as many of these child soldiers as possible. But right now, without more help, we have to turn many child soldiers away. Your $7 can make the difference between life and death for a child soldier.

A kid or a killer…you decide.

It’s a powerful emotional appeal, and if I could make the purchase they advertise, I would (many times over). There’s just one problem: after carefully examining the entire website, I cannot determine what this organization does.

It mentions paying for “7 days of food,” “7 days of quality education,” “play clothes to replace a child’s army uniform,” and “supplies for a child’s initial urgent medical care and hygiene” … but what is the plan to prevent them from becoming soldiers? Is this nonprofit hiring mercenaries to conduct armed rescues? Coming into peaceful communities and hoping that its help will discourage children from turning to the military? Or something else? And whatever it is, is it doable and does it work? I couldn’t find the answer.

It’s an extreme example of a style of argument common to nonprofits: point to a problem so large and severe (and the world has many such problems) that donors immediately focus on that problem – feeling compelled to give to the organization working on addressing it – without giving equal attention to the proposed solution, how much it costs, and how likely it is to work. Another example is the massive support for organizations such as the Save Darfur movement, despite serious questions about what exactly Save Darfur is trying to do (questions that I doubt most of its supporters have looked into).

Many of the donors we hear from are passionately committed to fighting global warming because it’s the “most pressing problem,” or to a particular disease because it affected them personally – even while freely admitting that they know nothing about the most promising potential solutions. I ask these donors to consider the experience related by William Easterly:

I am among the many who have tried hard to find the answer to the question of what the end of poverty requires of foreign aid. I realized only belatedly that I was asking the question backward … the right way around [is]: What can foreign aid do for poor people? (White Man’s Burden pg 11)

As a single human being, your powers are limited. As a donor, you’re even more limited – you’re not giving your talent or your creativity, just your money. This creates a fundamentally different challenge from identifying the problem you care most about, and can lead to a completely different answer.

In my case: I would rather close the achievement gap than fight developing-world disease, but my giving goes to the latter because it’s a problem that I can do much more to address.

The truth is that you may not be able to do anything to help address the root causes of poverty or cure cancer or solve the global energy crisis.* But you probably can save a life, and insisting on giving to the “biggest problem” could be passing up that chance.


*I haven’t looked into the latter two, and it’s possible that they are more tractable. If you know something about their tractability, I encourage you to share it.

Volunteer tutoring program

Via Joanne Jacobs: a large randomized controlled trial found statistically effects of a volunteer tutoring program on reading skills.

The effect size (.1-.16 standard deviations on 3 measures; insignificant on one other – see pg 13 of the full study) is in the same ballpark as the effect observed in a recent study of vouchers in D.C. (which we discussed here) – yet this was a 24-week intervention as opposed to a 3-year effect from switching schools. (Though which one “costs” more is debatable, since the voucher program simply reallocated public funds whereas this one required time and expense outside the standard school system.)

Note that this program reached much younger children (grades 1-3 – page 5) and focused on those with the worst performance – an approach that seems sensible based on how early the achievement gap appears. It also focused exclusively on reading, an approach that appeals to me intuitively because – speaking purely from intuition – reading seems like a more universally important skill than other skills taught in school.

Though the effect size isn’t huge, it’s an encouraging result.

Positive but underwhelming voucher study

The third-year evaluation of a federally funded school voucher program in D.C. has recently been released (H/T Joanne Jacobs).

We’ve written before that past voucher studies have shown extremely underwhelming (if any) effects, and at first glance this report would seem to be a change in the pattern: “The evaluation found that the OSP improved reading, but not math, achievement overall and for 5 of 10 subgroups of students examined.” But on slightly closer examination, I’m not sure how much there is to be excited about here. A few observations (page numbers refer to the full study, available here):

  • The study found a statistically significant impact on reading performance after year 3 – but no impact on math performance, and no impact on either after years 1 or 2 (xvii).
  • The impact appears largely to have been confined to students who were less disadvantaged to begin with (see page 36). Students coming from “schools in need of improvement” (i.e., the weakest schools) saw no statistically significant improvement.
  • Even with all of these caveats aside, the impact was small, estimated at about .15 standard deviations after 3 years for students who used (not just received) the scholarship. For context, a .15 standard-deviation improvement for a student initially scoring in the 25th percentile would take him/her to the 30th percentile.
  • It strikes me as odd that the estimated effect of using vouchers was so close to the estimated effect of receiving vouchers (.15 vs. .13 standard deviations -see page 36), even though only 41% of recipients consistently used the scholarships and 25% did not use them at all (see page xxiii). The study does not explicitly address the performance of the students who received scholarships but did not use them – if a similar effect showed up there, I’d worry that randomization wasn’t carried out as intended.

The study is more encouraging than others I’ve seen about the effects of vouchers, but the picture it gives is still very far from the idea that vouchers (alone) can make a significant dent in the achievement gap.