The GiveWell Blog

Qualitative evidence vs. stories

Our reviews have a tendency to discount stories of individuals, in favor of quantitative evidence about measurable outcomes. There is a reason for this, and it’s not that we only value quantitative evidence – it’s that (in our experience) qualitative evidence is almost never provided in a systematic and transparent way.

If a charity selected 100 of its clients in a reasonable and transparent way, asked them all the same set of open-ended questions, and published their unedited answers in a single booklet, I would find this booklet to be extremely valuable information about their impact. The problem is that from what we’ve seen, what charities call “qualitative evidence” almost never takes this form – instead, charities share a small number of stories without being clear about how these stories were selected, which implies to me that charities select the best and most favorable stories from among the many stories they could be telling. (Examples: Heifer International, Grameen Foundation, nearly any major charity’s annual report.)

A semi-exception is the Interplast Blog, which, while selective rather than systematic in what it includes, has such a constant flow of stories that I feel it has assisted my understanding of Interplast’s activities. (Our review of Interplast is here.)

I don’t see many blogs like this one, and I can’t think of a particularly good reason why that should be the case. A charity that was clear, systematic and transparent before-the-fact about which videos, pictures and stories it intended to capture (or that simply posted so many of them as to partly alleviate concerns about selection) would likely be providing meaningful evidence. If I could (virtually) look at five random clients and see their lives following the same pattern as the carefully selected “success stories” I hear, I’d be quite impressed.

But this sort of evidence seems to be even more rare than quantitative studies, which are at least clear about how data was collected and selected.


  • Tony Pipa on April 26, 2009 at 11:39 pm said:

    To my mind, narrative is a much better metaphor for capturing the intrinsic value of nonprofits than the quant metaphors of the business world that so many (including GiveWell) seem obsessively determined to apply to nonprofits. However, as you point out, narrative as used by almost all nonprofits now is basically marketing, because there is subjective bias in which stories are told and what of that story is presented. Love your suggestion to combine 100 unedited interviews, and would like to see a foundation take it and fund a pilot of a group of grantees to carry it out, to test its value. I’d also love to see much more discussion and more suggestions of how to make narrative meaningful to measurement/learning by those who are developing models to measure effectiveness.

  • Sean Stannard-Stockton on April 27, 2009 at 9:08 am said:

    It seems to me that your post highlights the need for charity evaluation to be done in a way that the analyst seeks out info rather than requesting it from the charity. Bill Somerville, the author of Grassroots Philanthropy is a big believer in qualitative evaluation. But he also feels that someone evaluating a charity should spend the majority of their time out in the field. He also evaluates “grassroots” (small) nonprofits.

    It may be that qualitative analysis is the best approach in many cases, but it requires that the analyst have the ability to seek out the experiences to evaluate. This of course would be plausible if you were evaluating small, NY based nonprofits. But until our sector grows significantly, we simply won’t have the scale to do it for large nonprofits.

    In the classic book One Up On Wall Street, Peter Lynch argued that individual investors could do better than wall street investors because the individuals could spend their time in the field talking to friends and family and understanding how popular products and services actually were.

    So here’s a challenge for you. Why don’t you pick one small, NY based nonprofit to evaluate and attempt to do it as qualitatively as possible?

  • Holden on April 27, 2009 at 10:13 am said:

    Sean – our target audience is donors who don’t have the time or resources for their own in-depth investigations. Thus, we need to focus on finding information that is generally useful and evaluating nonprofits that have significant capacity to take individual donations.

    What you suggest would be an interesting experience. It would also be a large investment of time in something that wouldn’t necessarily turn up a charity recommendation or any generally applicable information. We might end up with a great charity or a terrible one; in either case, focusing so heavily on one would likely skew our perspective more than it would inform it.

    Something we are thinking of doing is conducting up-close investigations of charities that our existing process has identified as the top ones, to see how the picture on the ground compares with the picture given by the formal reports.

    In any case, the message of this post isn’t that quantitative information is inferior to qualitative information, or that reports are inferior to up-close observation. Both have strengths and drawbacks. The aim of this post was to distinguish between useful qualitative information and the carefully selected “stories” that most charities put at the heart of their case.

  • Ingvild Bjornvold on April 30, 2009 at 9:01 am said:

    From an anthropologist: There is a time and a place for qualitative data. I don’t blame a nonprofit for using stories to illustrate their successes in their marketing materials – why shouldn’t they? And I don’t think it would make much sense to do open-ended interviews with all the clients of a nonprofit (unless it’s very small) – imagine the nightmare of making sense of the data!

    That said, I think qualitative and quantitative data should supplement each other more frequently than it does. It would be a very good idea for Givewell to check out its top choices on the ground. Just because an RCT has found a program effective in the past doesn’t mean it is still implemented with fidelity to the effective model…

    Qualitative interviews make most sense with a relatively small number of key people in the exploratory phases of an evaluation – say in the case of a process evaluation where the goal is to understand the range of issues volunteers are faced with. But the next step, if the purpose is to understand what’s going on on a larger scale, should be to create structured interviews to determine how widespread the issues that emerged are.

    Qualitative exploration also makes sense if quantitative data is mysterious in some way – it’s a great way to begin to find out what may lie behind the numbers.

    In short: No numbers without stories; no stories without numbers.

  • I would have to agree with Ingvlid’s comment, “no numbers without stories; no stories without numbers.”

    If we’re constantly being fed solely numbers, charts, data and figures, we sometimes lose our focus. I think that the stories keep a real human aspect and i think that those shouldn’t be dismissed


  • Holden on April 30, 2009 at 7:34 pm said:

    Ingvild and Nick, thanks for the comments.

    Generally, I’m not worried about losing sight of the stories. They’re front and center on every charity’s website; trying to figure out how representative they are is what drives our analysis. If we ever reached the point where people were giving purely based on numbers, I’d agree that this was a problem, but I don’t think that’s a substantial risk for any donor right now.

    We do hope to check out our top choices on the ground and will be discussing this further later on.

    Ingvild, to clarify, I didn’t advocate interviewing all clients, I advocated interviewing a large and representative sample. I believe that reading through (or even skimming or spot-checking) unedited content from such interviews would help my understanding.

  • Shari on May 1, 2009 at 6:11 pm said:

    Great points, all. I think there are two different aspects to this issue that have not been distinguished yet. First, there is the need for semi-scientifically gathered and analyzed qualitative information to supplement financial and quantitative data. This would be used by foundations and other super-full time philanthropists and interested parties. Second, there is a need for a large amount of qualitative data, period. This would be used by part-timers, those that simply check out major (and very valuable) websites like Guidestar or Charity Navigator or the nonprofit’s own website before deciding to donate/get involved.

    The first situation requires time, money, and plenty of forethought by those who would gather and use that data. The second situation, however, does not. I reference “Yelp,” a consumer review site for any sort of business.

    The key to valuable qualitative information, ironically, is quantity. The more testimonies you have from people who know about a nonprofit, the more you know about it. In aggregate, you get a pretty good picture of what’s going on. This is where the ubiquitous, viral power of the internet really comes into play. (I don’t want to be obnoxious, but I feel I should mention that our website, GreatNonprofits, is currently serving as an aggregator of user-generated reviews about nonprofits)

    In the absence of a set, universally adopted standard of excellence for the nonprofit sector, the only thing left (and I don’t think it’s such a bad deal) is to turn to those we serve and those that know about our work to make the judgement.

  • David Bonbright on May 3, 2009 at 9:22 am said:

    Thanks to everyone for contributing to this good discussion. Like GiveWell and GreatNonprofits, my organization Keystone Accountability is dedicated to improving the quality of evidence available to understand the difference organizations make when they set out to improve the world in some way. Like GreatNonprofits, with which we are pleased to collaborate, we specialize in cultivating feedback from those who are meant to benefit. From this vantage point, I can briefly share some insights from our work.

    First, as this conversation shows, the debate has moved beyond qualitative versus quantitative. Those with a deep interest on these themes may want to look into the current debates in the field of impact evaluation. One articulation of the bleeding edge there is summarized in a 2-page call for action at

    Second, intensive field observation by third-party analysts (as valuable as it is) just won’t scale in a way that will allow us to meet the system-level challenge to get all organizations to put better information into the public domain. We need open review systems like the one that GreatNonprofits operates as this will increase the volume of independent opinions about performance. But we need more than this as well.

    Third, and to my mind most importantly, we need feedback data to be collected and aggregated in a way that creates scientifically valid comparisons across organizations. the Center for Effective Philanthropy’s (CEP) Grantee Perception Report(tm) has done this at the level of the performance of grantee feedback to foundations. Keystone is now working with CEP, a number of leading foundations, human services organizations, and charity rating websites to do is at the level of the primary constituents of social change — those meant to benefit. We need to create a new information infrastructure for comparative feedback in order to do this at scale, perhaps something along the lines of what the customer satisfaction industry has done in the commercial context.

    Fourth and finally, there is a very simple thing that we can all do in order to make a giant step forward in this space. We can all adopt and insist upon the feedback principle of public reporting by organizations seeking to create social and environmental benefits. The feedback principle asks organizations to report their results in whatever way they choose, but in addition they should report what those meant to benefit have to say about those alleged results. This feedback would provide a kind of validation to the purported results, and would need to be undertaken in a methodologically valid way. The method for this kind of constituency validation of results is not the hard part here. The challenge is to gain a sufficient appreciation of the necessity of this kind of system upgrade to our theory of public reporting.

  • Valerie Threlfall on May 13, 2009 at 6:39 pm said:

    As David provided us with an entrée to this discussion, I wanted to highlight a recent effort by the Center for Effective Philanthropy (CEP), in conjunction with the Bill & Melinda Gates Foundation, to do exactly what some are asking for above– piloting a foundation-led effort to collect qualitative and quantitative data in a systematic way from those who foundations and nonprofits are ultimately trying to serve.

    Earlier this year, CEP and the Gates Foundation launched YouthTruth in 20 Gates-funded high schools around the country. As part of this effort, CEP surveyed more than 5,300 high school students to hear from them what was working well and not working well within their schools. Students provided both quantitative feedback about various aspects of their school experience such as their relationships with teachers, the rigor of their coursework, the factors that make it hard for them to succeed in school, as well as qualitative open-ended feedback about how their school could improve. Their feedback was systematically analyzed and presented on a comparative basis across schools participating in the pilot, and shared with school leaders, grantees, and the Foundation.

    The project is highlighted in the May 7 issue of the Chronicle of Philanthropy. See “Talking Back to Bill Gates: Do His Grants Matter?” at

    We believe YouthTruth, as a pilot Beneficiary Perception Report (BPR) tool, has the potential to serve as a model for how to rigorously collect feedback from those on the ground experiencing program interventions and share that data back with those funding programs. Of course, as David and Shari highlight above, scaling is the real issue as the field needs to continue to develop the “will” and the tools to enable more systematic and ongoing feedback loops.

Comments are closed.