The GiveWell Blog

My donation for 2009 (guest post from Dario Amodei)

This is a guest post from Dario Amodei about how he decided what charity to support for his most recent donation. Dario and GiveWell staff had several in-depth conversations as he worked through his decision, so we invited him to share his thought process here. Note that GiveWell has made minor editing suggestions for this post (though Dario determined the final content).

Before I get into the details of my donation decision, I’d like to first share a bit about myself: I’m a graduate student in physics at Princeton, and am interested, very broadly, in what I can do to make the world a better place. I feel that giving away a significant portion of my income is an important part of that, and since 2006 I’ve been donating to organizations that try to improve life in the developing world. I’ve always tried my best to make my donations as effective as possible, but on my own I was never able to give this task as much attention as it deserved. I happened upon GiveWell in 2008 through a link from an economics blog, and to date it’s been the single most useful resource I’ve found in deciding where to donate. Last year I gave $10,000 through GiveWell’s pledge fund, and ultimately decided to allocate all of this money to Village Reach. Holden and Elie have asked me to share the thought process I went through in making my decision, in the hopes that it might be of use to other donors facing a similar choice.

My focus has always been on developing-world health interventions, because I believe these interventions address some of the world’s most urgent needs in a highly tangible way. Six out of 12 of GiveWell’s recommended charities operate in this area, including some health charities I’ve donated to in the past. Reading GiveWell’s reports on these charities, it quickly became clear to me that the “three-star” organizations — Village Reach (VR) and Stop TB — really do stand out above the others. Though I respect and am impressed by the two star organizations, they all seem to have sizable holes in their case for efficacy: for instance, PIH seems to (completely?) lack data on medical outcomes, and the Global Fund seems to have problems with how to use additional funds (William Easterly also seems to have a strongly negative assessment of it in this diavlog ).

Thus, I decided to focus on VR (which aims to improve operational logistics for child vaccinations) and Stop TB (which provides governments with funds for tuberculosis treatment). Choosing between these very compelling charities proved difficult, but I don’t regret the considerable effort I put into my choice — as I tried to constantly remind myself, this choice should involve every bit as much effort as buying a $10,000 item for myself. I considered three relevant factors —

  1. Cost-effectiveness
  2. Execution
  3. “Incentive effects” (explained more below)

Cost-effectiveness

GiveWell makes explicit cost effectiveness estimates (based in part on those of the Disease Control Priorities report) for both organizations: ~$545 per infant death averted for Village Reach, and ~$150-750 per death averted for Stop TB. These are roughly comparable, but don’t take into account the fact that Stop TB mainly treats adults, while VR mainly treats infants and children. I feel that adults are capable of deeper and more meaningful experiences than are infants, and also deeper connections with other people, so an adult death seems worse to me than an infant death (though both are of course bad). Trying to quantify exactly how much worse is very subjective and can also seem calculating (“how many babies would you kill to save an adult?”), but on a practical level one is forced to make difficult decisions with limited funds, and in my case I’d say that I think an adult death is perhaps 2 or 3 times worse than an infant’s death. Thus, adjusted for my personal values, I’d say that Stop TB is ~2-3 times more cost-effective than VR, though I understand that others may validly disagree with this subjective assessment.

Execution

The second factor, execution, is the one I find most important. By execution I mean all the factors that are assumed to go right in an ideal cost-effectiveness calculation, but could go wrong in practice. I take Murphy’s Law very seriously, and think it’s best to view complex undertakings as going wrong by default, while requiring extremely careful management to go right. This problem is especially severe in charity, where recipients have no direct way of telling donors whether an intervention is working. The situation is worse yet in the developing world, where projects cannot count on the reliable infrastructure and basic social trust we take for granted in the developed world. Given all these problems, what I look for in a charity is a simple and short chain of execution in which relatively few things can go wrong, together with rigorous efforts to close whatever loopholes do exist. As far as I can tell, VR fits these criteria better than any other charity I’ve encountered. Vaccines unquestionably save lives if correctly administered, so it’s generally enough to show that functional vaccines are being correctly delivered and administered. Roughly, the major questions I want answered about a vaccination program are:

(a) are the vaccines actually delivered to health clinics?
(b) do the vaccines remain effective during transport and storage?
(c) once in storage, are the vaccines actually administered, and safely so?
(d) does the program have a clear plan for spending additional money, so that donations actually translate to more vaccines?
(e) are vaccination rates measured to check that the whole chain is working?

I won’t go through the details, which are in GiveWell’s report, but VR makes a systematic effort to address each question. Deliveries are tracked by phone in real-time (e.g. (a)), VR takes an active role in providing power for refrigerators to keep vaccines cold (e.g. (b)), sterilization equipment is provided and stock outs are tracked (which at least suggests successful administration (c)), VR has a clear plan (d) for how to use additional funds, and changes in vaccination rates are measured with controls (e). These steps aren’t perfect – for example, there is apparently no systematic reporting confirming the actual correct administration of vaccines, so step (c) has some room for error — but overall the chain of execution is tighter than any I’ve seen, and the potential holes seem small enough to be manageable.

By contrast, in Stop TB’s case, such a chain (if I could even write it down) would be much longer — Stop TB hands drugs over to governments (involving several layers of administration, differing from country to country) which then must perform all the logistical details VR must perform, plus diagnostics, recurring treatments, and in some cases second-line treatment. There is also the possibility of TB evolving resistance if treatments are not correctly administered. Stop TB’s random inspections, cure rate data, and external auditing seem suggestive of positive results, but my inability to examine in detail a process that I know is quite complex ultimately leaves me very suspicious about efficacy. This isn’t just a matter of Stop TB being a large organization; rather, the problem is that I can’t see the full process of treatment setup and administration, whether applied to one person or a million. Lacking that clear and full view of Stop TB, I have to conclude that VR is the winner on execution.

Incentive effects

Given only VR’s superiority on execution and StopTB’s superiority on cost-effectiveness, I would be about equally inclined to support either, with perhaps a small edge to VR because execution is so critical. However, it’s important to look at the incentive effects of my donation — the money I give out is not just a one-shot intervention, but also a vote on what I want the philanthropic sector to look like in the future. Along these lines, I see three additional advantages to VR, which make it the clear winner in my mind:

  1. VR’s small size means that funds given to it through GiveWell could greatly change its funding situation (GiveWell seems to have been responsible for a sizable fraction of VR’s total donations last year). What happens to Village Reach could make a notable impression on other charities, which badly need to hear that focusing on efficacy can pay off.
  2. In my view, incentivizing careful execution is a higher priority right now than incentivizing cost-effectiveness. Cost-effectiveness would be important if there were many good charitable opportunities and not enough money to fund them all. Instead, the current situation seems to be that a lot of programs are probably a waste of money. It thus makes sense, from an incentive point of view, to reward charities that focus maximally on execution — such as VR.
  3. Logistics and efficiency are extremely important, but don’t make for good headlines. VR should be getting a lot more money than it is, and I want to tell the philanthropic sector that charities can succeed without being flashy.

In addition to all the arguments listed above, there were a number of other factors which I thought about (some of which were raised in GiveWell’s reports and posts) but ultimately had a hard time getting a handle on and so did not give much weight to. I considered too many factors to list them all, but here are a few examples:

  • By lowering child mortality, could VR have different effects on population growth than Stop TB? If so, is population growth beneficial or harmful?
  • A vaccination or treatment doesn’t only save one person; it also impedes the spread of the disease. Could TB treatment and child vaccinations differ in how much they do this?
  • Stop TB treats people who live in less isolated areas and thus have more opportunity to interact with others and indirectly improve their lives. How important is this?
  • VR’s logistics ideas could be applied to many health interventions. If VR’s model spreads and proves effective on a wider scale, how large would the overall benefits be?

Any one of these effects could theoretically be important enough to outweigh all my arguments for VR, so this list serves as a reminder that there can never be any guarantees of efficacy, let alone optimality. Uncertainty, however, is simply part of life, and all I can do is go with my best guess, so I decided to give to VR.

I hope (though I cannot be sure) that my donation will save the lives of 20 children (which is what the cost-effectiveness numbers work out to). That’s a truly staggering benefit, and honestly it came at very little cost to myself: I don’t much miss the new car I didn’t buy, and I’ll gladly make the same sacrifice next year in order to donate again. What did feel very emotionally taxing was reading (and in most cases, agreeing with) all the negative analysis of charities at GiveWell and elsewhere. I found it difficult to evaluate everything in a critical fashion while still holding on to the compassion and optimism that originally inspired me to donate. It’s tough to find the right balance between caring and hard-nosed realism, but it is possible, and it is, as far as I know, the only way to truly change the world.

Are great charities made or born?

Among the groups in our “meta-philanthropy” space, one of the big questions is how to create more “high-impact” (also called “blue-chip”) charities: the rare groups that can reliably, demonstrably translate donations into improved lives.

The rough consensus seems to be that we need to fund and support “high-performance organizations”: groups that have “some, but not all” of the qualities of blue-chip charities. The idea is that charities start with a few good qualities and slowly grow into blue-chippers.

(See, for example, parts one and two of Tactical Philanthropy’s discussion from last summer, which argues that “the best thing to do is not study how to [achieve impact] and then set out on that exact path with the exact tools needed according to your theory, but instead to build the most robust [organization] possible.” Also see the most recent draft of the Social Risk Assessment Protocol, which effectively gives twice the weight to a charity’s evaluation and adjustment practices that it gives to the charity’s choice of programs.)

I’ve been rethinking this idea. When I look at the three charities that I consider to be most “blue-chip” (our three highest-ranked charities), I don’t see a path of a “strong organization that eventually figured out what to do and whether it worked. ” Rather, I see organizations that stayed as small as possible – or didn’t exist – until they had strong evidence of impact for their basic approach. They built their choice of programming into their DNA, as much as they could, from day one.

The reigning consensus seems to treat “evidence of impact” as a late (or at least potentially late) step in the development of a nonprofit, but in fact it has been the first step for the strongest nonprofits I know of.

This makes intuitive sense to me as well.

  • Finding “approaches that work” is fundamentally a research challenge, and probably requires a completely different skill set from running an organization well.
  • Once an organization is “up and running,” it may become a very poor environment for a good impact evaluation. To me a good impact evaluation is one that has a real chance of demonstrating failure, and the stakes may simply be too high for an organization that has already built up significant funds, donors, clients, stories, staff, habits, etc.

The truth is that if an organization wants to become “high-impact,” there are already proven approaches for it to choose from; if it wants to investigate an approach that isn’t yet proven, it can (like VillageReach) stay at minimal size and essentially act as a “research project.” For an organization that has chosen to do neither of these things, and has already “scaled up” its program and built up a staff/organization (no matter how well run) … it may be too late.

GiveWell focuses on finding blue-chip charities, not on creating them. But for those looking to do the latter, I submit that it may be less effective to start with “high-performance organizations” than to start from scratch.

Thoughts on “Moonshine or the kids?”

Nicholas Kristof’s recent column argues that

if the poorest families spent as much money educating their children as they do on wine, cigarettes and prostitutes, their children’s prospects would be transformed. Much suffering is caused not only by low incomes, but also by shortsighted private spending decisions by heads of households.

This argument has provoked a lot of strong reactions, for and against. I feel it is helpful to separate three separate questions the article raises:

  1. Is selfish/bad spending by the poor ever a problem?
  2. Is selfish/bad spending by the poor a major/widespread/leading problem?
  3. Is selfish/bad spending by the poor a promising target for aid? (As we’ve written before, we think there is a vital – and often underrecognized – distinction between “biggest problem” and “best target for aid.”)

With this separation in mind, I think it becomes fairly clear that Mr. Kristof has told an interesting story that points to an interesting hypothesis (perhaps worthy of investigation), but that he’s a long way from making a case for action.

I think the distinction between “interesting story/hypothesis” and “good case for action” is also chronically underrecognized in the world of giving.

1. Is selfish/bad spending by the poor ever a problem?

I’m confident that it is. I didn’t personally observe the two “irresponsible fathers” Mr. Kristof talks about, and I don’t know whether he represented them fairly, but I doubt anyone would try to argue that there are no irresponsible fathers among the over 2 billion people living on under the equivalent of $2/day.

People in our own society aren’t immune to selfishness and shortsightedness; I don’t see why we would expect the poor to be.

2. Is selfish/bad spending by the poor a major/widespread/leading problem?

Most of Mr. Kristof’s column is anecdotal, focusing on two men in one village. His only reference to more systematically collected, possibly representative data is to “The Economic Lives of the Poor” (Banerjee and Duflo 2006) (PDF).

We are familiar with this work and have published our own summary of its implications. I don’t feel that Mr. Kristof gives a fair picture of it. It’s one thing to find specific fathers who could be sending their kids to school if they drank less, but when looking at broader data a lot more care is needed. Much of what looks at first glance like “irresponsible spending” could in fact be benign and reasonable. (For example, when looking at the fact that the poor spend money on “festivals,” I have the same reaction as Aid Watch, which asks, “Is it really such a big surprise that the poor also want recreation? That the poor have a life? Including some of the same vices that the rich have?”)

I think the details of the flaws of Mr. Kristof’s interpretation are well covered by this Wronging Rights post.

None of this disproves his claim. It seems like we simply don’t know the size of the problem he’s describing (and, importantly, where and when it’s a problem).

3. Is selfish/bad spending by the poor a promising target for aid?

Even if selfish/bad spending were shown to be a dominant and widespread problem, there would still be the question of whether aid can reasonably expect to do anything about it, and how our prospects compare to our prospects of (for example) vaccinating more children.

Perhaps because he is aware of this, Mr. Kristof doesn’t suggest heavy-handed interventions. In fact, he suggests expanding microsavings and “giv[ing] women more control over purse strings and more legal title to assets” – both goals that have much to recommend them regardless of whether selfish/bad spending is a problem at all.

Bottom line

I credit Mr. Kristof for saying what many may be afraid to. It is possible that many of the poor’s problems come down to selfish/bad spending.

However, the information we have – or at least the information Mr. Kristof presents – isn’t enough to make the argument any more than a hypothesis. And the last thing we need is another instance of pouring resources into a hypothesis as if it’s already been proven.

For those who find his column compelling, the appropriate response is more investigation, not action.

Please take 3 minutes to help us set priorities

We’re collecting information about people’s favorite causes/charities and giving habits to help us set our research priorities and provide the best service possible.

The link below goes to a survey that should take you about 3 minutes. Whether you’re a major supporter or you’ve never used GiveWell’s research before, please fill it out. We appreciate it.

Take our survey

How the American Cancer Society and Susan G. Komen for the Cure spend their money

This year, we’ve been looking into the cause of disease research. We’re trying to find outstanding organizations for donors interested giving to help out with research efforts to develop cures, or new treatments, to cancer and other diseases.

We figured that a logical place to start would be with two big-name organizations: the American Cancer Society and Susan G. Komen for the Cure. The first question we asked was “What do they do?”, and the first thing we found surprised us: funding research into cures or new treatments is a relatively small part of their activities.

American Cancer Society

The American Cancer Society’s 2008 IRS Form 990 demonstrates that ACS is not primarily a research organization.

The following chart shows a breakdown of the American Cancer Society’s Program expenses in 2008; explanatory notes follow. (Note these figures include the payments to affiliates — which themselves account for about 1/3 of the American Cancer Society’s expenses — according to the breakdown in Statement 7 of the 990.)

The 990 offers these notes to explain the different categories (quoted from Part III of the 2008 990):

  • Patient support: Programs to assist cancer patients and their families and ease the burden of cancer for them.
  • Prevention: Programs that provide the public and health professionals with information and education to prevent cancer occurrences or to reduce the risk of developing cancer. (My emphasis)
  • Detection/treatment: Programs directed at finding cancer before clinically apparant & that provide information about cancer treatments for cure, recurrence, symptom management, and pain control.
  • Research: Financial support provided to academic institutions and scientists to seek new knowledge about the causes, prevention. and cure of cancer, and to conduct epidemiological and behavioral studies.

Susan G. Komen for the Cure

Susan G. Komen’s 2008 audited financials paint a similar picture: it is not primarily a research (or research-funding) organization.

The chart below shows Komen’s 2008 program expenses, which include the central organization’s activities as well as those of the affiliates.

Komen’s 2008 Form 990, Schedule O offers additional explanation about the “public health education” category. The following is taken from the beginning of the section discussing its public health education program:

Komen has formed advisory councils to address the breast health and breast cancer needs of people from different cultures and backgrounds… We have developed a variety of educational materials for specific audiences in English and most are also available in Spanish… Examples of our education materials include – Breast Self-Awareness (BSA) cards in 12 languages for 14 specific audiences – General breast health and breast cancer brochures and fact sheets – Booklets with support information for survivors and co-survivors – Outreach resources including breast self-awareness information in CD-ROM, DVD or VHS formats.

Bottom line

We haven’t yet established anything about whether the American Cancer Society or Susan G. Komen is effective (or ineffective) at accomplishing their missions.

But we, at least, have been surprised by this fairly basic information.

  • Both organizations seem to spend relatively small portions of their funds on researching new treatments or cures.
  • Both organizations spend significant portions of their funds on “raising awareness” type activities.

I personally feel more negative on the two charities as a result of this basic check, because (a) I find research into cures/treatments as having a potentially huge upside in humanitarian terms, while public education and provision of existing treatments (in the developed world) don’t seem nearly so promising to me; (b) the “education” type activities are a red flag to me that research, specifically, may not have room for more funding.

Others may disagree, and I may change my mind after getting more information. But I wonder how many of the donors to these organizations have considered the variety of different activities that “fighting cancer” can mean, and considered whether they’d rather give to an organization that’s focused on a particular one of them.

Futility of standardized metrics: An example

We often hear calls to “standardize metrics” so that nonprofits’ outcomes can be compared directly to each other. As an example of why we find this idea unpromising, I’d like to review some of our work on the first cause we ever investigated: employment assistance in NYC.

We received 19 applications from employment assistance programs in NYC. 7 applicants were able to provide relatively long-term data on job placements and retentions. We initially hoped to compare these outcomes to each other and get some sense of which organization was delivering the most “jobs created per dollar.” It didn’t work out (to put it mildly).

Breakdown #1: youth vs. adults

The HOPE Program, Highbridge Community Life Center, St. Nick’s and CCCS serve unemployed/underemployed adults; Covenant House, Year Up and The Way to Work (formerly Vocational Foundation) explicitly focus on “disconnected youth.” It was immediately clear to us that we would have to subdivide the organizations, and could not directly compare something like Year Up to something like The Hope Program, since the challenges and opportunities are so different for a youth seeking a “first job” vs. a struggling adult.

Breakdowns beyond the mission statements

The three “youth” organizations listed above may appear similar, if you’re going only off of their basic mission statements.

Organization Mission statement
Covenant House Through our job training programs, homeless teens can gain skills in a specific vocation and also learn what they need to know about job hunting and the professional world. We also give them interview clothes. Job training programs include courses in the culinary arts, desktop publishing, healthcare, public safety, computer skills, woodworking, and more.
Year Up Year Up’s mission is to close the Opportunity Divide by providing urban young adults with the skills, experience, and support that will empower them to reach their potential through professional careers and higher education.We achieve this mission through a high support, high expectation model that combines marketable job skills, stipends, internships, college credit, a behavior management system and several levels of support to place these young adults on a viable path to economic self-sufficiency.
Way to Work (formerly Vocational Foundation) At The Way to Work we are committed to empowering young New Yorkers ages 17-24 with the tools needed to achieve their highest potential. Formerly known as the Vocational Foundation, Inc. or VFI, we have created lasting impact through our comprehensive, individualized approach to career training, GED preparation, professional and personal counseling, job placement and retention services.

But when we got into the details of how the different organizations recruit and select clients, it became clear that these three organizations cannot at all be compared in an apples-to-apples way.

Year Up, for example, not only requires a high-school degree (or GED) of all applicants, but conducts a competitive application process and – from the data we looked at – accepted fewer than 1/3 of applicants. (Details.) By contrast, Covenant House asserts that over half its clients have dropped out of school by tenth grade. Year Up places a substantially higher portion of its clients in substantially better-paying jobs than Covenant House, but given the differences in whom they serve, shouldn’t this be expected regardless of the impact of the actual employment assistance programs?

What about Way to Work (formerly Vocational Foundation)? It’s clear that the organization is far less selective than Year Up, as only 23% of its clients have a high school degree or GED, and there does not appear to be a “competitive” process (i.e., willing applicants’ being turned away). However, there does appear to be a good deal of “self-selection” going on – 2/3 of clients drop out early in the program. (Details.) We have no directly comparable data with which to compare this organization’s clients with Covenant House’s: we have 75% in “public housing” vs. 53% “homeless” (staying in Covenant House shelter) and 77% with no high school degree (or GED) vs. 50%+ having dropped out of high school by 10th grade. Covenant House has lower placement rates, and we would guess that it is serving the more challenged population, but we can neither verify nor quantify the extent to which it is.

The importance of differences in target populations

At one point we had hoped to – and attempted to – use Census data to figure out how important the differences in target populations were. This proved futile as anything but a super-rough contextualization: the Census data can only be narrowed down in certain ways, and we only had certain information about target populations, and there was no way to really make them match up. However, for this post I pulled together some data on 1999 wage earnings (focusing on the percentage of relevant people who earned more than $20k in 1999) to give a sense of how much small differences can matter.

  • Among 18-24 year olds no longer in school, 27% of those with a high school degree (only) made $20k+; only 17% of those without a high school degree did. People with higher degrees made far more.
  • Among 18-24 year olds no longer in school with a high school degree, earnings varied substantially by neighborhood. 59% of those in the same area as the Covenant House office (Chelsea/midtown) made $20k+, while 27% of those in the same area as the Year Up and Way to Work offices (Financial District) made $20k+.

Details here (XLS)

Apples to oranges

Looking over our overview table for finalists in this cause, it becomes clear how many differences make it impossible to compare their outcomes directly. Their programs target different populations, have different requirements (often varying significantly even within a single charity, which may offer different programs), and train them in different skills. The adult programs have even clearer differences than the youth programs. If St. Nick’s places 2/3 of its clients in Environmental Remediation Technician jobs while Highbridge places just under half in Nurse Aide jobs … what does this tell us about how the two compare?

I wouldn’t be ready to ascribe meaning to a direct comparison of job placements between two charities unless the two charities were working in the same region, with the same requirements, similar actual client populations (in terms of age, educational attainment, etc.) and essentially the same program (since self-selection effects would be different for different programs). Even then, something as simple as a difference in advertising techniques could cause differences in the target populations, differences that could swamp any effects of the programs.

Outcomes vs. impact

What if we could know, for each charity, not just how many clients they placed in jobs, but how many they placed in jobs who wouldn’t have gotten such jobs without the charity’s assistance?

If we had this information, I’d be more ready to compare it across charities. But this information is impact – what Tactical Philanthropy has called the “holy grail” of philanthropy. It’s extraordinarily rare for a charity even to attempt to collect reliable evidence of impact (more).

Our current approach is to seek out the few charities that can give at least somewhat compelling evidence of impact, and recommend them, with the quantification of outcomes and cost-effectiveness as a secondary consideration.

It is simply not feasible to compare charities across large sectors (something like “Employment assistance for disconnected youth in New York City”) in an apples-to-apples way. Even if the charities collected all the information we would like, the fundamentals of their programs and target populations would have to reach an unrealistic degree of similarity.