The GiveWell Blog

Announcing GiveWell Labs

[Added August 27, 2014: GiveWell Labs is now known as the Open Philanthropy Project.]

The research we’ve been doing for the last couple of years has been constrained in a couple of key ways:

  • We’ve pre-declared areas of focus (based on our guesses as to where the most promising charities would be found), and disqualified charities for recommendations on the basis of their being “out of scope” (though we’ve been gradually broadening our scope).
  • We’ve needed to decide which organizations to recommend without being able to say in advance how much money would go to them as a result. This has led to challenges with the question of “room for more funding.” We’ve had to find charities that could essentially use any amount of funding (large or small) productively, and this has drastically narrowed our options.

We’re now launching a new initiative within GiveWell that will not be subject to either of these constraints. We plan to invest about 25% of our research time in what we’re calling GiveWell Labs: an arm of our research process that will be open to any giving opportunity, no matter what form and what sector.

Through GiveWell Labs, we will try to identify outstanding giving opportunities (whether they’re organizations or specific projects), publish rankings of these giving opportunities (separate from the top charities list we maintain using our existing research process) and try to raise money for these opportunities. Donors have pre-committed a minimum of $1 million to the GiveWell Labs initiative, meaning that we will have at least $1 million to commit to our choice of projects even if we are able to raise nothing else. (We expect to raise more if and when we find great giving opportunities; the $1 million has been committed based on donors’ trust in our ability to find such opportunities.)

Our existing work of finding outstanding international aid charities – using a more systematic process – continues. Over the coming year, we expect to spend about 75% of our research time on our existing work of finding outstanding international aid charities, and 25% of our research time on GiveWell Labs. Note that our “standard” process continues to gradually evolve and broaden its scope, and hopefully will come to incorporate insights gained through the work on GiveWell Labs. The distinction between the two may even dissolve over time. But at this time, GiveWell Labs is the arm of our process that is open to any giving opportunity, no matter what form and what sector.

In future blog posts, we’ll be giving a lot more information about this project, including:

  • More on why we’re moving in this direction at this time, and why we think a less-constrained, exploratory arm of our research process will help us find better giving opportunities.
  • Our planned process for finding great giving opportunities through GiveWell Labs, and what you can expect from us in terms of transparency.
  • The main qualities we’re looking for in a funding opportunity (when unconstrained by the form or sector of the opportunity), and why we’re looking for them.
  • The areas we think are most likely to yield great giving opportunities, and why.

In the meantime, if you know of any giving opportunities that are (a) not already funded or likely to be funded by others; (b) outstanding opportunities to have a large positive impact, please let us know.

Somalia famine: Update

Over the past month, we’ve worked to understand the situation in Somalia and make a recommendation to donors about where they should give. At this point we’re wrapping up our work with the following conclusions:

  • We wouldn’t recommend giving to support Somalia specifically over supporting everyday aid. While the needs are extreme, we aren’t convinced that individual donors can effectively cause more aid to be delivered via their donations.
  • For those who do want to give, we suggest The International Committee of the Red Cross (ICRC), the World Food Programme (WFP) or Doctors Without Borders (MSF), but with serious reservations about each of these.
  • There is a severe lack of transparency on the part of charities and funders, particularly the US government, that has hindered our ability to understand the situation and make a strong recommendation.

A brief note before getting into the details: most of this work was done by our summer intern, Josh Rosenberg. We couldn’t have learned as much as we did about this crisis without his help. Thanks, Josh.

Details follow.

Giving to Somalia relative to giving to everyday international aid

We believe strongly that the needs inside Somalia are great and that, were it possible to send food or medical supplies such that it would reach people in the region, that would accomplish a great deal of good. We recently spoke with an American journalist in the region, and he said:

I went to a hospital in Somalia recently, and we saw kids in very bad shape. There are just no resources there. They don’t have medicine, IV bags, solution. There are dozens, if not hundreds of people arriving each day that need hospitalization. It’s the same situation with camps inside Somalia.

Nevertheless, we don’t have confidence that it is in fact possible for donors to help get more food and supplies to those who need it. There are numerous reports of World Food Programme food aid being stolen by al-Shabaab, a group classified as a terrorist organization that governs much of the famine zone, and many Western NGOs have been explicitly banned from the region by al-Shabaab or have chosen to leave due to security concerns.

Even for the groups operating in the famine zone, it’s not clear to us that additional funds donated will lead to additional services provided, for reasons of donations’ fungibility. Our guess is that given the dire circumstances, aid organizations are likely to spend whatever resources they can, unrestricted or otherwise, to reach those in need, and individuals’ providing additional funds to the few organizations active in the famine zone may make little difference to this specific relief effort.

That said, we have still had limited contact with organizations operating inside famine zones, and we would be interested in hearing from any if they feel they would reach additional people with additional funding. Were we to gain confidence that an organization could do this, we could plausibly view the donation opportunity as on par with our highest rated organizations.

Assuming one wants to give to Somalia, which organization will be most effective?

We struggled to obtain substantive credible information to help us answer this question well. In our previous Somalia update, we listed the questions we sought to answer, and in most cases, we received vague information from charities such that we weren’t able to answer our questions well. For example, we often received budget proposals such as “$5 million for water and sanitation projects” with no detail regarding (a) the type of projects (e.g., digging wells vs trucking water vs purifying water) to be implemented, (b) the cost of each project component, or (c) the location where the projects would be implemented.

Given the lack of substantive, credible information, the three factors we focused most on are:

  1. Where is the organization operating? While we believe that there are great needs throughout the region–inside the famine zone, in the rest of Somalia, and in refugee camps and mainland areas of Ethiopia and Kenya–the greatest needs are in the famine zone. In our conversation with a journalist in the region, we asked “How do the conditions in refugee camps in Kenya and Ethiopia compare to conditions inside Somalia?” and he responded:

    My impression is that the refugee camps are pretty well taken care of right now. Even though they’re burgeoning with people, they’re doing OK. There’ve been some disease outbreaks in Ethiopia. They could use more help but there’s already a huge infrastructure there. In Dadaab there’s a huge compound for western aid workers. There’s a bar and restaurant. In Somalia, the people are near death and have no access to resources… I went to Dadaab, and I saw the same thing and saw starving kids and poor families, but there were people driving CARE cars and wearing MSF badges or Save the Children hats, so there are NGOs in the camps, but there’s no help inside Somalia.

    That is, while the people who reach the refugee camps need assistance, they are being served and we don’t have enough information to say that there is room for more funding in the camps.

  2. How transparent is the organization about its activities and spending? Regarding Somalia, but also other disasters, we’ve found limited information about the impacts or results of charities’ programs. We’ve therefore focused on organizations’ transparency and openness to being held accountable for their activities as a proxy for the organizations that are likely to be most effective.
  3. What other information do we have about the organization that would inform our conclusion about where to give now? To the extent we’ve considered the organization in other contexts, we’ve incorporated any additional information into our views here.

Having completed our analysis, three organizations stood out.

  • The International Committee of the Red Cross (ICRC). The ICRC is appealing for funds solely for use in the famine zone, and our understanding from them and from the journalist we spoke to is that they are active in famine areas. ICRC gave us a uniquely detailed plan for scaling up and using funds. Unfortunately, we aren’t cleared to share this plan publicly, but it was a more comprehensive and detailed plan than we received from other charities.However, the plan did not allow us to easily connect what ICRC plans to do with how it would spend money. Also, the journalist we spoke with told us:

    They are working in South Somalia in the al-Shabbab areas where no one else is. But, I’ve been told by some people that they screwed it up for other aid groups because they paid al-Shabbab a tax/bribe to work in those areas, and then al-Shabbab demanded it from other groups. Because al-Shabbab is a designated a terrorist organization by the US government, aid groups had to leave because it wasn’t legal for them to pay money to al-Shabbab. So, while ICRC is doing good work, there’s some resistance to them from other NGOs.

    We have not verified this claim or questioned the ICRC about it.

  • The World Food Programme (WFP). WFP is the only organization we spoke with that makes its detailed reports publicly available on its website. The reports include detailed budgets as well as quantities of food to be delivered and targeted locations. In addition, WFP is one of the largest entities (if not the largest entity) operating in the region, and they have been criticized in the media for mistakes they’ve made. Other things equal, we feel donors are well served to support the groups that will ultimately be seen as “responsible for” the response because they are most likely to be held accountable by donors and the public. Note that al-Shabbab has denied access to WFP in areas it controls.The criticisms that have been made raise room for concern as well, particularly regarding reports of World Food Programme food aid being stolen by al-Shabaab.
  • Doctors without Borders (MSF). We’ve spoken several times with MSF, but have received limited information from them. MSF is operating in the famine zone. We maintain our generally good feelings about the organization, but this is based largely on MSF’s transparency about their activities and needs for donations in past disasters. We are disappointed in MSF’s lack of transparency in this case.

The journalist we spoke with also mentioned some other organizations. We don’t endorse these but include his comments here for those interested:

I interviewed one guy and came across one NGO I thought was doing decent work and one of the few that had American people on the ground. It’s the American Refugee Committee. They’re pretty small and maybe their smallness has helped them be more nimble. He has gone to Mogadishu, and come out and gone back in. A few people have asked me whom to give to and I said that I had seen the American Refugee Committee. That was one of the few western organizations working on the ground. I also saw IRC, the International Rescue Committee. I have some friends in the aid business, and they’ve told me that IRC is there. They’re trying to provide help at camps and hospitals….There is a local NGO in Somalia that is doing good work. It’s called the Hawa Abdi Foundation. There’s a Somali woman doctor who set up a clinic in a camp, and she’s helping a lot of people. She was named one of Glamour Magazine’s top ten women of the year. If you steer any donors to local groups, it’s a good one. She has a track record of doing good work and reaching people. World Vision has done some good work inside Somalia. They were run out but are now starting again. In North Kenya, they’ve done good famine prevention work and set up agricultural projects that are helping people in these drought areas become farmers and less reliant just on cattle. I’ve looked closely at the World Vision and they’re pretty brave. They’re working in areas no one wants to go to.

Our struggles and the lack of transparency

One of the most disappointing aspects of our Somalia research has been the opacity of charities and the US government. We are particularly disheartened by USAID’s consistent position that they cannot help us, or by extension, individual donors in any meaningful way.

Over the past month, we have spoken with several representatives at USAID, all of whom have told us the same thing:

  • We cannot comment on or off the record about specific aid organizations.
  • We cannot offer any advice about which organizations are likely most effective or have the greatest need for funds.
  • We cannot share any of the information we’ve received from organizations about what they’re doing or how they’ll spend money.

While there may be specific cases of documents that must be kept private for safety or privacy concerns, we feel that most information can and should be shared. We’ve seen USAID documents in some cases because charities have voluntarily sent them to us, and we haven’t seen anything about these documents that would clearly cause harm if shared more widely. Generally, when charities have asked us to keep information confidential (which we’ve honored), we’ve seen little that seems it would cause harm or danger if shared publicly; confidentiality concerns have seemed to have more to do with charities’ not wanting to be judged in certain ways.

USAID is a government agency funded by the public. USAID has significant, detailed information about NGOs’ activities around the world, and sharing this information publicly would provide significant help to donors aiming to give more effectively. USAID has told us that the information they receive from charities is private and confidential and cannot be shared. This conclusion does not seem valid or just to us.

Donors who care about impact should continue to pressure the charities they support and the US government, which provides funding to them, to be more open with their information.

Working for GiveWell

If you’re interested in working or volunteering for GiveWell, now would be a good time to let us know. We’ve been a 7-person team for the last couple of months, but since two of the hires were temporary, we’re soon going to be back down to 5. We were happy with our productivity when we had 7 people; we have the funds, the management capacity and the desire to get back up to that size if the right people come along.

About the role

We’re looking primarily for Research Analysts – people who will provide support to the goal of finding the best giving opportunities. Research Analyst duties mostly consist of:

  • Reviewing independent research on the best ways to help people and on other issues relevant to giving
  • Reviewing particularly promising charities – including speaking with their representatives and asking critical questions, reviewing and evaluating documents they send, and writing up their answers to critical questions, strengths, and weaknesses
  • Taking part in discussions of which giving opportunities are most promising and of general GiveWell strategy
  • Miscellaneous duties depending on individual preferences, including networking, outreach, writing (e.g., for the blog), and original analysis on research questions
  • We encourage analysts to push their abilities to the limit and take on as much responsibility as they can. An analyst can grow into a major role at GiveWell.

A few practical details on the role:

  • We are located in New York City and currently work in the Tribeca/Chinatown area. Hours are flexible and some telecommuting is allowed, though overall expectations for productivity are high.
  • The general environment is one of intense discussion and debate. We change course and rethink things frequently, and analysts are encouraged to challenge, question and critique their managers.

What we’re looking for

We believe the most important qualities for a Research Analyst are

  • Passion for finding great giving opportunities. There’s little precedent for the kind of work we do, and we can’t train people to the point where little of their own judgment is required. As a result, analysts end up making a lot of judgment calls and it’s important that these judgment calls be oriented toward finding great giving opportunities.

    In the past, we’ve found that the best employees are the ones who come to us looking to volunteer or work for us, demonstrating pre-existing interest in and passion for the project. That’s why we’re starting this search via our own blog – we think the most promising candidates are likely to be among our readers.

  • Critical thinking/analysis skills. Analysts need not have existing proficience with data analysis (though it’s a plus), but they do need to be able to approach claims about charities’ impact with skepticism and good critical questions – whether those claims are made by charities, scholars, or myself and Elie.
  • Attention to detail. Analysts need to do careful, reliable work whose conclusions we can trust.

How to apply

Email us with a resume and a note on why you’d like to work for GiveWell. We will most likely write back asking you to enter our volunteer process; we generally ask people to volunteer for us before being hired, so that we can get a strong sense of their fit with the organization.

We also appreciate referrals to people who might be a good fit.

Why we can’t take expected value estimates literally (even when they’re unbiased)

While some people feel that GiveWell puts too much emphasis on the measurable and quantifiable, there are others who go further than we do in quantification, and justify their giving (or other) decisions based on fully explicit expected-value formulas. The latter group tends to critique us – or at least disagree with us – based on our preference for strong evidence over high apparent “expected value,” and based on the heavy role of non-formalized intuition in our decisionmaking. This post is directed at the latter group.

We believe that people in this group are often making a fundamental mistake, one that we have long had intuitive objections to but have recently developed a more formal (though still fairly rough) critique of. The mistake (we believe) is estimating the “expected value” of a donation (or other action) based solely on a fully explicit, quantified formula, many of whose inputs are guesses or very rough estimates. We believe that any estimate along these lines needs to be adjusted using a “Bayesian prior”; that this adjustment can rarely be made (reasonably) using an explicit, formal calculation; and that most attempts to do the latter, even when they seem to be making very conservative downward adjustments to the expected value of an opportunity, are not making nearly large enough downward adjustments to be consistent with the proper Bayesian approach.

This view of ours illustrates why – while we seek to ground our recommendations in relevant facts, calculations and quantifications to the extent possible – every recommendation we make incorporates many different forms of evidence and involves a strong dose of intuition. And we generally prefer to give where we have strong evidence that donations can do a lot of good rather than where we have weak evidence that donations can do far more good – a preference that I believe is inconsistent with the approach of giving based on explicit expected-value formulas (at least those that (a) have significant room for error (b) do not incorporate Bayesian adjustments, which are very rare in these analyses and very difficult to do both formally and reasonably).

The rest of this post will:

  • Lay out the “explicit expected value formula” approach to giving, which we oppose, and give examples.
  • Give the intuitive objections we’ve long had to this approach, i.e., ways in which it seems intuitively problematic.
  • Give a clean example of how a Bayesian adjustment can be done, and can be an improvement on the “explicit expected value formula” approach.
  • Present a versatile formula for making and illustrating Bayesian adjustments that can be applied to charity cost-effectiveness estimates.
  • Show how a Bayesian adjustment avoids the Pascal’s Mugging problem that those who rely on explicit expected value calculations seem prone to.
  • Discuss how one can properly apply Bayesian adjustments in other cases, where less information is available.
  • Conclude with the following takeaways:
    • Any approach to decision-making that relies only on rough estimates of expected value – and does not incorporate preferences for better-grounded estimates over shakier estimates – is flawed.
    • When aiming to maximize expected positive impact, it is not advisable to make giving decisions based fully on explicit formulas. Proper Bayesian adjustments are important and are usually overly difficult to formalize.
    • The above point is a general defense of resisting arguments that both (a) seem intuitively problematic (b) have thin evidential support and/or room for significant error.

The approach we oppose: “explicit expected-value” (EEV) decisionmaking
We term the approach this post argues against the “explicit expected-value” (EEV) approach to decisionmaking. It generally involves an argument of the form:

    I estimate that each dollar spent on Program P has a value of V [in terms of lives saved, disability-adjusted life-years, social return on investment, or some other metric]. Granted, my estimate is extremely rough and unreliable, and involves geometrically combining multiple unreliable figures – but it’s unbiased, i.e., it seems as likely to be too pessimistic as it is to be too optimistic. Therefore, my estimate V represents the per-dollar expected value of Program P.
    I don’t know how good Charity C is at implementing Program P, but even if it wastes 75% of its money or has a 75% chance of failure, its per-dollar expected value is still 25%*V, which is still excellent.

Examples of the EEV approach to decisionmaking:

  • In a 2010 exchange, Will Crouch of Giving What We Can argued:

    DtW [Deworm the World] spends about 74% on technical assistance and scaling up deworming programs within Kenya and India … Let’s assume (very implausibly) that all other money (spent on advocacy etc) is wasted, and assess the charity solely on that 74%. It still would do very well (taking DCP2: $3.4/DALY * (1/0.74) = $4.6/DALY – slightly better than their most optimistic estimate for DOTS (for TB), and far better than their estimates for insecticide treated nets, condom distribution, etc). So, though finding out more about their advocacy work is obviously a great thing to do, the advocacy questions don’t need to be answered in order to make a recommendation: it seems that DtW [is] worth recommending on the basis of their control programs alone.

 

  • The Back of the Envelope Guide to Philanthropy lists rough calculations for the value of different charitable interventions. These calculations imply (among other things) that donating for political advocacy for higher foreign aid is between 8x and 22x as good an investment as donating to VillageReach, and the presentation and implication are that this calculation ought to be considered decisive.
  • We’ve encountered numerous people who argue that charities working on reducing the risk of sudden human extinction must be the best ones to support, since the value of saving the human race is so high that “any imaginable probability of success” would lead to a higher expected value for these charities than for others.
  • “Pascal’s Mugging” is often seen as the reductio ad absurdum of this sort of reasoning. The idea is that if a person demands $10 in exchange for refraining from an extremely harmful action (one that negatively affects N people for some huge N), then expected-value calculations demand that one give in to the person’s demands: no matter how unlikely the claim, there is some N big enough that the “expected value” of refusing to give the $10 is hugely negative.The crucial characteristic of the EEV approach is that it does not incorporate a systematic preference for better-grounded estimates over rougher estimates. It ranks charities/actions based simply on their estimated value, ignoring differences in the reliability and robustness of the estimates.

    Informal objections to EEV decisionmaking
    There are many ways in which the sort of reasoning laid out above seems (to us) to fail a common sense test.

    • There seems to be nothing in EEV that penalizes relative ignorance or relatively poorly grounded estimates, or rewards investigation and the forming of particularly well grounded estimates. If I can literally save a child I see drowning by ruining a $1000 suit, but in the same moment I make a wild guess that this $1000 could save 2 lives if put toward medical research, EEV seems to indicate that I should opt for the latter.
    • Because of this, a world in which people acted based on EEV would seem to be problematic in various ways.
      • In such a world, it seems that nearly all altruists would put nearly all of their resources toward helping people they knew little about, rather than helping themselves, their families and their communities. I believe that the world would be worse off if people behaved in this way, or at least if they took it to an extreme. (There are always more people you know little about than people you know well, and EEV estimates of how much good you can do for people you don’t know seem likely to have higher variance than EEV estimates of how much good you can do for people you do know. Therefore, it seems likely that the highest-EEV action directed at people you don’t know will have higher EEV than the highest-EEV action directed at people you do know.)
      • In such a world, when people decided that a particular endeavor/action had outstandingly high EEV, there would (too often) be no justification for costly skeptical inquiry of this endeavor/action. For example, say that people were trying to manipulate the weather; that someone hypothesized that they had no power for such manipulation; and that the EEV of trying to manipulate the weather was much higher than the EEV of other things that could be done with the same resources. It would be difficult to justify a costly investigation of the “trying to manipulate the weather is a waste of time” hypothesis in this framework. Yet it seems that when people are valuing one action far above others, based on thin information, this is the time when skeptical inquiry is needed most. And more generally, it seems that challenging and investigating our most firmly held, “high-estimated-probability” beliefs – even when doing so has been costly – has been quite beneficial to society.
    • Related: giving based on EEV seems to create bad incentives. EEV doesn’t seem to allow rewarding charities for transparency or penalizing them for opacity: it simply recommends giving to the charity with the highest estimated expected value, regardless of how well-grounded the estimate is. Therefore, in a world in which most donors used EEV to give, charities would have every incentive to announce that they were focusing on the highest expected-value programs, without disclosing any details of their operations that might show they were achieving less value than theoretical estimates said they ought to be.
    • If you are basing your actions on EEV analysis, it seems that you’re very open to being exploited by Pascal’s Mugging: a tiny probability of a huge-value expected outcome can come to dominate your decisionmaking in ways that seem to violate common sense. (We discuss this further below.)
    • If I’m deciding between eating at a new restaurant with 3 Yelp reviews averaging 5 stars and eating at an older restaurant with 200 Yelp reviews averaging 4.75 stars, EEV seems to imply (using Yelp rating as a stand-in for “expected value of the experience”) that I should opt for the former. As discussed in the next section, I think this is the purest demonstration of the problem with EEV and the need for Bayesian adjustments.

    In the remainder of this post, I present what I believe is the right formal framework for my objections to EEV. However, I have more confidence in my intuitions – which are related to the above observations – than in the framework itself. I believe I have formalized my thoughts correctly, but if the remainder of this post turned out to be flawed, I would likely remain in objection to EEV until and unless one could address my less formal misgivings.

    Simple example of a Bayesian approach vs. an EEV approach
    It seems fairly clear that a restaurant with 200 Yelp reviews, averaging 4.75 stars, ought to outrank a restaurant with 3 Yelp reviews, averaging 5 stars. Yet this ranking can’t be justified in an EEV-style framework, in which options are ranked by their estimated average/expected value. How, in fact, does Yelp handle this situation?
    Unfortunately, the answer appears to be undisclosed in Yelp’s case, but we can get a hint from a similar site: BeerAdvocate, a site that ranks beers using submitted reviews. It states:

    Lists are generated using a Bayesian estimate that pulls data from millions of user reviews (not hand-picked) and normalizes scores based on the number of reviews for each beer. The general statistical formula is:
    weighted rank (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C
    where:
    R = review average for the beer
    v = number of reviews for the beer
    m = minimum reviews required to be considered (currently 10)
    C = the mean across the list (currently 3.66)

    In other words, BeerAdvocate does the equivalent of giving each beer a set number (currently 10) of “average” reviews (i.e., reviews with a score of 3.66, which is the average for all beers on the site). Thus, a beer with zero reviews is assumed to be exactly as good as the average beer on the site; a beer with one review will still be assumed to be close to average, no matter what rating the one review gives; as the number of reviews grows, the beer’s rating is able to deviate more from the average.

    To illustrate this, the following chart shows how BeerAdvocate’s formula would rate a beer that has 0-100 five-star reviews. As the number of five-star reviews grows, the formula’s “confidence” in the five-star rating grows, and the beer’s overall rating gets further from “average” and closer to (though never fully reaching) 5 stars.

    I find BeerAdvocate’s approach to be quite reasonable and I find the chart above to accord quite well with intuition: a beer with a small handful of five-star reviews should be considered pretty close to average, while a beer with a hundred five-star reviews should be considered to be nearly a five-star beer.

    However, there are a couple of complications that make it difficult to apply this approach broadly.

    • BeerAdvocate is making a substantial judgment call regarding what “prior” to use, i.e., how strongly to assume each beer is average until proven otherwise. It currently sets the m in its formula equal to 10, which is like giving each beer a starting point of ten average-level reviews; it gives no formal justification for why it has set m to 10 instead of 1 or 100. It is unclear what such a justification would look like.In fact, I believe that BeerAdvocate used to use a stronger “prior” (i.e., it used to set m to a higher value), which meant that beers needed larger numbers of reviews to make the top-rated list. When BeerAdvocate changed its prior, its rankings changed dramatically, as lesser-known, higher-rated beers overtook the mainstream beers that had previously dominated the list.
    • In BeerAdvocate’s case, the basic approach to setting a Bayesian prior seems pretty straightforward: the “prior” rating for a given beer is equal to the average rating for all beers on the site, which is known. By contrast, if we’re looking at the estimate of how much good a charity does, it isn’t clear what “average” one can use for a prior; it isn’t even clear what the appropriate reference class is. Should our prior value for the good-accomplished-per-dollar of a deworming charity be equal to the good-accomplished-per-dollar of the average deworming charity, or of the average health charity, or the average charity, or the average altruistic expenditure, or some weighted average of these? Of course, we don’t actually have any of these figures.For this reason, it’s hard to formally justify one’s prior, and differences in priors can cause major disagreements and confusions when they aren’t recognized for what they are. But this doesn’t mean the choice of prior should be ignored or that one should leave the prior out of expected-value calculations (as we believe EEV advocates do).

    Applying Bayesian adjustments to cost-effectiveness estimates for donations, actions, etc.
    As discussed above, we believe that both Giving What We Can and Back of the Envelope Guide to Philanthropy use forms of EEV analysis in arguing for their charity recommendations. However, when it comes to analyzing the cost-effectiveness estimates they invoke, the BeerAdvocate formula doesn’t seem applicable: there is no “number of reviews” figure that can be used to determine the relative weights of the prior and the estimate.

    Instead, we propose a model in which there is a normally (or log-normally) distributed “estimate error” around the cost-effectiveness estimate (with a mean of “no error,” i.e., 0 for normally distributed error and 1 for lognormally distributed error), and in which the prior distribution for cost-effectiveness is normally (or log-normally) distributed as well. (I won’t discuss log-normal distributions in this post, but the analysis I give can be extended by applying it to the log of the variables in question.) The more one feels confident in one’s pre-existing view of how cost-effective an donation or action should be, the smaller the variance of the “prior”; the more one feels confident in the cost-effectiveness estimate itself, the smaller the variance of the “estimate error.”

    Following up on our 2010 exchange with Giving What We Can, we asked Dario Amodei to write up the implications of the above model and the form of the proper Bayesian adjustment. You can see his analysis here. The bottom line is that when one applies Bayes’s rule to obtain a distribution for cost-effectiveness based on (a) a normally distributed prior distribution (b) a normally distributed “estimate error,” one obtains a distribution with

    • Mean equal to the average of the two means weighted by their inverse variances
    • Variance equal to the harmonic sum of the two variances

    The following charts show what this formula implies in a variety of different simple hypotheticals. In all of these, the prior distribution has mean = 0 and standard deviation = 1, and the estimate has mean = 10, but the “estimate error” varies, with important effects: an estimate with little enough estimate error can almost be taken literally, while an estimate with large enough estimate error ends ought to be almost ignored.

    In each of these charts, the black line represents a probability density function for one’s “prior,” the red line for an estimate (with the variance coming from “estimate error”), and the blue line for the final probability distribution, taking both the prior and the estimate into account. Taller, narrower distributions represent cases where probability is concentrated around the midpoint; shorter, wider distributions represent cases where the possibilities/probabilities are more spread out among many values. First, the case where the cost-effectiveness estimate has the same confidence interval around it as the prior:

    If one has a relatively reliable estimate (i.e., one with a narrow confidence interval / small variance of “estimate error,”) then the Bayesian-adjusted conclusion ends up very close to the estimate. When we estimate quantities using highly precise and well-understood methods, we can use them (almost) literally.

    On the flip side, when the estimate is relatively unreliable (wide confidence interval / large variance of “estimate error”), it has little effect on the final expectation of cost-effectiveness (or whatever is being estimated). And at the point where the one-standard-deviation bands include zero cost-effectiveness (i.e., where there’s a pretty strong probability that the whole cost-effectiveness estimate is worthless), the estimate ends up having practically no effect on one’s final view.

    The details of how to apply this sort of analysis to cost-effectiveness estimates for charitable interventions are outside the scope of this post, which focuses on our belief in the importance of the concept of Bayesian adjustments. The big-picture takeaway is that just having the midpoint of a cost-effectiveness estimate is not worth very much in itself; it is important to understand the sources of estimate error, and the degree of estimate error relative to the degree of variation in estimated cost-effectiveness for different interventions.

    Pascal’s Mugging
    Pascal’s Mugging refers to a case where a claim of extravagant impact is made for a particular action, with little to no evidence:

    Now suppose someone comes to me and says, “Give me five dollars, or I’ll use my magic powers … to [harm an imaginably huge number of] people.

    Non-Bayesian approaches to evaluating these proposals often take the following form: “Even if we assume that this analysis is 99.99% likely to be wrong, the expected value is still high – and are you willing to bet that this analysis is wrong at 99.99% odds?”

    However, this is a case where “estimate error” is probably accounting for the lion’s share of variance in estimated expected value, and therefore I believe that a proper Bayesian adjustment would correctly assign little value where there is little basis for the estimate, no matter how high the midpoint of the estimate.

    Say that you’ve come to believe – based on life experience – in a “prior distribution” for the value of your actions, with a mean of zero and a standard deviation of 1. (The unit type you use to value your actions is irrelevant to the point I’m making; so in this case the units I’m using are simply standard deviations based on your prior distribution for the value of your actions). Now say that someone estimates that action A (e.g., giving in to the mugger’s demands) has an expected value of X (same units) – but that the estimate itself is so rough that the right expected value could easily be 0 or 2X. More specifically, say that the error in the expected value estimate has a standard deviation of X.

    An EEV approach to this situation might say, “Even if there’s a 99.99% chance that the estimate is completely wrong and that the value of Action A is 0, there’s still an 0.01% probability that Action A has a value of X. Thus, overall Action A has an expected value of at least 0.0001X; the greater X is, the greater this value is, and if X is great enough then, then you should take Action A unless you’re willing to bet at enormous odds that the framework is wrong.”

    However, the same formula discussed above indicates that Action X actually has an expected value – after the Bayesian adjustment – of X/(X^2+1), or just under 1/X. In this framework, the greater X is, the lower the expected value of Action A. This syncs well with my intuitions: if someone threatened to harm one person unless you gave them $10, this ought to carry more weight (because it is more plausible in the face of the “prior” of life experience) than if they threatened to harm 100 people, which in turn ought to carry more weight than if they threatened to harm 3^^^3 people (I’m using 3^^^3 here as a representation of an unimaginably huge number).

    The point at which a threat or proposal starts to be called “Pascal’s Mugging” can be thought of as the point at which the claimed value of Action A is wildly outside the prior set by life experience (which may cause the feeling that common sense is being violated). If someone claims that giving him/her $10 will accomplish 3^^^3 times as much as a 1-standard-deviation life action from the appropriate reference class, then the actual post-adjustment expected value of Action A will be just under (1/3^^^3) (in standard deviation terms) – only trivially higher than the value of an average action, and likely lower than other actions one could take with the same resources. This is true without applying any particular probability that the person’s framework is wrong – it is simply a function of the fact that their estimate has such enormous possible error. An ungrounded estimate making an extravagant claim ought to be more or less discarded in the face of the “prior distribution” of life experience.

    Generalizing the Bayesian approach
    In the above cases, I’ve given quantifications of (a) the appropriate prior for cost-effectiveness; (b) the strength/confidence of a given cost-effectiveness estimate. One needs to quantify both (a) and (b) – not just quantify estimated cost-effectiveness – in order to formally make the needed Bayesian adjustment to the initial estimate.

    But when it comes to giving, and many other decisions, reasonable quantification of these things usually isn’t possible. To have a prior, you need a reference class, and reference classes are debatable.

    It’s my view that my brain instinctively processes huge amounts of information, coming from many different reference classes, and arrives at a prior; if I attempt to formalize my prior, counting only what I can name and justify, I can worsen the accuracy a lot relative to going with my gut. Of course there is a problem here: going with one’s gut can be an excuse for going with what one wants to believe, and a lot of what enters into my gut belief could be irrelevant to proper Bayesian analysis. There is an appeal to formulas, which is that they seem to be susceptible to outsiders’ checking them for fairness and consistency.

    But when the formulas are too rough, I think the loss of accuracy outweighs the gains to transparency. Rather than using a formula that is checkable but omits a huge amount of information, I’d prefer to state my intuition – without pretense that it is anything but an intuition – and hope that the ensuing discussion provides the needed check on my intuitions.

    I can’t, therefore, usefully say what I think the appropriate prior estimate of charity cost-effectiveness is. I can, however, describe a couple of approaches to Bayesian adjustments that I oppose, and can describe a few heuristics that I use to determine whether I’m making an appropriate Bayesian adjustment.

    Approaches to Bayesian adjustment that I oppose

    I have seen some argue along the lines of “I have a very weak (or uninformative) prior, which means I can more or less take rough estimates literally.” I think this is a mistake. We do have a lot of information by which to judge what to expect from an action (including a donation), and failure to use all the information we have is a failure to make the appropriate Bayesian adjustment. Even just a sense for the values of the small set of actions you’ve taken in your life, and observed the consequences of, gives you something to work with as far as an “outside view” and a starting probability distribution for the value of your actions; this distribution probably ought to have high variance, but when dealing with a rough estimate that has very high variance of its own, it may still be quite a meaningful prior.

    I have seen some using the EEV framework who can tell that their estimates seem too optimistic, so they make various “downward adjustments,” multiplying their EEV by apparently ad hoc figures (1%, 10%, 20%). What isn’t clear is whether the size of the adjustment they’re making has the correct relationship to (a) the weakness of the estimate itself (b) the strength of the prior (c) distance of the estimate from the prior. An example of how this approach can go astray can be seen in the “Pascal’s Mugging” analysis above: assigning one’s framework a 99.99% chance of being totally wrong may seem to be amply conservative, but in fact the proper Bayesian adjustment is much larger and leads to a completely different conclusion.

    Heuristics I use to address whether I’m making an appropriate prior-based adjustment

    • The more action is asked of me, the more evidence I require. Anytime I’m asked to take a significant action (giving a significant amount of money, time, effort, etc.), this action has to have higher expected value than the action I would otherwise take. My intuitive feel for the distribution of “how much my actions accomplish” serves as a prior – an adjustment to the value that the asker claims for my action.
    • I pay attention to how much of the variation I see between estimates is likely to be driven by true variation vs. estimate error. As shown above, when an estimate is rough enough so that error might account for the bulk of the observed variation, a proper Bayesian approach can involve a massive discount to the estimate.
    • I put much more weight on conclusions that seem to be supported by multiple different lines of analysis, as unrelated to one another as possible. If one starts with a high-error estimate of expected value, and then starts finding more estimates with the same midpoint, the variance of the aggregate estimate error declines; the less correlated the estimates are, the greater the decline in the variance of the error, and thus the lower the Bayesian adjustment to the final estimate. This is a formal way of observing that “diversified” reasons for believing something lead to more “robust” beliefs, i.e., beliefs that are less likely to fall apart with new information and can be used with less skepticism.
    • I am hesitant to embrace arguments that seem to have anti-common-sense implications (unless the evidence behind these arguments is strong) and I think my prior may often be the reason for this. As seen above, a too-weak prior can lead to many seemingly absurd beliefs and consequences, such as falling prey to “Pascal’s Mugging” and removing the incentive for investigation of strong claims. Strengthening the prior fixes these problems (while over-strengthening the prior results in simply ignoring new evidence). In general, I believe that when a particular kind of reasoning seems to me to have anti-common-sense implications, this may indicate that its implications are well outside my prior.
    • My prior for charity is generally skeptical, as outlined at this post. Giving well seems conceptually quite difficult to me, and it’s been my experience over time that the more we dig on a cost-effectiveness estimate, the more unwarranted optimism we uncover. Also, having an optimistic prior would mean giving to opaque charities, and that seems to violate common sense. Thus, we look for charities with quite strong evidence of effectiveness, and tend to prefer very strong charities with reasonably high estimated cost-effectiveness to weaker charities with very high estimated cost-effectiveness

    Conclusion

    • I feel that any giving approach that relies only on estimated expected-value – and does not incorporate preferences for better-grounded estimates over shakier estimates – is flawed.
    • Thus, when aiming to maximize expected positive impact, it is not advisable to make giving decisions based fully on explicit formulas. Proper Bayesian adjustments are important and are usually overly difficult to formalize.

 

Donating to the Somalia famine: A brief update

Since our initial post on the Somalia famine, we’ve continued our research to provide a stronger recommendation to donors. We do not yet have enough information to do so. At this point, we maintain our provisional recommendation for Doctors Without Borders (MSF).

Over the past 3 weeks, we’ve contacted many aid and UN-based organizations. We’ve spoken with representatives from Action Against Hunger, CARE, Doctors without Borders (MSF), International Committee of the Red Cross, International Medical Corps, Oxfam, Save the Children, the World Food Programme, and UNICEF.

In our conversations with organizations, we’ve tried to answer the following questions:

  • What, specifically, are your activities in response to the emergency?
  • How do your expenses break down across these activities?
  • In what regions are you working? Are you primarily in the famine zone? In refugee camps? Other locations in the region?
  • Are you appealing for additional funding? If so, how much are you seeking? If you don’t raise all that you are appealing for, would you allocate unrestricted funding to your response?
  • How, specifically, would you spend additional funding?

We’ve also contacted funders such as the UN’s Central Emergency Response Fund (CERF), the Disasters Emergency Committee in the UK and USAID, but they have not been able to give us information about organizations or the situation on the ground that would inform our views of specific aid organizations.

We are waiting on information from several of the charities we’ve contacted to answer the questions above, and once we receive this information, we’ll be in a better position to make a stronger charity recommendation to donors.

For the time being, we maintain our provisional recommendation of Doctors Without Borders (MSF), which is publicly appealing for funds for the crisis in Somalia and its consequences.

Guest post from Vipul Naik

This is a guest post from Vipul Naik about how he decided what charity to support for his most recent donation. We requested this post along the lines of earlier posts by Eric Friedman, Jason Fehr, Ian Turner, and Dario Amodei. Note that this post was written before we published our most recent update on VillageReach.

Early giving: small amounts, based on whims

In September 2007, I joined the University of Chicago for graduate study in mathematics. For the first time, I was drawing a regular stipend that significantly exceeded my financial needs. I could now consider donating parts of my "own" money. Initially, I neither had a strong sense of what I should donate to, nor a burning desire to donate large parts of my savings, though I did have a vague feeling that donating money for worthwhile causes was a nice thing.

My initial "donations" made around December 2007 weren't really donations — they were more gratitude payments to non-profits and organizations that I think have made the world a better place — such as a $100 donation to the Wikimedia Foundation (the non-profit behind Wikipedia). I didn't consider myself a philanthropist trying to achieve specific large-scale change through my giving. Also, my savings weren't very high, and I hadn't mentally adjusted to the concept of making large donations.

Sponsor a kid!

I liked the idea of donating to organizations that serve poor people. However, I wasn't aware of any organization that I considered reliable, and finding one wasn't a priority. In June 2008, I was in downtown Chicago running some errands when I came across street fundraisers advertising for Children International (GiveWell review here), a Kansas-based international NGO that serves children across many developing countries through a one-on-one child sponsorship model. The idea appealed to me (my parents had participated in child sponsorship programs in India). I investigated Children International's website, and three weeks later (July 2008), I decided to sponsor a child for $22/month. A month later, I upped the number of children to two, for $44/month. I continued increasing the number of sponsored children until, around August 2009, the number had increased to 15 kids for $330/month.

Some neat — and life-changing — logic

I read a chapter in Steven Landsburg's book More Sex Is Safer Sex (an expansion of this Slate article) where Landsburg asserted that one should donate to only one charity rather than split one's donations across multiple charities. Landsburg argued that the size of a donation is usually too small to affect the relative merits of different charitable causes — and hence if you chose to give your first $1000 to Charity A rather than Charity B, the same reasoning should continue to apply to your next $1000. "Small" charities are somewhat different, if a donation has sufficient impact on the charity’s activities such that the donation, itself, alters the relative merits of different charities. However, for much of impersonal charitable giving to large causes/organizations, Landsburg's reasoning (and the accompanying mathematics) seemed valid, and I was convinced. (GiveWell has a similar philosophy — see this blog post on triage).

Landsburg's "one charity argument," on the surface, was more reason to keep donating to Children International and simply adjust the quantity donated rather than donating extra money to other charities. Or so I thought. But I gradually realized that the argument isn't merely about donating to one charity, rather, it is about donating to the best charity. I had no reason to suspect that Children International was bad, but I had no basis to conclude that they were the best (or anywhere near). Why did I continue donating to them?

Children International's sponsorship model (as opposed to simply making one-off grants/donations) made it psychologically hard for me to stop donating to them. At the time, I had no idea of candidates for substantially better charities. In hindsight, I should have stopped donating to Children International much earlier, even before I'd found a good charity.

Cutting the sponsorship cord

In late December 2009, I discovered a Bloggingheads diavlog (conversation) between William Easterly and Peter Singer. I'd already read Easterly's books The White Man's Burden and The Elusive Quest For Growth, and I also followed the Aid Watch blog to which he was a primary contributor. I was thus aware of Easterly's work and views on the shortcomings of official aid and development assistance. Peter Singer, a Princeton bioethicist and advocate of greater giving to meet the needs of the world's poorest, was new to me. In the diavlog, Singer mentioned GiveWell, and I followed the link to their website. GiveWell's research and philosophy impressed me. GiveWell did not recommend Children International, but recommended a handful of organizations based on extensive analysis. I wasn't sold on GiveWell's recommendations, but I now had some serious candidates that seemed substantially better than Children International.

I asked Children International to end my sponsorship in February 2010. I decided to not use a regular monthly donation model any more (with its implicit feeling of lock-in) but rather make periodic donation decisions, with due diligence done each time. I wasn't sure of the period: a long period has the advantage that the donation amount is sufficiently large to undertake a more thorough investigation, but this is also a disadvantage. Shorter periods between donations and smaller donation quantities reduce the risk of making a large donation to an organization that shuts down, or closes its room for more funding gap, shortly after I donate.

Discovering VillageReach

I continued to follow GiveWell as well as other blogs on philanthropy, aid, poverty, and development. I was reasonably convinced that low-income country health systems was low-hanging fruit for donor money. The approach of GiveWell's top charity VillageReach (GiveWell review here) impressed me. I made donations of $1250 in March 2010 and $2000 in June 2010 to VillageReach through GiveWell's website.

Around this time, I started feeling that the one charity argument had exceptions. In some cases, I thought, making a donation tied to specific single projects can actually get those single projects done. Around August 2010, I got in touch with a researcher and talked about partially funding some research related to low-cost private education in the developing world. We had extensive correspondence and phone conversations and in September 2010, I made a donation covering part of the costs of a new research project, with the understanding that any cost overruns would be covered by him. The project was successful (albeit with cost overruns) though the research report is not yet published, so I cannot share details right now. I think this was a case where my willingness to come forward with initial money helped accelerate a project that may otherwise either not have happened or happened a year later.

However, such opportunities are rare and inherently risky. In October 2010, I returned to considering VillageReach for my next donation. I talked over the phone with Holden of GiveWell. I shared some concerns:

  • Did GiveWell have a sufficient incentive to critically re-evaluate their own top-rated charities in light of new data?

  • Why was there very little other information or news coverage about VillageReach other than their own website and GiveWell's evaluation of them?

  • Why hadn't any major donor or foundation agreed to cover VillageReach's funding gap?

Holden addressed my questions, and, shortly thereafter, GiveWell elaborated further in the blog posts Health system strengthening + sustainability + accountability and After "Extraordinary and Unorthodox" comes the Valley of Death.

In December 2010, I made a donation of $5100 to VillageReach, my largest to the organization, bringing my total to-date donations to VillageReach at $8350. After donating, I talked over the phone with VillageReach employee John Beale about VillageReach's activities, to help me in future donation decisions.

A new year

I planned to make my next donation around April 2011. GiveWell published an update on VillageReach in March 2011. The good news: GiveWell found no reason, based on VillageReach's latest activities, to modify its analysis of VillageReach's cost-effectiveness. However, the evidence at this stage wasn't sufficiently clear to conclude definitively that VillageReach's current programs would be as successful as (or more successful than) the pilot programs on which GiveWell had based its analysis.

GiveWell's recommendation was responsible for about $1.1 million of roughly $2 million that VillageReach raised in 2010. VillageReach had originally projected a need for slightly under $6 million for their Mozambique project that was to continue till 2014. They seemed to be on track to meet their funding needs. I was now unsure of the value of my marginal donation. I would still have reason to donate to VillageReach if either:

  1. They could deliver demonstrably greater benefits by rolling out their program much more quickly, and they could do so by getting funding more quickly.

  2. GiveWell could identify other top charities so that, once VillageReach's funding gap was closed, other donors could donate instead to these other top charities.

I talked again with VillageReach's John Beale in March 2011, and although I continued to be convinced about VillageReach's effectiveness, I was unconvinced about (1). The key hope was now (2) — could GiveWell identify more top charities soon? GiveWell had already identified finding top charities as their top priority for 2011 (see here and here). However, by end April 2011, I wasn't convinced that they'd be successful. Thus, I decided to hold off my donation.

Independently, I started investigating other forms of philanthropy (such as those covered at the Breakthrough Philanthropy conference). I find some of them promising but don't yet feel confident to make a large donation to any of those organizations. In the mean time, I continue to check out GiveWell's updates on VillageReach and on their search for new top charities.