The GiveWell Blog

GiveWell and Good Ventures

Last year, we met Cari Tuna and Dustin Moskovitz of Good Ventures, a new foundation that is planning eventually on giving substantial amounts (Dustin and Cari aim to give the majority of their net worth within their lifetimes; Dustin is the co-founder of Facebook and, more recently, Asana). We immediately established that Good Ventures and GiveWell share some core values that relatively few others seem to share:

  • Both Good Ventures and GiveWell are aiming to do as much good as possible, from a global-humanitarian perspective.
  • Both are willing to consider any group and any cause in order to accomplish this goal.
  • Both are highly interested in increasing the level of transparency, accountability, and critical discussion and reflection within the world of giving.

Over time, GiveWell and Good Ventures have worked increasingly closely together. In April of last year, Cari joined our Board of Directors; in December of last year, Cari announced substantial grants to our top-rated charities from Good Ventures. In the meantime, Cari was exploring the rest of the world of philanthropy, speaking with a large number of major philanthropists, nonprofit representatives, philanthropic advisors, etc. After a year of exploration, Cari stated to us that while many of the people she had spoken to had been helpful, GiveWell seemed to be most in alignment with the values of Good Ventures and had given the most helpful support in pursuing these values, and that GiveWell’s research appears to her to be at least as high-quality as any foundation research she’s seen. Now, GiveWell and Good Ventures plan to “act as a single team” as we source and vet funding opportunities in areas in which our interests overlap.

This is a partnership, not a merger; we remain separate legal entities. Cari is President of Good Ventures, while Elie and I are Co-Executive Directors of GiveWell; our authorities differ accordingly. If Good Ventures is interested in an area or activity that we aren’t interested in, it will use its resources to pursue this area or activity; likewise, if we are interested in an area or activity that Good Ventures isn’t interested in, we will use GiveWell’s resources to pursue this area or activity.

However, “acting as a single team” does mean that

  • There are substantial areas of overlap between our interests – investigations and activities that rank high on both of our priority lists. The agenda we laid out recently is a close match to current points of intersection.
  • Within these areas, we maintain a common priority list and divide up labor so that we don’t double-do any work. Division of labor is done by consensus, and if there are unresolvable disagreements each organization makes its own choices about its own resources (this has not happened so far).
  • Within these areas, funding requests and ideas will go through a common process. I.e., if someone brings an idea or request to Cari and we have agreed that it fits within an area that is being primarily managed by GiveWell, she will refer the request or idea to GiveWell rather than evaluating it herself.
  • When given confidential materials that are “for our eyes only,” we will attempt to share these with each other (though of course this will require permission from those providing the materials).
  • We are currently experimenting with close coordination on screening and training new hires. We look for similar qualities in new hires, so people who are interested in a job with one organization or the other may be interviewed by both simultaneously.
  • Overall, the above items require close coordination. For this and other reasons, the GiveWell team is currently planning to move to the Bay Area (more on this in a future post).

It seems to me that this is a relatively unusual arrangement. Formally, each organization has full authority over its own resources and none over the other’s, and this fact underlies all procedures for resolving disagreements if and when we cannot reach consensus. However, in practice recently, these cases have been rare and it has often felt as though we’re a single team with a single agenda.

Why does this situation seem unusual? One possibility is that it isn’t a good idea and that the problems with it will become apparent in the future; this possibility is why we have been clear about procedures for resolving disagreements. But there is another possible explanation. In my view, nonprofit work is naturally suited to this sort of “teamwork without a single authority” arrangement, in a way that for-profit work is not. Both GiveWell and Good Ventures are mission-driven: there are no financial returns to divide up, just a vision for the world on which we are closely aligned.

I believe that nonprofits sometimes mimic for-profits in ways that don’t make sense given their missions. They raise money beyond what they need for their core work. They keep information confidential rather than publishing it as a public good. And they exaggerate successes and downplay shortcomings, while being more honest would help the rest of the world learn and thus ultimately promote their mission (if not their organization). If I’m right, the relative unusualness of “teamwork without mergers” could be another way in which nonprofits are missing opportunities to be effective that aren’t available for for-profits. I think it’s possible that the sort of collaboration GiveWell and Good Ventures have today will be far more common in the future.

Objections and concerns about our new direction

GiveWell has recently been taking on activities that may seem to represent a pretty substantial change of direction, especially for those who think of us as a “charity evaluator focused on saving the most lives per dollar spent.”

  • Within global health and nutrition, we’re considering restricted funding for specific projects, not just recommendations of particular charities.
  • We’re also exploring other causes that are extremely different from global health and may be far less amenable to measurement and “cost per life saved” type calculations, such as meta-research.

When discussing these activities, we’ve lately been encountering a couple of different objections and concerns; this post discusses the objections and our responses. In a nutshell:

  • Some are concerned that we’ll lose our objectivity if we get involved in providing restricted funding: we’ll be tempted to rank the groups following our plans ahead of the groups following their own plans, and we’ll thus lose the quality of being a disinterested third-party evaluator. We believe we can draw a meaningful line between “charities we recommend for unrestricted funding” and “plans we have designed,” leaving individual donors to decide whether they’d rather take our recommendation unconditionally or only follow our advice in the areas where we’re disinterested; we also believe that being open to providing restricted funding is necessary and important, and justifies the resources we’ll be investing. More
  • Some are concerned that by going into new causes, we’ll be spreading ourselves too thin. Understanding global health is already an ambitious and difficult goal; it’s been suggested that we should “stick to our knitting.” We feel that sticking to global health, when we see other causes as potentially more promising, would be out of line with our fundamental mission and value-added as an organization that seeks to help people do as much good as possible. More
  • Some are concerned specifically about new causes that don’t lend themselves to measurement and cost-effectiveness calculations (such as meta-research). It may be difficult to remain systematic and transparent about how we make decisions in these more speculative areas. We recognize this concern, but feel that we can remain systematic and transparent even where measurement is difficult or impossible; furthermore, we feel that we must find a way to do this if we are to have a strong case that philanthropy as a whole (not just sub-sectors of it) should be more systematic and transparent. More

Despite the concerns and risks above, we feel that the benefits of our new direction outweigh them. A major input into this view is the feeling that sticking to our old process would be extremely unlikely to result in finding more outstanding giving opportunities within a reasonable period of time; this is something we will be writing more about.

That said, we do recognize the concerns and risks, and we are interested in others’ thoughts on them.

The risk of losing our objectivity

To date, all of GiveWell’s recommendations have involved unrestricted support to existing organizations. Because of this, we can be pointed to as a “neutral third party” that recommends organizations based exclusively on impact-related criteria. But we’re now contemplating doing what a lot of major funders do and helping to set the agenda for a funded organization, through the mechanism of restricted funding. If we did this, we might have difficulty being neutral between (a) projects that we help design and (b) charities that are simply asking for unrestricted funds, not contracting with us. In fact, we might be tempted to eschew (b) entirely and focus exclusively on designing – rather than finding – giving opportunities.

One important principle here is that we will draw a clear line between organizations we recommend for unrestricted funding and projects designed by GiveWell. We don’t know exactly how the visual presentation will work yet, but we have agreed on the principle that there will be a clear distinction – including on our higher-level and frequently-accessed pages – between GiveWell-designed projects and recommended charities.

Of course, there is still a risk that recommendations for unrestricted funding will have “soft conditions” (i.e., that it will be clear to charities what activities they have to carry out in order to earn or maintain recommendations); this is something that has always been true, though I think the situation is somewhat mitigated by the nature of the room for more funding analysis we perform. (Our analysis asks for predicted charity activities based on total unrestricted funding, not based on GiveWell-specific funding. The expectation is that if GiveWell-directed funding falls short of expectations and the gap is made up by other funding, the activities will still be as outlined; this hopefully provides charities an incentive to project the activities they would most like to carry out, rather than projecting the activities they hope will most appeal to GiveWell specifically.)

Even with a clear distinction, there could still be a reasonable concern that GiveWell will over-allocate resources (in terms of investigative capacity) to designing its own projects, as opposed to finding great organizations. We recognize this concern, but wish to note that – philosophically – we greatly prefer unrestricted to restricted funding, and greatly prefer a “hands-off” to a “hands-on” approach. We don’t have the capacity to actively manage projects ourselves, and we believe projects are likely to work out better when they are run by people who fully buy into them (as opposed to people who are fulfilling the requirements of restricted funding).

It’s partly because of this philosophy that we’ve stayed away from restricted funding to date, and we remain highly cautious about it. We would prefer to stick to unrestricted funding and may never in fact deal in restricted funding.

Yet it is worth noting why we are considering restricted funding now in a way that we haven’t before. Our impression is that major funders frequently make extensive use of restricted funding; as a result, the existing landscape consists of many charities whose agendas are set partly or fully by external funders.

  • We’ve been surprised by the disconnect we’ve observed in which there is a large number of promising interventions but few charities that focus on these interventions (in a way such that additional dollars will mean additional execution).
  • More generally, we’ve been surprised that in the majority of conversations in which we ask an organization what it would do with more unrestricted funding, it has no clear answer, and prefers instead to tailor its answer to our priorities.

Practically speaking, charities have to focus on what they can fund; and in today’s world, it seems possible that agendas are largely set by funders. Our ideal role would be to “free” great organizations from restricted funding, allowing them to carry out promising projects that they can’t fund otherwise. However, it seems possible that there are too few charities for whom funding would make this sort of a difference, and there is thus some argument for our taking the sort of active role that other funders do.

Finally, by being open to restricted funding, we’ve come across some opportunities that are similar to “unrestricted funding” in most relevant ways, but that structurally involve restrictions and that we couldn’t have come across using our former approach. For example, we’re currently considering the idea of funding particular parts of UNICEF that work on particular interventions that we’re interested in. This wouldn’t involve laying out our own plan, and it would involve getting money to a specific team and leaving the use of the funds at their discretion; however, we could not find this sort of giving opportunity by talking to general UNICEF representatives and asking what they would do with more unrestricted funding. In some sense it may be appropriate to think of UNICEF (and other organizations like it) as a coalition of teams with their own priorities rather than as a single team with a single set of priorities; so in this case a gift that is formally restricted may have many of the desirable qualities of an unrestricted gift. To avoid confusion, we will still distinguish any recommendations along these lines from purely unrestricted gifts, as laid out above.

The risk of spreading ourselves too thin

We still have a lot to learn about global health and nutrition (as indicated by, among other things, our continued learning from VillageReach’s progress). It has been suggested that we should “stick to our knitting,” focusing on the areas of giving in which (a) we’ve built up the brand we have; (b) data and feedback loops tend to be unusually good for the nonprofit world, facilitating learning.

In response, I’d observe:

  • GiveWell is still a young organization. I believe we have attracted attention more for “bringing a different perspective and approach to giving” than for “being experts in global health” (the latter certainly does not describe us). We recognize that we’re taking some level of risk in moving into new areas, but we also believe that taking risks and staying open to new approaches is a major part of what makes GiveWell what it is and that part of “sticking to our knitting” is retaining this quality. We believe that GiveWell and the donors who use our research will be best served by our continuing to do whatever we believe will lead to the best giving opportunities, continuing to change course as much as necessary to facilitate this, and continuing to bring a different perspective and approach to giving – not continuing to focus on global health.
  • While we currently believe that global health is the most promising cause given the information available, we are not confident in this conclusion. We believe that other causes are potentially promising as well, and if we never investigate them, we will be failing in our mission of finding the best giving opportunities possible.
  • We are currently expanding our staff; we expect that we will invest at least as much time in global health over the next few years as over the last few (while also investing time in other causes).

The risk of losing transparency and systematicity as we move away from highly measurable interventions

We have written before that the cause of “global health and nutrition” seems unusually well-suited to meaningful measurement and metrics (by the standards of the nonprofit sector). When working within this cause, we have been able to be relatively clear about our process and about what distinguishes a recommended from a non-recommended charity. There is some risk that as we tackle other causes, such as meta-research, we will have less of an evidence base to go off of; our goals will be further out; we will have to use more intuition and may therefore become less systematic and transparent.

We believe this is a real risk. However, we also believe that (a) the best opportunities for good giving don’t necessarily lie in the domains with the highest measurability (though there is something to be said for measurability, all else equal); (b) we have reached the point where we feel we can explore causes such as meta-research in a way that – while not as systematic as our work on global health – will still include a great deal of public discussion of how we’re thinking, why we recommend what we do, what the key assumptions are in our thinking and recommendations, and how our projects progress over time.

We have long advocated that philanthropists should be more systematic and transparent in their work. If our own systematicity and transparency applies only to the cause where measurement is easiest, we won’t have a very strong case; if, however, we can consistently bring an unusually level of systematicity and transparency to every cause we examine (even those that are less prone to measurement), we will have much more potential to change philanthropy broadly rather than just a single sector of it.

The benefits of our new direction

The above discussion addresses potential concerns over our new direction. We have previously discussed the substantial benefits: finding the best giving opportunities possible and reaching the largest donors possible, both of which are core to our mission. Dealing with the above issues – keeping a focus on recommending unrestricted funding when possible, covering new causes without overly detracting from continued progress on the causes we know well, and remaining systematic and transparent – will be a challenge, but we feel that it is well worth it, especially because we feel we are reaching the limits (for the moment) of our old approach. (We went through a large number of charities in 2011 and are skeptical that we will find new contenders for our top charities, using that basic methodology, anytime in the near future.)

We welcome further comments and criticisms regarding our new approach.

Meta-research

[Added August 27, 2014: GiveWell Labs is now known as the Open Philanthropy Project.]

We previously laid out our working set of focus areas for GiveWell Labs. This post further elaborates on the cause of “meta-research” and explains why meta-research is currently a very high priority for us – it is our #2 highest-priority focus area, after global health and nutrition.

Meta-research refers to improving the incentives in the academic world, to bring them more in line with producing work of maximal benefit to society. Below, we discuss

  • Problems and potential solutions we perceive for (the incentives within) development economics, the area of academia we’re currently most familiar with.
  • Some preliminary thoughts on the potential of meta-research interventions in other fields, particularly medicine.
  • Why we find meta-research so promising and high-priority as a cause.
  • Our plans at the moment for investigating meta-research further.

Meta-research issues for development economics

Through our work in trying to find top charities, we’ve examined a fair amount of the literature on how Western aid might contribute to reducing poverty, which we broadly refer to in this post as “development economics.” In doing so, we’ve noticed – and discussed – multiple ways in which development economics appears to be falling short of its full potential to generate useful knowledge:

Lack of adequate measures against publication bias. We have written extensively about publication bias, which refers broadly to the tendency of studies to be biased toward drawing the “right” conclusions (the conclusions the author would like to believe in, the conclusions the overall peer community would like to believe in, etc.) Publication bias can come both from “data mining” (an author interprets the data in many different ways and publishes/highlights the ways that point to the “right” conclusions) and the “file drawer problem” (studies that do not find the “right” conclusions have more difficulty getting published).

Conceptually, publication bias seems to us like one of the most fundamental threats to academia’s producing useful knowledge – it is a force that pushes research to “find” what is already believed (or what people want to believe), rather than what is true, in a way that is difficult for the users of research to detect. The existing studies on publication bias suggest that it is a major problem. There are potential solutions to publication bias – particularly preregistration – that appear underutilized (we have seen next to no use of preregistration in development economics).

A funder recently forwarded us the following comment on a paper under review from a journal, which illustrates this problem:

Overall, I think the paper addresses very important research questions. The authors did well in trying to address issues of causality. But the lack of results has weakened the scope and the relevance of the paper. Unless the authors considerably generate new and positive results by looking say at more heterogeneous treatment effects, the paper cannot, in my view, be published in an academic journal such as the [journal in question].

Lack of open data and code, by which we mean the fact that academic authors rarely share the full details behind their calculations and claims. David Roodman wrote in 2010:

Not only do authors often keep their data and computer programs secret, but journals, whose job it is to assure quality, let them get away with it. For example, it took two relatively gargantuan efforts—Jonathan Morduch’s in the late 1990s, and mine (joining Jonathan) more recently—just to check the math in the Pitt and Khandker paper claiming that microcredit reduced poverty in Bangladesh. And it’s pretty clear now that the math was wrong.

The case he discusses turned out, in our opinion, to be an excellent illustration of the problems that can arise when authors do not share the full details of their calculations: a study was cited for years as some of the best available evidence regarding the impact of microfinance, but it ultimately turned out to be badly flawed, and later more rigorous studies contradicted its conclusions. (See our 2011 discussion of this case.)

Another example of the importance of open data was our 2011 uncovering of errors in a prominent cost-effectiveness estimate for deworming. This estimate had been public and cited since 2006, and it took us months of back-and-forth to obtain the full details behind it; at that point it turned out to contain multiple basic errors that caused it to be off by a factor of ~100.

The lack of open data is significant for reasons other than the difficulty of understanding and examining prominent findings. It also is significant because open data could be a public good for researchers; one data set could be used by many different researchers to generate multiple valuable findings. Currently, incentives to create such public goods seem weak.

Inadequate critical discussion and examination of prominent research results. The above two examples, in addition to illustrating open-data-related problems, illustrate another issue: it appears that there are few incentives within academia to critically examine and challenge others’ findings. And when critical examinations and challenges do occur, they can be difficult to find. Note that Roodman and Morduch’s critique (from the example above) was rejected by the journal that had published the study they were critiquing (the sole reviewer was the author of the critiqued study); as for the case of the DCP2 estimate, the critique came from GiveWell and has been published only on our blog (five years after the publication of the estimate).

Overall, our impression is that there is little incentive for academics to actively investigate and question each others’ findings, and that doing so is difficult due to the lack of open data (mentioned above).

Lack of replication. In addition to questioning the analysis of prominent studies, it would also be useful to replicate them: to try carrying out similar interventions, in similar contexts, and seeing whether similar results hold.

In the field of medicine, it is common for an intervention to be carried out in many different rigorous studies (for example, the literature on the effects of distributing insecticide-treated nets includes 22 different randomized controlled trials, and the programs executed are broadly similar though there are some differences). But in development economics, this practice is relatively rare.

More at a recent post by Berk Ozler.

General disconnect between “incentive to publish” and “incentive to contribute maximally to the stock of useful knowledge.” This point is vaguer, but we have heard it raised in multiple conversations with academics. In general, it seems that academics are encouraged to do a certain kind of work: work that results in frequent insights that can lead to publications. Other kinds of useful work may be under-rewarded:

  • Creating public goods for other researchers, such as public data sets (as discussed above)
  • Work whose main payoff is far in the future (for example, studies that take 20 years to generate the most important findings)
  • Studies that challenge widely held, fundamental assumptions in the field (and thus may have difficulty being published and cited despite having high value)
  • Studies whose findings are important from a policymaking or funding perspective, but not interesting (and thus difficult to publish) in terms of delivering surprising or generalizable new insights. For example, we have only been able to identify one randomized controlled trial of a program for improving rural point-of-source water quality, despite the popularity and importance of this type of intervention.

Potential interventions to address these issues.

We’ve had several conversations with academics and funders who work on development economics about how the above issues might be addressed. Most are directed at the specific problems we’ve listed above, though some are more generally in the category of “creating public goods for the research community as a whole.” Some of the more interesting ideas we’ve come across:

  • Funding efforts to promote the use of preregistration and data/code sharing, such as advocating that journals require these things of their publications (a journal might require preregistration and data/code sharing as a condition of publication) or that funders require these things of their grantees (a funder might require preregistration and data/code sharing from all funded studies).
  • Creating a “journal of good questions” – a journal that makes publication decisions on the basis of preregistered study plans rather than on the basis of results. The idea is to reward (with publication) good choices of topics and hypotheses and plans for investigating them, regardless of whether the results themselves turn out to be “interesting.” (We have previously discussed this idea.)
  • Funding a journal, or special issue of a journal, devoted to open-access data sets. Each data set would be accompanied by an explanation of its value and published as a “publication,” to be cited by any future publication drawing on that data set. This may improve incentives to create and publish useful open-access data sets, since scholars who did so well could end up publishing the data sets as papers and having them cited.
  • Funding the creation of large-scale, general-purpose open-access data sets. Currently, researchers generally collect data for the purpose of conducting a particular study; an effort that aimed specifically to create a public good might be better suited to maximizing the general usefulness of the collected data, and may be able to do so at greater scale than would be realistic for a data set aiming to answer a particular question. For example, one might fund a long-term effort to track a representative population in a particular developing country, randomly separating the population into a large “control group” and a set of “treatment groups” that could be treated with different interventions of general interest (cash transfers, scholarships, nutrition programs, etc.)
  • Funding a journal, or special issue of a journal, devoted to discussion, critiques, re-analyses, etc. of existing studies, in order to put more emphasis on – and give more reward to – this activity.
  • Funding awards for excellent public data sets and for excellent replicative studies, reanalysis, and other work that causes either confirmation or re-examination of earlier studies’ findings.
  • Creating a group that specializes in high-quality systematic reviews that summarize the evidence on a particular question, giving heavier weight to more credible studies (similar to the work of the Cochrane Collaboration, which we discuss more below). These reviews might make it easier for funders, policymakers, etc. to make sense of research, and would also provide incentives to researchers to conduct their studies in more credible ways (employing preregistration, data/code sharing, etc.)
  • Creating a web application for sharing, discussing, and rating papers (discussed previously).
  • Awards for the most useful and important research from a policymaker’s or funder’s perspective (these could take practices like data sharing and registration into account as inputs into the credibility of the research).
  • Promoting an “alternative/supplemental reputation system” for papers (and potentially academics) directly based on the value of research from a funder’s or policymaker’s perspective, taking practices like data sharing and registration into account as inputs into the credibility of the research.
  • Creating an organization dedicated to taking quick action to take advantage of “shocks” (natural disasters, policy changes, etc.) that may provide opportunities to test hypotheses. When a “shock” occurred, the organization could poll relevant academics on what the important questions are and what data should be collected, record the academics’ predictions, and fund the collection of relevant data.

Meta-research for other fields

We aren’t as familiar with most fields of research as we are with development economics. However, we have some preliminary reason to think that many fields in academia have a similar story to development economics: multiple issues that keep them short of reaching their full potential to generate useful knowledge, and substantial room for interventions that may improve matters.

  • We recently met with representatives of the Cochrane Collaboration, a group that does systematic reviews of medical literature. We have found Cochrane’s work to be valuable and high-quality, and we were surprised to be told that the U.S. Cochrane Center raises very little in the way of unrestricted funding. After talking to more people in the field, we have formed a preliminary impression that there is little funding available for medical initiatives that cut across biological categories, including the sort of work that Cochrane does (which we would characterize as “meta-research” in the sense that it works toward improved incentives and higher value-added for research in general). We will be further investigating the Cochrane Collaboration’s funding situation and writing more about it in the future.
  • Informal conversations have given me the impression that many of the problems described above – particularly lack of adequate measures against publication bias, lack of preregistration, lack of data/code sharing, and general misalignment between what academics have incentives to study and what would be most valuable – apply to many other fields within the natural and social sciences.
  • I’ve also heard of other problems and ideas that are specific to other fields. For example, a friend of mine in the field of computer science stated to me that
    • There are too few literature reviews in the field of computer science, summarizing what is known and what remains to be determined within a particular field. The literature reviews that do exist quickly become out of date. More up-to-date literature reviews would make it easier for people to contribute to fields without having to be at the right school (and thus in the right social network) for these fields.
    • There are some sub-fields in computer science that require testing different algorithms on data sets, such that the number of appropriate available data sets is highly limited. (For example, testing an algorithm for analyzing online social networks against a data set based on an actual online social network.) In practice, academics often design algorithms that are “over-fitted” to the data sets in use, such that their predictive power over new data sets is questionable. He proposed a set of centralized “canonical” data sets, each split into an “exploration” half and a “confirmation” half; while the “exploration” half would be open access, the “confirmation” half would be controlled by a central authority and academics would be able to test their algorithms on it only in a limited, controlled way (for example, perhaps each academic would be given 5 test runs per month). These data sets would constitute a public good making it easier to compare different academics’ algorithms in a meaningful way, both by reducing the risk of over-fitting and by bringing more standardization to the tests.

Overall, the conversations I’ve had about meta-research – even with people who aren’t carefully selected, such as personal friends – have resulted in an unusually high density of strong opinions and novel (to me) ideas for bringing about positive change.

Why we find meta-research promising as a cause

High potential impact. As we wrote previously, it seems to us that many of philanthropy’s most impressive success stories come from funding scientific research, and that meta-research could have a leveraged impact in the world of scientific research.

Seeming neglect by other funders. We see multiple preliminary signs that this area is neglected by other funders:

  • In examining what foundations work on today, we haven’t seen anyone who appears to have a focus on meta-research. We recently attended a funders’ meeting on promoting preregistration and got the same impression from that meeting.
  • As mentioned above, informal conversations seem to lead more quickly to “ideas for projects that could be worked on but aren’t currently being worked on” than conversations in other domains.
  • As mentioned above, we are surprised by the U.S. Cochrane Center’s apparent low level of funding and need for more funds, and feel that this may point to meta-research as a neglected area.

Good learning opportunities. We have identified funding scientific research as an important area for further investigation. We believe it is one of the most promising areas in philanthropy and also one of the areas that we know the least about. We believe that investigating the question, “In what ways does the world of academic research function suboptimally?” will lead naturally to a better understanding of how that world operates and where within it we are most likely to find overlooked giving opportunities.

Our plan for further investigation of meta-research as an issue area

We are pursuing the following paths of further investigation:

  • Further investigation of the Cochrane Collaboration, starting with conversations with potential funders about why it is having trouble attracting funding. We believe that the Cochrane Collaboration may turn out to be an excellent giving opportunity, and if it does, that this will provide further evidence that meta-research is a promising and under-invested-in cause; on the other hand, if we discover reasons to doubt Cochrane’s effectiveness or need for more funds, this will likely be highly educational in thinking about meta-research in general.
  • Conversations with academics about meta-research-related issues. Some of the key questions we have been asking and will continue to ask:
    • Are there any ways in which the academic system is falling short of its full potential to generate useful knowledge? What are they?
    • What could be done about them?
    • Who is working on the solutions to these problems? Who would be the logical people for a funder to work with on them?
    • Is there any research that you wish you could do but can’t get funded to do? Is there any research that you generally feel ought to be taking place and isn’t? If so, why is this happening?
    • Are there areas of research that you think is overdone or overinvested in? Why do you think this is?
    • What do you think of the ideas we’ve accumulated so far? To the extent that you find one or more to be good ideas, whom would you recommend working with to move forward on or further investigate them?
    • Whom else would you recommend speaking with?
  • Trying to get a bird’s-eye view of the world of academic research, i.e., a view of what the various fields are, how large they are (in terms of people and funding), and where the funding for them comes from. We hope that this bird’s-eye view will help us be more strategic about which fields best combine “high potential” with “major room for interventions to improve their value-added,” and thus to pick fields to focus on for meta-research in a more systematic manner than we’ve done so far.

Giving cash versus giving bednets

We recently published a new review of GiveDirectly, a “standout” charity that gives cash directly to poor people in Kenya. As we were going through the process of discussing and vetting the new review, I found myself wondering how I would defend my preference to donate to distribute insecticide-treated bednets (ITNs) against a serious advocate for cash transfers. We’ve written before about the theoretical appeal of giving out cash, and the fact that there is a promising charity doing so renews the question of whether we should.

I continue to worry about the potential “paternalism” of giving bednets rather than cash (i.e., the implication that donors are making decisions on behalf of recipients). I believe that by default, we should assume that recipients are best positioned to make their own decisions. However, I see a few reasons to think bednets can overcome this presumption:

  • The positive externalities of ITNs
  • The fact that bednets protect children rather than adults
  • The fact that ITNs may be unavailable in local markets or that people may reasonably expect to be given them for free.

I address each of these reasons in more depth below. Note, however, that this discussion is meant to be primarily about the theoretical question of giving cash versus giving bednets; a more practical discussion of giving to the Against Malaria Foundation versus giving to GiveDirectly would focus on the specifics of the two organizations.

The positive externalities of ITNs

We discussed the evidence that ITNs have benefits for community members other than those using the ITNs in our review of the evidence for ITNs. After speaking with several malaria scholars and reviewing the literature, we concluded:

  • The evidence for the efficacy of ITNs is based on studies of universal coverage programs, not targeted programs. In particular, all five studies relevant to the impact of ITNs on mortality involved distribution of ITNs to the community at large, not targeted coverage… Thus, there is little basis available for determining how the impact of ITNs divides between individual-level effects (protection of the person sleeping under the net, due to blockage of mosquitoes) and community-level effects (protection of everyone in communities where ITN coverage is high, due to reduction in the number of infected mosquitoes, caused either by mosquitoes’ being killed by insecticide or by mosquitoes’ becoming exhausted when they have trouble finding a host).
  • The people we spoke to all believe that the community-level effect of ITNs is likely to be a significant component of their effect, though none believe that this effect has been conclusively demonstrated or well quantified.
  • There is some empirical evidence suggesting that the community-level impact of ITNs is significant.

In our main model of the cost-effectiveness of distributing ITNs (XLS), we assumed that 50% of the benefits of ITNs come from the total community coverage of ITNs.

To the extent that ITNs have positive externalities, private actors may underinvest in them, meaning that it may be a good idea to distribute them freely even if individuals would choose not to purchase them at the available price. More generally, since we care about helping whole populations and not any particular specific individual, providing “public goods” of this sort amplifies our impact relative to giving the same amount of money to individuals.

Although it is conceptually possible that giving a large number of individuals small cash grants also has positive externalities, e.g. by boosting the local economy, we haven’t seen any evidence of this, and we doubt that the magnitude of the externality would be as large.

Bednets protect children rather than adults

One of the central reasons that I appreciate cash transfers is that they avoid paternalism. But sometimes, especially with regard to children, paternalism seems morally justifiable. I believe this is one of those cases.

Although AMF distributes ITNs universally, not just to children, the main benefits of ITNs—averting mortality—accrue to children under the age of 5. Children under the age of 5 lack bargaining power, income, and access to credit, not to mention the cognitive faculties to make decisions about their own long-term welfare. Accordingly, purchasing something that is reasonably likely to keep young children alive, even if they don’t or can’t decide to purchase it for themselves, seems to be a justifiable form of paternalism. In general, paternalism towards such young children is unobjectionable.

By distributing bednets, we might be spending money to benefit kids in a way that their parents wouldn’t spend it if we gave it to them instead. Given the magnitude of the benefits to the children, this seems to be justified.

People may not purchase ITNs because they are unavailable in local markets or because they expect to be given them for free

This point is more anecdotal, but Natalie, Holden and I remember being told while we were in Malawi that long-lasting insecticide-treated bednets, of the sort that AMF distributes, are essentially not available for purchase in local markets. Unfortunately this is not in our published notes (DOC) from the conversation where we recollect it occurring.

In another case, an RCT in Kenya in which researchers experimentally subsidized the cost of bednets (PDF), even very small increases in prices led to substantial reductions in bednet purchases by mothers (e.g. charging $0.60 led to a 60% reduction in take-up). Two different people told us in off-the-record conversations that they thought that this occurred because the mothers offered subsidized bednets believed that they would be able to acquire free nets at some other point. There have been periodic free ITN distributions in many sub-Saharan African countries over the last decade, and the international consensus seems to be that governments should distribute ITNs free of charge in malaria-endemic areas. Accordingly, it should not be especially surprising that citizens may expect bednets to be provided free of charge, and may not move to purchase them even if they are available at subsidized prices in the marketplace. If we reasonably expect to be given something for free in a relatively short time window, why buy it now?

This wouldn’t necessarily have been the case if philanthropy had never funded bednets, but having started down this path, I think it provides another consideration in favor of continuing. If we could credibly and cheaply communicate that no more bednets would be forthcoming, this consideration wouldn’t matter, but there is no obvious way to do so.

This is something to keep in mind in the future: philanthropic funding decisions may create an unanticipated a form of “lock-in,” in which future philanthropists become effectively committed to continued funding, even if it would not have been necessary in a counterfactual world of no philanthropic support. Although unlikely to be crucial, this consideration may counsel against certain undertaking some marginal philanthropic activities.

Conclusion

I think that in order to avoid paternalism, philanthropists working to improve the lives of the global poor should have fairly strong presumption in favor of cash transfers, and that those who advocate other strategies should have a convincing story to tell about why they beat cash. Above, I’ve tried to justify my view that bednet distributions are one of those philanthropic strategies that may beat cash. In searching for future top charities, I’d like to see a similarly strong case.

Update on Against Malaria Foundation’s costs

New cost estimates for AMF’s 2012 distributions

In a blog post in February, we noted that we had missed some costs in our estimates that were incurred by AMF’s distribution partner, Concern Universal. We undertook an assessment of these costs through discussion with Concern Universal.

In the course of our assessment, we re-visited our estimate of all other distribution costs as well, and decided that the most informative cost estimates for donors are 2012 projected distribution costs. The reason for this is that as of November 2011, AMF has shifted to larger-scale distributions which it will continue in 2012; these distributions are more cost-effective than previous distributions.

We have now calculated the 2012 projected costs. The total cost per net is lower than our previous estimate, even including the extra distribution partner costs mentioned above. We estimate a total cost of $5.54 per net for 2012 distributions, compared to a previous estimated total cost of $6.31 per net. This figure includes estimates of all costs incurred by all organizations participating in the distribution, including AMF, AMF’s distribution partners and local actors that work with AMF’s distribution partners.

The bulk of the change is due to the fact that AMF expects to distribute a million nets in 2012 – over twice the number it distributed in any previous year – while its organizational costs are likely to remain stable. Another contributor to the lower cost is that the equivalent cost of the donated services that AMF receives has decreased (both in the past year and projected for 2012). See our updated AMF review for full details.

We also calculated the marginal cost per net, which is projected to be $5.15 per net for 2012. The marginal net cost excludes AMF organizational costs, because we believe that these are unlikely to rise as additional nets are distributed (details in our updated AMF review.) The marginal cost per net is slightly higher than our previous estimate (which was about $5 per net), since it includes an extra $0.15 costs incurred by the distribution partners (for details on these costs, see below.)

Updated cost per life saved

Using the 2012 projected costs per LLIN, we estimate the cost per child life saved through an AMF LLIN distribution at about $1,600 using the marginal cost ($5.15 per LLIN) and about $1,700 using the total cost ($5.54 per LLIN).

See our spreadsheet analysis for details of our cost per life saved estimate.

Missing distribution partner costs

We have now gathered information on the missing costs from AMF’s distribution partner, Concern Universal. These missing costs have added an additional $0.15 per net. They consist of costs for salaries and office overhead that were incurred by both Concern Universal and by the Malawi government (which pays the salaries of health workers who assisted in the net distribution). Concern Universal did not initially tell us about these costs because they were costs that it incurred regardless of whether the distribution took place. However, we prefer to include all costs incurred to carry out a project, because we believe that this gives the best view of what it costs to achieve a particular impact (such as saving a life), and also avoids the lack of clarity and complications of leverage in charity.

Full details on these costs are available in our costs spreadsheet and our updated AMF review.

Millennium Villages Project

Several people have emailed us in the past few days asking about the new evaluation of the Millennium Villages Project (MVP), published in The Lancet last week. It has received significant attention in the development blogosphere (see, e.g., here, here, here, and here).

The evaluation argues that the MVP was responsible for a substantial drop in child mortality. However, we see a number of problems.

Summary

  • Even if the evaluation’s conclusions are taken at face value, insecticide-treated net distribution alone appears to account for 42% of the total effect on child mortality (though there is high uncertainty).
  • The MVP is much more expensive than insecticide-treated net distribution – around 45x on a per-person basis. Therefore, we believe that in order to make an argument that the MVP is the best available use of dollars, one must demonstrate effects far greater than those attained through distributing bednets. We believe the evaluation falls short on this front, and that the mortality averted by the MVP could have been averted at about 1/35th of the cost by simply distributing bednets. Note that the evaluation does not claim statistically significant impacts beyond health; all five of the reported statistically significant impacts are fairly closely connected to childhood mortality reduction.
  • There are a number of other issues with the evaluation, such that we believe the child mortality effect should not be taken at face value. We have substantial concerns about both selection bias and publication bias. In addition, a mathematical error, discovered by the World Bank’s Gabriel Demombynes and Espen Beer Prydz, overstates the reduction in child mortality, and the corrected effect appears similar to the reduction in child mortality for the countries as a whole that the MVP works in (though still greater than the reduction in mortality for the villages the MVP chose as comparisons for the evaluation). The MVP published a partial retraction with respect to this error (PDF) today.

We would guess that the MVP has some positive effects in the villages it works in – but for a project that costs as much per person as the MVP, that isn’t enough. We don’t believe the MVP has demonstrated cost-effective or sustainable benefits. We also don’t believe it has lived up (so far) to its hopes of being a “proof of concept” that can shed new light on debates over poverty.

Also see coverage of the Millennium Villages Project by David Barry, Michael Clemens, Lee Crawfurd, and Gabriel Demombynes and Espen Beer Prydz, much of which we’ve found helpful in thinking about the MVP and some of which we cite in this post.

Background

The Millennium Villages Project attempts to make significant progress towards achieving the Millennium Development Goals through a package of intensive interventions in 13 clusters of villages in rural Africa. It further aims to serve as a demonstration of the potential of integrated development efforts to cost-effectively improve lives in rural Africa. In its own words, the MVP states, “Millennium Villages are designed to demonstrate how the Millennium Development Goals can be met in rural Africa over 10 years through integrated, community-led development at very low cost.”

The drop in child mortality, and the comparison to insecticide-treated nets

The new evaluation concludes:

“Baseline levels of MDG-related spending averaged $27 per head, increasing to $116 by year 3 of which $25 was spent on health. After 3 years, reductions in poverty, food insecurity, stunting, and malaria parasitaemia were reported across nine Millennium Village sites. Access to improved water and sanitation increased, along with coverage for many maternal-child health interventions. Mortality rates in children younger than 5 years of age decreased by 22% in Millennium Village sites relative to baseline (absolute decrease 25 deaths per 1000 livebirths, p=0.015) and 32% relative to matched comparison sites (30 deaths per 1000 livebirths, p=0.033). The average annual rate of reduction of mortality in children younger than 5 years of age was three-times faster in Millennium Village sites than in the most recent 10-year national rural trends (7.8% vs 2.6%).”

In a later section, we question the size and robustness of this conclusion; here we argue that even taken at face value, it does not imply good cost-effectiveness for the MVP compared to insecticide-treated net distribution alone.

The MVP’s own accounting puts the cost per person served in the third year of treatment, including only field costs, at $116 (see the quote, above). Assuming linear ramp-up of the program, we take the average of baseline ($27/person) and third year ($116/person) spending and estimate that MVP spent roughly $72/person during the first three years of the project. Michael Clemens, has argued that their spending amounts to “roughly 100% of local income per capita.”

We should expect that amount of spending to make a difference in the short term, especially since some of it is going to cheap, proven interventions, like distributing bednets. In fact, it appears that the biggest and most robust impact of the 18 reported was increasing the usage of bednets.

The proportion of under-5 children sleeping under bednets in the MVP villages in year 3 was 36.7 percentage points higher than the proportion in the comparison villages. The Cochrane Review on bednet distribution estimates that “5.53 deaths [are] averted per 1000 children protected per year.” (See note.) If we assume that 80% of bednets distributed are used, the additional bednet usage rate (36.7 percentage points) found in MVP’s survey indicates that MVP’s program lead to 46 percentage points (36.7 / 80%) more villagers receiving bednets than did in the control villages. (Note that using a figure lower than 80% for usage would imply a higher impact of bednets because of the way the estimate works.) Therefore, we’d estimate that for every 1000 children living in an MVP village, the bednet portion of MVP’s program alone would be expected to save 2.54 lives per year ((5.53 lives saved per year / 1000 children who receive a bednet) * 0.46 additional children receiving a bednet per child in a MVP village). Said another way, the bednet effect of the MVP program would be expected to reduce a child’s chances of dying by his or her fifth birthday by roughly 1.27 percentage points (0.254% reduction in mortality per year over 5 years). The total reduction in under-five mortality observed in the evaluation was 3.05 percentage points (30.5 per 1000 live births). Thus the expected effect of increasing bednet usage in the villages accounts for 42% of the observed decrease in under-5 mortality, and is within the 95% confidence interval for the total under-5 mortality reduction. (We can’t say with 95% confidence that the true total effect of the MVP on child mortality is larger than just its effect due to increased bednet distribution.)

Insecticide-treated nets cost roughly $6.31 (including all costs) to distribute and cover an average of 1.8 people and last 2.22 years (according to our best estimates). That works out to about $1.58 per person per year. At $72 per person per year, the MVP costs about 45 times as much (on a per-person-per-year basis) as net distribution. Although we would expect bednets to achieve a smaller effect on mortality than MVP on a per-person-per-year basis, we estimate that the MVP could have attained the same mortality reduction at ~1/35 of the cost by simply distributing bednets (see our spreadsheet for details of the calculation).

If the MVP evaluation had shown other impressive impacts, then perhaps the higher costs would be well justified, but 3 of the 5 statistically significant results from the study are on bednet usage, malaria prevalance, and child mortality. (The other two are access to improved sanitation and skilled birth attendance, both of which would also be expected to manifest benefits in terms of reductions in under-5 mortality.) There were no statistically significant benefits in terms of poverty or education.

Other issues with the MVP’s evaluation

Lack of randomization in selecting treatment vs. comparison villages. The evaluation uses a comparison group of villages that were selected non-randomly at the time of follow-up, so many of the main conclusions of the evaluation are drawn based simply on comparing the status of the treated and non-treated villages in year 3 of the intervention, without controlling for potential initial differences between the two groups. If the control villages started at a lower baseline level and improved over time at exactly the same rate as the treatment villages, then the treatment would appear to have an impact equal to the initial difference, before the intervention began, between the the treatment and control groups, even though it actually had none. Even in cases in which baseline data is available from the control groups, it is possible that the group of villages selected as controls could improve more slowly than the treatment group for reasons having nothing to do with the treatment. Accordingly, there are strong structural reasons to regard the evaluation’s claims with skepticism.

Michael Clemens has written more about this issue here and here. We agree with his argument that the MVP could and seemingly should have randomized its selection of treatment vs. control villages instead, especially given its goal of serving as a proof of concept.

Publication bias concerns. The authors report 18 outcomes from the evaluation; results on 13 of them are statistically insignificant at the standard 95% confidence level (including all of the measures of poverty and education). Even if results were entirely random, we’d expect roughly one statistically significant result out of 18 comparisons. The authors find five statistically significant results, which implies that the results are unlikely to be just due to chance, but they could have explicitly addressed the fact that they checked a number of hypotheses and performed statistical adjustments for this fact, which would have increased our confidence in their results. The authors did register the study with ClinicalTrials.gov, but the protocol was first submitted in May 2010, long after the data had been collected for this study.

We also note that the registration lists 22 outcomes, but the authors only report results for 18 in the paper. They explain the discrepancy as follows: “The outcome of antimalarial treatment for children younger than 5 years of age was excluded because new WHO guidelines for rapid testing and treatment at the household level invalidate questions used to construct this indicator. Questions on exclusive breast-feeding, the introduction of complementary feeding, and appropriate pneumonia treatment were not captured in our year 3 assessments.” But this only accounts for three of the four missing outcomes. This does not explain why the authors do not report results for mid-upper arm circumference (a measure of malnutrition), which the ClinicalTrials.gov protocol said they would collect.

Mathematical error in estimating the magnitude of the child-mortality drop.

Note: the MVP published a partial retraction with respect to this error (PDF) today.

At the World Bank’s Development Impact Blog, Gabriel Demombynes and Espen Beer Prydz point out a mathematical error in the evaluation’s claim that “The average annual rate of reduction of mortality in children younger than 5 years of age was three-times faster in Millennium Village sites than in the most recent 10-year national rural trends (7.8% vs 2.6%).”

Essentially, they used the wrong time frame in calculating the decline in Millennium Villages: to estimate the per-year decline in childhood mortality, they divided the difference in the average childhood mortality during the treatment period (3 years long) and the previous 5 year baseline period by three, to try to get the annual decline. As Demombynes and Prydz point out, however, this mistakenly assumes that the time difference between the 3 year average and the 5 year average is 3 years, when it is in fact 4 years:

[When we originally published this post in 2012, we included a link here to an image stored on a World Bank web server. In 2020, we learned that this image link was broken and were unable to successfully replace it. We apologize for the omission of this image.]

This shifts the annual decline in child mortality from 7.8% to 5.9% (though see Dave Barry and Michael Clemens’ comments here for more discussion of the assumptions behind these calculations).

The adjusted figure for child mortality improvement is no better for the MVP villages than for national trends. Demombynes and Prydz go on to argue that using a more appropriate and up-to-date data set for the national trends in childhood mortality get an average trend of -6.4% a year, better than in the Millennium Villages, and that the average reductions in rural areas are even higher.

Note, however, that this argument is saying that the comparison group in the study is not representative of the broader trend, not that the Millennium Villages did not improve relative to the comparison group.

Conclusion

The Millennium Villages Project is a large, multi-sectoral, long-term set of interventions. The new evaluation suggests, though it does not prove, that the MVP is making progress in reducing childhood mortality, but at great cost. It does not provide any evidence that the MVP is reducing poverty or improving education, its other main goals. These results from the first three years of implementation, if taken seriously, are discouraging. The primary benefits of the intervention so far–reductions in childhood mortality–could have been achieved at much lower costs by simply distributing bednets.

Note: the Cochrane estimate of 5.53 deaths averted per 1,000 children protected per year does not assume perfect usage. Our examination of the studies that went in to the Cochrane estimate found that most studies report usage rates in the range of 60-80%, though some report 90%+ usage.