The GiveWell Blog

Rethinking VillageReach’s pilot project

Background

Over the past 3 years, VillageReach has received over $2 million as a direct result of our recommendation. VillageReach put these funds towards its work to scale up its health logistics program, which it implemented as a pilot project in one province in Mozambique between 2002 and 2007, to the rest of the country. A core part of our research process is following the progress of charities to which GiveWell directs a significant amount of funding, and we’ve been following and reporting on VillageReach’s progress.

In addition to following VillageReach’s progress scaling up its program, we’ve recently been reassessing the evidence of the effectiveness of VillageReach’s pilot project. We’ve done this for two reasons. First, when we first encountered VillageReach, in 2009, GiveWell was a younger organization with a less developed research process. We believe that our approach for evaluating organizations has significantly improved and we wanted to see how VillageReach’s evidence would stack up given our current approach. Second, in its scale-up work, new data has become available that is relevant to our assessment of whether the pilot project was successful. GiveWell is now better than it was at understanding the extent to which it, or anyone, can draw conclusions from these sorts of impact evaluations.

In the case of the VillageReach evaluation, while we do not have facts to demonstrate otherwise, we now understand it is possible that factors other than VillageReach’s program might have contributed to the increase in coverage rate. As a result, we have moderated the confidence we had earlier in the extent to which VillageReach’s program was responsible for the increase in coverage rates.

This has major implications for our view of VillageReach, as compared to our current top charities (AMF and SCI), as a giving opportunity. We feel that within the framework of “delivering proven, cost-effective interventions to improve health,” AMF and SCI are solidly better giving opportunities than VillageReach (both now and at the time when we recommended VillageReach). Given the information we have, we see less room for doubt in the cases for AMF’s and SCI’s impact than in the case for VillageReach’s.

That said, we continue to view VillageReach as a highly transparent “learning organization” (capable of conducting its activities in a way that can lead to learning). Over the past few years, VillageReach has provided us with the source data behind its evaluations enabling us to do our own in depth analysis and draw our own conclusions. That work has contributed to our own growing ability to evaluate impact evaluations and determine the level of reliance that can be placed on them. We will be talking with VillageReach about how more funding could contribute to more experimentation and learning, and we will likely be interested in recommending such funding – to encourage such outstanding transparency and accountability, and learn more in the future.

VillageReach’s pilot project’s impact

In our July 2009 review of VillageReach, we attributed an increase in vaccination rates in Cabo Delgado to VillageReach’s program.

Two factors carried substantial weight in our view: (a) drops in the “stockout rate” of vaccines (i.e., the percentage of clinics that did not have all basic vaccine types available, see chart below) and (b) VillageReach’s report that other NGOs were unlikely to have contributed to the increase because they were not very involved with immunization activities during the 2002-2007 period.

In March 2012, we published a re-analysis that somewhat changed the picture presented by these charts. In it, the change in immunization coverage appears more similar between Cabo Delgado and Niassa (though quite different from the other provinces in Mozambique); in addition, some of the “low stockouts” period in the first chart turns out to be a period in which there was substantial missing data (it still appears that stockouts were, in fact, low during this period, so this is something of a minor change, but it still presents a different picture from how we interpreted the data previously).

Since we first reviewed VillageReach in 2009, our understanding of international aid generally has improved, and we now have more context for alternative, non-VillageReach factors that could have led to the increase in immunization. For example, in other charity examinations, there have been cases in which we noted that the charity’s entry into an area appeared to coincide with a generally higher level of interest in the charity’s sector on the part of the local government. We sought to understand the extent to which there may be an alternate explanation for the improvements that were concurrent with VillageReach’s activities.

We have not found any evidence that activities by other NGOs (i.e., non-governmental organizations) contributed to the increase in coverage rates, but reflecting on that question led us to focus on whether activities by governmental aid organizations (multilaterals and bilaterals) could have contributed to the increase in coverage rates. To answer this question we contacted and spoke with groups familiar with Mozambique’s immunization program during the 2002-2007 period. We spoke with Karin Turner, Deputy Director, Health Systems for USAID Mozambique (as well as other staff in that office) and Dr. Manuel Novela, a WHO EPI (Expanded Program on Immunization) Specialist for Mozambique.

Our understanding from these conversations is that:

  • As an alternative to prior separate donor-direct funding mechanisms, major international donors started contributing to “common funds” around the year 2000. Common funds aimed to provide general operating support (and greater decision making autonomy) to developing countries’ ministries of health. When we spoke with USAID’s Mozambique office in April, reprensentatives told us that it recalled that Cabo Delgado and Niassa, the two provinces in Mozambique which experienced the largest increases in immunization rates between 2003 and 2008, used a larger proportion of their common fund funds on immunization-related activities than other provinces. We recently reached out to USAID in Mozambique to confirm this and we have not yet received an answer. Unfortunately, we have also not been able to track down data on how common funds were spent. [UPDATE September 10, 2012: Our current understanding is that USAID believes Niassa and Cabo Delgado had the ability to focus common funds on vaccination, but that it does not know whether this was done.]
  • In the early 2000s, other funders became interested in supporting Northern Mozambique (of which Niassa and Cabo Delgado are a part), specifically. According to USAID, Irish Aid and the World Bank provided increased support for immunization activities to Niassa during the 2000s. We have no evidence, however, of additional funders for immunization activities in Cabo Delgado.

At this point we feel that the fall in stockouts and rise in immunization rates observed in Cabo Delgado could be attributed to VillageReach’s activities (and the improvement in Niassa attributed to the activities of Irish Aid and the World Bank discussed above), but it is possible to speculate that the improvements in both provinces were driven by another factor (perhaps the allocation of common funds) that we do not have full context on. The fact that Niassa, a neighboring province, experienced a large rise in immunization rates (although not to the 90%+ range seen in Cabo Delgado) over the same period (see chart above) raises the possibility (from our perspective) that non-VillageReach factors contributed to the rise in immunization rates in Cabo Delgado (although it is also possible to speculate that Irish Aid/World Bank funds spent in Niassa increased coverage rates there while the VillageReach program in Cabo Delgado was responsible for the increases in that province). We have also not looked into immunization funding in other (i.e., non-Niassa or Cabo Delgado) provinces over this period. Were we to find evidence of increased funding for immunization without commensurate increases in immunization coverage, it would reduce our assessment of the probability that government funds were responsible for the increase in Cabo Delgado.

VillageReach’s perspective

We asked Leah Hasselback, VillageReach’s Mozambique Country Director, about possible additional factors. She told us that in completing its evaluation of the pilot project, VillageReach had spoken with WHO as well as with bilateral donors, and that no one had mentioned Cabo Delgado’s using common funds for immunization or additional immunization-specific funding for Cabo Delgado. Note that VillageReach’s assessment of other factors was completed after the fact.

Additional Data

VillageReach exited Cabo Delgado in 2007. Recently, two different data sets have become available on immunization coverage in the province in 2010-2011. The first is a survey conducted by VillageReach, and the second is the DHS report for 2011. The key question we asked when examining these was whether they demonstrate a worsening of immunization coverage relative to 2008; if immunization coverage had worsened in the years since VillageReach exited (during which time its distribution system was discontinued), this would provide some suggestive evidence for the importance of the VillageReach model.

The two data sets present different pictures. The VillageReach survey data shows different trends in different figures, but overall we feel it does not show worsening of immunization coverage. On the other hand, the DHS report does show signs of worsening in coverage. (Details in the footnote at the end of this post.)

VillageReach’s perspective

Leah Hasselback, VillageReach’s Country Director for Mozambique, notes several other factors occurring in Cabo Delgado between 2007 and 2010 may have caused immunization rates to stay higher:

  • Mozambique introduced the pentavalent vaccine in November 2009. This vaccine, which includes 5 needed vaccines in one, was accompanied with significant vaccine-related promotion which also should have improved immunization rates.
  • Cabo Delgado added 20 additional health centers between the end of VillageReach’s pilot project and its beginning its scale up work. During the entire period of the pilot project, Cabo Delgado added only 1 health center.
  • There were immunization campaigns in 2008 that focused specifically on measles and polio.
  • FDC, the local NGO with which VillageReach partnered during the pilot project, ran a social mobilization campaign in 2008-09 in a single district of Cabo Delgado.

Our current take on VillageReach

Though its pilot project evaluation is the single best evaluation we have ever seen from a nonprofit evaluating its own programs (as opposed to academics running randomized controlled trials of aspects of an organization’s activities), and the evaluation is both thoughtful and self-critical, we still feel that there are too many unanswered (and perhaps unanswerable) questions about VillageReach’s impact to have strong confidence that it caused an increase in immunization rates.

This view has major implications for our view of VillageReach, as compared to our current top charities (AMF and SCI), as a giving opportunity. We feel that within the framework of “delivering proven, cost-effective interventions to improve health,” AMF and SCI are solidly better giving opportunities than VillageReach (both now or at the time when we recommended it). Given the information we have, we see less room for doubt in the cases for AMF’s and SCI’s impact than in the case for VillageReach’s.

On the other hand, we wish to emphasize another sense in which VillageReach was – and is – an outstanding giving opportunity. VillageReach is experimenting with a novel approach to health, collecting meaningful data that can lead to learning, and sharing what it finds – both the good and the bad – in a way that is likely to improve the knowledge and efficiency of aid as a whole. In this respect we see it as very unusual: most of the charities we’ve encountered seem to collect little meaningful data, are reluctant to share what they do have, and are especially reluctant to share anything that may call their impact into question.

Groups like VillageReach are creating a new dialogue around charitable giving, and it’s important to us that this type of behavior is supported. We want to encourage VillageReach and other groups to share information about how their programs are going, and we want to continue to see more experimentation and learning. So, we are seriously considering recommending donations to VillageReach, not despite the struggles it’s had but because it’s had these struggles and is being honest about them.

VillageReach has sent us a funding update, which we plan to review and share soon. We will also be writing more, in future posts, about what we’ve learned overall from the experience of working with VillageReach, and what we feel it says about our research process.


Footnote:

VillageReach 2010 suvery in Cabo Delgado

For the below analysis we relied on two studies conducted by VillageReach or contractors hired by VillageReach:

  • A July 2008 survey of two groups of children in Cabo Delgado: children aged 12-23 months (likely vaccinated at the end and after the VR project, which ended in Feb-Apr 2007) and children aged 24-25 months (likely vaccinated during the project).
  • An April 2010 survey of children 12-23 months of age. None of these children would have been vaccinated during the VillageReach pilot project.

There are three main indicators that VillageReach uses as numerators for the “vaccination coverage rate”:

  1. Fully vaccinated: child has received each of 8 vaccinations by the time of the survey (BCG, 3 x DTP, 3 x Polio, Measles). A vaccination is counted if either it is recorded on the child’s vaccination card (which are kept by parents) or if a caregiver states that the child received the vaccination.
  2. Fully immunized (either by time of survey or before 12 months of age): This is a stricter measure than “fully vaccinated.” In addition to having all the vaccinations, there are additional conditions which must be met:
    • All vaccinations and timings must be verified on the child’s vaccination card (verbal confirmation by a caregiver is not valid).
    • All 3 polio vaccinations must be received at least 28 days apart. Same for DTP vaccinations.
    • Measles vaccination must be given after 9 months of age.
  3. DTP3: Received all 3 diptheria, pertussis, and tetanus vaccinations. Verification with the vaccination card is not needed.

In Cabo Delgado rates of “fully vaccinated” and DTP3 remained more or less constant in the 2008 and 2010 surveys:

  • Fully vaccinated:
    • 2008: 92.8% for 24-35 month olds and 87.8% for 12-23 months olds
    • 2010: 89.1% (12-23 month olds)
  • DTP3:
    • 2008: 95.4% for 24-35 month olds and 92.8% for 12-23 months olds
    • 2010: 91.9% (12-23 month olds)

It’s harder to interpret the fully immunized figures. The figure for this did fall between 2008 and 2010:

  • Fully immunized at the time of the survey:
    • 2008: 72.2% for 24-35 month olds and 73.0% for 12-23 months olds
    • 2010: 57.9% or 48.8% (both numbers are given in the report; 48.8% is the one that is repeated in summary reports VillageReach has published)
  • Fully immunized by 12 months of age:
    • 2008: 54.9% for 24-35 month olds and 61.2% for 12-23 months olds
    • 2010: 40.8%

The primary reasons that children failed to qualify as fully immunized in the 2010 survey do not appear to be issues that better vaccine logistics, the issue addressed by VillageReach’s program, would likely have addressed (these categories can overlap):

  • 27% of the whole sample (i.e., at least half of those who didn’t qualify as fully immunized) received their measles vaccine before 9 months of age, up from 8% in the 2008 survey
  • 19% of the sample got polio or DTP shots within 28 days of each other, up from 2% in 2008 survey
  • Only 11.5% of the sample got a vaccination after 12 months of age

Demographic and Health Survey (DHS) of Mozambique from 2011 (preliminary report)

In this spreadsheet, we have compiled vaccination rate data from four national, high-quality surveys: 3 DHS surveys in 1997, 2003, and 2011, and a Multiple Indicator Cluster Survey (MICS) from 2008. Note that only a subset of the children included in the 2008 survey were born in time to potentially directly benefit from VillageReach’s pilot project. With that caveat in mind, a few observations:

  • DPT3 vaccination and fully vaccinated rates observed in Cabo Delgado in 2011 were substantially lower in 2011 than in 2008, while rates were found to have risen over that period in nearby provinces, including Niassa, the comparison province from VillageReach’s project evaluation.
  • Vaccination rates for vaccines earlier in the vaccination series (such as DPT1, DPT2, and BCG) were found to be about the same or decreased only slightly from the 2008 to the 2011 surveys.

Sources:

Some history behind our shifting approach to research

The approach that GiveWell took from 2007-2011 had two crucial qualities:

  • We have been passive. That is, we have focused on finding the best existing organizations and supporting them with no-strings-attached donations, rather than a more “active” approach of designing our own strategy, treating charities as partners in carrying out this strategy, and restricting donations accordingly.
  • We have sought proven cost-effective giving opportunities. That is, we have looked for situations where a donor can be reasonably confident – based on empirical evidence – that his/her donation will result in lives being changed for the better, at a high rate of “expected good accomplished per dollar spent.”

This year, we have been experimenting with giving opportunities that lack one or both of these qualities. We previously defended our shift in this direction; this post gives more context on the history that has led us to this point and discusses why we don’t think we can retain both of the qualities above and continue to find great giving opportunities at an acceptable rate. A future post will go into some of the questions we are addressing as we begin to shift our approach.

The history of GiveWell’s approach to finding outstanding giving opportunities
In our first year, we took an approach that was highly passive and highly focused on proven cost-effective interventions. We invited grant applications from a wide variety of nonprofits, and we didn’t attempt to focus on any particular strategies for helping people; we emphasized only that we wished to see a convincing case for proven, cost-effective, scalable impact, and we picked recommended charities accordingly. (See our first-round application linked from our overview of applications we received for our 2007-2008 research process.)

At that point we weren’t sure what to expect; what we found was that much of the most convincing evidence for effectiveness was at the “intervention level” rather than the “charity level” (i.e., there are programs, such as distribution of insecticide-treated nets, that have strong publicly available evidence bases, and few charities whose in-house evidence seems to add much to the case). Accordingly, for our 2008-2009 process, we did substantial independent review of research on aid interventions, and published a list of “priority programs” with strong evidence bases. This was a step in the direction of being less “passive”: doing our own independent analysis to determine what sorts of charities were most promising, rather than simply asking charities to make their own case.

In 2011, we announced an intensifying focus on deeply examining the best charities we could find (rather than evaluating charities by the standards of their issue areas). The research process that followed (leading up to our 2011 recommendations) was broad and open to many different types of groups, but it was also “active” in the sense that we often deprioritized a charity after the initial phone call, based on our judgment of how likely it seemed to ultimately merit a confident recommendation. In addition to examining charities focused on what we considered to be proven interventions, we also flagged charities for having seeming “high upside” in various ways, and considered these groups for recommendations; we tried to be open to groups we could recommend even though they didn’t work on what we considered priority interventions.

Ultimately, we didn’t find any such groups promising enough, and our top charities ended up being groups focused on interventions with strong evidence bases.

At this point, we feel that

  • There is a distinct set of interventions – concentrated in the area of global health and nutrition – for which there is strong and generalizable empirical evidence of cost-effective impact on saving and improving lives.
  • We have made intensive efforts over the past 3+ years to find all charities that focus on these interventions. Since these interventions are frequently delivered by/through developing-world governments, it is rare to see many charities taking different approaches to a given such intervention; it is more common that for each such intervention, there are a small number of fairly large charities that work with governments to provide funding, technical assistance, etc. in delivering these interventions.
  • Our current two top charities are the groups we’ve identified that both focus on such interventions (e.g., we can confidently predict that additional donations to these groups will result in more of these interventions) and demonstrate the necessary transparency such that we can perform thorough evaluations and updates.
  • Therefore, we do not expect to find any more “top charities” (in the sense we’ve previously used – “charities that will use additional dollars to carry out cost-effective, proven, scalable activities with high transparency and accountability”) in the near future.
  • By focusing our efforts at the project level rather than the organization level, we may be able to generate more options for donors to deliver such interventions (e.g., considering a single nutritional program implemented by UNICEF. By being open to recommending the funding of particular projects – rather than just the writing of unrestricted checks – we would be further shifting in the direction of “active funding.”

Why aren’t there more organizations focused on our priority interventions?
It may seem puzzling that there are relatively few charities focused on what we consider the most proven interventions. Our basic picture of the reasons for this:

  • As mentioned above, there are only a small number of interventions – concentrated in the area of global health and nutrition – for which there is strong and generalizable empirical evidence of cost-effective impact on saving and improving lives. It seems that global health and nutrition are particularly amenable to meaningful data and analysis; we feel that other sectors are very far from having meaningful data on how to improve lives, perhaps due to the inherent difficulty of measurement rather than due to a failure of effort. (We’ll be writing more about this idea in the future.)
  • In many cases, these interventions are generally thought to be best delivered in partnership with the government (and often many other organizations) and at large scale. Large funders (for example, government funders and major foundations), when they seek to roll out these interventions, often work directly with governments; they may pull in nonprofits for specific sorts of support (example: the rollout of ART in Botswana).This dynamic may limit the opportunities for “entrepreneurial” charities working on these interventions (charities that start small and earn prominence through the work they do). Many of the organizations that do focus on these interventions were essentially founded with very large grants from large funders (examples: Schistosomiasis Control Initiative, GAVI).
  • It is rare for a charity to be exclusively focused on one of these interventions; in fact, it is somewhat rare for a charity to be exclusively focused on any particular intervention. (For illustration, see our list of charities considered in 2009 and note how many have “highly diverse activities.”) A common theme in our conversations with charities has been that we often ask, “What would you do with more unrestricted funding?” and have trouble getting a definite answer; charities often come back with multiple possibilities, asking us which we prefer and offering to submit proposals tailored to our interests.Our impression – both from looking at the grants of major funders and from these conversations with charities – is that charities’ agendas are often partly or fully set by major funders, and thus often reflect the diversity of the different major funders the charities work with, with smaller unrestricted donations serving as support for these diverse agendas. This dynamic makes it difficult to find groups that focus exclusively on a particular proven intervention.

Earlier in our development, we expected the nonprofit sector to look something like the for-profit sector in terms of how organizational strategies and agendas are set. That is, we expected to find that nonprofits usually set their own agendas and seek funders who will support these agendas with relatively limited stipulations and modifications. Instead, we found charities constantly asking what agenda we wanted to fund.

Perhaps this reflects one of the fundamental differences between the two sectors. For-profit investment is what we might call “accountable to profits”: the success of a for-profit investment ultimately depends on whether the company can ultimately turn a profit, and thus it depends on things like the company’s ability to win over customers. By contrast, nonprofit investment is ultimately not accountable to anyone or anything: if a funder sets a poor agenda or fails to support a good one, there are no consequences except what the funder chooses to impose on itself. Nonprofit funders thus have fewer reasons (other than self-imposed ones) to defer to others on agenda-setting and strategizing.

Bottom line: it seems to us that agendas are often set by funders, not charities; looking for charities that have their own predefined agendas limits our options; looking for charities that focus on proven interventions further limits our options; so we have few options unless we expand our scope beyond “charities focused on proven interventions.”

What would our options be if we remained committed to both “passive funding” and “proven cost-effective” interventions?
We could “hold out” for more giving opportunities that meet our original criteria, continuing to recommend the best charities we’ve found while waiting for other charities meeting our criteria to emerge organically (and hoping that our money moved to top charities served as an incentive for this to happen). This would be less work per year than what we’ve done so far (which has involved a lot of exploration, getting up to speed on academic literature, examining many different charities, etc.), so if we went this route we would probably either shrink GiveWell (to perform the same role with minimal resources) or further deepen our due diligence on existing top charities (e.g., perform more site visits).

We’ve also considered looking into areas such as clean water provision and surgery, where we might find giving opportunities that still fit the rubric of “passive” and “proven cost-effective,” but with less strong evidence and/or likely inferior cost-effectiveness. Though we doubt we would find giving opportunities here as strong as our top charities, there could be benefits down the line simply to having more absorptive capacity (i.e., if our money moved continues to grow, we may need more options for donors than what we currently offer.)

In our view, sticking to either of the above approaches would be leaving a lot of potential for impact on the table. We believe that we can broaden our criteria while continuing to bring a level of transparency and public critical reflection that is absent, but badly needed, in today’s nonprofit sector. We believe that this approach may lead to better giving opportunities than those we’ve found so far (even if not by the original criteria), as well as a broader influence on donors and nonprofits. In the future, we’ll be writing more about how we plan to accomplish this.

New Cochrane review of the effectiveness of deworming

Update 07/20/12: Miguel and Kremer (and others) have responded to the characterization of their 2004 study by the updated Cochrane review here. We find many of their responses to the Cochrane authors’ objections (which are distinct from our reservations) persuasive, especially regarding attrition and sample selection in the haemoglobin data and baseline school attendance data. As we wrote last week, the Baird et al. 2011 follow-up to Miguel and Kremer 2004 remains especially important to our view on deworming; neither the updated Cochrane review nor the author’s response has changed that.

On Wednesday, the Cochrane Collaboration published a new systematic review of the effectiveness of deworming drugs in improving nutritional status, school performance, and cognitive test scores.

The new Cochrane review of deworming to kill soil-transmitted intestinal worms (STHs) finds almost no evidence of benefits on nutrition, cognitive development, or school performance in mass deworming studies, and small benefits on nutrition in small, screened studies; this is largely the same conclusion as the older Cochrane review, though the new one is updated with more studies and a persuasive response to criticisms. It excludes studies that treat both STHs and schistosomiasis, which is what the Schistosomiasis Control Initiative does, so it does not directly affect our assessment of them. However, the new review reinforces our skepticism about the quality of much of the evidence supporting deworming, and strengthens our view that the evidence in favor of distributing bednets is stronger. Accordingly, SCI continues to hold our #2 rating. We plan to continue to investigate the papers that are most crucial to our assessment of the benefits of deworming.

In a nutshell, the new Cochrane review does not directly challenge the case for SCI as our #2 charity, though we have somewhat less confidence than we did.

In the remainder of this post, we:

 

The new Cochrane review on STH deworming

In the new Cochrane review on STH deworming, Taylor-Robinson et al. examine randomized controlled trials (RCTs) of deworming to address soil-transmitted intestinal worms (STHs), looking at impacts on nutrition, cognitive skills, and educational outcomes. Excluding studies that treated both STHs and schistosomiasis, they find surprisingly limited evidence of nutritional benefits, and very little support for cognitive or educational benefits.

In particular, they find that:

  • in mass deworming programs that treated everyone without testing them first, there is no consistent evidence for any effect on nutrition, cognitive performance, or school performance (more);
  • in small pilot programs that screened for the presence of worms prior to treatment, treatment was associated with increased weight and haemoglobin, which implies a reduction in anemia (more).

The previous Cochrane review of STH deworming, also by Taylor-Robinson et al., reached many similar conclusions, but we believe the new one to be more robust (more). The older review did not separate out studies that screened for worm infection and look at their effects separately, as the new review does. Doing so sharpens our take on the evidence, without fundamentally changing the picture.

We write more below about how this affects our take on SCI, but it is worth noting that the new systematic review might affect our likelihood of recommending Deworm the World, another deworming charity that we have been investigating. Unlike SCI, which conducts combination deworming, we believe that Deworm the World does some STH-only deworming.

Changes since the last Cochrane review and response to critics

The new review differs from the previous Cochrane review of STH deworming in several ways. Most importantly, from our perspective:

  • it incorporates many additional studies, including more studies focused on haemoglobin/ anemia and Miguel and Kremer 2004, which was previously excluded;
  • it stratifies mass deworming studies by the prevalence of infections, so it can determine whether effects are consistently larger in higher-prevalence studies; and
  • it distinguishes between mass and screened deworming programs.

The new review also differs from several systematic reviews–Hall 2008, Albonico 2008, and Gulani 2007–that have been published since the last major update to the Cochrane review, all of which found statistically significant benefits to deworming.

Some of the changes since the last review were undertaken in order to respond to criticisms from deworming scholars. Taylor-Robinson et al. write:

Critics of a previous version of this review (Dickson 2000a) stated that the impact must be considered stratified by the intensity of the infection (Cooper 2000; Savioli 2000). We have done this comprehensively in this edition and no clear pattern of effect has emerged….

Other advocates of deworming, such as Bundy 2009, have argued that many of the underlying trials of deworming suffer from three critical methodological problems: treatment externalities in dynamic infection systems, inadequate measurement of cognitive outcomes and school attendance, and sample attrition. We agree with these points. However, externalities will be detected by large cluster-RCTs with a year or more follow up, and there are now five trials such as this included in this review.

We find these responses from Taylor-Robinson et al. compelling and we believe the new review to be a significant improvement over the older Cochrane review of deworming.

The new review’s take on mass-deworming programs

Unlike screened programs, mass deworming programs treat everyone with deworming drugs without testing whether they have a worm infection first (because doing so is costly relative to the price of the deworming drugs). The new Cochrane review finds that there is little evidence from studies of mass deworming programs to show that they improve nutrition, cognitive performance, or school outcomes.

Two studies in one location in Kenya with extremely high worm prevalence found that a single deworming treatment caused weight gain, but seven more studies in different areas found no effect, and larger studies with multiple doses were even more inconclusive: two found large and significant results, while ten others found small statistically insignificant results (pgs 19-21). There is essentially no evidence from studies of mass STH deworming to show that it improves haemoglobin status, height, cognitive test scores, or school performance; the evidence for an improvement in school attendance comes solely from the Miguel and Kremer 2004 study, with the other unscreened RCT finding no improvement in attendance. (See our update about this study above).

The older Cochrane review on STH deworming, which we wrote about in our intervention report on deworming, did not distinguish as sharply between mass and screened programs. Though a sensitivity analysis in the old review that focused on mass studies found no significant effect on weight, the main analysis found a small statistically significant benefit by combining screened and mass studies. The new review continues to find that mass deworming has no statistically significant benefit on weight, but it differs from the older review in that it foregrounds this result.

The new Cochrane review also includes haemoglobin status as a main outcome for the first time. It is the first systematic review we’ve seen that distinguishes between the haemoglobin outcomes of mass and screened deworming, finding no statistically significant effect of mass STH deworming.

The new review’s take on smaller programs that screened for worm infection

Despite finding little evidence from mass deworming studies to support deworming, the new Cochrane review does find some evidence from randomized controlled trials to indicate that STH deworming improves nutrition in programs that screen for worm infections (i.e. only give deworming drugs to infected people).

In three small RCTs with a total of 149 participants who were screened for STH infections prior to participation, deworming pills caused a statistically significant increase in weight of about .6 kilograms. In a few other small screened RCTs, deworming statistically significantly improved mid-upper arm circumference and skin fold thickness; similar studies found no effect on height, body mass index, or school attendance. Two screened RCTs with a total of 108 participants found that treating STH infections causes a statistically significant increase in haemoglobin of 3.7g/L (which implies a reduction in anemia).

What does it mean if smaller programs with screened participants show effects, while larger programs of mass deworming do not? One possibility is that STH deworming does have some impact on nutrition in infected individuals, but that the effect is too small to pick up in unscreened population studies. Another possibility is that the effects seen in smaller programs are spurious. The Cochrane review highlights the latter possibility, stating that “the data on targeted deworming is limited (three small trials, n = 149); the quality of the evidence is ’moderate’ for weight and ’low’ for haemoglobin.” (The Cochrane review also points to a third possibility: “the intervention itself is different … having been screened, and then told they have worms, children are more likely to comply with treatment, and alter their behaviour.” We find this possibility least likely.)

The overall quality of deworming research: publication bias, data-mining, and representativeness
One of our big take-aways from the Taylor-Robinson et al. review is that we should be really worried about publication bias, data-mining, and the representativeness of the research we rely on.

Publication bias

The best example of publication bias comes from the DEVTA study of deworming and Vitamin A supplementation, conducted on a population of more than a million children in Lucknow, India from 1999 to 2004, which remains unpublished to this day. We had already been aware of DEVTA from our research on Vitamin A supplementation, but the particulars of Taylor-Robinson et al.’s correspondence with the authors are new to us:

DEVTA: the world’s largest ever RCT, which includes over a million children randomized in a cluster design with mortality as the primary outcome, remains unpublished six years after completion. We have corresponded with the senior author on several occasions. We also wrote a letter to the Lancet in June 2011, asking for publication of this important study. When this letter was accepted, the authors submitted the manuscript to the Lancet within a week, and we withdrew our letter. However, at the time of writing (June 2012) the paper remains unpublished.

Results presented at a conference in 2007 (PPT) indicate that compliance was high but that the treatment did not cause a statistically significant reduction in mortality. Combining this results with other studies of Vitamin A, there still appears to be an effect on mortality, but the lack of formal publication means that the international consensus continues to overestimate the impact of Vitamin A on mortality.

We don’t think that STH deworming prevents a significant number of deaths, so whatever the impact of the deworming branch of the treatment in DEVTA on child mortality turns out to be is unlikely to affect our assessment of deworming. However, the fact that such a large and important study remains unpublished eight years after the trial was completed and five years after a conference presentation conveying the key results speaks to the power of publication bias.

Data mining

More generally, Taylor-Robinson et al. make it clear that studies have looked for potential impacts of deworming on a large number of different outcomes. (I count more than ten—weight, height, mid-upper arm circumference, skin-fold thickness, body mass index, measures of physical exertion like the Harvard Step Test, hemoglobin status, school attendance, school persistence, school exam performance, and cognitive test scores—with many potential sub-categories and measures each.) With so many different outcomes measured and little theoretical basis for determining which results are genuine, the potential for spurious results seems large, especially for outcomes which have been measured in only a few studies. (This would be a form of data-mining, and seems to have played a role in the previous systematic reviews that did find significant results.)

Representativeness

Taylor-Robinson et al. point to an additional concern about representativeness, which, while not really fitting the rubric of data-mining and publication bias, raises the specter of a set of rigorous research results that nonetheless don’t translate into practice. They write:

Evidence of benefit of deworming on nutrition appears to depend on three studies, all conducted more than 15 years ago, with two from the same area of Kenya where nearly all children were infected with worms and worm burdens were high. Later and much larger studies have failed to demonstrate the same effects. It may be that over time the intensity of infection has declined, and that the results from these few trials are simply not applicable to contemporary populations with lighter worm burdens.

This worry comports with our own reservations about the evidence from the Miguel and Kremer 2004 experiment, which was conducted during a period of abnormally elevated worm prevalences due to flooding caused by El Nino.

Together, these examples heighten our concern about the potential for bias and unrepresentativeness in the key studies we rely on in our assessment of the evidence for deworming.

 

The evidence in favor of the Schistosomiasis Control Initiative

Our intervention report on combination deworming, of the kind conducted by the Schistosomiasis Control Initiative, focuses on three kinds of benefits:

  • Subtle general health impacts, especially on haemoglobin. We drew our conclusions on haemoglobin effects from Smith and Brooker 2010‘s analysis of studies on combination deworming; since the new review examines STH-only deworming and not combination deworming, it does not address these studies.
  • Prevention of potentially severe effects, such as intestinal obstruction. These effects are rare and play a relatively small role in our position on deworming. The Cochrane review does not address these effects for the most part. (As stated above, it does discuss one study, with unavailable results, that examined mortality, but we believe mortality from STHs is rare enough that we wouldn’t expect it to show up in such a study.)
  • Developmental impacts, particularly on income later in life. The new review does not directly address the studies we used here. Bleakley 2004 is outside of the scope of the Cochrane review because it is not an experimental analysis, and Baird et al. 2011 is not mentioned, presumably because it has not yet been published. However, Taylor-Robinson do discuss Miguel and Kremer 2004, which underlies the Baird et al. 2011 follow-up; in their assessment of the risk of bias in included studies, Miguel and Kremer 2004 does poorly (it appears to be the worst-graded of the 42 included trials; Figure 3). (See our update about this study above.) Presumably, the follow-up is subject to most, if not all, of the same worries that characterize the initial study since it relies on the same underlying experiment. We have written before about our reservations about these studies, and the new Taylor-Robinson et al. review reinforces those reservations without adding substantial new information. We plan to continue to research the details of these papers, which are crucial to our assessment of deworming.

Conclusion
The new Cochrane review does not directly challenge the findings that are core to our view on combination deworming. That said, it does highlight general issues with research on deworming (e.g., potential publication bias and a case for benefit that is generally weaker than what many relevant academics and advocates seem to have believed). We therefore continue to recommend the Schistosomiasis Control Initiative as our #2 charity, though we have somewhat less confidence than we previously did.

Update on GiveWell’s web traffic / money moved: Q2 2012

In addition to evaluations of other charities, GiveWell publishes substantial evaluation on itself, from the quality of its research to its impact on donations. We publish quarterly updates regarding two key metrics: (a) donations to top charities and (b) web traffic.

The charts below present basic information about our growth in money moved and web traffic thus far in 2012.

Website traffic tends to peak in December of each year (circled in the chart below). Growth in web traffic has generally remained strong in 2012, though it has slowed somewhat in May and June.

Growth in money moved has remained strong as well. The majority of the funds GiveWell moves comes from a relatively small number donors giving larger gifts. These larger donors tend to give in December, and we have found that growth in donations from smaller donors throughout the year tends to provide a reasonable estimate of the growth from the larger donors by the end of the year.

Below, we show two charts illustrating growth among smaller donors.

Thus far in 2012, GiveWell has directed $404,775 to our top charities from donors giving less than $10,000. This is approximately 2.5x the amount we had directed at this point last year.

Most donors give less than $1,000; the chart below shows the growth in the number of smaller donors giving to our top charities.

Overall, 1247 donors have given to GiveWell’s top charities this year (compared to 479 donors at this point last year).

In total, GiveWell donors have directed $964,250 to our top charities this year, compared with $568,250 at this point in 2011. For the reason described above, we don’t find this number to be particularly meaningful at this time of year. One major difference between 2011 and 2012 is that in 2011, Ken Jennings allocated the $150,000 he won participating in a Jeopardy! contest against IBM’s Watson to VillageReach.

GiveWell and Good Ventures

Last year, we met Cari Tuna and Dustin Moskovitz of Good Ventures, a new foundation that is planning eventually on giving substantial amounts (Dustin and Cari aim to give the majority of their net worth within their lifetimes; Dustin is the co-founder of Facebook and, more recently, Asana). We immediately established that Good Ventures and GiveWell share some core values that relatively few others seem to share:

  • Both Good Ventures and GiveWell are aiming to do as much good as possible, from a global-humanitarian perspective.
  • Both are willing to consider any group and any cause in order to accomplish this goal.
  • Both are highly interested in increasing the level of transparency, accountability, and critical discussion and reflection within the world of giving.

Over time, GiveWell and Good Ventures have worked increasingly closely together. In April of last year, Cari joined our Board of Directors; in December of last year, Cari announced substantial grants to our top-rated charities from Good Ventures. In the meantime, Cari was exploring the rest of the world of philanthropy, speaking with a large number of major philanthropists, nonprofit representatives, philanthropic advisors, etc. After a year of exploration, Cari stated to us that while many of the people she had spoken to had been helpful, GiveWell seemed to be most in alignment with the values of Good Ventures and had given the most helpful support in pursuing these values, and that GiveWell’s research appears to her to be at least as high-quality as any foundation research she’s seen. Now, GiveWell and Good Ventures plan to “act as a single team” as we source and vet funding opportunities in areas in which our interests overlap.

This is a partnership, not a merger; we remain separate legal entities. Cari is President of Good Ventures, while Elie and I are Co-Executive Directors of GiveWell; our authorities differ accordingly. If Good Ventures is interested in an area or activity that we aren’t interested in, it will use its resources to pursue this area or activity; likewise, if we are interested in an area or activity that Good Ventures isn’t interested in, we will use GiveWell’s resources to pursue this area or activity.

However, “acting as a single team” does mean that

  • There are substantial areas of overlap between our interests – investigations and activities that rank high on both of our priority lists. The agenda we laid out recently is a close match to current points of intersection.
  • Within these areas, we maintain a common priority list and divide up labor so that we don’t double-do any work. Division of labor is done by consensus, and if there are unresolvable disagreements each organization makes its own choices about its own resources (this has not happened so far).
  • Within these areas, funding requests and ideas will go through a common process. I.e., if someone brings an idea or request to Cari and we have agreed that it fits within an area that is being primarily managed by GiveWell, she will refer the request or idea to GiveWell rather than evaluating it herself.
  • When given confidential materials that are “for our eyes only,” we will attempt to share these with each other (though of course this will require permission from those providing the materials).
  • We are currently experimenting with close coordination on screening and training new hires. We look for similar qualities in new hires, so people who are interested in a job with one organization or the other may be interviewed by both simultaneously.
  • Overall, the above items require close coordination. For this and other reasons, the GiveWell team is currently planning to move to the Bay Area (more on this in a future post).

It seems to me that this is a relatively unusual arrangement. Formally, each organization has full authority over its own resources and none over the other’s, and this fact underlies all procedures for resolving disagreements if and when we cannot reach consensus. However, in practice recently, these cases have been rare and it has often felt as though we’re a single team with a single agenda.

Why does this situation seem unusual? One possibility is that it isn’t a good idea and that the problems with it will become apparent in the future; this possibility is why we have been clear about procedures for resolving disagreements. But there is another possible explanation. In my view, nonprofit work is naturally suited to this sort of “teamwork without a single authority” arrangement, in a way that for-profit work is not. Both GiveWell and Good Ventures are mission-driven: there are no financial returns to divide up, just a vision for the world on which we are closely aligned.

I believe that nonprofits sometimes mimic for-profits in ways that don’t make sense given their missions. They raise money beyond what they need for their core work. They keep information confidential rather than publishing it as a public good. And they exaggerate successes and downplay shortcomings, while being more honest would help the rest of the world learn and thus ultimately promote their mission (if not their organization). If I’m right, the relative unusualness of “teamwork without mergers” could be another way in which nonprofits are missing opportunities to be effective that aren’t available for for-profits. I think it’s possible that the sort of collaboration GiveWell and Good Ventures have today will be far more common in the future.

Objections and concerns about our new direction

GiveWell has recently been taking on activities that may seem to represent a pretty substantial change of direction, especially for those who think of us as a “charity evaluator focused on saving the most lives per dollar spent.”

  • Within global health and nutrition, we’re considering restricted funding for specific projects, not just recommendations of particular charities.
  • We’re also exploring other causes that are extremely different from global health and may be far less amenable to measurement and “cost per life saved” type calculations, such as meta-research.

When discussing these activities, we’ve lately been encountering a couple of different objections and concerns; this post discusses the objections and our responses. In a nutshell:

  • Some are concerned that we’ll lose our objectivity if we get involved in providing restricted funding: we’ll be tempted to rank the groups following our plans ahead of the groups following their own plans, and we’ll thus lose the quality of being a disinterested third-party evaluator. We believe we can draw a meaningful line between “charities we recommend for unrestricted funding” and “plans we have designed,” leaving individual donors to decide whether they’d rather take our recommendation unconditionally or only follow our advice in the areas where we’re disinterested; we also believe that being open to providing restricted funding is necessary and important, and justifies the resources we’ll be investing. More
  • Some are concerned that by going into new causes, we’ll be spreading ourselves too thin. Understanding global health is already an ambitious and difficult goal; it’s been suggested that we should “stick to our knitting.” We feel that sticking to global health, when we see other causes as potentially more promising, would be out of line with our fundamental mission and value-added as an organization that seeks to help people do as much good as possible. More
  • Some are concerned specifically about new causes that don’t lend themselves to measurement and cost-effectiveness calculations (such as meta-research). It may be difficult to remain systematic and transparent about how we make decisions in these more speculative areas. We recognize this concern, but feel that we can remain systematic and transparent even where measurement is difficult or impossible; furthermore, we feel that we must find a way to do this if we are to have a strong case that philanthropy as a whole (not just sub-sectors of it) should be more systematic and transparent. More

Despite the concerns and risks above, we feel that the benefits of our new direction outweigh them. A major input into this view is the feeling that sticking to our old process would be extremely unlikely to result in finding more outstanding giving opportunities within a reasonable period of time; this is something we will be writing more about.

That said, we do recognize the concerns and risks, and we are interested in others’ thoughts on them.

The risk of losing our objectivity

To date, all of GiveWell’s recommendations have involved unrestricted support to existing organizations. Because of this, we can be pointed to as a “neutral third party” that recommends organizations based exclusively on impact-related criteria. But we’re now contemplating doing what a lot of major funders do and helping to set the agenda for a funded organization, through the mechanism of restricted funding. If we did this, we might have difficulty being neutral between (a) projects that we help design and (b) charities that are simply asking for unrestricted funds, not contracting with us. In fact, we might be tempted to eschew (b) entirely and focus exclusively on designing – rather than finding – giving opportunities.

One important principle here is that we will draw a clear line between organizations we recommend for unrestricted funding and projects designed by GiveWell. We don’t know exactly how the visual presentation will work yet, but we have agreed on the principle that there will be a clear distinction – including on our higher-level and frequently-accessed pages – between GiveWell-designed projects and recommended charities.

Of course, there is still a risk that recommendations for unrestricted funding will have “soft conditions” (i.e., that it will be clear to charities what activities they have to carry out in order to earn or maintain recommendations); this is something that has always been true, though I think the situation is somewhat mitigated by the nature of the room for more funding analysis we perform. (Our analysis asks for predicted charity activities based on total unrestricted funding, not based on GiveWell-specific funding. The expectation is that if GiveWell-directed funding falls short of expectations and the gap is made up by other funding, the activities will still be as outlined; this hopefully provides charities an incentive to project the activities they would most like to carry out, rather than projecting the activities they hope will most appeal to GiveWell specifically.)

Even with a clear distinction, there could still be a reasonable concern that GiveWell will over-allocate resources (in terms of investigative capacity) to designing its own projects, as opposed to finding great organizations. We recognize this concern, but wish to note that – philosophically – we greatly prefer unrestricted to restricted funding, and greatly prefer a “hands-off” to a “hands-on” approach. We don’t have the capacity to actively manage projects ourselves, and we believe projects are likely to work out better when they are run by people who fully buy into them (as opposed to people who are fulfilling the requirements of restricted funding).

It’s partly because of this philosophy that we’ve stayed away from restricted funding to date, and we remain highly cautious about it. We would prefer to stick to unrestricted funding and may never in fact deal in restricted funding.

Yet it is worth noting why we are considering restricted funding now in a way that we haven’t before. Our impression is that major funders frequently make extensive use of restricted funding; as a result, the existing landscape consists of many charities whose agendas are set partly or fully by external funders.

  • We’ve been surprised by the disconnect we’ve observed in which there is a large number of promising interventions but few charities that focus on these interventions (in a way such that additional dollars will mean additional execution).
  • More generally, we’ve been surprised that in the majority of conversations in which we ask an organization what it would do with more unrestricted funding, it has no clear answer, and prefers instead to tailor its answer to our priorities.

Practically speaking, charities have to focus on what they can fund; and in today’s world, it seems possible that agendas are largely set by funders. Our ideal role would be to “free” great organizations from restricted funding, allowing them to carry out promising projects that they can’t fund otherwise. However, it seems possible that there are too few charities for whom funding would make this sort of a difference, and there is thus some argument for our taking the sort of active role that other funders do.

Finally, by being open to restricted funding, we’ve come across some opportunities that are similar to “unrestricted funding” in most relevant ways, but that structurally involve restrictions and that we couldn’t have come across using our former approach. For example, we’re currently considering the idea of funding particular parts of UNICEF that work on particular interventions that we’re interested in. This wouldn’t involve laying out our own plan, and it would involve getting money to a specific team and leaving the use of the funds at their discretion; however, we could not find this sort of giving opportunity by talking to general UNICEF representatives and asking what they would do with more unrestricted funding. In some sense it may be appropriate to think of UNICEF (and other organizations like it) as a coalition of teams with their own priorities rather than as a single team with a single set of priorities; so in this case a gift that is formally restricted may have many of the desirable qualities of an unrestricted gift. To avoid confusion, we will still distinguish any recommendations along these lines from purely unrestricted gifts, as laid out above.

The risk of spreading ourselves too thin

We still have a lot to learn about global health and nutrition (as indicated by, among other things, our continued learning from VillageReach’s progress). It has been suggested that we should “stick to our knitting,” focusing on the areas of giving in which (a) we’ve built up the brand we have; (b) data and feedback loops tend to be unusually good for the nonprofit world, facilitating learning.

In response, I’d observe:

  • GiveWell is still a young organization. I believe we have attracted attention more for “bringing a different perspective and approach to giving” than for “being experts in global health” (the latter certainly does not describe us). We recognize that we’re taking some level of risk in moving into new areas, but we also believe that taking risks and staying open to new approaches is a major part of what makes GiveWell what it is and that part of “sticking to our knitting” is retaining this quality. We believe that GiveWell and the donors who use our research will be best served by our continuing to do whatever we believe will lead to the best giving opportunities, continuing to change course as much as necessary to facilitate this, and continuing to bring a different perspective and approach to giving – not continuing to focus on global health.
  • While we currently believe that global health is the most promising cause given the information available, we are not confident in this conclusion. We believe that other causes are potentially promising as well, and if we never investigate them, we will be failing in our mission of finding the best giving opportunities possible.
  • We are currently expanding our staff; we expect that we will invest at least as much time in global health over the next few years as over the last few (while also investing time in other causes).

The risk of losing transparency and systematicity as we move away from highly measurable interventions

We have written before that the cause of “global health and nutrition” seems unusually well-suited to meaningful measurement and metrics (by the standards of the nonprofit sector). When working within this cause, we have been able to be relatively clear about our process and about what distinguishes a recommended from a non-recommended charity. There is some risk that as we tackle other causes, such as meta-research, we will have less of an evidence base to go off of; our goals will be further out; we will have to use more intuition and may therefore become less systematic and transparent.

We believe this is a real risk. However, we also believe that (a) the best opportunities for good giving don’t necessarily lie in the domains with the highest measurability (though there is something to be said for measurability, all else equal); (b) we have reached the point where we feel we can explore causes such as meta-research in a way that – while not as systematic as our work on global health – will still include a great deal of public discussion of how we’re thinking, why we recommend what we do, what the key assumptions are in our thinking and recommendations, and how our projects progress over time.

We have long advocated that philanthropists should be more systematic and transparent in their work. If our own systematicity and transparency applies only to the cause where measurement is easiest, we won’t have a very strong case; if, however, we can consistently bring an unusually level of systematicity and transparency to every cause we examine (even those that are less prone to measurement), we will have much more potential to change philanthropy broadly rather than just a single sector of it.

The benefits of our new direction

The above discussion addresses potential concerns over our new direction. We have previously discussed the substantial benefits: finding the best giving opportunities possible and reaching the largest donors possible, both of which are core to our mission. Dealing with the above issues – keeping a focus on recommending unrestricted funding when possible, covering new causes without overly detracting from continued progress on the causes we know well, and remaining systematic and transparent – will be a challenge, but we feel that it is well worth it, especially because we feel we are reaching the limits (for the moment) of our old approach. (We went through a large number of charities in 2011 and are skeptical that we will find new contenders for our top charities, using that basic methodology, anytime in the near future.)

We welcome further comments and criticisms regarding our new approach.