The GiveWell Blog

New Incentives update

We’re planning to release updated top-charity recommendations in mid-November, and one of the questions our staff has been debating recently is whether to recommend New Incentives as a top charity.

We’ve decided that New Incentives doesn’t currently meet our criteria for a top charity because its program doesn’t have sufficient evidence supporting it. However, we have been extremely impressed with and think very highly of New Incentives’ staff and are considering how best to support them in the future and incentivize others to found an organization like they did.

In this post, we summarize the answers to the key questions we asked to determine whether New Incentives meets our criteria for a top charity recommendation and the options we’re considering for future support.

Background

New Incentives operates a conditional cash transfer (CCT) program in Nigeria to incentivize pregnant women to deliver in a health facility. New Incentives originally intended its CCT program to focus primarily on prevention of mother-to-child transmission (PMTCT) of HIV. However, under this model the program did not reach enough HIV-positive pregnant women to justify its operating costs, and in 2015, New Incentives expanded its program to target both HIV-positive women and HIV-negative women.

New Incentives was the first organization we supported as part of our experimental work to support the development of future top charities. It has been about two and a half years since New Incentives received its initial grant, and it now has a long enough track record implementing its program to be considered for a top charity designation.

Is New Incentives’ intervention evidence-backed?

New Incentives’ impact is made up of three components: (a) delivering cash to very poor people, (b) incentivizing HIV-positive pregnant women to deliver in clinics and get the medicines that prevent mother-to-child transmission of HIV, and (c) incentivizing pregnant women to deliver their babies in a health facility.

Because a relatively small portion of New Incentives’ beneficiaries are HIV-positive, because it costs New Incentives more than GiveDirectly to deliver each dollar, and because it is likely reaching individuals with higher incomes than GiveDirectly does, the impact that has the dominant effect on our view about whether or not New Incentives meets the standard we have for a top charity’s cost-effectiveness is the impact of facility delivery on neonatal mortality.

The evidence we have for the impact of facility delivery comes from (1) relevant randomized controlled trials (RCTs), (2) monitoring that New Incentives carries out, and (3) non-RCT evidence on the impact of facility delivery.

Overall, the evidence from the RCTs increases our confidence that an intervention that offers improved neonatal care could have a significant impact on neonatal mortality, but the evidence we have seen and New Incentives’ current monitoring of its program is insufficient to convince us that increasing the number of women who deliver at facilities has a similar impact.

Randomized controlled trial evidence

Two RCTs of low-intensity training programs for traditional birth attendants found significant (30-45%) reductions in neonatal mortality. These interventions are different than New Incentives’ intervention but may have a similar effect since they aim to increase the knowledge of traditional birth attendants so that they offer similar care to that which is offered in health facilities. We did not find any RCTs on facility delivery itself; these two RCTs are the most similar ones to New Incentives’ program that we identified. The interventions varied:

  • In Gill et al. 2011, the intervention group received training and supplies related to common practices to reduce neonatal mortality immediately following birth. The study observed significant differences between the treatment and control group on practices such as drying the baby with a cloth and then wrapping it in a separate blanket (as opposed to using the same blanket), clearing the baby’s mouth and nose with a suction bulb (instead of a cloth), and using a pocket resuscitator (instead of mouth to mouth) (see Table 5, Pg. 8). We have not closely vetted this study but note some significant-seeming differences between the treatment and control birth attendants–in particular, the treatment group had significantly more education than the control group (see Table 1, Pg. 4).
  • In Jokhio et al. 2005, the intervention group received supplies and 3 days of training focused on antepartum, intrapartum, and postpartum care, including activities such as: “how to conduct a clean delivery; use of the disposable delivery kit; when to refer women for emergency obstetrical care; and care of the newborn.” The intervention group was “asked to visit each woman at least three times during the pregnancy (at three, six, and nine months) to check for dangerous signs such as bleeding or eclampsia, and to encourage women with such signs to seek emergency obstetrical care.”

New Incentives’ monitoring

New Incentives’ staff interviews a nurse and conducts additional inspection at each health facility it considers working with. New Incentives reports the results of these interviews. Two questions are most relevant to our assessment of the similarity between the interventions studied in the RCTs discussed above and the care offered in facilities New Incentives works with.

New Incentives asks nurses at each health facility: 1) “What multiple steps do you take immediately after delivery?” and 2) “What are the essential steps immediately after birth in ensuring that the baby can breathe and is warm?”

For the first question, New Incentives counts how many of the following steps nurses say they take (without being prompted by the New Incentives staff member asking the question): a) Dry baby with cloth, b) Slightly rub baby, c) Clear airways, d) Use air mask if necessary, e) Regulate temperature (put on mother’s belly), f) Don’t know/refused to answer. For the second question, New Incentives captures a free form answer.

We have limited information about the differences in practices between the intervention and control groups in Jokhio et al. 2005, but we do have this information for Gill et al. 2011. (See Gill et al. 2011, Table 5, Pg. 8.) It does not appear that the way New Incentives evaluates answers to its first question can tell us whether nurses in the facilities with which it works follow the improved practices from Gill et al. 2011.

We aggregated the answers to the second question, and 17 of 54 answers explicitly mentioned using a bulb syringe or mucus extractor, which we would guess is equivalent to clearing the baby’s mouth and nose with a suction bulb in Gill et al. 2011 (another 11 mentioned ‘clear airways’ or ‘suck’ which might refer to the procedure used in Gill et al. 2011). We were not able to get additional relevant information from nurses’ answers to the second question.

New Incentives does not appear to ask questions that fully address the other major difference between the intervention and control groups in Gill: use of a resuscitation intervention.

The intervention offered by Jokhio et al. 2005 includes antenatal care in addition to intrapartum and postpartum care, and we don’t know what impacts each part of the intervention had.

Note that New Incentives does not systematically collect data on the type of care women who enroll in its program would have received had they not delivered in a facility, though it has done some limited surveys of traditional birth attendants in the areas it works in.

Non-randomized evaluations of the impact of facility delivery

We have not carefully reviewed these studies, and the studies we identified found mixed effects (including some studies finding higher neonatal mortality in facilities) but we have major questions about these studies’ ability to assess facilities’ causal impact.[1] In particular, women may be more likely to go to a facility for childbirth when they are experiencing complications, which could bias the results.

What is our best guess about New Incentives’ cost-effectiveness?

The most important questions in assessing New Incentives’ cost-effectiveness are (a) the impact its cash transfers have on rates of facility delivery and (b) the impact that increased facility delivery has on neonatal mortality.

New Incentives is conducting an RCT of its impact on (a) and preliminary results indicate that it had a significant impact on facility deliveries: 48% of women in the treatment group (i.e., all those who were offered the opportunity to enroll in the program even if they chose not to do so) delivered in a facility versus 27% in the control group. However, there are differences between the program studied by New Incentives’ RCT and its current program; the RCT only targeted HIV-positive women, so some portion of the impact may be attributable to educating women about the importance of PMTCT. The program studied in the RCT also provided larger cash transfers than New Incentives will provide in its ongoing program: the program originally gave 6,000 naira (approximately 19 US dollars) for enrollment, 20,000 naira for delivery, and 6,000 naira for an HIV test; the program currently gives 1,000 naira for enrollment and 10,000 naira for delivery.

As noted above, we have very limited information to rely on when forming an estimate of the impact of facility delivery on neonatal mortality, and we do not see the evidence from the RCTs described above as particularly relevant or informative.

However, in trying to arrive at our best guess of the impact of the program, we also considered the facts that:

  • The interventions described in Gill et al. 2011 and Jokhio et al. 2005 are relatively low cost and of limited intensity, and they find significant decreases in neonatal mortality. This increases the plausibility that merely referring women to facilities for childbirth could have a similar, significant impact.
  • Our intuition (supported by what appears to be conventional wisdom in the global health community) strongly implies that delivering in a facility (in general, without respect to the specific facilities New Incentives works with in Nigeria) is likely to lead to lower mortality than alternatives.

Philosophical value judgments

Based on the results from the RCTs, we would expect New Incentives’ program to primarily prevent deaths of very young children (largely those within the first days or week of life). In internal, staff discussions about New Incentives, we have asked ourselves how we value the lives of newborn children vs. the lives of those saved by malaria nets (the other life-saving intervention we currently recommend). We have not completed a thorough assessment of the ages at which people die from malaria, but our impression is that the median age of death is approximately 1.[2]

We believe there is no “right” answer to this question, but depending on one’s values, the answer could have a significant impact on the relative cost-effectiveness of New Incentives vs. the Against Malaria Foundation, and by extension our other top charities.

Key considerations include:

  • One could simply sum the number of remaining years of life lost due to a death of a newborn vs a 1-year-old.
  • One could focus solely on lives saved and treat all lives as equivalent.
  • One might say that families and society have invested more in 1-year-olds and that 1-year-olds have more self-awareness and “personhood” than newborns, leading to valuing the 1-year-old more than the newborn.

Primarily for the last reason, the GiveWell staff who participated in these discussions tend to value 1-year-old lives over newborns, though our relative weights vary considerably.

Best guess cost-effectiveness estimate

Ultimately, we don’t have enough information to arrive at a reliable estimate of the impact of facility delivery on neonatal mortality. Our best guess is extremely rough, based primarily on intuitions formed based on limited data, and one that could easily shift significantly. We asked all staff who primarily work on GiveWell research to (a) guess the likely effect of New Incentives’ program on neonatal mortality and (b) enter the philosophical values discussed above. This yielded a median staff estimate that New Incentives was approximately as cost-effective as cash (in GiveDirectly’s program). Our cost-effectiveness model is here (.xlsx).

Is New Incentives transparent?

Yes – extremely. New Incentives has shared all of the information we have requested (and more) in a timely fashion. We feel that it is as good as any other organization we have ever engaged with on this criterion.

Options we’re considering for future support of New Incentives and/or its staff

We have discussed each of the following options with New Incentives and plan to let New Incentives’ preference drive our decision about which one to choose. In considering these options, we took into account (a) the likely direct impact funding would have and (b) the incentives that funding would create for others considering starting a new organization like New Incentives.

  1. Recommend that Good Ventures (a foundation with which we work closely that has provided past funding for our experimental work) provide an “exit grant” of approximately $1.2 million to New Incentives. New Incentives relied heavily on funding we recommended in its scale up, and abruptly stopping funding could cause it significant harm. Our impression is that funders often give grantees exit grants to offer them time to comfortably adjust their plans for fundraising and spending; this has been GiveWell’s experience with support from institutional funders. We would plan to benchmark our recommendation to the level of support New Incentives could have expected from us over the next two years (January 2017 – December 2018) as of the last time Good Ventures made a grant (March 2016). $1.2 million represents half what we would have projected New Incentives spending to be in 2017 and 2018 as of March 2016. (It grew faster than we expected since March 2016, so this is less than 50% of its projected operating expenses.)
  2. Recommend that Good Ventures agree to support some portion of New Incentives’ ongoing operations and a randomized controlled trial of New Incentives’ program’s impact on neonatal mortality. New Incentives’ program doesn’t seem cost-effective enough that we’d be willing to recommend that Good Ventures fully fund an RCT and New Incentives’ ongoing operations, but we’d consider recommending some, significant support (very roughly, we’d cap a recommendation at 50% of the total cost) if New Incentives could raise the rest of the funding elsewhere. This option would provide New Incentives with the opportunity to demonstrate that its program is more effective/cost-effective than we currently expect it to be as long as it is able to convince other funders to provide some support as well.
  3. Provide support to New Incentives/the New Incentives team to do something new. If New Incentives or its staff were interested in starting a new charity aiming to be a GiveWell top charity or significantly changing its program to focus on something more cost-effective, we would recommend that Good Ventures provide support.

We hope to decide soon about which option to pursue.

[Added December 19, 2016: GiveWell’s experimental work is now known as GiveWell Incubation Grants.]

Notes
[1] We identified two relevant meta-analyses. Chinkhumba et al. 2014, a meta-analysis of six prospective cohort studied of perinatal mortality in sub-Saharan Africa found 21% higher perinatal mortality in home deliveries compared to facility deliveries (OR 1.21 [1.02-1.46]) using a fixed-effects model, but this difference was not significant using a random effects model (OR 1.21 [0.79-1.84]).

We are also concerned that studies limited to the perinatal period may not capture longer-term neonatal effects. Tura et al. 2013, a meta-analysis of 19 studies (of various methodology) of the effect of facility delivery on neonatal mortality, found mixed results. Pooled results from low- and middle-income countries showed 29% reduction in risk of neonatal death associated with facility delivery. However, results of the studies were highly heterogeneous. Of the 8 studies in sub-Saharan Africa, 4 found effect near the pooled mean, and the other 4 did not find a statistically significant effect. (Of the four that did not find a significant effect, two studies found a nonsignificant effect close to the pooled mean of all studies, and two found no effect.)

A retrospective study based on the demographic and health surveys in Nigeria found that facility delivery is associated with increased neonatal mortality (adjusted odds ratio 1.28 [1.11-1.47], Fink et al. 2015, Figure 1, Pg. 5).

[2] Here is one paper we found. We have not vetted this paper. The simple average age of death in it is approximately 1.2 years (see Table 1).

Updates on AMF’s transparency and monitoring

In our mid-year update, we continued to recommend that donors give to the Against Malaria Foundation (AMF), and we wrote that we believe AMF has the most valuable current funding gap among our top charities. We also briefly wrote about some new concerns we have about AMF based on our research from the first half of 2016.

This post describes our new concerns about AMF’s transparency and monitoring in more depth. We continue to believe that AMF stands out among bed net organizations, and among charities generally, for its transparency and the quality of its program monitoring. AMF makes substantial amounts of useful information on the impact of its programs—far more than the norm—publicly available on its website, and has generally appeared to value transparency as much as any organization we’ve encountered. But our research on AMF in 2016 has led us to moderately weaken our view that AMF stands out for these qualities. In short, this is because:

  • The first two post-distribution check-up surveys following AMF’s bed net distribution in Kasaï-Occidental, Democratic Republic of the Congo (DRC) were poorly implemented. AMF told us that it agrees that the surveys were poorly implemented and is working to improve data quality in future surveys.
  • We learned that the methodology for selecting communities and households for post-distribution check-up surveys in Malawi is less rigorous than we had previously thought.
  • AMF was slower to share information with us in 2016 than we would have liked. Unfortunately, we aren’t fully confident about what caused this to happen. We believe that AMF misunderstood the type of information we would value seeing, and this may have caused some (but not all) of this issue.

These updates somewhat lower our confidence in AMF’s track record of distributing bed nets that are used effectively (i.e., present, hanging, and in good condition) over the long term and in its commitment to transparency; however, this is only a moderate update, and we don’t think what we’ve learned is significant enough to outweigh AMF’s strengths. We continue to recommend AMF and believe that it has the highest-value current funding gap of any of our top charities.

Going forward, we plan to continue to learn more about AMF’s transparency and monitoring through reviewing the results of additional post-distribution surveys and continued communication with AMF.

Background on AMF’s strengths and evidence of impact

We recommend AMF because there is strong evidence that mass distribution of long-lasting insecticide-treated bed nets reduces child mortality and is cost-effective, and because of AMF’s strengths as an organization: a standout commitment to transparency and self-evaluation, substantial room for more funding to deliver additional bed nets, and relatively strong evidence overall on whether bed nets reach intended destinations and are used over the long term.

In particular, AMF requires that its distribution partners implement post-distribution check-up surveys every six months among a sample (around 5%) of recipient households for 2.5 years following a bed net distribution, and has publicly shared the results of these surveys from several of its bed net distributions in Malawi. It’s our understanding that AMF is quite unusual in this regard—other organizations that fund bed nets do not typically require post-distribution check-up surveys to monitor bed net usage over time, and do not publicly share monitoring data as AMF does.

Evidence of the impact of AMF’s bed net distribution in Kasaï-Occidental, DRC

AMF has sent us reports and data from two post-distribution check-up surveys (from eight months and twelve months after the distribution) from Kasaï-Occidental, DRC.

Donors may not realize that AMF has a short track record of major distributions. It has historically primarily worked with Concern Universal in Malawi, so these are the first surveys we’ve seen from large-scale AMF bed net distributions outside of Malawi. AMF’s post-distribution check-up surveys are intended to provide evidence on how many AMF nets are used effectively over the long-term, but we (and AMF) believe that these surveys in Kasaï-Occidental, DRC were poorly implemented (details in footnote).[1]

Due to the extent of the implementation issues, we don’t think the post-distribution check-up surveys provide a reliable estimate of the proportion of bed nets distributed in Kasaï-Occidental, DRC used effectively over the long term. (Note that AMF earlier provided a distribution report, registration data, and photos and videos as evidence that bed nets originally reached intended destinations.) It seems plausible to us that the reported rates of nets in-use (hung over a sleeping space) from the AMF distribution in the 8-month post-distribution check-up survey (~80%) and the 12-month post-distribution check-up survey (64-69%) are either substantial overestimates or substantial underestimates.

Non-random sampling in post-distribution surveys in Malawi

This year, we learned that Concern Universal, AMF’s distribution partner in Malawi, does not use a completely random process to select participants for post-distribution surveys. We have received some conflicting information from AMF and Concern Universal on the specifics of how the selection process deviates from pure randomization, so we aren’t confident that we fully understand how the selection process works in practice (details in footnote).[2]

Our earlier understanding was that Concern Universal randomly selected villages and households for post-distribution surveys without any adjustments.

We are now concerned that the results from post-distribution surveys from Malawi could be somewhat biased estimates of the long-term impact of AMF’s distributions (though we wouldn’t guess that the effect of the bias on the results would be very large, since AMF and Concern Universal described selection processes that seem likely to produce reasonably representative samples).

AMF told us that it may reconsider its requirements for random selection of participants in future post-distribution surveys and invited us to make suggestions for improvement.

AMF’s transparency and communication

Although we still believe that AMF stands out among bed net organizations for its commitment to transparency, AMF has recently been less transparent with us than we’ve come to expect.

In early 2016, we requested several documents from AMF (including the 8-month and 12-month post distribution surveys from Kasaï-Occidental, DRC, malaria case rate data from clinics in Malawi, and audits of household registration data from Malawi), which AMF told us it had available and would share once it had the capacity to review and edit them. Although we eventually received reports and data from the two DRC post-distribution surveys in June, we still haven’t seen the other documents we requested. AMF responded to these concerns here.

We are concerned that AMF did not tell us about the poor implementation of the first two Kasaï-Occidental, DRC surveys earlier, and that we only recently learned about the details of Concern Universal’s adjustments to random sampling for post-distribution surveys in Malawi. AMF told us it agrees that it should have communicated more clearly with us about these two issues and believes that it did not because it misunderstood the type of information we would value seeing. We are not confident that this fully explains AMF’s lack of transparency.

What we hope to learn going forward

AMF’s track record of providing evidence of impact on its bed net distributions outside of Malawi is currently very limited. Our impression is that DRC is a difficult country for charities to work in; we’re uncertain whether the methodological issues with the first two surveys from Kasaï-Occidental were due to the difficulty of working in DRC specifically, to more general issues with AMF starting programs in new countries and working with new implementing partners, or to the relatively poor performance of an implementing partner.

AMF has told us that it expects the implementation of future post-distribution surveys in DRC to improve, and that it has made several changes to its practices in response to the issues discussed above, including:

  • Hiring a Program Director, Shaun Walsh, whose primary job is to work in-country with distribution partners on planning, executing, and monitoring bed net distributions.
  • Requiring more detailed budgets and plans from distribution partners for upcoming post-distribution surveys in Ghana, Uganda, and Togo.
  • Focusing on improving timeliness of reporting on distributions and post-distribution surveys.

We plan to communicate closely with AMF on its upcoming post-distribution surveys, and update our views on AMF’s track record outside of Malawi when more survey results are available.

Notes
[1]
AMF’s reports on the surveys indicate that:

  1. It seems likely that different data collectors interpreted ambiguously-worded questions differently for both the 8-month and 12-month surveys. “Number of nets available” (translated from French) was variously interpreted as the number of nets hung, the number of nets hung plus the number of nets present but not hung, or the number of nets present but not hung. This led to internally inconsistent data (e.g. different numbers of nets reported for a single household for different survey questions) for a large proportion of households (42% in the 8-month post-distribution survey and around half in the 12-month post-distribution survey). AMF excluded households with internally inconsistent data from its analysis of the proportion of nets from the distribution still in use.
  2. AMF addressed this issue by re-writing survey questions after the 8-month survey, but the corrected survey questions were not put onto the data collectors’ smartphones before the 12-month survey.
  3. Household members sometimes reported inaccurate information to data collectors when survey questions were asked outside of a home. Data collectors later confirmed that the information was inaccurate (e.g. the household owned more bed nets than reported) by direct observation inside the home, but were not able to correct the data already entered into their smartphones.
  4. Data collectors did not distinguish between nets from the late 2014 AMF distribution and bed nets from other sources. AMF notes that the average level of previously-owned nets was around 2.5% so this would not have materially influenced the results of the post-distribution survey.

[2]

  • AMF told us:
    Concern Universal selects villages for post-distribution surveys in each health center catchment area where AMF nets were distributed. Concern Universal divides each health center catchment area into three “bands:” a short distance, medium distance, and far distance away from the health center. In each band, Concern Universal randomly selects between 25% and 50% of the villages. In each of those villages, Concern Universal randomly selects around 20% of the households.

  • In April 2016, we spoke with a representative of Concern Universal, who told us that, in addition to the stratification of villages by geographic location described by AMF, that villages selected in one post-distribution survey are excluded from being selected for the following post-distribution survey.

September 2016 open thread

Our goal with hosting quarterly open threads is to give blog readers an opportunity to publicly raise comments or questions about GiveWell or related topics (in the comments section below). As always, you’re also welcome to email us at info@givewell.org or to request a call with GiveWell staff if you have feedback or questions you’d prefer to discuss privately. We’ll try to respond promptly to questions or comments.

If you have questions related to the Open Philanthropy Project, you can post those in the Open Philanthropy Project’s open thread.

You can view our June 2016 open thread here.

Would other organizations have funded AMF’s bednet distributions if AMF hadn’t?

An important question to ask when deciding where to give is “what would happen if this charity didn’t receive my donation?”

To investigate this, we focus on charities’ “room for more funding,” i.e., what will additional funding for this organization allow it to do that it would not be able to do without additional support from the donors GiveWell influences?

This question is relevant to the Against Malaria Foundation (AMF), currently our #1 rated charity, which provides funding to support malaria net distributions in Sub-Saharan Africa. In the past, we focused intensely on the question of whether AMF would be able to absorb and commit additional funds.

Recently, we asked another question: how likely is it that the bednet distributions that AMF supports would have been funded by others if AMF hadn’t provided funding? That is, would another funder have stepped in to provide funding in AMF’s absence?

If this were the case, our assessment of AMF’s impact would be diminished because it would seem likely that, in the absence of giving to AMF, the distributions it might have supported would occur anyway.

We can’t know what other funders might do in the future, so to learn more about this we looked back at cases from 2012 and 2013 where AMF had initially considered a distribution but then didn’t end up providing funding. We asked whether, and when, those distributions were eventually funded by others.

Our investigation

We looked at five cases where AMF considered funding a distribution but did not end up moving forward. In short:

  • In two cases, major delays (18 months and ~36 months) occurred before people in the area received bednets from other sources.
  • In two cases, other funders filled the gap six to nine months later than AMF would have.
  • In one case, funding was committed soon after AMF’s talks fell through.

(For context, we use an “8%-20%-50%” model to estimate the longevity of bednets, which assumes that 92% of nets are still in use through the first year, 80% through the second, and 50% through the third (and none after the end of the third year). On average, then, we estimate that nets last about 27 months.)

More details are available in our full report on this investigation.

Of course, these cases aren’t necessarily predictive:

  • It’s possible that the distributions were atypical, and that the reasons that led AMF to not carry out these distributions were the same reasons that led other funders to not fund them. This would mean that a typical AMF distribution might, in fact, be more likely to be funded by someone else, if AMF doesn’t fund it, than these results predict.
  • It’s possible the global funding situation has changed since the cases we investigated in 2012 and 2013 – if more funding is now available overall, it would make it more likely that if AMF didn’t carry out a given distribution, another funder would step in.

That said, even if other funders would always step in if AMF didn’t carry out a distribution, it’s still possible that AMF is increasing the total number of bednets distributed, if there’s an overall funding gap for bednets globally. We’ve written more about the global bednet gap here. For this to be the case, it would likely require there exists some additional pool of funding that can be directed to bednets when necessary.

Overall, we think that the cases we looked at offer support to our conclusion that there is a real need for additional funding for bednets, and that AMF is not primarily displacing other funding for bednets.

Deworming might have huge impact, but might have close to zero impact

We try to communicate that there are risks involved with all of our top charity recommendations, and that none of our recommendations are a “sure thing.”

Our recommendation of deworming programs (the Schistosomiasis Control Initiative and the Deworm the World Initiative), though, carries particularly significant risk (in the sense of possibly not doing much/any good, rather than in the sense of potentially doing harm). In our 2015 top charities announcement, we wrote:

Most GiveWell staff members would agree that deworming programs are more likely than not to have very little or no impact, but there is some possibility that they have a very large impact. (Our cost-effectiveness model implies that most staff members believe there is at most a 1-2% chance that deworming programs conducted today have similar impacts to those directly implied by the randomized controlled trials on which we rely most heavily, which differed from modern-day deworming programs in a number of important ways.)

The goal of this post is to explain this view and why we still recommend deworming.

Some basics for this post

What is deworming?

Deworming is a program that involves treating people at risk of intestinal parasitic worm infections with parasite-killing drugs. Mass treatment is very inexpensive (in the range of $0.50-$1 per person treated), and because treatment is cheaper than diagnosis and side effects of the drugs are believed to be minor, typically all children in an area where worms are common are treated without being individually tested for infections.

Does it work?

There is strong evidence that administration of the drugs reduces worm loads, but many of the infections appear to be asymptomatic and evidence for short-term health impacts is thin (though a recent meta-analysis that we have not yet fully reviewed reports that deworming led to short-term weight gains). The main evidence we rely on to make the case for deworming comes from a handful of longer term trials that found positive impacts on income or test scores later in life.

For more background on deworming programs see our full report on combination deworming.

Why do we believe it’s more likely than not that deworming programs have little or no impact?

The “1-2% chance” doesn’t mean that we think that there’s a 98-99% chance that deworming programs have no effect at all, but that we think it’s appropriate to use a 1-2% multiplier compared to the impact found in the original trials – this could be thought of as assigning some chance that deworming programs have no impact, and some chance that the impact exists but will be smaller than was measured in those trials. For instance, as we describe below, worm infection rates are much lower in present contexts than they were in the trials.

Where does this view come from?

Our overall recommendation of deworming relies heavily on a randomized controlled trial (RCT) (the type of study we consider to be the “gold standard” in terms of causal attribution) first written about in Miguel and Kremer 2004 and followed by 10-year follow up data reported in Baird et al. 2011, which found very large long-term effects on recipients’ income. We reviewed this study very carefully (see here and here) and we felt that its analysis largely held up to scrutiny.

There’s also some other evidence, including a study that found higher test scores in Ugandan parishes that were dewormed in an earlier RCT, and a high-quality study that is not an RCT but found especially large increases in income in areas in the American South that received deworming campaigns in the early 20th century. However, we consider Baird et al. 2011 to be the most significant result because of its size and the fact that the follow-up found increases in individual income.

While our recommendation relies on the long-term effects, the evidence for short-term effects of deworming on health is thin, so we have little evidence of a mechanism through which deworming programs might bring about long-term impact (though a recent meta-analysis that we have not yet fully reviewed reports that deworming led to short-term weight gains). This raises concerns about whether the long-term impact exists at all, and may suggest that the program is more likely than not to have no significant impact.

Even if there is some long-term impact, we downgrade our expectation of how much impact to expect, due to factors that differ between real-world implementations and the Miguel and Kremer trial. In particular, worm loads were particularly high during the Miguel and Kremer trial in Western Kenya in 1998, in part due to flooding from El Niño, and in part because baseline infection rates are lower in places where SCI and Deworm the World work than in the relevant studies.

Our cost-effectiveness model estimates that the baseline worm infections in the trial we mainly rely on were roughly 4 to 5 times as high as in places where SCI and Deworm the World operate today, and that El Niño further inflated those worm loads during the trial. (These estimates combine data on the prevalence of infections and intensity of infections, and so are especially rough because there is limited data on whether prevalence or intensity of worms is a bigger driver of impact). Further, we don’t know of any evidence that would allow us to disconfirm the possibility that the relationship between worm infection rates and the effectiveness of deworming is nonlinear, and thus that many children in the Miguel and Kremer trial were above a clinically relevant “threshold” of infection that few children treated by our recommended charities are above.

We also downgrade our estimate of the expected value of the impact based on: concerns that the limited number of replications and lack of obvious causal mechanism might mean there is no impact at all, expectation that deworming throughout childhood could have diminishing returns compared to the ~2.4 marginal years of deworming provided in the Miguel and Kremer trial, and the fact that the trial only found a significant income effect on those participants who ended up working in a wage-earning job. See our cost-effectiveness model for more information.

Why do we recommend deworming despite the reasonably high probability that there’s no impact?

Because mass deworming is so cheap, there is a good case for donating to support deworming even when in substantial doubt about the evidence. We estimate the expected value of deworming programs to be as cost-effective as any program we’ve found, even after the substantial adjustments discussed above: our best guess considering those discounts is that it’s still roughly 5-10 times as cost-effective as cash transfers, in expectation. But that expected value arises from combining the possibility of potentially enormous cost-effectiveness with the alternative possibility of little or none.

GiveWell isn’t seeking certainty – we’re seeking outstanding opportunities backed by relatively strong evidence, and deworming meets that standard. For donors interested in trying to do as much good as possible with their donations, we think that deworming is a worthwhile bet.

What could change this recommendation – will more evidence be collected?

To our knowledge, there are currently no large, randomized controlled trials being conducted that are likely to be suitable for long-term follow up to measure impacts on income when the recipients are adults, so we don’t expect to see a high-quality replication of the Miguel and Kremer study in the foreseeable future.

That said, there are some possible sources of additional information:

  • The follow-up data that found increased incomes among recipients in the original Miguel and Kremer study was collected roughly 10 years after the trial was conducted. Our understanding is that 15 year follow-up data has been collected and we expect to receive an initial analysis of it from the researchers this summer.
  • A recent study from Uganda didn’t involve data collection for the purpose of evaluating a randomized controlled trial; rather, the paper identified an old, short-term trial of deworming and an unrelated data set of parish-level test scores collected by a different organization in the same area. Because some of the parishes overlap, it’s possible to compare the test scores from those that were dewormed to those that weren’t. It’s possible that more overlapping data sets will be discovered and so we may see more similar studies in the future.
  • We’ve considered whether to recommend funding for an additional study to replicate Baird et al. 2011: run a new deworming trial that could be followed for a decade to track long term income effects. However, it would take 10+ years to get relevant results, and by that time deworming may be fully funded by the largest global health funders. It would also need to include a very large number of participants to be adequately powered to find plausible effects (since the original trial in Baird et al. 2011 benefited from particularly high infection rates, which likely made it easier to detect an effect), so it would likely be extremely expensive.

For the time being, based on our best guess about the expected cost-effectiveness of the program when all the factors are considered, we continue to recommend deworming programs.

Update on GiveWell’s web traffic / money moved: Q1 2016

In addition to evaluations of other charities, GiveWell publishes substantial evaluation of ourselves, from progress against our goals to our impact on donations. We generally publish quarterly updates regarding two key metrics: (a) donations to top charities and (b) web traffic (though going forward, we may provide less frequent updates).

The tables and chart below present basic information about our growth in money moved and web traffic in the first quarter of 2016 compared to the previous two years (note 1).

Money moved and donors: first quarter

Table_2016Q1MoneyMoved.png

Money moved by donors who have never given more than $5,000 in a year increased about 50% to $1.1 million. The total number of donors in the first quarter increased about 30% to about 4,500 (note 2).

Most of our money moved is donated near the end of the year (we tracked 70% or more of our total money moved in the fourth quarter each of the last three years) and is driven by a relatively small number of large donors. Because of this, we do not think we can reliably predict our growth and think that our year-to-date total money moved provides relatively limited information about what our year-end money moved is likely to be (note 3). We therefore look at the data above as an indication of growth in our audience.

Web traffic through April 2016

Table_2016Q1WebTraffic.png

Growth in web traffic excluding Google AdWords increased 10% in the first quarter. GiveWell’s website receives elevated web traffic during “giving season” around December of each year. To adjust for this and emphasize the trend, the chart below shows the rolling sum of unique visitors over the previous twelve months, starting in December 2009 (the first period for which we have 12 months of reliable data due to an issue tracking visits in 2008).

Chart_2016Q1WebTraffic.png

We use web analytics data from two sources: Clicky and Google Analytics (except for those months for which we only have reliable data from one source). The raw data we used to generate the chart and table above (as well as notes on the issues we’ve had and adjustments we’ve made) is in this spreadsheet. (Note on how we count unique visitors.)



Note 1: Since our 2012 annual metrics report we have shifted to a reporting year that starts on February 1, rather than January 1, in order to better capture year-on-year growth in the peak giving months of December and January. Therefore, metrics for the “first quarter” reported here are for February through April.

Note 2: Our measure of the total number of donors may overestimate the true number. We identify individual donors based on the reported name and email. Donors may donate directly to our recommended charities and not opt to share their contact information with us, or donors may use different information for subsequent donations (for example, a different email), in which case, we may mistakenly count a donation from a past donor as if it was made by a new donor. We are unsure but would guess that the impact of this issue is relatively small and that the data shown are generally reflective of our growth from year to year.

Note 3: In total, GiveWell donors directed $2.6 million to our top charities in the first quarter of 2016, compared to $2.0 million that we had tracked in the first quarter of 2015. For the reason described above, we don’t find this number to be particularly meaningful at this time of year.

Note 4: We count unique visitors over a period as the sum of monthly unique visitors. In other words, if the same person visits the site multiple times in a calendar month, they are counted once. If they visit in multiple months, they are counted once per month.