Investigating the Ebola response

Should you donate to efforts to contain the Ebola outbreak in west Africa? With hundreds of millions of dollars coming in from other donors, will your donation make a difference? How does this compare to giving to GiveWell’s top charities?

These are difficult questions. It’s always hard to estimate how much good a donation does; it’s much harder in the midst of a rapidly evolving situation like this one. It requires predicting the future path of the pandemic and the effects of response efforts. New information (and new donations) are constantly changing the picture. Further complicating matters, the people who best understand the situation are extremely busy, and we need to be careful with how we request their time. Even coming up with a rough take on Ebola involves major effort. However, at this point – due to some preliminary analysis and estimates – we are in the midst of conducting a small investigation, and hope to publish our take on donating for Ebola response within the next week or two.

In this post, we lay out the steps we’ve taken and the steps we’re planning next for our investigation. We then discuss what goes into our decisions about how to respond to sudden, prominent donation opportunities like this one, and why we’ve decided to do an investigation in this case.

Our investigation
The basic question we’re trying to answer is: what is the cost-effectiveness (in terms of lives saved and similar benefits per dollar spent) of additional donations to the Ebola response (beyond what’s already been raised, and including factors such as the risk that Ebola might spread to more countries in Africa or become endemic if not contained)?

Unfortunately, we don’t know of any published efforts to answer this question. We also don’t know of efforts to answer related questions such as “What is the expected death toll of the Ebola outbreak conditional on the current planned response, and how would this change if the response were better-funded?” The information and analysis we do have that seems most relevant is:

  • A CDC model that projects Ebola deaths under different assumptions about what proportion of cases are effectively isolated. The projection goes only through January 20, and covers only Liberia and Sierra Leone. There are also some other models with broadly similar properties. We initially tried using these models, but now provisionally believe they cannot be used for our purposes (more below).
  • Some basic information on the status of the UN fundraising appeal. As of now, $988 million has been requested; $486 million has been raised and an additional $233 million has been pledged.
  • Some basic information on the World Health Organization (WHO)’s hopes for the containment effort. A recent press briefing with a WHO representative states: “…the numbers we need to get behind are 70:70:60; that number is 70% safe burials, 70% cases being managed and cared for properly; and within 60 days of our start date which for UNMEER we’re taking as 1st October. So, our goal is to have that in place by 60 days which would be 1st December.”

Initially, we tried to focus on using the CDC model to forecast Ebola cases at higher and lower levels of response efforts, which we tried to map to higher or lower levels of funding. However, we ran into several issues here.

  • One fundamental issue is that we know too little about the relationship between “how much money is raised” and “what sort of response is possible”: it might be that the activities most crucial to containing the epidemic can already be funded at current levels, and that additional donations would do relatively little.
  • Another major issue turned out to be that the CDC model already appears to be out of date (and specifically, overly pessimistic). The model incorporates data on cases through late August; reported Ebola cases since then are lower than the model predicted even in the maximal “strong response effort” scenario. It is possible that the recent reports of Ebola cases reflect issues with data collection (for example, perhaps people with Ebola are now avoiding care or healthcare workers are too overwhelmed to report data); but based purely on the numbers, we don’t feel we can use the CDC model to make good forecasts for cost-effectiveness analysis.
  • Even if we resolved the above two issues, there would be major questions remaining. The CDC model covers only two countries, and only through January 20; it does not address cases in Guinea, the possibility that Ebola becomes endemic, or the possibility that Ebola spreads to other countries. We know little about the organizations involved in the response effort and how well they’re performing, and it’s unlikely that we’ll be able to find out much about this question while the epidemic is ongoing.

We have also experimented with using a model published more recently by the Virginia Bioinformatics Institute, but we haven’t yet determined whether this model could be useful. We haven’t been able to compare this model’s predictions to recent reports directly, but it appears to make similar projections to the CDC’s for Ebola cases conditional on strong control as of December 31. We would need a better understanding of the model, and more discussion, in order to determine whether it might be used for a cost-effectiveness estimate (but even if we did use it for such an estimate, the estimate would remain problematic for many of the reasons listed above).

At this point, we’re focusing on trying to set up conversations to gain more information about the following questions:

  • If we were to recommend donations to the response effort, how quickly could donations be utilized on the ground? Would they make a difference to the response effort?
  • What would these donations allow that could not be funded otherwise? Would they expand the most important response activities? Should we think of additional donations as having similar impact to the average dollar in the response effort?
  • How significant is the risk that Ebola spreads to other countries and/or becomes endemic? How should we think about the likely longer-term death toll, factoring in unlikely but extremely bad scenarios?
  • Should we infer from recent data that the CDC model was overly pessimistic, or is there another explanation for the low (relative to the CDC model’s projections) reports of further cases?
  • If one donates to the response effort, whom specifically should they donate to?

We’re first trying to see whether we can gain information by speaking with people who aren’t directly involved with the effort, and who can therefore take time to speak with us in a low-stakes way. If necessary, we may need to create an estimate of how much money we might be able to raise for the response, in order to give people more information about whether talking to us is worth their time.

How we decide which crises to investigate
When a humanitarian crisis hits the headlines, we usually get a lot of questions along the lines of “How can I help and where should I give?” At the same time, there are several reasons that headline-dominating-crises tend not to make for the best giving opportunities, and particularly tend to be a poor fit for our work.

  • The people best positioned to understand, and help with, Ebola response are probably the people who have been working on pandemic containment, developing-world health systems, and other related areas for years before this crisis emerged. The best opportunities to prevent or contain the epidemic were probably before it was widely recognized as a crisis (and perhaps before Ebola had broken out at all – more funding for preventive surveillance could have made a big difference). We’d guess that a similar dynamic holds in general: it takes years to build expertise and context in an area, and the most crucial opportunities to make a difference will often be before the issue is getting widespread attention. In general, we think we’ll find the best giving opportunities by picking good causes to focus on and working on them for years, not by scrambling to catch up on the state of knowledge about an urgent and chaotic situation. As it happens, biosecurity is one of our leading contenders for a focus area, and we have been actively investigating the area for a few months. One of our main focuses is on strengthening routine preventive surveillance. However, we are far from having the network and knowledge needed for a rapid diagnosis of the Ebola outbreak.
  • When an issue is getting a lot of media coverage, it often attracts a lot of funding. All else equal, this makes giving less attractive, since we emphasize room for more funding. In past investigations (2010 Haiti earthquake, Japan tsunami), we found evidence that money was not the limiting factor for the relief effort.
  • Urgent issues also tend to be particularly difficult to investigate. The people who know the most about them tend to be extremely busy, and issues tend to be more newsworthy when they are more unprecedented and chaotic.
  • If we do choose to investigate a crisis, we generally need to make the investigation an urgent top priority in order to keep up with developing news. That means high involvement from senior staff and major disruptions to our workflow. It can be worth it, but the costs are high.

In some past crises, we have made major efforts to put out helpful content – particularly the 2010 Haiti earthquake and 2011 Japan tsunami. Our work attracted a fair amount of media coverage, and helped us formulate general principles for disaster relief giving, but it also took a lot of time and did not result in large amounts of donations (in 2011, when we covered both the Japan tsunami and the Somalia famine and recommended Doctors Without Borders for both, we tracked ~$50,000 in money moved to Doctors Without Borders; note that in these cases, we also stated that we did not feel the giving opportunity was as strong as giving to our top charities). We provided more limited coverage for the 2011 Somalia famine and choose only to provide general tips in response to the 2013 Philippines earthquake.

When a crisis starts getting coverage, we weigh factors such as (a) how many people are asking for our views and (b) how much capacity we have for an investigation, as well as (c) the likely “cost per life saved” (or similar metric) for donating to the relief effort.

In the case of the Ebola outbreak, we initially guessed that the outbreak would remain relatively contained, and that ample funding for the relief effort would come in. (High-profile donations from individuals and significant attention from governments both contributed to this view.) Recently, several things have changed:

  • Over the past week, we’ve heard from more people – particularly people who follow GiveWell closely – than we had in previous weeks.
  • The crisis has now been attracting significant attention, yet funding remains substantially below what has been requested.
  • The crisis appears quite relevant to our ongoing investigation of preventive surveillance. Many of the people we are speaking to about surveillance are heavily involved in the Ebola response.
  • In light of the above factors, we decided to put some time into a very rough estimate of what the “cost per life saved” might look like for the Ebola response. Some initial calculations indicated that the cost-effectiveness could be quite strong, consistent with the idea that containing a small number of cases now could prevent a large number of cases later. However, in light of our questions about the CDC model (among other issues), we don’t think our estimate is usable, and decided to gather more information along the lines described above.

Ebola response may be an outstanding use of funds, largely because the right preventive measures could stop the problem from becoming much larger and more costly to contain. The same logic would apply at an even earlier stage – to the strengthening of everyday preventive surveillance, of the kind that could have led to much earlier detection and containment of this epidemic. If that’s right, surveillance could turn out to be an outstanding cause to specialize in, under the heading of the Open Philanthropy Project.

Our ongoing review of Living Goods

Living Goods runs a network of Community Health Promoters (CHPs) who sell health and household goods door-to-door in their communities in Uganda and Kenya. CHPs also provide basic health counseling. Living Goods also provides consulting and funding to other organizations to run similar networks in other locations. We have been considering Living Goods for a 2014 recommendation.

We’ve now spent a considerable amount of time talking to Living Goods and analyzing documents Living Goods shared with us. This post shares what we’ve learned so far and what questions we’re planning to focus on throughout the rest of our investigation. (For more detail, see our detailed interim review.)

Living Goods has successfully completed the first phase of our investigation process and we view it as a contender for a recommendation this year. We now plan (a) to make a $100,000 grant to Living Goods (as part of our “top charity participation grants,” funded by Good Ventures) and (b) continue our analysis to determine whether or not we should recommend Living Goods to donors at the end of the year.

Reasons we prioritized Living Goods

Living Goods contacted us a few months ago to inform us that the initial results from a randomized controlled trial (RCT) of its program were available. The headline result from the study was a 25% reduction in under-five mortality, a remarkable effect size.

Questions we hope to answer in our ongoing analysis

How robust is the RCT?

The authors of the RCT have not yet completed the full report on the study, so we have not been able to vet the results in detail. RCTs generally have fewer methodological issues that severely undermine the results than other types of studies, but they are not immune to these problems. We discuss potential issues with the RCT in our interim review.

The authors are seeking publication in an academic journal and the paper will be embargoed until a journal publishes it. This may mean that we are unable to discuss the details of the study before releasing our 2014 recommendations. We are unsure how strong a recommendation of Living Goods we might make if we were unable to give the details of the main evidence for its impact.

In addition, we don’t want to overemphasize the strength of the evidence provided by a single RCT (even if it has no methodological issues). Interventions such as bednets and cash transfers are supported by multiple RCTs and other evidence.

Will future work be as impactful as past work, and how will we know?

There are some reasons to think future results could be worse than RCT results: locations for the RCT were carefully selected, perhaps to maximize impact, and malaria control in Uganda may have improved in recent years. Even if the program is somewhat less effective in the future, it may still be worth supporting.

Our main concern is about both Living Goods’ and our ability to know how well the program is performing in the future. Living Goods asks CHPs to report on activities such as treatments provided and follow up visits, but because of the incentive structure and lack of audits on the accuracy of these reports, we put limited weight on these metrics. Living Goods told us that its branch managers conduct randomized follow ups with clients, but we have not see documentation from these audits (or other evidence that these checks are happening). We’re not aware of any other monitoring that Living Goods conducts on its program.

Will other funders fill Living Goods’ funding gap?

Living Goods is looking to significantly scale up its program in the next four years. It is in discussions with current funders to see if they will increase their support. It believes it may be able to fund up to two-thirds of its scale-up through these commitments. It is continuing to seek new sources of funding. We may have to make a decision about how much funding to recommend to Living Goods in 2014 before other funders make their decisions known.

If Living Goods raises more than it needs for its scale-up, it would likely use these funds to co-fund partner organizations to start networks of CHP-like agents in other countries. This would be a riskier bet for donors, and its not clear how much we can expect to learn about how these programs turn out.

Is the CHP program cost-effective?

Living Goods estimates that its program will have a cost per life saved of $4,773 in 2015, decreasing to $2,773 in 2018. We have made some adjustments to this model to generate our own estimates. We estimate that Living Goods’ cost per life saved will be roughly $11,000 in 2014-2016. Making assumptions that we would guess are particularly optimistic about Living Goods, we estimate the cost per life saved at about $3,300. Pessimistic assumptions lead to an estimate of $28,000 per life saved. (Details in our interim review.) Our work on this model is ongoing.

Our guess is that Living Goods’ program is in the same range as (though slightly less cost-effective than) the most cost-effective programs we have considered, such as bednets, deworming, and iodization.

(See our page on cost-effectiveness for more on the role these estimates play in our recommendations.)

Expert philanthropy vs. broad philanthropy

It seems to me that the most common model in philanthropy – seen at nearly every major staffed foundation – is to have staff who specialize in a particular cause (for example, specializing in criminal justice policy). Often, such staff have a very strong background in the cause before they come to the foundation, and they generally seem to focus their time exclusively on one cause – to the point of becoming (if they weren’t already) an expert in it.

I think this model makes a great deal of sense, partly for reasons we’ve discussed previously. Getting to know the people, organizations, literature, challenges, etc. most relevant to a particular cause is a significant investment – a “fixed cost” that can then make one more knowledgeable about all giving opportunities within that cause. Furthermore, evaluating and following a single giving opportunity can be a great deal of work. Now that the Open Philanthropy Project has made some early grants, it is hitting home just how many questions we could – and, it feels, should – ask about each. If we want to follow each grant to the best our abilities, we’ll need to allocate a lot of staff time to each; having staff specialize in causes is likely the only way to do so efficiently.

Yet I’m not convinced that this model is the right one for us. Depth comes at the price of breadth. With our limited management capacity, following each grant to the best of our abilities shouldn’t be assumed to be the right approach. I’ve been asking myself the question of whether there’s a way to be involved in many more causes at a much lower level of depth, looking for the most outstanding giving opportunities to come along in the whole broad set of causes. I’ve been thinking about this question recently mostly in the context of policy, which will be the focus of this post.

Having a “low-depth” involvement in a given issue could take a number of forms – for example:

  • One might make a concerted effort to identify a small number of “big bets” related to an issue, and focus effort on following these “big bets.”
  • One might make a concerted effort to identify a small number of “gaps” – aspects of an issue that get very little attention and have very few people working on them – and focus grantmaking activity on these “gaps.” This approach could be consistent with making a relatively large number of grants in the hopes that some grantee gains traction.
  • One might focus on identifying a trusted advisor in an issue space, and make a small number of grants as recommended by the advisor (this is largely the approach behind our grants so far on labor mobility).
  • One might co-fund the work of another major funder, join a collaboration of major funders, or support the work of a large and established organization, and gain more familiarity with the issue over time by following this partner’s work.
  • One might aim for a very basic level of understanding of an issue – in particular, which way we would like to see policy change relative to the status quo, and whom we feel aligned enough with to take their advice. With this understanding in hand for multiple issues, one might then be well-positioned to support: (a) “cross-issue” organizations and projects that are likely to have a small impact on many issues; (b) campaigns aiming to take advantage of short-term “windows of opportunity” that arise for various issues.

I can see a few arguments in favor of trying one or more of these, all of which make it possible to take some form of a “breadth” oriented approach (more causes, at with a lower degree of depth and expertise, than the standard cause-specialist approach would involve).

First and most importantly, we will never know as much about grantees’ work as they do, and it arguably makes more sense to think of grantees as the relevant experts. The best funder might be the one who picks qualified grantees in an important cause, supports them and otherwise stays out of their way. With this frame in mind, focusing on in-house expertise is arguably inefficient (in the sense that our expertise would become somewhat redundant with grantees’) and possibly even counterproductive (in the sense that it could lead us to be overly “active” with grantees, pushing them toward our theory of the case).

Of course, picking qualified grantees is a serious challenge, and one that is likely harder without deep context. But the question is how much additional benefit deep context provides. Even without expertise, it is possible to get some signals of grantee quality – general reputation, past accomplishments, etc. – and even with expertise, there will be a great deal of uncertainty. In a high-risk model of the world, where perhaps 10% of one’s grants will account for 90% of one’s impact, it may be better pick “potentially outstanding” grantees from a relatively broad space of possibilities than to limit oneself to a narrower space, while having more precise and reliable ways of distinguishing between marginally better or worse giving opportunities.

Expertise would also be an advantage for following a grant, learning from it and continuing to help grantees as they progress. However, it seems quite possible to me that the best grantees tend to be self-driven and improvisatory, such that following them closely wouldn’t add value to what they’re doing, and would largely serve to assuage our own anxiety without doing much to increase our impact.

Secondly, the best giving opportunities may sometimes cut across multiple causes and be hard to assess if we’ve engaged seriously with only a small number of causes. This issue seems particularly important to me in the area of U.S. policy, where the idea of strengthening the network of people who share our values – or the platform representing those values – could be very important. If we focus exclusively on a small number of policy areas, and give little attention to others, we could end up lacking the knowledge and networks to perform well on this goal, and we could be ill-positioned to evaluate the ramifications of a giving opportunity for the full set of issues we care about. (An argument for pursuing both breadth- and depth-oriented strategies simultaneously is that the depth-oriented work may surface opportunities that are relevant to a large number of issues, and the breadth-oriented work could then be helpful in assessing such opportunities.)

Finally, it seems to us that there are some issue areas where the giving opportunities are quite limited – particularly issues that we think of as green fields, as well as neglected sub-areas of other issues. Devoting a full staff member to such an issue would pose particular risks in terms of inefficiency, and it might be better to fund the few available opportunities while waiting for more to emerge.

I think the cases of Ed Scott and the Sandler Foundation represent interesting examples of what a philanthropist can accomplish despite not specializing exclusively in a particular cause, and despite not building out a staff of domain experts.

  • Ruth Levine of the Hewlett Foundation writes that Ed Scott has “built at least four excellent organizations from the ground up” – including the Center for Global Development, which we have supported and think positively of. She adds that “Far more than many others seem to be able to do, he lets go – and as he does, the organizations he supports go further and faster than if he were holding on tight.”
  • We know less about the Sandler Foundation, but it seems to have played a founding role in several prominent organizations and to be well-respected by many, despite not having staff who specialize in a particular cause over the long run. It does do deep cause investigations in sequence, in order to identify promising grantees, but staff work on new cause investigations even while maintaining their funding of previous causes and organizations; this approach therefore seems distinct from the traditional foundation model and can be thought of as one approach to the kind of “broad” work outlined here. One of its core principles is that of looking for excellence in organizations and in leadership, and entrusting those it supports with long-term, flexible support (rather than continuously revisiting and revising the terms of grants).

In both cases, from what we can tell (and we are considering trying to learn more via case studies), a funder helped create organizations that shared a broad set of values but weren’t focused on a particular policy issue; the funder did not appear to become or hire a domain expert, and may have been more effective by being less hands-on than is the norm among major foundations. My point isn’t that these funders should be emulated in every way (I know relatively little about them), but that the “cause-focused, domain expert” model of grantmaking is not the only viable one.

I’m not yet sure of exactly what it would look like for us to try a breadth-emphasizing model, and I know that we don’t want this to be the only model we try. The depth-emphasizing model has much to recommend it. I can anticipate that, in some ways, a breadth-emphasizing model could be both genuinely risky and psychologically challenging, as we’d have a lower level of knowledge about our grants than many foundations have of theirs. But I think the potential benefits are big, and I think this idea is worth experimenting with.

Our ongoing review of Development Media International

Development Media International (DMI) produces radio and television programs in developing countries that encourage people to adopt improved health practices, such as exclusive breastfeeding of infants and seeking treatment for symptoms associated with fatal diseases. The program aims to reduce mortality of children under five years old.

In May, we wrote that we were considering DMI for a 2014 top charity recommendation. We’ve now spent a considerable amount of time talking to DMI and analyzing documents DMI shared with us. This post shares what we’ve learned so far and what questions we’re planning to focus on throughout the rest of our investigation. (For more, see our detailed interim review.)

DMI has successfully completed the first phase of our investigation process and we view it as a contender for a recommendation this year. We now plan to (a) make a $100,000 grant to DMI (as part of our “top charity participation grants,” funded by Good Ventures) and (b) continue our analysis to determine whether or not we should recommend DMI to donors at the end of the year, including conducting a site visit to Burkina Faso.

Reasons we prioritized DMI

We’ve long been interested in programs that aim to use mass media (e.g., radio or television programming) to promote and disseminate messages on potential life-saving practices. It’s quite plausible to us that messages on TV or the radio could influence behavior, and could reach large numbers of people for relatively low costs, leading to high cost-effectiveness in terms of lives saved or improved per dollar spent, but we previously deprioritized our work in this area due to limitations in the available evidence of effectiveness.

DMI is currently conducting a randomized controlled trial (RCT) of its program, preliminary results from which became available in April.

Questions we hope to answer in our ongoing analysis

How robust are the midline results from the RCT?

Our level of confidence in the success of DMI’s program rests heavily on the midline results from the RCT, but there are reasons these results should be interpreted with caution. In particular:

  • The treatment group (i.e., the regions that were randomly selected to hear DMI’s broadcasts) had noticeably higher levels of child mortality and less access to healthcare at baseline than the control group. Details in our interim review.
  • While DMI plans to collect data on mortality, the only results reported thus far are based on self-reported behavior change, the reliability of which is questionable.

Many details of the RCT are not yet available publicly as the study is ongoing, and we have a number of questions about it that could affect our view of DMI’s impact. In particular, we would like to know more about the activities of other health programs in Burkina Faso during the trial period, and the extent to which the midline results are driven by certain villages or regions versus consistent behavior change across all participating areas.

It’s also important to note that the evidence for DMI’s program relies heavily on a single unpublished RCT; interventions such as bednets and cash transfers are supported by multiple peer-reviewed RCTs and other evidence.

How representative is DMI’s impact in Burkina Faso of its likely impact in other countries?

There are some reasons to expect that DMI’s future results will vary from the RCT. For example, much of DMI’s expected impact comes from behavior changes that require access to health services or products to be effective, such as seeking treatment when a child displays symptoms of malaria. DMI’s ability to predict access in other countries is critical to predicting impact, and may be limited. Details in our interim review.

How cost effective is DMI’s program?

We have not yet completed a full cost effectiveness estimate for DMI’s work but plan to do so for our final review of DMI.

DMI estimates that the cost effectiveness of its intervention is extremely strong relative to other cost effective interventions (for example, more than 10x stronger than our estimate of our strongest top charities). We expect our final estimate of DMI’s “cost per life saved” to be substantially less optimistic, though still within the range of our current priority programs. Details in our interim review.

Will future work be as impactful as past work, and how will we know?

We do not know how DMI will design its attempts to measure its future programs’ impact on behavior change.

 

A promising study on the long-term effects of deworming

This year, Dr. Kevin Croke, a post-doctoral fellow at the Harvard School of Public Health, released a study that we consider an important addition to the evidence for deworming children. The study (Croke 2014) followed up on a randomized controlled trial (RCT) of a deworming program in Uganda and found higher scores on tests of literacy and numeracy in children living in treatment areas compared to the control 7 to 8 years later. This finding reinforces the findings of the only other RCT examining the long-term effects of deworming, which we had previously considered to be relatively strong but still had substantial reservations about. By providing a second data point that is free of some of our previous concerns, Croke 2014 substantially changes our view of the evidence.

Two of our top charities, the Deworm the World Initiative (DtWI) led by Evidence Action and the Schistosomiasis Control Initiative (SCI), focus on deworming. We have not yet concluded our examination of Croke 2014, but at this point we think it is likely to lead us to view these charities as more cost-effective.

Overview of Croke 2014

Croke 2014 follows up on an RCT that involved 48 parishes (the administrative level above the village and below the sub-county) in 5 districts of Uganda selected based on the prevalence of worms in the districts. Half of the parishes were randomly assigned to a treatment group and the other half to a control. In all the districts, community organizations delivered basic health services, like vitamin A supplementation, vaccination and growth monitoring, through regular Child Health Days (CHDs). Children in the treatment group received albendazole (a deworming drug) during CHDs in addition to the other services offered, while children in the control just received the usual services.

Croke analyzed surveys conducted by an education nonprofit several years later that happened to sample 22 of the parishes in the original RCT. He compared children living in the treatment parishes sampled who were 1 to 7 years old (the age group offered albendazole) at the time of the program to children of the same age living in the control parishes and found children in the treatment group had scores about 1/3 of a standard deviation higher on tests of literacy and numeracy.

Strengths and significance

Few other studies have rigorously examined the long-term effects of deworming. Up until now, we’ve relied heavily on two studies: (a) Bleakley 2004, a study of the Rockefeller Sanitary Commission’s campaign to eradicate hookworm in the American South in the early 20th century; (b) a series of studies in Kenya, in which school deworming was rolled out on a purposefully arbitrary (randomization-like) basis, and children who received more years of deworming were compared to children who had received fewer. These studies suggest the possibility that deworming children dramatically improves their productivity later in life by subtly improving their development throughout childhood. In our view, the case for deworming largely rests on these long-term, developmental effects, because the intervention seems to have few obvious short-term benefits.

Having two relatively recent RCTs from Sub-Saharan Africa increases our confidence in long-term benefits far more than having just one RCT, especially because we have had substantial reservations about the RCT in Kenya – some of which seem notably less applicable to Croke 2014. Specifically:

  • The earlier RCT was a trial of “combination deworming” – treatment of both schistosomiasis (with praziquantel) and soil-transmitted helminths (with albendazole). Croke 2014 looks only at albendazole. This is particularly important because one of our current top charities – the Deworm the World Initiative – operates largely in India, where only albendazole is used.
  • Regarding the earlier study, we also thought it was plausible that efforts to encourage students to attend school in order to receive treatment might have accounted for some of the effect found in Baird et al 2012 (the follow-up of Miguel and Kremer 2004). The intervention examined in Croke 2014 appears far less likely to introduce other positive changes into the treatment group, because it involves the addition of albendazole to an existing program rather than an intensive program of deworming in schools in the treatment compared to the absence of any program in the control.
  • We worried that the results of Miguel and Kremer 2004 (the RCT in Kenya) might not generalize to other areas, because of extraordinary flooding caused by the El Niño climate pattern during the study and abnormally high infection rates in the study area (more). Croke 2014 appears to have had a lower (though still high) initial prevalence of infections (the programs selected districts based on high rates of worms found by Kabatereine et al 2001, which estimated that 60% of children ages 5-10 were infected, primarily with hookworm). El Niño may still have affected the study, however, because the parishes examined in Croke 2014 are very close (some within about 10 miles) to the district in which Miguel and Kremer 2004 took place. The program evaluated in Croke 2014 started about 2 years after El Niño, but we’re not sure whether this amount of lag time would lead to lower or higher infection rates.

In our current cost-effectiveness analyses for deworming, we have a “replicability adjustment” to account for the possibility that Baird et al 2012 wouldn’t necessarily hold up on replication, as well as an “external validity adjustment” to account for the fact that most deworming programs likely take place in less heavily infected areas. We will be revisiting both of these adjustments, resulting in improved estimated cost-effectiveness for deworming.

Remaining questions

We still have some concerns about the evidence.

  • El Niño may have affected the parishes examined in Croke 2014 just as it affected the schools in Miguel and Kremer 2004, potentially causing unrepresentatively high infection rates and limiting the generalizability of both studies.
  • Though Croke 2014 finds a large increase in test scores, Baird et al 2012 does not.
  • We worry about the sensitivity of the results to the outcome and control variables used in regressions and about selective reporting of results.
  • We also worry about publication bias. Perhaps other parish-level surveys would have supplied other outcomes. We wonder if other analyses employing a similar methodology that did not find an effect would have been published.
  • The study included a relatively small number of clusters. Croke 2014 reports on a few different regressions and methods of calculating the standard error of the treatment effect, which lead to different estimates of that standard error. In one more conservative analysis, for instance, the effect on the combined literacy and numeracy test scores is significant at the 90% confidence level.
  • Finally, we still can’t articulate any mechanism for the long-term benefits of deworming supported by data (we haven’t seen notable impacts on weight or other health or nutrition measures).

Bottom line

We have not yet concluded our examination of Croke 2014, though we have looked it over closely enough to feel that it is very likely to result in a substantial positive revision of our view on deworming, and therefore of our views on two current top charities as well.

In combination with the earlier study, Croke 2014 represents a major update regarding the case for deworming; we’re very glad to see this new evidence generated, and hope that it will become a prominent part of the dialogue around deworming. We intend to do our part by updating our content for this giving season.