We’ve long had mixed feelings about cost-effectiveness estimates of charitable programs, i.e., attempts to figure out “how much good is accomplished per dollar donated.”
The advantages of these estimates are obvious. If you can calculate that program A can help much more people – with the same funds, and in the same terms – than program B, that creates a strong case (arguably even a moral imperative) for funding program A over program B. The problem is that by the time you get the impact of two different programs into comparable “per-dollar” terms, you’ve often made so many approximations, simplifications and assumptions that a comparison isn’t much more meaningful than a roll of the dice. In such cases, we believe there are almost always better ways to decide between charities.
This post focuses on the drawbacks of cost-effectiveness estimates. I’m going to go through the details of what we know about one of the best-known, most often-cited cost-effectiveness figures there is: the cost per disability-adjusted life-year (DALY) for deworming schoolchildren. This figure uses the disability-adjusted life-year (DALY) metric, probably the single most widely cited and accepted “standardized” measure of social impact within the unusually quantifiable area of health.
Note that various versions of this figure:
- Occupy the “top spot” in the Disease Control Priorities Report‘s chart of “Cost-effectiveness of Interventions Related to Low-Burden Diseases” (see page 42 of the full report). (I’ll refer to this report as “DCP” for the rest of this post.)
- Are featured in a policy briefcase by the Poverty Action Lab (which we are fans of), calling deworming a “best buy for education and health.”
- Appear to be the primary factor in the decision by Giving What We Can
(a group that promotes both more generous and more intelligent giving) to designate deworming-related interventions as its top priority (see the conclusion of its report on neglected tropical diseases), and charities focused on these interventions as its two top-tier charities.
I don’t feel that all the above uses of this figure are necessarily inappropriate (details in the conclusion of this post). But I do feel that they point to the worthiness of inspecting this figure closely, and it is important to be aware of the following issues.
- The estimate is likely based on successful, thoroughly observed programs and may not be representative of what one would expect from an “average” deworming program.
- The estimate appears to rely on an assumption of continued successful treatment over time, an assumption which could easily be problematic in certain cases.
- A major input into the estimate is the prevalence of worm infections. In general, prevalence data is itself is the product of yet more estimations and approximations.
- Many factors in cost-effectiveness, positive and negative, appear to be ignored in the estimate simply because they cannot be quantified.
- Different estimates of the same program’s cost-effectiveness appear to strongly contradict each other.
Issue 1: the estimate is likely based on successful, thoroughly observed programs.
The Poverty Action Lab estimate of $5 per DALY is based on a 2003 study by Miguel and Kremer of a randomized controlled trial in Kenya. As the subject of an unusually rigorous evaluation, this program likely had an unusual amount of scrutiny throughout (and may also have been picked in the first place partly for its likelihood of succeeding). In addition, this program was carried out by a partnership between the Kenyan government and a nonprofit, ICS (pg 165), that has figured prominently in numerous past evaluations (for example, see this 2003 review of rigorous studies on education interventions).
In this sense, it seems reasonable to view its results as “high-end/optimistic” rather than “representative of what would one expect on average from a large-scale government rollout.”
Note also that the program included a significant educational component (169). The quality of hygiene education, in particular, might be much higher in a closely supervised experiment than in a large-scale rollout.
It is less clear whether the same issue applies to the DCP estimate, because the details and sources for the estimate are not disclosed (see box on page 476). However,
- The other studies referenced throughout the chapter appear to be additional “micro-level” evaluations – i.e., carefully controlled and studied programs – as opposed to large-scale government-operated programs.
- The DCP’s cost-effectiveness estimate for combination deworming (the program most closely resembling the program discussed in Miguel & Kremer) is very close to the Miguel & Kremer estimate of $5 per DALY. (There is some ambiguity on this point – more on this under Issue 5 below.)
Issue 2: the estimate appears to rely on an assumption of continued successful treatment over time, an assumption which could easily be problematic in certain cases.
Miguel & Kremer states:
single-dose oral therapies can kill the worms, reducing … infections by 99 percent … Reinfection is rapid, however, with worm burden often returning to eighty percent or more of its original level within a year … and hence geohelminth drugs must be taken every six months and schistosomiasis drugs must be taken annually. (pg 161)
Miguel & Kremer emphasizes the importance of externalities (i.e., the fact that eliminating some infections slows the overall transmission rate) in cost-effectiveness (pg 204), and it therefore seems important to ask whether the “$5 per DALY” estimate is made (a) assuming that periodic treatment will be sustained over time; (b) assuming that it won’t be.
Miguel & Kremer doesn’t explicitly spell out the answer, but it seems fairly clear that (a) is in fact the assumption. The study states that the program averted 649 DALYs (pg 204) over two years (pg 165), of which 99% could be attributed to aversion of moderate-to-heavy schistosomiasis infections (pg 204). Such infections have a disability weight of 0.006 per year, so this is presumably equivalent to averting over 100,000 years ((649*99%)/0.006) of schistosomiasis infection – even though well under 10,000 children were even loosely included in the project (including control groups and including pupils near but not included in the program – see pg 167). Even if a higher-than-standard disability weight was used, it seems fairly clear that many years of “averted infection” were assumed per child.
In my view, this is the right assumption to make in creating the cost-effectiveness estimate … as long as the estimate is used appropriately, i.e., as an estimate of how cost-effective a deworming program would be if carried out in an near-ideal way, including a sustained commitment over time.
However, it must be noted that sustaining a program over time is far from a given, especially for organizations hoping for substantial and increasing government buy-in over time. As we will discuss in a future post, one of the major deworming organizations appears to have aimed to pass its activities to the government, with unclear/possibly mixed results. And as we have discussed before, there are vivid examples of excellent, demonstrably effective projects failing to achieve sustainability in the past.
Does the DCP’s version of the estimate make a similar assumption? Again, we do not have the details of the estimate, but the DCP chapter – like the Miguel & Kremer paper – stresses the importance of “Regular chemotherapy at regular intervals” (pg 472).
One more concern along these lines: even if a program is sustained over time, there may be “diminished efficacy with frequent and repeated use … possibly because of anthelmintic resistance” (pg 472).
Extrapolation from a short-term trial to long-term effects is probably necessary to produce an estimate, but it further increases the uncertainty.
Issue 3: cost-effectiveness appears to rely on disease incidence/prevalence data that itself is the product of yet more estimations and approximations.
The Miguel & Kremer study took place in an area with extremely high rates of infection: 80% prevalence of schistosomiasis (where schistosomiasis treatment was applied), and 40-80% prevalence of three other infections (see pg 168). The DCP emphasizes the importance of carrying out the intervention in high-prevalence areas (for example, see the box on page 476). Presumably, the program should be carried out in as high-prevalence areas as possible for maximum cost-effectiveness.
The problem is that prevalence data may not be easy to come by. The Global Burden of Disease report reports using a variety of elaborate methods to estimate prevalence, using “environmental data derived from satellite remote sensing” as well as mathematical modeling (see pg 80). Though I don’t have a source for this statement, I recall either a conversation or a paper making a fairly strong case that data on neglected tropical diseases is particularly spotty and unreliable, likely because it is harder to measure morbidity than mortality (the latter can be collected from death records; the former requires more involved examinations and/or judgment calls and/or estimates).
Issue 4: many factors in cost-effectiveness appear to be ignored in the estimate simply because they cannot be quantified.
Both positive and negative factors have likely been ignored in the estimate, including:
- Possible negative health effects of the deworming drugs themselves (DCP pg 479). (Negative impact on cost-effectiveness)
- Possible development of resistance to the drugs, and thus diminishing efficacy, over time (mentioned above). (Negative impact on cost-effectiveness)
- Possible interactions between worm infections and other diseases including HIV/AIDS (DCP pg 479), which may increase the cost-effectiveness of deworming. (Positive impact on cost-effectiveness)
- The question of whether improving some people’s health leads them to contribute back to their families, communities, etc. and improve others’ lives. This question applies to any health intervention, but not necessarily to the same degree, since different programs affect different types of people. From what I’ve seen, there is very little available basis for making any sorts of estimates of such differences.
Issue 5: different estimates of the same program’s cost-effectiveness appear to strongly contradict each other.
The DCP’s summary of cost-effectiveness alone (box on pg 476) raises considerable confusion:
the cost per DALY averted is estimated at US $3.41 for STH infections [the type of infection treated with albendazole] … The estimate of cost per DALY is higher for schistosomiasis relative to STH infections because of higher drug costs and lower disability weights … the cost per DALY averted ranges from US$3.36 to US$6.92. However, in combination, treatment with both albendazole and PZQ proves to be extremely cost-effective, in the range of US$8 to US$19 per DALY averted.
The language seems to strongly imply that the combination program is more effective than treating schistosomiasis alone, but the numbers given imply the opposite. Our guess is actually that the numbers are inadvertently switched. To one taking the numbers too literally, the expected “cost-effectiveness” of a donation could be off by a factor of 2-5 depending on this question of copy editing.
Comparing this statement with the Miguel & Kremer study adds more confusion. The DCP estimates albendazole-only treatment at $3.41 per DALY, which appears to be better than (or at least at the better end of the range for) the combination program. However, Miguel & Kremer estimates that albendazole-only treatment is far less effective than the combination program, at $280 per DALY (pg 204).
Perhaps the DCP envisions albendazole treatment carried out in a different way or in a different type of environment. But given that the Miguel & Kremer study appears to be examining a fairly suitable environment for albendazole-only treatment (see above comments about high infection prevalence and strong program execution), this would indicate that cost-effectiveness is extremely sensitive to subtle changes in the environment or execution.
There is a lot of uncertainty in this estimate, and this uncertainty isn’t necessarily “symmetrical.” Estimates of different programs’ cost-effectiveness, in fact, could be colored by very different degrees of optimistic assumptions.
Despite all of the above issues, I don’t find the cost-effectiveness estimate discussed here to be meaningless or useless.
Researchers’ best guesses put the cost-effectiveness of deworming in the same ballpark as that of other high-priority interventions such as vaccines, tuberculosis treatment, etc. (I do note that many of these appear to have more robust evidence bases behind their cost-effectiveness – for example, estimated effects of large-scale government programs are sometimes available, giving an extra degree of context.)
I think it is appropriate to say that available evidence suggests that deworming can be as cost-effective as any other health intervention.
I think it is appropriate to call deworming a “best buy,” as the Poverty Action Lab does.
I do not think it is appropriate to conclude that deworming is more cost-effective than vaccinations, tuberculosis treatment, etc. I think it is especially inappropriate to conclude that deworming is several times more cost-effective than vaccinations, tuberculosis treatment, etc.
Most of all, I do not think it is appropriate to expect results in line with this estimate just because you donate to a deworming charity. I believe cost-effectiveness estimates usually represent “what you can achieve if the program goes well” more than they represent “what a program will achieve on average.”
In my view, the greatest factor behind the realized cost-effectiveness of a program is the specifics of who carries it out and how.