Evidence of Impact for Long-Term Benefits

We’ve recently published our updated review on the evidence on cash transfers. It elaborates on a claim we’ve made previously – that there is evidence for long-term benefits from cash transfers at high average rates of return.

Some people have expressed skepticism of this evidence, pointing to several limitations: there are not many studies, some of the key data comes from people’s reports of their own spending, and programs studied may not be representative of GiveDirectly‘s program.

We think that some of these limitations are less concerning than they may appear at first glance. More importantly, though, the same limitations broadly apply to the evidence of long-term impact (aside from bednets’ impact on mortality) for our recommended health interventions. The situations are not exactly analogous, but the question of which interventions have stronger vs. weaker evidence for long-term impact (aside from bednets’ impact on mortality) does not have an obvious answer.

We speculate that individual donors instinctively imagine that the evidence around many programs is more robust than it is. (We know that we did so when we first started GiveWell.) If this is the case, we’re glad that our recommendation of a cash transfer program – which many people find intuitively unappealing – has prompted some of our followers to take a closer, more skeptical look at this evidence.

Determining that a given intervention – whether health, cash transfers, or anything else – has long-lasting impacts on quality of life is extremely challenging for a multitude of reasons. It requires researchers to track the same people for a long period of time, to collect accurate and relevant data from these people that can shed light on their quality of life. It requires both funders and researchers to have substantial patience and foresight, making plans 5-10 years in advance (and a lot can change in 5-10 years). As a result, informative data about long-term impacts can be hard to come by, and interpreting such data requires substantial judgment calls.

(An easier form of long-term impact to assess is the impact on mortality. Since most people who make it past their fifth birthday live past age 60, we believe it is relatively safe to equate averting a child’s death to long-term impact. However, different people have different intuitions on how to value averting the death of a child under five vs. improving someone’s long-term quality of life. In addition, while there is strong evidence that bednets avert mortality, the case for deworming rests on life improvement. In this post, we focus on the evidence for long-term life improvement rather than on the evidence for mortality reduction, which is quite robust for bednets.)

Learning about the limitations described in this post has made us (a) more confident that using rigid criteria and definitions of “evidence-based” is the wrong path; (b) more favorably inclined toward interventions that seem to require unusually low burdens of proof (a description that we believe all of our top charities currently fit).

The rest of this post goes into more detail on the limitations to the evidence of long-term impact for cash transfers, and how these compare to the limitations to the evidence for bednet distribution and deworming.

We will discuss the relative cost-effectiveness of cash transfers next week.

Limitation 1: not many studies

The case for the long-term benefits of cash transfers rests largely on one high-quality (randomized) long-term study of conditional cash transfers as well as one high-quality long-term study of unconditional grants to microenterprises. Neither of the programs studied is exactly like GiveDirectly’s program, and both could be have taken place in a substantially different context; this issue is discussed in the following section. This section addresses the simple fact that there are not many studies on the topic.

We have long argued that no one study should be considered a “final word” on the effectiveness of a program, even if the program studied was exactly the same as the program of interest; there are many reasons that a single study might be unreliable, and that its results might fail to hold up upon replication. (For more on this topic, see our discussion of meta-research as well as John Ioannidis’s work on replicability in biomedical research). Thus, having a small number of relevant studies is a significant concern.

However, similar concerns apply in other cases.

The case for the long-term benefits of deworming rests on two studies. One is a high-quality randomized study. The other is a retrospective (non-randomized) examination of a hookworm eradication program in the American South in the early 20th century.

The case for the long-term developmental impacts of insecticide-treated nets includes very little in the way of direct studies: just one retrospective (non-randomized) analysis similar to the second study of deworming mentioned above. However, there are other reasons to be optimistic about the long-term impacts of bednets. One is that bednets have been shown to avert deaths, an impact that can be measured over the short run but has clear long-term significance (though how one ought to value this impact, relative to something like “improved income in adulthood,” is an open question). Another reason is that multiple studies have found substantial short-term health impacts for children under five, and there are studies in a variety of other areas making a case for the connection between under-five health and later-in-life developmental benefits. (We have not written extensively about the latter, though we will do so eventually.)

Even putting aside lives saved, if I had to bet on one intervention to have long-term impacts, I’d bet on nets – though if one rigidly requires top-quality randomized studies of the exact intervention itself, the case for nets is weakest. Regardless, the case for all three interventions is quite limited.

Limitation 2: limited representativeness

The case for the long-term benefits of cash transfers rests on one study of conditional cash transfers (which were made with certain requirements, and which were structured as small recurring transfers while GiveDirectly’s grants are structured as larger one-time transfers) and on one study of grants to microenterprises (which, while made with no strings attached and were made one time only, were targeted specifically at people running microenterprises and were also smaller than GiveDirectly’s transfers).

We don’t believe conditionality is a major issue. In examining the impacts of cash transfers, we have focused on impacts that we feel aren’t plausibly related to the conditions. Data about how people spend their money, and what returns they earn on it, seem unlikely to be driven by the sorts of conditions imposed in these programs, which generally pertain to sending children to school, bringing them in for health checkups, etc. (In fact, we would guess that following conditions would be likely to reduce rather than increase consumption and investment returns for adults, by reducing child labor and/or reallocating time and other resources toward children.) The size and structure of cash transfers may be a major issue, though we would guess (as reasoned in our writeup) that GiveDirectly’s version would be more conducive to higher rates of investment and thus greater long-term returns.

Again, there are similar issues with the evidence for nets and deworming.

  • The key study of deworming was of an annual deworming program in an area with extremely high rates of infection (particularly of schistosomiasis – see our recent post on the matter). Because of the proximity to Lake Victoria and the role El Nino played in the study, we’d guess that most of SCI’s work takes place in areas of much lower infections.

    Much of SCI’s work involves deworming people less frequently than they were dewormed in the studies (details)

  • The high-quality studies of nets involved unusually intensive programs, with constant replacement and checking up on nets. In some cases, programs were structured quite differently – involving treatment of existing nets (rather than distribution of long-lasting insecticide-treated nets), social marketing of nets (rather than free distribution), etc. As far as we can tell, the conditions of AMF’s distributions are likely to resemble those in the studies in relevant ways (particularly usage of nets), but this isn’t something we have definitively established.

In addition to these differences, there is a broader difference that we feel is quite important: studies took place at very different places and times, and with different populations, which could be particularly significant for economic impacts – the main things we are taking as evidence of long-term impact. This applies to all the studies in question.

Limitation 3: reliance on self-reported data

The studies of cash transfers rely on recipients’ reports of their earnings and/or consumption.

Of the limitations discussed in this post, this is the one we’re least concerned about. We do believe that self-reported data is likely to be highly misleading in certain contexts, such as when it is clear to the person being surveyed what sort of answer the surveyor is hoping for (or when some answers are more socially acceptable than others). We feel this is a valid reason to be skeptical of e.g. GiveDirectly’s data on how people spent their transfers (and GiveDirectly concedes as much). However, the studies of cash transfers take a quite different approach: they randomly assign people to treatment and control groups and perform highly extensive surveys of people in both groups, attempting to quantify consumption and other factors. (To get a sense for how extensive such surveys can be, see GiveDirectly’s survey instrument for its ongoing study; the survey instrument we examined for our reanalysis of the evidence for deworming was similarly extensive.) It generally seems to us to be a consensus among scholars that more complex surveys of this type are more reliable, because it becomes easier to answer straightforwardly than to intuit what sorts of answers are being sought (for example, see our notes on speaking with Richard Cibulskis of the World Malaria Report (DOC)).

The long-term followup on deworming we recently discussed relies on similar survey data (i.e., participants’ self-reports of their earnings). And we also rely on such survey data in estimating the rate at which insecticide-treated nets are used.

Two more considerations regarding the relative strength of the evidence for cash transfers vs. deworming

Comments

Evidence of Impact for Long-Term Benefits — 7 Comments

  1. I see from the survey instrument you linked to that GiveDirectly are also collecting saliva samples, to test for stress hormone. How sensitive is this measure, and have they found any differences between treatment and control groups? It would obviously be good to have evidence on some measure which is not self-reported – though I imagine that the causal relationship between receiving a cash transfer and a change in stress levels is complex.

  2. Rob, thanks for the question. As far as we know, GiveDirectly’s mid-line survey did not collect the cortisol data, so those results will not be available until the final survey is completed.

  3. Thanks for this consideration of evidence. A few points. First, the types of evidence you want — long-term and from many contexts — is often not sexy data to collect for academics. But if donors start to demand these kinds of data, then replicating the same interventions in new places may start to become more appealing.

    Second, these types of data (over time, the geographic breadth of a country) are what governments should, theoretically, be able to collect as part of surveillance and administrative data. This is a good reason for GiveWell to encourage its top ‘charities’ to not only be transparent but to help governments build monitoring and delivery capacity, in part because this can help generate the data we need to answer some of the big questions. All the better if a rigorous plan to test different delivery mechanisms in different places, so we can learn about efficacy & effectiveness.

    It’s not clear how much power GiveWell has to shape such a data agenda in this way but… worth exploring,

  4. Dear Givewell, I have a question about the long term benefits of deworming.

    On your page on deworming you suggest that “the most compelling case for deworming … [is] the possibility that deworming children has a subtle, lasting impact on their development, and thus on their ability to be productive and successful throughout life.” You refer to the Baird et al. 2012 paper as evidence of this and, in your analysis of that paper you point out that one of the largest positive effects is increased earnings (as a result of employment shifting to the manufacturing sector).

    I am concerned that you do not discuss the possibility that this particular benefit could arise largely from a signalling effect of more time spent school.

    Ie. Deworming children leads to children performing better in schools. These children will have higher grades and then get the best jobs. This will displace other people from getting those jobs. So although the average income of the dewormed children rises the average income of someone else may fall. So the later life increased earnings of those dewormed at school will have come at a cost of lower earnings to others. Similarly other benefits such as being healthier later in life could result from these increased earnings. These idea is discussed in more detail here http://www.copenhagenconsensus.com/sites/default/files/PP_Education_-_Pritchett_0.pdf

    Is this something you have considered? Would this effect reduce your credence in deworming as an effective intervention?

    Cheers, Sam

  5. Samuel, don’t you think it’s a bit pessimistic to assume that better educational outcomes are a zero-sum game?

  6. Sam,

    We have considered this as a possible issue, though we haven’t written about it. We don’t see any particular reason to attribute the income effects of deworming to this dynamic (note that the key deworming study, Baird et al. 2012, does not observe statistically significant increases in total grades of schooling completed, and impacts on test scores are not statistically significant either), though we agree it may be a part of the picture. Generally, any micro-level study focused on earnings impacts (at least where mechanisms are unclear) will have this same issue (so it applies to evidence on long-term impacts of the other interventions as well).

  7. Dear Holden and Ian,

    Thank you for your good replies. It is useful for me deciding between the top recommended charities, to know that GiveWell has considered this issue.

    For clarity and in case future people are interested:

    - This issue arose for me specifically because of a few papers, (such as: http://wber.oxfordjournals.org/content/15/3/367.full.pdf and http://jos.sagepub.com/content/46/1/45.full.pdf) that claimed to have evidence that education interventions in particular were ineffective because of the reasoning laid out above.

    - I have found one meta-analysis on this topic that looks into much of the evidence. This an be found at http://www.copenhagenconsensus.com/projects/copenhagen-consensus-2012/research/education in the Challenge paper in the section titled ‘Do school investments contribute to economic growth?’ The conclusion is that the benefits of education may be high or low depending on the degree of economic and political freedom in the country concerned.

    - Generally I would love it if GiveWell carried out more analyses of potential macro-level effects like this that could effect giving decisions.