The GiveWell Blog

Updates on AMF’s transparency and monitoring

In our mid-year update, we continued to recommend that donors give to the Against Malaria Foundation (AMF), and we wrote that we believe AMF has the most valuable current funding gap among our top charities. We also briefly wrote about some new concerns we have about AMF based on our research from the first half of 2016.

This post describes our new concerns about AMF’s transparency and monitoring in more depth. We continue to believe that AMF stands out among bed net organizations, and among charities generally, for its transparency and the quality of its program monitoring. AMF makes substantial amounts of useful information on the impact of its programs—far more than the norm—publicly available on its website, and has generally appeared to value transparency as much as any organization we’ve encountered. But our research on AMF in 2016 has led us to moderately weaken our view that AMF stands out for these qualities. In short, this is because:

  • The first two post-distribution check-up surveys following AMF’s bed net distribution in Kasaï-Occidental, Democratic Republic of the Congo (DRC) were poorly implemented. AMF told us that it agrees that the surveys were poorly implemented and is working to improve data quality in future surveys.
  • We learned that the methodology for selecting communities and households for post-distribution check-up surveys in Malawi is less rigorous than we had previously thought.
  • AMF was slower to share information with us in 2016 than we would have liked. Unfortunately, we aren’t fully confident about what caused this to happen. We believe that AMF misunderstood the type of information we would value seeing, and this may have caused some (but not all) of this issue.

These updates somewhat lower our confidence in AMF’s track record of distributing bed nets that are used effectively (i.e., present, hanging, and in good condition) over the long term and in its commitment to transparency; however, this is only a moderate update, and we don’t think what we’ve learned is significant enough to outweigh AMF’s strengths. We continue to recommend AMF and believe that it has the highest-value current funding gap of any of our top charities.

Going forward, we plan to continue to learn more about AMF’s transparency and monitoring through reviewing the results of additional post-distribution surveys and continued communication with AMF.

Background on AMF’s strengths and evidence of impact

We recommend AMF because there is strong evidence that mass distribution of long-lasting insecticide-treated bed nets reduces child mortality and is cost-effective, and because of AMF’s strengths as an organization: a standout commitment to transparency and self-evaluation, substantial room for more funding to deliver additional bed nets, and relatively strong evidence overall on whether bed nets reach intended destinations and are used over the long term.

In particular, AMF requires that its distribution partners implement post-distribution check-up surveys every six months among a sample (around 5%) of recipient households for 2.5 years following a bed net distribution, and has publicly shared the results of these surveys from several of its bed net distributions in Malawi. It’s our understanding that AMF is quite unusual in this regard—other organizations that fund bed nets do not typically require post-distribution check-up surveys to monitor bed net usage over time, and do not publicly share monitoring data as AMF does.

Evidence of the impact of AMF’s bed net distribution in Kasaï-Occidental, DRC

AMF has sent us reports and data from two post-distribution check-up surveys (from eight months and twelve months after the distribution) from Kasaï-Occidental, DRC.

Donors may not realize that AMF has a short track record of major distributions. It has historically primarily worked with Concern Universal in Malawi, so these are the first surveys we’ve seen from large-scale AMF bed net distributions outside of Malawi. AMF’s post-distribution check-up surveys are intended to provide evidence on how many AMF nets are used effectively over the long-term, but we (and AMF) believe that these surveys in Kasaï-Occidental, DRC were poorly implemented (details in footnote).[1]

Due to the extent of the implementation issues, we don’t think the post-distribution check-up surveys provide a reliable estimate of the proportion of bed nets distributed in Kasaï-Occidental, DRC used effectively over the long term. (Note that AMF earlier provided a distribution report, registration data, and photos and videos as evidence that bed nets originally reached intended destinations.) It seems plausible to us that the reported rates of nets in-use (hung over a sleeping space) from the AMF distribution in the 8-month post-distribution check-up survey (~80%) and the 12-month post-distribution check-up survey (64-69%) are either substantial overestimates or substantial underestimates.

Non-random sampling in post-distribution surveys in Malawi

This year, we learned that Concern Universal, AMF’s distribution partner in Malawi, does not use a completely random process to select participants for post-distribution surveys. We have received some conflicting information from AMF and Concern Universal on the specifics of how the selection process deviates from pure randomization, so we aren’t confident that we fully understand how the selection process works in practice (details in footnote).[2]

Our earlier understanding was that Concern Universal randomly selected villages and households for post-distribution surveys without any adjustments.

We are now concerned that the results from post-distribution surveys from Malawi could be somewhat biased estimates of the long-term impact of AMF’s distributions (though we wouldn’t guess that the effect of the bias on the results would be very large, since AMF and Concern Universal described selection processes that seem likely to produce reasonably representative samples).

AMF told us that it may reconsider its requirements for random selection of participants in future post-distribution surveys and invited us to make suggestions for improvement.

AMF’s transparency and communication

Although we still believe that AMF stands out among bed net organizations for its commitment to transparency, AMF has recently been less transparent with us than we’ve come to expect.

In early 2016, we requested several documents from AMF (including the 8-month and 12-month post distribution surveys from Kasaï-Occidental, DRC, malaria case rate data from clinics in Malawi, and audits of household registration data from Malawi), which AMF told us it had available and would share once it had the capacity to review and edit them. Although we eventually received reports and data from the two DRC post-distribution surveys in June, we still haven’t seen the other documents we requested. AMF responded to these concerns here.

We are concerned that AMF did not tell us about the poor implementation of the first two Kasaï-Occidental, DRC surveys earlier, and that we only recently learned about the details of Concern Universal’s adjustments to random sampling for post-distribution surveys in Malawi. AMF told us it agrees that it should have communicated more clearly with us about these two issues and believes that it did not because it misunderstood the type of information we would value seeing. We are not confident that this fully explains AMF’s lack of transparency.

What we hope to learn going forward

AMF’s track record of providing evidence of impact on its bed net distributions outside of Malawi is currently very limited. Our impression is that DRC is a difficult country for charities to work in; we’re uncertain whether the methodological issues with the first two surveys from Kasaï-Occidental were due to the difficulty of working in DRC specifically, to more general issues with AMF starting programs in new countries and working with new implementing partners, or to the relatively poor performance of an implementing partner.

AMF has told us that it expects the implementation of future post-distribution surveys in DRC to improve, and that it has made several changes to its practices in response to the issues discussed above, including:

  • Hiring a Program Director, Shaun Walsh, whose primary job is to work in-country with distribution partners on planning, executing, and monitoring bed net distributions.
  • Requiring more detailed budgets and plans from distribution partners for upcoming post-distribution surveys in Ghana, Uganda, and Togo.
  • Focusing on improving timeliness of reporting on distributions and post-distribution surveys.

We plan to communicate closely with AMF on its upcoming post-distribution surveys, and update our views on AMF’s track record outside of Malawi when more survey results are available.

Notes
[1]
AMF’s reports on the surveys indicate that:

  1. It seems likely that different data collectors interpreted ambiguously-worded questions differently for both the 8-month and 12-month surveys. “Number of nets available” (translated from French) was variously interpreted as the number of nets hung, the number of nets hung plus the number of nets present but not hung, or the number of nets present but not hung. This led to internally inconsistent data (e.g. different numbers of nets reported for a single household for different survey questions) for a large proportion of households (42% in the 8-month post-distribution survey and around half in the 12-month post-distribution survey). AMF excluded households with internally inconsistent data from its analysis of the proportion of nets from the distribution still in use.
  2. AMF addressed this issue by re-writing survey questions after the 8-month survey, but the corrected survey questions were not put onto the data collectors’ smartphones before the 12-month survey.
  3. Household members sometimes reported inaccurate information to data collectors when survey questions were asked outside of a home. Data collectors later confirmed that the information was inaccurate (e.g. the household owned more bed nets than reported) by direct observation inside the home, but were not able to correct the data already entered into their smartphones.
  4. Data collectors did not distinguish between nets from the late 2014 AMF distribution and bed nets from other sources. AMF notes that the average level of previously-owned nets was around 2.5% so this would not have materially influenced the results of the post-distribution survey.

[2]

  • AMF told us:
    Concern Universal selects villages for post-distribution surveys in each health center catchment area where AMF nets were distributed. Concern Universal divides each health center catchment area into three “bands:” a short distance, medium distance, and far distance away from the health center. In each band, Concern Universal randomly selects between 25% and 50% of the villages. In each of those villages, Concern Universal randomly selects around 20% of the households.

  • In April 2016, we spoke with a representative of Concern Universal, who told us that, in addition to the stratification of villages by geographic location described by AMF, that villages selected in one post-distribution survey are excluded from being selected for the following post-distribution survey.

September 2016 open thread

Our goal with hosting quarterly open threads is to give blog readers an opportunity to publicly raise comments or questions about GiveWell or related topics (in the comments section below). As always, you’re also welcome to email us at info@givewell.org or to request a call with GiveWell staff if you have feedback or questions you’d prefer to discuss privately. We’ll try to respond promptly to questions or comments.

If you have questions related to the Open Philanthropy Project, you can post those in the Open Philanthropy Project’s open thread.

You can view our June 2016 open thread here.

Would other organizations have funded AMF’s bednet distributions if AMF hadn’t?

An important question to ask when deciding where to give is “what would happen if this charity didn’t receive my donation?”

To investigate this, we focus on charities’ “room for more funding,” i.e., what will additional funding for this organization allow it to do that it would not be able to do without additional support from the donors GiveWell influences?

This question is relevant to the Against Malaria Foundation (AMF), currently our #1 rated charity, which provides funding to support malaria net distributions in Sub-Saharan Africa. In the past, we focused intensely on the question of whether AMF would be able to absorb and commit additional funds.

Recently, we asked another question: how likely is it that the bednet distributions that AMF supports would have been funded by others if AMF hadn’t provided funding? That is, would another funder have stepped in to provide funding in AMF’s absence?

If this were the case, our assessment of AMF’s impact would be diminished because it would seem likely that, in the absence of giving to AMF, the distributions it might have supported would occur anyway.

We can’t know what other funders might do in the future, so to learn more about this we looked back at cases from 2012 and 2013 where AMF had initially considered a distribution but then didn’t end up providing funding. We asked whether, and when, those distributions were eventually funded by others.

Our investigation

We looked at five cases where AMF considered funding a distribution but did not end up moving forward. In short:

  • In two cases, major delays (18 months and ~36 months) occurred before people in the area received bednets from other sources.
  • In two cases, other funders filled the gap six to nine months later than AMF would have.
  • In one case, funding was committed soon after AMF’s talks fell through.

(For context, we use an “8%-20%-50%” model to estimate the longevity of bednets, which assumes that 92% of nets are still in use through the first year, 80% through the second, and 50% through the third (and none after the end of the third year). On average, then, we estimate that nets last about 27 months.)

More details are available in our full report on this investigation.

Of course, these cases aren’t necessarily predictive:

  • It’s possible that the distributions were atypical, and that the reasons that led AMF to not carry out these distributions were the same reasons that led other funders to not fund them. This would mean that a typical AMF distribution might, in fact, be more likely to be funded by someone else, if AMF doesn’t fund it, than these results predict.
  • It’s possible the global funding situation has changed since the cases we investigated in 2012 and 2013 – if more funding is now available overall, it would make it more likely that if AMF didn’t carry out a given distribution, another funder would step in.

That said, even if other funders would always step in if AMF didn’t carry out a distribution, it’s still possible that AMF is increasing the total number of bednets distributed, if there’s an overall funding gap for bednets globally. We’ve written more about the global bednet gap here. For this to be the case, it would likely require there exists some additional pool of funding that can be directed to bednets when necessary.

Overall, we think that the cases we looked at offer support to our conclusion that there is a real need for additional funding for bednets, and that AMF is not primarily displacing other funding for bednets.

Deworming might have huge impact, but might have close to zero impact

We try to communicate that there are risks involved with all of our top charity recommendations, and that none of our recommendations are a “sure thing.”

Our recommendation of deworming programs (the Schistosomiasis Control Initiative and the Deworm the World Initiative), though, carries particularly significant risk (in the sense of possibly not doing much/any good, rather than in the sense of potentially doing harm). In our 2015 top charities announcement, we wrote:

Most GiveWell staff members would agree that deworming programs are more likely than not to have very little or no impact, but there is some possibility that they have a very large impact. (Our cost-effectiveness model implies that most staff members believe there is at most a 1-2% chance that deworming programs conducted today have similar impacts to those directly implied by the randomized controlled trials on which we rely most heavily, which differed from modern-day deworming programs in a number of important ways.)

The goal of this post is to explain this view and why we still recommend deworming.

Some basics for this post

What is deworming?

Deworming is a program that involves treating people at risk of intestinal parasitic worm infections with parasite-killing drugs. Mass treatment is very inexpensive (in the range of $0.50-$1 per person treated), and because treatment is cheaper than diagnosis and side effects of the drugs are believed to be minor, typically all children in an area where worms are common are treated without being individually tested for infections.

Does it work?

There is strong evidence that administration of the drugs reduces worm loads, but many of the infections appear to be asymptomatic and evidence for short-term health impacts is thin (though a recent meta-analysis that we have not yet fully reviewed reports that deworming led to short-term weight gains). The main evidence we rely on to make the case for deworming comes from a handful of longer term trials that found positive impacts on income or test scores later in life.

For more background on deworming programs see our full report on combination deworming.

Why do we believe it’s more likely than not that deworming programs have little or no impact?

The “1-2% chance” doesn’t mean that we think that there’s a 98-99% chance that deworming programs have no effect at all, but that we think it’s appropriate to use a 1-2% multiplier compared to the impact found in the original trials – this could be thought of as assigning some chance that deworming programs have no impact, and some chance that the impact exists but will be smaller than was measured in those trials. For instance, as we describe below, worm infection rates are much lower in present contexts than they were in the trials.

Where does this view come from?

Our overall recommendation of deworming relies heavily on a randomized controlled trial (RCT) (the type of study we consider to be the “gold standard” in terms of causal attribution) first written about in Miguel and Kremer 2004 and followed by 10-year follow up data reported in Baird et al. 2011, which found very large long-term effects on recipients’ income. We reviewed this study very carefully (see here and here) and we felt that its analysis largely held up to scrutiny.

There’s also some other evidence, including a study that found higher test scores in Ugandan parishes that were dewormed in an earlier RCT, and a high-quality study that is not an RCT but found especially large increases in income in areas in the American South that received deworming campaigns in the early 20th century. However, we consider Baird et al. 2011 to be the most significant result because of its size and the fact that the follow-up found increases in individual income.

While our recommendation relies on the long-term effects, the evidence for short-term effects of deworming on health is thin, so we have little evidence of a mechanism through which deworming programs might bring about long-term impact (though a recent meta-analysis that we have not yet fully reviewed reports that deworming led to short-term weight gains). This raises concerns about whether the long-term impact exists at all, and may suggest that the program is more likely than not to have no significant impact.

Even if there is some long-term impact, we downgrade our expectation of how much impact to expect, due to factors that differ between real-world implementations and the Miguel and Kremer trial. In particular, worm loads were particularly high during the Miguel and Kremer trial in Western Kenya in 1998, in part due to flooding from El Niño, and in part because baseline infection rates are lower in places where SCI and Deworm the World work than in the relevant studies.

Our cost-effectiveness model estimates that the baseline worm infections in the trial we mainly rely on were roughly 4 to 5 times as high as in places where SCI and Deworm the World operate today, and that El Niño further inflated those worm loads during the trial. (These estimates combine data on the prevalence of infections and intensity of infections, and so are especially rough because there is limited data on whether prevalence or intensity of worms is a bigger driver of impact). Further, we don’t know of any evidence that would allow us to disconfirm the possibility that the relationship between worm infection rates and the effectiveness of deworming is nonlinear, and thus that many children in the Miguel and Kremer trial were above a clinically relevant “threshold” of infection that few children treated by our recommended charities are above.

We also downgrade our estimate of the expected value of the impact based on: concerns that the limited number of replications and lack of obvious causal mechanism might mean there is no impact at all, expectation that deworming throughout childhood could have diminishing returns compared to the ~2.4 marginal years of deworming provided in the Miguel and Kremer trial, and the fact that the trial only found a significant income effect on those participants who ended up working in a wage-earning job. See our cost-effectiveness model for more information.

Why do we recommend deworming despite the reasonably high probability that there’s no impact?

Because mass deworming is so cheap, there is a good case for donating to support deworming even when in substantial doubt about the evidence. We estimate the expected value of deworming programs to be as cost-effective as any program we’ve found, even after the substantial adjustments discussed above: our best guess considering those discounts is that it’s still roughly 5-10 times as cost-effective as cash transfers, in expectation. But that expected value arises from combining the possibility of potentially enormous cost-effectiveness with the alternative possibility of little or none.

GiveWell isn’t seeking certainty – we’re seeking outstanding opportunities backed by relatively strong evidence, and deworming meets that standard. For donors interested in trying to do as much good as possible with their donations, we think that deworming is a worthwhile bet.

What could change this recommendation – will more evidence be collected?

To our knowledge, there are currently no large, randomized controlled trials being conducted that are likely to be suitable for long-term follow up to measure impacts on income when the recipients are adults, so we don’t expect to see a high-quality replication of the Miguel and Kremer study in the foreseeable future.

That said, there are some possible sources of additional information:

  • The follow-up data that found increased incomes among recipients in the original Miguel and Kremer study was collected roughly 10 years after the trial was conducted. Our understanding is that 15 year follow-up data has been collected and we expect to receive an initial analysis of it from the researchers this summer.
  • A recent study from Uganda didn’t involve data collection for the purpose of evaluating a randomized controlled trial; rather, the paper identified an old, short-term trial of deworming and an unrelated data set of parish-level test scores collected by a different organization in the same area. Because some of the parishes overlap, it’s possible to compare the test scores from those that were dewormed to those that weren’t. It’s possible that more overlapping data sets will be discovered and so we may see more similar studies in the future.
  • We’ve considered whether to recommend funding for an additional study to replicate Baird et al. 2011: run a new deworming trial that could be followed for a decade to track long term income effects. However, it would take 10+ years to get relevant results, and by that time deworming may be fully funded by the largest global health funders. It would also need to include a very large number of participants to be adequately powered to find plausible effects (since the original trial in Baird et al. 2011 benefited from particularly high infection rates, which likely made it easier to detect an effect), so it would likely be extremely expensive.

For the time being, based on our best guess about the expected cost-effectiveness of the program when all the factors are considered, we continue to recommend deworming programs.

Update on GiveWell’s web traffic / money moved: Q1 2016

In addition to evaluations of other charities, GiveWell publishes substantial evaluation of ourselves, from progress against our goals to our impact on donations. We generally publish quarterly updates regarding two key metrics: (a) donations to top charities and (b) web traffic (though going forward, we may provide less frequent updates).

The tables and chart below present basic information about our growth in money moved and web traffic in the first quarter of 2016 compared to the previous two years (note 1).

Money moved and donors: first quarter

Table_2016Q1MoneyMoved.png

Money moved by donors who have never given more than $5,000 in a year increased about 50% to $1.1 million. The total number of donors in the first quarter increased about 30% to about 4,500 (note 2).

Most of our money moved is donated near the end of the year (we tracked 70% or more of our total money moved in the fourth quarter each of the last three years) and is driven by a relatively small number of large donors. Because of this, we do not think we can reliably predict our growth and think that our year-to-date total money moved provides relatively limited information about what our year-end money moved is likely to be (note 3). We therefore look at the data above as an indication of growth in our audience.

Web traffic through April 2016

Table_2016Q1WebTraffic.png

Growth in web traffic excluding Google AdWords increased 10% in the first quarter. GiveWell’s website receives elevated web traffic during “giving season” around December of each year. To adjust for this and emphasize the trend, the chart below shows the rolling sum of unique visitors over the previous twelve months, starting in December 2009 (the first period for which we have 12 months of reliable data due to an issue tracking visits in 2008).

Chart_2016Q1WebTraffic.png

We use web analytics data from two sources: Clicky and Google Analytics (except for those months for which we only have reliable data from one source). The raw data we used to generate the chart and table above (as well as notes on the issues we’ve had and adjustments we’ve made) is in this spreadsheet. (Note on how we count unique visitors.)



Note 1: Since our 2012 annual metrics report we have shifted to a reporting year that starts on February 1, rather than January 1, in order to better capture year-on-year growth in the peak giving months of December and January. Therefore, metrics for the “first quarter” reported here are for February through April.

Note 2: Our measure of the total number of donors may overestimate the true number. We identify individual donors based on the reported name and email. Donors may donate directly to our recommended charities and not opt to share their contact information with us, or donors may use different information for subsequent donations (for example, a different email), in which case, we may mistakenly count a donation from a past donor as if it was made by a new donor. We are unsure but would guess that the impact of this issue is relatively small and that the data shown are generally reflective of our growth from year to year.

Note 3: In total, GiveWell donors directed $2.6 million to our top charities in the first quarter of 2016, compared to $2.0 million that we had tracked in the first quarter of 2015. For the reason described above, we don’t find this number to be particularly meaningful at this time of year.

Note 4: We count unique visitors over a period as the sum of monthly unique visitors. In other words, if the same person visits the site multiple times in a calendar month, they are counted once. If they visit in multiple months, they are counted once per month.

Weighing organizational strength vs. estimated cost-effectiveness

A major question we’ve asked ourselves internally over the last few years is how we should weigh organizational quality versus the value of the intervention that the organization is carrying out.

In particular, is it better to recommend an organization we’re very impressed by and confident in that’s carrying out a good program, or better to recommend an organization we’re much less confident in that’s carrying out an exceptional program? This question has been most salient when deciding how to rank giving to GiveDirectly vs giving to the Schistosomiasis Control Initiative.

GiveDirectly vs SCI

GiveDirectly is an organization that we’re very impressed by and confident in, more so than any other charity we’ve come across in our history. Reasons for this:

But, we estimate that marginal dollars to the program it implements — direct cash transfers — are significantly less cost-effective than bednets and deworming programs. Excluding organizational factors, our best guess is that deworming programs — which SCI supports — are roughly 5 times as cost-effective as cash transfers. As discussed further below, our cost effectiveness estimates are generally based on extremely limited information and are therefore extremely rough, so we are cautious in assigning too much weight to them.

Despite the better cost-effectiveness of deworming, we’ve had significant issues with SCI as an organization. The two most important:

  • We originally relied on a set of studies showing dramatic drops in worm infection coinciding with SCI-run deworming programs to evaluate SCI’s track record; we later discovered flaws in the study methodology that led us to conclude that they did not demonstrate that SCI had a strong track record. We wrote about these flaws in 2013 and 2014.
  • We’ve seen limited and at times erroneous financial information from SCI over the years. We have seen some improvements in SCI’s financial reporting in 2016, but we still have some concerns, as detailed in our most recent report.

More broadly, both of these cases are examples of general problems we’ve had communicating with SCI over the years. And we don’t believe SCI’s trajectory has generated evidence of overall impressiveness comparable to GiveDirectly’s, discussed above.

Which should we recommend?

One argument is that GiveWell should only recommend exceptional organizations, and so the issues we’ve seen with SCI should disqualify them.

But, we think that the ~5x difference in cost-effectiveness is meaningful. There’s a large degree of uncertainty in our cost-effectiveness analyses, which is something we’ve written a lot about in the past, but this multiplier appears somewhat stable (it has persisted in this range over time, and currently is consistent with the individual estimates of many staff members), and a ~5x difference gives a fair amount of room for SCI to do more good even accounting both for possible errors in our analysis and for differences in organizational efficiency.

A separate argument that we’ve made in the past is that great organizations have upside that goes beyond the value of conducting the specific program they’re implementing. For example, early funding to a great organization may have allow it to grow faster and increase the amount of money going to their program globally, either through proving the model or through their own fundraising. And GiveDirectly has shown some propensity for potentially innovative projects, as discussed above.

We think that earlier funding to GiveDirectly had this benefit, but it’s less of a consideration now that GiveDirectly is a more mature organization.  We believe this upside exists for what we’ve called “capacity-relevant” funding, which is the type of funding need that we consider to be most valuable when ranking the importance of marginal dollars to each of our top charities, and refers to funding gaps that we expect will allow organizations to grow in an outsized way in the future, for instance by going into a new country.

Bottom line

Our most recent recommendations ranked SCI’s funding gap higher than GiveDirectly’s due to SCI’s cost-effectiveness. We think that SCI is a strong organization overall, despite the issues we’ve noted, and we think that the “upside” for GiveDirectly is limited on the margin, so ultimately our estimated 5x multiplier looks meaningful enough to be determinative.

We remain conflicted about this tradeoff and regularly debate it internally, and we think reasonable donors may disagree about which organization to support.