We’ve long been interested in the idea of subjecting our research to formal external evaluation. We publish the full details of our analysis so that anyone may critique it, but we also recognize that it can take a lot of work to digest and critique our analysis, and we want to be subjecting ourselves to constant critical scrutiny (not just to the theoretical possibility of it).
A couple of years ago, we developed a formal process for external evaluations, and had several such evaluations conducted and published. However, we haven’t had any such evaluations conducted recently. This post discusses why.
- The challenges of external evaluation are significant. Because our work does not fall cleanly into a particular discipline or category, it can be difficult to identify an appropriate reviewer (particularly one free of conflicts of interest) and provide enough structure for their work to be both meaningful and efficient. We put a substantial amount of capacity into structuring and soliciting external evaluations in 2010, and if we wanted more external evaluations now, we’d again have to invest a lot of our capacity in this goal.
- The level of in-depth scrutiny of our work has increased greatly since 2010. While we would still like to have external evaluations, all else equal, we also feel that we are now getting much more value than previously from the kinds of evaluations that we ultimately would guess are most useful – interested donors and other audience members scrutinizing the parts of our research that matter most to them.
Between these two factors, we aren’t currently planning to conduct more external evaluations in the near future. However, we remain interested in external evaluation and hope eventually to make frequent use of it again. And if someone volunteered to do (or facilitate) formal external evaluation, we’d welcome this and would be happy to prominently post or link to criticism.
The challenges of external evaluation are significant:
- There is a question around who counts as a “qualified” individual for conducting such an evaluation, since we believe that there are no other organizations whose work is highly similar to GiveWell’s. Our work is a blend of evaluating research and evaluating organizations, and it involves both in-depth scrutiny of details and holistic assessments of the often “fuzzy” and heterogeneous evidence around a question.
On the “evaluating research” front, one plausible candidate for “qualified evaluator” would be an accomplished development economist. However, in practice many accomplished development economists (a) are extremely constrained in terms of the time they have available; (b) have affiliations of their own (the more interested in practical implications for aid, the more likely a scholar is to be directly involved with a particular organization or intervention) which may bias evaluation.
- Based on past work on external evaluation, we’ve found that it is very important for us to provide a substantial amount of structure for an evaluator to work within. It isn’t practical for someone to go over all of our work with a fine-toothed comb, and the higher-status the person, the more of an issue this becomes. Our current set of evaluations is based on old research, and to have new evaluations conducted, we’d need to create new structures based on current research. This would take trial-and-error in terms of finding an evaluation type that produces meaningful results.
- There is also the question of how to compensate people for their time: we don’t want to create a pro-GiveWell bias by paying, but not paying further limits how much time we can ask.
I felt that we found a good balance with a 2011 evaluation by Prof. Tobias Pfutze, a development economist. Prof. Pfutze took ten hours to choose a charity to give to – using GiveWell’s research as well as whatever other resources he found useful – and we “paid” him by donating funds to the charity he chose. However, developing this assignment, finding someone who was both qualified and willing to do it, and providing support as the evaluation was conducted involved significant capacity.
Given the time investment these sorts of activities require on our part, we’re hesitant to go forward with one until we feel confident that we are working with the right person in the right way and that the research they’re evaluating will be representative of our work for some time to come.
Over the last year, we feel that we’ve seen substantially more deep engagement with our research than ever before, even as our investments in formal external evaluation have fallen off.
- We conducted an internal evaluation with employee Jonah Sinick. Jonah went over our work on insecticide-treated nets at a high level of detail, and we posted his full notes (as well as a summary), and an in-depth discussion of the most major issue he raised.
- We published new analysis on deworming, and in the process, we had significant engagement and back-and-forth with scholars who study deworming. See our post on revisiting the case for developmental effects from deworming (which we published after substantial back-and-forth with the authors of the study in question) and our discussion of a new literature review on deworming (which was reviewed by an author of the Cochrane review prior to publication and by a scholar who dissented with its findings subsequent to publication).
- Reader David Barry did a highly in-depth review of our comparative cost-effectiveness analysis, and published his findings as a guest post on our blog.
- Our recommendation of GiveDirectly prompted substantial pushback from our audience, and we believe this led to an elevated level of critical engagement with our research. For example, see this Giving What We Can post as well as audio and transcripts we’ve posted from in-person meetings and donor calls involving such critical engagement (see the 1/10/13, 12/6/12 and 11/26/12 items in particular). Engaging with this pushback led to our posts comparing evidence quality for our top charities’ interventions and further discussing our rankings.
- We also believe that our content on top charities – and updates on their progress – has been more carefully reviewed than previously; for example, see this discussion of a footnote in an update we published on Schistosomiasis Control Initiative.
We continue to believe that it is important to ensure that our work is subjected to in-depth scrutiny. However, at this time, the scrutiny we’re naturally receiving – combined with the high costs and limited capacity for formal external evaluation – make us inclined to postpone major effort on external evaluation for the time being.
- If someone volunteered to do (or facilitate) formal external evaluation, we’d welcome this and would be happy to prominently post or link to criticism.
- We do intend eventually to re-institute formal external evaluation.