In a recent guest post for Development Impact, Martin Ravallion writes:
The current fashion [for evaluating aid projects] is for randomized control trials (RCTs) and social experiments more generally … The problem is that the interventions for which currently favored methods are feasible constitute a non-random subset of the things that are done by donors and governments in the name of development. It is unlikely that we will be able to randomize road building to any reasonable scale, or dam construction, or poor-area development programs, or public-sector reforms, or trade and industrial policies—all of which are commonly found in development portfolios. One often hears these days that opportunities for evaluation are being turned down by analysts when they find that randomization is not feasible. Similarly, we appear to invest too little in evaluating projects that are likely to have longer-term impacts; standard methods are not well suited to such projects … (Emphasis mine)
He concludes with a call for “‘central planning’ in terms of what gets evaluated” to ensure that evaluation doesn’t become concentrated among the projects that are easy to evaluate. His post could be seen as a direct retort to the kind of work emphasized in the recent books Poor Economics and More Than Good Intentions (our review). These books present ideas and evidence that is mostly drawn from high-quality studies, and have little to say on questions that high-quality studies cannot help answer.
My instinct is the opposite of Dr. Ravallion’s: I feel that the move toward high-quality evaluations is a good thing, even if it starts to cause bias in what sorts of programs are evaluated – and carried out. What follows is an attempt to explain my feeling on this. My feeling is a function of my worldview and biases, and this post should be taken less as a “rebuttal” than as an opportunity to explicate my worldview and biases.
My disagreement with Dr. Ravallion has to do with my experience as a “customer” of social science. The vast majority of studies I’ve come across have been seemed so methodologically suspect to me that I’ve ended up not feeling they shed much light on anything at all; and many (not all) exceptions are studies that have come out of the “randomista” movement. (Another particularly helpful source of evidence has been Millions Saved, which focused on global health.) Given this situation, I’m not excited about using “central planning” to make sure that researchers continue to try answering questions that they simply don’t have the methods to answer well. I’d rather see them stick to areas where they can be helpful.
What does it look like when we build knowledge only where we’re best at building knowledge, rather than building knowledge on the “most important problems?” A few thoughts jump to mind:
- Over the last several decades, I am not sure whether we’ve generated any useful and general knowledge about how to promote women’s empowerment and equality – from the outside – in developing-world countries. But we’ve generated a lot of knowledge about how to produce affordable, convenient birth control in a variety of forms. I would guess (though this is just a guess, as empowerment itself is so hard to measure) that the latter kind of knowledge generation has done much more for empowerment and equality than attempts to study empowerment/equality directly.
- Similarly, what has done more for political engagement in the U.S.: studying how to improve political engagement, or studying the technology that led to the development of the Internet, the World Wide Web, and ultimately to sites like Change.org (as well as new campaign methods)?
- More broadly, studying areas we’re good at studying and generating knowledge we’re good at generating has led to a lot of wealth generation and poverty reduction. I feel poverty reduction brings a lot of benefits that would be hard to bring about (or even fully understand) directly.
Bottom line – researching topics we’re good at researching can have a lot of benefits, some unexpected, some pertaining to problems we never expected such research to address. Researching topics we’re bad at researching doesn’t seem like a good idea no matter how important the topics are. Of course I’m in favor of thinking about how to develop new research methods to make research good at what it was formerly bad at, but I’m against applying current problematic research methods to current projects just because they’re the best methods available.
If we focus evaluations on what can be evaluated well, is there a risk that we’ll also focus on executing programs that can be evaluated well? Yes and no.
- Some programs may be so obviously beneficial that they are good investments even without high-quality evaluations available; in these cases we should execute such programs and not evaluate them.
- But when it comes to programs that where evaluation seems both necessary and infeasible, I think it’s fair to simply de-emphasize these sorts of programs, even if they might be helpful and even if they address important problems. This reflects my basic attitude toward aid as “supplementing people’s efforts to address their own problems” rather than “taking responsibility for every problem clients face, whether or not such problems are tractable to outside donors.” I think there are some problems that outside donors can be very helpful on and others that they’re not well suited to helping on; thus, “helping with the most important problem” and “helping as much as possible” are not at all the same to me.
It’s common in our sphere to warn against the “streetlight effect,” i.e., “looking for your keys where there’s light, rather than where the keys are most likely to be.” In the context of aid, this means executing – and studying – the programs that are easiest to evaluate rather than the programs that are most likely to do good. (Chris Blattman uses this analogy in the context of Dr. Ravallion’s post.)
But for the aid world, the right analogy would acknowledge that there are a lot of keys to be found, and a lot of unexplored territory both in and outside the light. In that context, the “streetlight effect” seems like a good thing to me.
The points raised herein seem very solid for the most part.
I would guess that the main point of contention would be that of whether current research methods in a given domain are problematic (and if so to what degree). It would be interesting to hear exactly what sorts of evaluations Dr. Ravallion has in mind and for what sorts of projects.
There’s also a related question of when programs are “so obviously beneficial that they are good investments even without high-quality evaluations available.”
If we presume that all maladies are more or less equal, or that we are working on the worst of them, then I wholeheartedly agree with this rationale. If two diseases kill lots of people prematurely and one of them is easily preventable, I have no problem focusing on the preventable one for the time being.
But what if the problem for which you don’t have a statistically tested solution is vastly larger than the one you can solve? Admittedly, few things manage to be an order of magnitude worse than malaria, AIDS, or extreme poverty (and since effective treatment is an order of magnitude better, that seems like a fair measurement, roughly speaking). But are we to ignore potential pandemics until the studies are complete and the disease is widespread? Are we content to deworm children in a midst of a genocide, or to ignore climate change because we’ve never faced a threat like it before?
The answer might be that these issues are the responsibility of governments, or that we look too narrowly or play the heart strings too much, as more die from disease than genocide, and nobody knows how to stop war.
However, those answers do not address the heart of the question. There are times when the streetlamp effect has us ignore a trump-card issue because we don’t yet know what to do. Is that what we should have done with AIDS before there was treatment, before we knew how it was and wasn’t spread? Ebola? Can we ignore climate change because we can’t design a global controlled experiment?
When crop failures and famine are widespread and fisheries lost to ocean acidification, I’m afraid we will wish we done more on that front.
Eric, I think your reasoning is sound. I agree that there can be interventions that ought to be carried out because of their sheer importance, even when they can’t be evaluated – that’s what I was trying to get at in the line, “Some programs may be so obviously beneficial that they are good investments even without high-quality evaluations available; in these cases we should execute such programs and not evaluate them.”
However, the most salient issue to me at this time is that I agree with your statement that “few things manage to be an order of magnitude worse than malaria, AIDS, or extreme poverty.”
Also bear in mind that GiveWell focuses on individual donors, who are less well-positioned than major donors and governments to take big risks and impose “fuzzier” accountability.
Comments are closed.