We’re excited about the project of making giving more analytical, more intellectual, and overall more rational. At the same time, we have mixed feelings about the project of quantifying good accomplished: of converting the impacts of all gifts into “cost per life saved” or “cost per DALY” type figures that can then be directly compared to each other.
We believe that these two projects are too often confused. Many expect (or assume) GiveWell to make all its recommendations on the basis of “cost per unit of good” formulas, and it often seems that the rare people who want to be intellectual and rational about their giving are overly interested in explicit quantification.
It seems to us that attitudes toward giving are most often classified as “emotional/intuitive” or “rational/quantitative.” We think this framing is problematic on multiple fronts. We don’t think that “rational” should be opposed to “emotional,” and we also don’t think that the very real problems of “quantitative” should be used to tar all those who seek a “rational” approach.
We propose what we think is a better framing: distinguishing “passive,” “rational” and “quantified” decisionmaking. For each, we discuss both how this style of decisionmaking approaches a “normal” purchasing decision – buying a printer – and how it approaches giving.
Passive decisionmaking follows the path of least resistance, favoring options that are salient and/or easy to choose.
A passive approach to buying a printer might involve buying the first printer that comes up in a search, or the first affordable printer that one sees in a store, or the printer suggested by a friend.
A passive approach to giving might involve supporting a charity that calls on the phone, or giving to a charity that a friend is fundraising for. It might also involve reacting directly to short-term emotional incentives: for example, a charity that tells a compelling story can induce immediate guilt/cognitive dissonance prior to a donation, and/or immediate emotional reward after a donation. Supporting such a charity can often be “easier” (in a short-term sense) than not supporting it.
We believe this describes the way most people choose charities to support.
Rational decisionmaking involves an effort to make the best possible choice (according to the decisionmaker’s values) between all viable alternatives, given constraints on how much information is available and what resources are available to gather more information.
A rational approach to buying a printer might involve a process such as
- Searching for “printer” on Amazon and using Amazon ranking, user reviews, and price as initial filters (e.g., making a list of printers that are within the price range and have strong user reviews).
- Creating a document in which the different options can be viewed side by side in terms of a variety of metrics, such as printing speed, printing quality, desk footprint, etc.
- Googling “printer,” searching for rankings and editorials, and entering noteworthy praise or criticism into the document.
- Asking friends for comments on their experiences and entering those into the document as well, as warranted.
- Looking at all the collected information, narrowing the field further based on particularly important criteria, perhaps creating some limited aggregated indices for comparison, and making a decision that considers – but does not commit to a formula regarding – the available information.
- Talking to people about one’s decision and reasoning and seeing whether one has missed any important considerations.
Of course, the amount of effort that is put in depends on how important the decision is (as well as on how much time the decisionmaker has available). In the case of a printer, the above process might or might not be overkill.
In our view, a process broadly similar to this is appropriate for choosing a charity to give to (or a cause to invest in), especially when giving a relatively large amount. We seek to:
- Consider a wide range of possible options.
- Narrow the field in successive stages, using the best heuristics we can come up with.
- Use both internal and outward-facing discussion to identify key questions that might be both (a) important to our bottom line and (b) tractable to further investigation.
- Investigate such questions and write up what we find.
- Collect all of the information into one place and thoroughly discuss it (again, both internally and externally) before making a final decision.
This approach has clear differences with the “passive” approach, but the differences are not – in my view – about “heart vs. head.” In fact, if anything, I think the rational approach is more likely to be associated with strong emotion. People often try hardest to take a “rational” approach when making their most high-stakes, emotionally important decisions; the “passive” approach is more common for decisions that are considered inconsequential.
Quantified decisionmaking (as defined for the purposes of this post) involves committing to a universal metric for comparing all options and then formally quantifying options in terms of this metric.
A quantified approach to buying a printer might involve
- Developing a metric such as “Net present-value dollars gained” for judging printers.
- For each printer, estimating things such as
- The expected time and convenience gained by being able to print things from home, and the present value of this time and convenience in dollars – including scenarios in which the alternative would be to print from a copy shop, scenarios in which the alternative would be to ask friends to print, and scenarios in which the alternative would be to refrain from printing a document and instead read it on the computer or rely on one’s memory.
- The expected time lost to repairing or replacing the printer (a function of its reliability as estimated from reviews and friends’ comments), again converted to present-value dollars.
- The monetary value of the desk space taken up by the printer.
- Creating an estimate of each relevant parameter, sometimes informed by facts and sometimes mostly based on guesswork.
- Calculating the “net present-value dollars gained” for each contending printer, making adjustments that come up on sanity checks, and purchasing the printer with the highest final score.
In our view, this approach is analogous to giving based purely on expected DALYs averted per dollar spent.
To be clear, we think this approach has some definite merits when applied to giving. We have found cost-effectiveness analysis to be highly valuable for the degree to which it brings implicit assumptions to the foreground. It often spurs debates that wouldn’t have happened otherwise, and it may provide clarity into our thinking that couldn’t be obtained in any other way. Perhaps it would accomplish similar things when applied to purchasing a printer. And this method is almost certain to be the best method when all of the relevant parameters can be estimated with high precision (this is not the case either with buying a printer or with giving to a charity, but may be the case for simpler and/or more technical questions.)
The weakness of this approach, in my view, is that it takes an enormous amount of effort to do well, and even when done well generally involves so much guesswork and uncertainty that it’s questionable whether the results should influence one’s prior beliefs. Valid, high-certainty information that should shift one’s view (for example, “this printer takes up a lot of space”) can be lost in the noise of all the guesswork used to convert the information into a unified framework (for example, converting the space taken up into dollars gained).
When using a single unified equation, one mistake – or omitted parameter – can result in a completely wrong conclusion, even if much of the other analysis that was done is sound. The “rational” approach uses implicit model combination and adjustment, and is more likely to give a good answer even when not all of the inputs are reliable. It can also be more efficient in the sense of view-shifting information gained per person-hour spent.
GiveWell exists to promote rational giving as opposed to passive giving. It doesn’t necessarily seek to promote quantified giving.
When rational and quantified giving are too strongly associated, rational giving suffers from the association. There are many legitimate criticisms of quantified giving that do not apply to rational giving, and future posts will advance some of these.
The primary purpose of this post was to draw as clear a distinction as we could. We don’t want to see the important and exciting project of “rational giving” remain tied to the much more limited and less exciting project of “quantified giving.”
“Collect all of the information into one place and thoroughly discuss it (again, both internally and externally) before making a final decision.”
In the model you discuss here quantified data is fed to intuition, with intuition having the last word. But there are various empirical studies that suggest that this degrades accuracy and performance relative to feeding intuitive judgments into a quantified formula.
In the recent IARPA-sponsored prediction tournaments, aggregation methods deliver large improvements on considered judgment, according to Phil Tetlock:
“There is some truth to that in the IARPA tournament. That simple averaging of the individual forecasters helps. But you can take it further, you can go beyond individual averaging and you can move to more complex weighted averaging kinds of formulas…we’re aggregating individual forecasters in sneaky and mysterious ways. Computers are an important part of this story.
We don’t have geopolitical algorithms that we’re comparing our forecasters to, but we’re turning our forecasters into algorithms and those algorithms are outperforming the individual forecasters by substantial margins…
In our tournament, we’ve skimmed off the very best forecasters in the first year, the top two percent. We call them “super forecasters.” They’re working together in five teams of 12 each and they’re doing very impressive work. We’re experimentally manipulating their access to the algorithms as well. They get to see what the algorithms look like, as well as their own predictions. The question is–do they do better when they know what the algorithms are or do they do worse?
There are different schools of thought in psychology about this and I have some very respected colleagues who disagree with me on it. My initial hunch was that they might be able to do better. Some very respected colleagues believe that they’re probably going to do worse.”
What is your prediction on this question, in light of the ideas in the OP?
I agree with the main claims of the post and found the printer analogy illuminating. In one way, I fear that the printer analogy stacks the deck against the quantified perspective. The issue is that printers are something we have a lot of familiarity with and feedback about. There’s something about this that seems very important for making the “rational” approach more attractive than the quantified approach. And of course a core issue with estimating a charity’s impact is that there are less often people out there who can unproblematically know from experience how much their efforts are helping people, unlike the way in which people can relatively unproblematically know from experience how well their printer is working out.
One relevant point in response to Carl’s comment is that you could use structured evaluation in the “rational” approach, assigning scores along a number of dimensions to different giving opportunities and using some combination of scores to choose between them. I understand there is some relevant psychological literature that says this is supposed to helpful, but I have not examined it personally and find the application somewhat suspicious. (I know that, according to this literature, people found the idea suspicious but did worse than the algorithmic prediction/decision rules.)
In addition to what Carl wrote above, which I think was very well put, I should say I have a feeling this post may be missing something about quantification advocates. It may be that rational does not mean quantified, but it is important to stress that quantified does not mean precise.
You say “Quantified decisionmaking (as defined for the purposes of this post) involves committing to a universal metric for comparing all options and then formally quantifying options in terms of this metric.” But that leaves out informal estimations as a tool for evaluating different courses. To stick to your printer example, it would surely not make much sense to have a large study conducted just to decide if a small firm should change its printer. But it would certainly help to ask: “how much money would we save, roughly?”, and then make back-of-the-envelope calculations.
Because many things in life differ in orders of magnitude, these back of the envelope calculations can be extremely valuable when compared to no estimates at all.
One calculation – which was certainly not back of the envelope but which is admittedly not very precise – illustrates this very well: Michael Clemens’ estimation that open borders would double world gdp. If any other cause could come close to this number, it would be extremely important to know that, even if estimations are rough.
But I also think that there should more resources devoted to quantifying initiatives. I think your printer analogy does not apply to discussing development initiatives because of the difference of importance of the decisions. A printer is an inexpensive device, so it would not make any sense to invest a lot of time figuring out whether it should be changed or not. Development decisions, however, have enormous effects – they are arguably the most important decisions we make. But we do not devote nearly as many intellectual resources taking these decisions as we devote to – say – calculating what would be the best restaurant for me to go given my location and what my friends like in Facebook.
And this is the main reason I think quantification should be much more explored in this area: where profit opportunities exist, companies are applying the most sophisticated techniques available. In public goods, however, we are moving forward – this website is an amazing change – but not as fast we could.
Maybe givewell does not have the resources to do that at this point. In that case, I would suggest you consider your own organization as one that has room for more funding.
“We have found cost-effectiveness analysis to be highly valuable for the degree to which it brings implicit assumptions to the foreground.” I think that’s at the heart of the matter. The way I’ve simplified this issue in my head is: Think rationally about stuff always, and only bother with quantified thinking when you strongly suspect there are some biases/assumptions at work in your own thinking.
The main reason I, and I suspect many, effective altruists like “$/QALY” or “$/life saved” estimates is that the concreteness of it is good for motivating ourselves and others to give. (On a personal level, it’s one thing to have a high-level goal to give money to the most cost-effective charities, and another to feel compelled to do it at any one moment, and a concrete picture of what your donation will achieve really helps with that latter part EVEN if you think this picture is just a visualization of the expected value. On a broader level, a “Let’s raise $2500 to save someone’s life from malaria!” campaign is more compelling that “Let’s raise $1000 to help save people from malaria!”, EVEN IF you caution that the $2500 figure is very shaky.)
P.S. Looks like this overemphasis on quantitative evaluation permeates the NGO world as a whole at the moment, not just our little effective altruism community: “evaluation academics reported that they get loads of NGO demand for numbers, but barely a flicker on process evaluation or realtime accompaniment” (http://www.oxfamblogs.org/fp2p/?p=15482)
Your citation describes using formulas to aggregate the judgments of many individuals, whereas the “quantified” approach I describe in the post is about how to aggregate an individual’s beliefs on different topics to reach a judgment about the matter of interest. I think these are very different things. This post is discussing the best way for an individual to generate a judgment given the information at hand, not the best way to aggregate individual judgments. I would be surprised if IARPA’s best forecasters were using methods analogous to the “quantified” approach described here.
More generally, I would guess that with sufficient “training” data, it is often (perhaps usually) possible to construct a formula that outperforms intuition, though not necessarily using the “estimate all relevant parameters” approach that I’ve identified as “quantified” above. But without such “training” data, and without a formula that has been calibrated, it’s not clear that our best-guess quantified formula is better than a “data fed to intuition” approach. If you have evidence that it may be, I’d appreciate a citation.
To answer your question, I’d guess – with low confidence – that access to aggregation algorithms would degrade performance for the general population but improve performance for “super forecasters,” especially if the “super forecasters” are strong on a wide variety of topics.
Nick, I don’t agree that the relative ease of evaluating one’s own printer is highly relevant to this analogy. If you imagine shopping for a printer ~20 years ago, before large numbers of first-hand opinions could be aggregated, I think the basic analogy would hold. You’d still want to consider the opinions of trusted friends who happened to have substantial experience with a printer, and you’d still want to collect other information on printer size, price, speed, functions, etc., and I think it remains fairly clear that consolidating this information and making a decision with it in hand would be better than the “quantified” approach, which would also be far less informed regarding reliability.
Tiago, I don’t identify “quantified” with “precise”; indeed, the high uncertainty around quantified estimates is (in my view) the reason they can be problematic. I do think “back of the envelope” calculations can be a helpful part of the process (as discussed above), but I don’t think they should be considered the only or primary tool for decisionmaking.
I strongly agree that we should be prepared to invest substantial resources in making maximally informed and intelligent decisions about where to give. The question this post discusses is how such resources should be invested and what intellectual frameworks should be used to synthesize information. As resources rise, precision may rise, and thus the relative merits of the quantified framework may increase. However, when it comes to giving, I think that for relatively large X as well as small, $X spent on investigation of giving options within the “quantified” framework will produce less of an improvement in decisionmaking than $X spent on investigation within the “rational” framework, because of the inherent limits to precision in many of the most important parameters.
Holly, I understand the appeal of quantified figures for motivation, and if using them this way causes people to give more, that’s a good outcome. But I think we would all be better off if we could acknowledge what these figures are and aren’t, intellectually, regardless of where we’re drawing our inspiration from emotionally. For my part, I find less quantified but more robust observations to be more motivating.
Holden, I think your printers from 20 years ago refinement still stacks the deck against the quantitative perspective. A more fair comparison might be something like quality of classroom education at your college. You have a hard time separating out the impact from other things going on and you can’t unproblematically know how much it helped you in the labor market, but you could get perspectives from a variety of people.
I also think some of the other points you make stack the deck against the quantified approach. Why can’t people using the quantified approach discuss their analysis with lots of people and find out where others disagree? Why can’t they do many of the things the other approach would do, and then use a quantified figure?
I think we’re largely on the same page, I’m just pushing back because feel the quantified approach is being straw-manned rather than steel-manned. I’d be interested in whether you think you are trying to attack the strongest possible version of the quantified approach.
Nick, I don’t think of this post as a decisive refutation of the quantified approach. I wouldn’t use an analogy for that purpose, and don’t feel any analogy is up to the job. I’ve made tighter arguments against the quantified approach elsewhere and plan to make more in the future.
Rather, this post aims to make clear how we see the two approaches as differing, and it purposefully uses an example in which the quantified approach seems self-evidently inferior, in order to highlight the conceptual (i.e., possible) contrast between “quantified” and “rational” for those who tend to conflate the two.
With that said,
Following up Nick’s first comment:
The book ‘how to measure anything’ has a summary of when structured evaluation (i.e. defining factors, weighting them and assigning scores) is and isn’t better than expert judgement. This could be a way to improve the rational approach as explained here.
I also wonder, why doesn’t GiveWell used a more structured evaluation process? You could score charities on ‘proven’, ‘cost-effective’, ‘scalable’ etc. then make a model combining these scores. This could be especially good for the initial filtering (perhaps with easier to evaluate factors), and it would make the process even more transparent. You could then do deep-dives to see if the model gives sensible answers.
Returning to the post:
How would a detailed structured evaluation process fit into the framework? For instance, you could define a bunch of factors (which may include quantified factors like $/QALY but also other more qualitative ones), assign numbers to each, then turn them into a combined score?
Expanding on Nick’s comment about straw manning:
I think the quantified decision making process outlined in the first section of ‘how to measure anything’, as summarized by Luke [here](http://lesswrong.com/lw/i8n/how_to_measure_anything/) is one of the best I have seen. I’d recommend taking it as the steel man to take on if you really want to argue against the quantified approach (rather than a common misinterpretation of it).
Ben, I think the “quantified” approach is importantly different from the broader idea of “structured decisionmaking” and the idea in the post you link to. Our goal is not take on a straw man or a steel man, but to discuss a specific approach that we often see promoted as – and confused with – “rational” giving. This approach is one that tries to base all decisions on explicitly quantified “good accomplished”; it represents a particular subcategory of structured evaluation, and a deeply problematic one in our view.
Regarding why we don’t impose more structure on our own evaluations: we have rated our top charities on our major criteria (see the table at this post). One issue is that many of the scores are “multidimensional” in a way that would make them hard to fit strictly into e.g. a 1-10 rating system. More importantly, in my view, we don’t have an independent way of assessing how the different dimensions should be weighted and combined. If we were able to try different weightings, see which lead to the best general track records, and then use those weightings over time even when they contradict our specific intuitions, we would. But with the low sample size we have, there’s no reasonable yardstick I can think of for a weighting scheme except the “circular” yardstick of whether the conclusions match our judgments.
Do you think there is room for more passive EA giving solutions, for people who agree with the cause in general, but cannot, or wish not to, engage with it? I know many people like that.
Do you think EA should adopt tactics other charities, and even commercial companies, employ in their marketing?
Uri, I think it’s possible that an EA could accomplish good by targeting “passive” givers. (That isn’t something we’re planning, though.)
There seems to me to be a large overlap between this rational-vs-quantified issue and the other topic that seems most starkly to divide the EA community, namely the relative value of “high uncertainty but potentially massive value” projects vs. “low uncertainty but merely high value” projects. I would like to ask for people’s views on my own way of addressing these questions, which I have not seen/heard elsewhere.
The essence of my approach is to appeal to quantification-influenced intuition from the opposite direction than is typical. What I think is usually done is to estimate the cost of saving a life or adding a QALY in a given way versus another way and then, in the case where the uncertainty of that cost is higher for the option whose expected benefit per dollar is higher, to use one’s intuition to decide whether the difference in expected benefit is enough to allow one to put the uncertainty out of one’s mind. This works in an insidious way against the high-value high-uncertainty cause, in that the more one tries to be rational, the more one fixates on the lack of quantification of the high uncertainty and the more one overstates/overestimates it. What I prefer is to focus explicitly on quantifying the uncertainty as well as the value – and, critically, to give intuition the last word there too. What this amounts to is asking oneself how plausible it really is that the higher-value option really has only the same value as one thinks the lower-value one has.
This is the sort of thinking that has led me to be so sure that aging is the world’s most important cause (with the possible exception of those relating to existential risks). I’ve seen estimates here that the cost of saving a life by buying malaria nets is a couple of thousand dollars. My current estimate of the cost of saving a life by contributing to anti-aging research and thereby hastening the medical defeat of aging is a couple of dollars – yes really, 1000 times less. And that’s even without taking into account the fact that defeating one person’s aging gives them far more expected QUALYs than stopping them from getting malaria. So the question has to be: how likely is it that I am 1000-fold overoptimistic about the prospects for anti-aging research?
Perhaps I should sketch how I reached that “couple of dollars” number. Basically I start by estimating how much money could usefully be spent over the next decade on anti-aging research, how many years the defeat of aging would be likely to be hastened by that tnvestment, and how many people would benefit. I specify “the next decade” because that’s the sort of timeframe in which I think there will be sufficient progress that society in general starts to view the uncertainty of success as much lower, such that there is no need to perform this sort of analysis.) These numbers are, respectively: half a billion dollars, five years, and 200 million people (the number of people who die every five years from age-related causes). That means the cost per life saved is $2.50.
So, applying my method above, the question that I feel an anti-aging-research-donation skeptic needs to ask themselves is whether they can really convince themselves that half a billion dollars would actually bring the defeat of aging forward by only two days (1/1000 of five years).
Aubrey, I don’t agree with this approach to estimation. My reasons for disagreeing are mostly laid out in this post and this post. In a nutshell: estimates that are sufficiently (a) rough/uncertain (b) optimistic should cause very little in the way of updates to our prior. The more optimistic the rough estimate, the more your way of framing things will make it look like a “large” adjustment is needed to reject it; but in fact, I believe that the proper approach penalizes rough estimates for getting more optimistic past a certain point, rather than rewarding them.
In the case of your specific argument, I think it’s worth noting that the NIH has spent billions per year on cancer for several decades (though the link only shows the past ~10 years), and we could still be a very long way off from eliminating the burden of cancer. Therefore, it’s not hard to imagine that $500 million spent on aging research could make only a very small contribution toward the goal you lay out. Of course, it’s also possible that $500 million could make a large difference; what I think is problematic is arguing from ignorance/uncertainty/guesswork (i.e., taking our best guess at how much a large amount of money will speed progress on a scientific goal we know so little about) to high confidence that this cause is outstanding. Laying out why that is problematic is something I have tried to do in the past (as in the two posts linked above) and may attempt more of in the future.
Comments are closed.