The GiveWell Blog

Transparency, measurement, humility

Transparency is the one thing about GiveWell that everyone seems to like. Our focus on measurement is much more contested. I believe that the connection is tight, though, because both are necessary consequences of humility, which is probably the last word you ever thought I’d ask to be associated with.

Transparency is a really big deal to us because we believe that no matter how much we learn and no matter how hard we work, we can always be wrong. That’s why we invite as many people as possible into the conversation.

(When I look at large foundations making multimillion-dollar decisions while keeping their data and reasoning “confidential” – all I see is a gigantic pile of the most unbelievably mind-blowing arrogance of all time. I’m serious. Deciding where to give is too hard and too complex – with all the judgment calls and all the different kinds of thinking it involves, there is just no way Bill Gates wouldn’t benefit from having more outside perspectives. I don’t care how smart he is.)

Measurement is about inviting someone else into the conversation: The Facts. The Facts have a lot to say, and they often contradict what we would have thought. That’s why we have to listen to them. Like transparency, measurement takes a lot of extra effort and expense; like transparency, it can’t solve all your problems by itself; and like transparency, it’s easily worth it if you agree that the issues are extremely complex, and that no matter how much sense something makes in your head, The Facts might disagree.

(And yes, when I hear people talking about how they’re “too busy helping people” to measure, I hear arrogance there too. The only way you could ever take that position is if you are so sure of what you’re doing that you think learning more about it isn’t worth the 10% more you can spend doing the same untested thing. How high on your own infallibility do you have to be to come to that conclusion?)

To a lot of people, humility means speaking with a certain tone of voice, or just plain keeping your mouth shut. If that’s what you think humility is, we don’t have it. To us, humility is constantly saying “The things that make sense to me could be wrong – that’s why I’m going to do everything I can to test them, against others’ ideas and against reality.” Instead of being silently dissatisfied with charity, we’re loudly dissatisfied, so that anyone who disagrees can respond. Instead of happily assuming our dollars are doing good, we demand to see The Facts, so they can respond too.

Laugh if you must, but in the end humility is the defining value of the GiveWell project. We’re not here to impose our solutions on the sector; we’re here because we want to see more questions.

Comments

  • Carl Shulman on December 27, 2007 at 3:54 pm said:

    “large foundations making multimillion-dollar decisions while keeping their data and reasoning “confidential””
    I was quite pleased to see that you had recorded your board meeting, and am very curious about the motivations for that secrecy. It could be as simple as board members not wanting to risk looking stupid or ignorant on camera.

    Also, this post reminds me of an old Overcoming Bias discussion:
    http://www.overcomingbias.com/2006/12/the_proper_use_.html

  • You may not be “imposing solutions,” but, as far as I can tell, you do criticize organizations for not providing the measurements you expect to see. I would like to see a section of your website devoted to the question, What are “hard data” in social services? I am not a social scientist, and I would venture to guess that most of us in this field aren’t either. How does one reconcile the need to provide proof of program success to each of our various funders, when our “subjects” are very complex human beings with very complex — and private — issues? Often, our only indicators of success are those few clients who come back years later to tell us how their lives changed for the better. Many people are helped, and many are not, and it is impossible to prove why. We go by our direct experience with person after person every day. We see what works and what doesn’t. I’m not saying we shouldn’t try to emulate successful programs, but my clients are different from your clients with different needs and circumstances. So comparing organizations’ effectiveness is based on what, exactly? People are not transparent nor trackable, so, really, the question is HOW do you measure them? What ARE the “facts”?

  • Good questions. I think the best place to start answering them is the Poverty Action Lab. I find abstract arguments over measurement methodology to be less illuminating than these guys, who give concrete example after concrete example of how well-designed studies can shed a heck of a lot of light on what works.

    Like any other participant in dialogue, The Facts don’t have all the answers, and they can paint a very complicated picture. I agree with you that there is no such thing as the “definitive” evaluation. But the kind of stuff PAL does is darn good, and a lot better (in my opinion) than what we see from most charities.

  • If you want a more generic picture, I laid out my basic framework for rigorous evaluation in this blog post. Comments on the post convinced me that I had underestimated (in that post) how difficult and expensive this sort of evaluation is likely to be … but I still think it’s often worth it (obviously depends on the specific case).

  • Jeffrey B. on December 27, 2007 at 7:48 pm said:

    Holden! Maybe no one will say this to your face, but…it’s not exactly that you lack humility; you are simply extraordinarily naive.

    “there is just no way Bill Gates wouldn’t benefit from having more outside perspectives. I don’t care how smart he is.”

    How can you possibly think Gates does *not* bring in outside perspectives? He has advisory board on top of advisory board, he has staff who talk with people and study and read who make the decisions with him–he doesn’t just make these decisions all by himself!–, he attends talks and has conversations with scads of people, reads books, travels, attends conferences, has consulted with probably thousands of people during the course of his life. Just like you plan to do! He talks frequently about why he gives money to the causes he supports. The programs themselves are planned and conducted by people who devote their whole lives to learning about the problems they attempt to ameliorate.

    “Measurement is about inviting someone else into the conversation: The Facts. ”

    Honestly! Do you really think you are the first person to have thought of this? Your naivity is stunning. You are lecturing here as if to children. It appears you haven’t yet caught on that that’s why people call you arrogant. It’s not that your ideas are not good, or you intentions either.

    Your depiction of the people who disagree with your approach on the basis that it is more difficult that you seem to realize, by portraying them as simply saying they don’t have time to measure, is a straw man. Of course it’s worth it to evaluate! Everybody would love to evaluate! Everybody would love it if money great on trees too! It’s as if you are entering the party half way through and can’t seem to comprehend that anybody came before you.

  • Caledonian on December 27, 2007 at 9:57 pm said:

    I’m sure Bill Gates could get lots of valuable advice by opening up and inviting input.

    He would also be inundated by a tsunami of scam, flim-flam, fraud, ignorance, desperation, good intentions, poor judgement, and pure greed. Wealth attracts economic predators the way a candle attracts moths – except it’s the charitable wealthy that are likely to be burned.

    Sorting through all the dross to find and identify the gold isn’t particularly likely to be worthwhile – at least I wouldn’t care to make the effort, if I were Gates.

    Are you sure it’s arrogance? Maybe it’s simple pragmatism. Gates may not be the most trustworthy advisor for Gates, but he’s probably a whole lot better than the hordes of advisor-wanna-bes will be.

  • The Bill & Melinda Gates Foundation hired a world-renowned expert in global health M&E, Christopher Murray this past July, to begin the Institute for Health Metrics and Evaluation in Seattle. The purpose of the Institute is to help the Foundation allocate health dollars more effectively through rigorous evaluation. The Institute is also creating a global data bank so that academic work can be replicated by others. Leading public health journals are slowly embracing the idea that with publication comes responsibility to make source code and data available free of charge. Conclusions drawn from ex post evaluations of public health programs too often reflect the prior beliefs of the authors. The analyst runs a large number of statistical models and publishes the results from a single specification. Readers are not informed of the process that the authors underwent to elevate one model over the rest or whether the authors used an objective metric with which to select the model. Placing data and code in the public domain will enforce accountability by allowing others to analyze the results as well as test and defend alternate specifications.

  • On Bill Gates: OK, I overstated this. You’re right, Gates has a lot of ways of getting exposure to outside perspectives without going all GiveWell.net , and he was a particularly bad example because I believe he is trying harder to do this than most foundations are.

    I do think he has been failing (perhaps I’ll post about this later – this is based on the materials that are available on the Gates website) and that with his resources, he has both the ability to root through the trash and a large benefit to be gained from it. But Andrew’s comment implies that he is starting to do exactly this anyway – this kind of transparency (and recognition of the problems with refusing to publicly publish data) is exactly what I am advocating. It could easily lead to my ceasing to lump them in with other non-transparent foundations, in the near future.

    Foundation opacity is frustrating because of the lost opportunity to help individual donors, not just because of the lost opportunity to get feedback. They both matter, but I overstated the importance of the latter in this post.

    Jeff B: first off, “we’re too busy to measure” is not a straw man. It factually isn’t. If you insist, I will go through my emails and pull out some ridiculous number of quotes along these lines, and probably find a bunch on the discussions and websites around here too. I recognize that charities may literally not have the funds to do evaluation (I have consistently leaned toward blaming funders rather than charities for this problem); funders are most definitely choosing not to do it, while the Poverty Action Lab is showing quite clearly that it can be done.

    You are very far from the first person to call me naive to my face. If you think “It’s been said a million times before” is any kind of legitimate refutation (as you seem to), consider yourself refuted.

    I get extraordinarily annoyed with the repeated “You are naive, you should know that these issues have been raised before” “argument,” not because it is confrontational – or because it bothers me to be considered annoying/arrogant/whatever – but because it’s completely vague and unhelpful. I’m not going to take your word that what we’re trying to do is futile. The world has a long history of making great changes that seemed obvious in retrospect, and pushing for these ones seems worthwhile both to me and to the growing group of GiveWell supporters. If we’re overlooking something, I’d think this would lead to a substantive criticism of the project; I would much prefer to stick to those.

  • michael vassar on December 28, 2007 at 12:11 pm said:

    Jeff B: Eventually one notices that most of the people you are talking to are essentially little children. At that point one ends up lecturing as if to little children.
    Exhibit a) http://www.nickbostrom.com/ethics/human-enhancement.pdf

  • Jeffrey B. on December 28, 2007 at 2:23 pm said:

    Holden: When people say to you “we’re too busy to measure,” they mean they are too busy because they don’t have enough people to run their program and evaluate at the same time, which is a money issue. In your essay you accused them of arrogance, and implied that they weren’t interested in finding out if their programs work, and that they weren’t sophisticated enough to understand the important of “Facts.” This is a naive assumption.

    I didn’t say that what you are trying to do is futile. In fact I specifically said “It’s not that your ideas are not good, or you intentions either.” You said “The world has a long history of making great changes that seemed obvious in retrospect, and pushing for these ones seems worthwhile both to me and to the growing group of GiveWell supporters. If we’re overlooking something, I’d think this would lead to a substantive criticism of the project; I would much prefer to stick to those.” In fact, what many are saying is that it is precisely *not* futile, and that many others have been working on this issue of how to know what works, and have progressed much further with it than you seem to realize. Your assumption that you’ve discovered this anew keeps you from understanding this.

    I just think you will be listened to more if you didn’t lecture as if you are speaking to children, and if you slow down a bit in your criticisms of an entire field of endeavor that’s existed for a long time. The mature thing to do would be to get up to speed before you insult people’s incompetence. Is that substantive enough?

  • In Holden’s defense, efforts to engage a wide audience in “evaluations science” is extremely timely. While efforts are underway in academic institutions and debates are raging in journals of public health and political science regarding the best approaches to evaluation, these forums are dominated by experts and are largely inaccessible to the public. What Holden and others are doing is to extend the debate beyond experts trained in a particular discipline and jargon and who think in terms of statistical life and death and whose work rests on black box metrics and assumptions. I don’t blame givewell for not being entirely up to date with the latest and greatest from academia since a lot of this work is occurring in venues to which we were not invited.

  • Rob Klein on December 28, 2007 at 3:11 pm said:

    How do you not put up a picture of a bear with this post?

  • Holden – some of the arguments about arrogance maybe compounded because you are late in your decision-making. $40K is a lot of money to a charity such as mine; like a lot of non-profits, we hold “spots” in our operating budget each fiscal year for potential funders. I feel the arguments on this blog have become increasingly abstract; there are real young people at the bottom of what we do who are really affected by funds received/not received. I’m not trying to make anyone feel bad; just want to bring the arguments back down to earth.

    For the amount of work you have required, keeping to deadlines should be absolutely sacred. Our non-profit is excited to embark on this experiment with you; I applaud your arguments for more transparency and accountability in the funding arena. That said – the phenomenal overhead of larger foundations also allows the process from initial contact to RFP to submission to decision to run more smoothly and on time. Being on time shows respect for those you are potentially funding. Is it possible to have more explanation of why, for example, the review process for Saving Lives and Employment Assistance took longer than you expected? Was it the sheer amount of data; additional research you needed to conduct on microfinance and skills training; all of these or something else? That would be a transparent response, and constructive for us as well. Unless I’m missing something, I found more generalities than specifics on your blog and site in this area.

    Another question: Will there be a refutation period after the final decisions are made? There is always the possibility for misunderstanding of data or intent. Also, why not conduct more site visits, at least in NYC? As the Director of Research, I would love to sit around crunching numbers in my office all day. However, meeting our students and staff lends both urgency and context to my work. Making the trek from Brooklyn to Harlem may not be your first choice of how to spend an afternoon, but it would be some proof that you are invested in the on the ground reality of charities as well as the concept (or, is in-person dialogue less important as a metric?).

    Finally – it’s hard to get away from arbitrariness in funding. For example, you could have made cut-off’s in education based on charities who employed randomized studies alone; or those who used value-added measurement of test scores; or an equal balance between qualitative and quantitative approaches. The entire concept of cohort studies in education is fairly novel in my experience in practice and pretty intimidating to a lot of people. I wonder how many extremely effective and very rigorous education charities you missed just because they hadn’t made that cognitive leap. Along those lines…how will you be systematically collecting reflective data from the charities themselves? Questions you might ask include their impressions of the smoothness of the process; the balance of the work required vs. the potential outcome; and thoughts about transparency and the potential ramifications of posting all submitted data online…

  • Wally G on December 28, 2007 at 4:27 pm said:

    I, for one, applaud all three values: transparency, measurement (or validation) and humility, as well as GiveWell’s steadfastness to integrate them in its business practice. Further, I agree that the future of philanthropy will be changed for the better by the prevalence of these values and the ongoing dialogue on what they mean to the growing continuum of funders and grantees.

    GiveWell,I encourage you to review The California Wellness Foundation’s (TCWF)white paper on their experience with evaluation (www.tcwf.org/pub_lessons/ezine6/index.htm).

    Years ago, TCWF required all funded grantees to include 10% of their budget for evaluation, probably for many of the same reasons GiveWell (and any other progressive funder) values measurement. Tom David (now at http://www.caseyfunds.org) was the brain behind this effort. In my opinion, TCWF embraced measurement but not transparency or humility.

  • Carl Shulman on December 28, 2007 at 5:26 pm said:

    “Also, why not conduct more site visits, at least in NYC? As the Director of Research, I would love to sit around crunching numbers in my office all day. However, meeting our students and staff lends both urgency and context to my work. Making the trek from Brooklyn to Harlem may not be your first choice of how to spend an afternoon, but it would be some proof that you are invested in the on the ground reality of charities as well as the concept (or, is in-person dialogue less important as a metric?).”

    Heidi,

    I’m curious about how the visit is supposed to affect the decision. Any charity can show off cute students and put its best face forward in a site visit, and emotional influences on evaluators shouldn’t shape grant decisions. Wal-Mart forbids its employees from receiving gifts of any kind, even coffee, to ensure that they make the best purchasing decisions rather than the ones supported by blandishments.

    If we’re talking about in-person dialogue with staff, why can’t the statements in question be written down in a submission? Written communication can be made transparent for all the donors using GiveWell as a resource, while oral communications cannot (without using an audio tape and cumbersome transcription). There is time for repeated exchanges and refutation/comments from charities or others during the analysis process.

    I recall an earlier discussion of the matter on this blog:
    http://blog.givewell.org/?p=193

  • Peter Burgess on December 29, 2007 at 1:28 am said:

    Dear Colleagues

    Transparency … Measurement … Humility

    There is a lot of talk about transparency in the international relief and development sector … World Bank, UN, USAID, NGOs, Government Agencies, etc., etc. … but in fact most of these organizations keep their information very much to themselves, and merely make accessible and publish a tiny subset of information that reflects favorably on them.

    Measurement in the relief and development sector is, in my view, unacceptably poor. There are four key metrics: cost, activity done, result of activity, benefit from activity. Taken together, they are very interesting.

    In my view measurement is an integral part of management … and much of the needed metrics come out of a decent accounting system. Why in heaven’s name is the accounting so poor, when good accounting is so simple and so critical to having control of any resource using operation.

    Humility … arrogance … whatever! Why has this come up to divert you all from the critical challenge of getting substantial improvement in transparency and the critical metrics. There is a need for results … which are long overdue.

    Sincerely

    Peter Burgess
    The Tr-Ac-Net Organization

  • Jeff B: yes, I now understand the substance of your criticism – you think that I should speak as though the things I want to happen are already happening, even though I see no direct evidence of it, because this will show more respect and make people listen to me more. My problem with this are:

    • Again, you’re factually wrong that I’m arguing with a straw man. Yes, some people say they’d like to measure and just don’t have enough money. But others raise objections along the lines of “If we measured, that would mean less spent helping people,” which is directly along the lines of the “straw man” I am attacking in this post. It is a very common objection that I will document if you insist. I’m actually pretty surprised at how sure you are of what all the people I talk to “mean” when they talk to me. That seems like a tough assumption to be sure of (and it is in fact wrong).
    • Just because something makes sense doesn’t mean it’s happening. Just because you say something is happening doesn’t mean it’s happening. As the end of this post states, my preferred approach is to state things exactly as they appear to me, and let people prove me wrong if I’m in fact wrong. I don’t want to give foundations credit for measurement and transparency until I see some reason to believe they are practicing these things, other than your logic that they must be because they’re smart and experienced and funded.

    I think I’ve taken quite a bit of effort to get up to speed. I’ve read loads of material telling me that foundations are already doing the things I advocate, but have seen nothing to show me this (and in fact I’ve seen direct evidence to the contrary). As for being listened to, the attention we’ve gotten has been inversely proportional to our diplomacy – we started off assuming the best of everyone, but no one would give us the time of day. We describe exactly the problems we see – not exaggerating but also not holding anything back, or giving the benefit of the doubt to those who don’t share their knowledge.

  • Heidi:

    I completely agree with you about the importance of being on time. And for your cause, we weren’t. That sucks. We messed up. I said as much, and apologized, in my email to you. As for why this happened, I’m not sure there is a lot to say: this is our first experience ever with evaluation, so we had absolutely no basis on which to form time projections. Given that, the fact that getting to the bottom of things takes about twice as long as we thought is really not much of a surprise at all. It’s probably this (the fact that we really had no idea how long things would take) that I should have thought about more and been more straightforward about earlier.

    I told applicants decisions would be made “by the end of the calendar year” because at the pace we were moving, it seemed like we’d easily be done by then. But our pace slowed late and sharply. The big factors were:

    • Generally, we had initially expected to see a lot more prepackaged/ready-made answers to our questions; in contrast to the way our process is often described, we thought we were pretty simple, basic, standard questions that should have been asked many times before. And most of our finalists assured us that they could easily answer them; it was only on looking closely at the materials that the holes in what we had access to started becoming apparent.
    • The Uncle Bob dynamic described in this blog post. Basically, every time we wrote down a central claim with documentation, we realized just how thin the documentation was, or how easily we could have misinterpreted what we read in the materials, and had to go back and find out more. And yes, this generally led to our doing a lot more of our own work and independent analysis than we had anticipated.

    We also thought it wasn’t a huge deal because we were offering relatively small grants to relatively large organizations, and had been extremely clear with all of them that they were competing with many other organizations (and thus shouldn’t be holding a spot in their budget for our funds, for the moment). If we weren’t clear enough in this, or if there is a faulty assumption in there too, then that’s another rookie mistake. We’re rookies – no one’s hiding that.

    In sum, we had no way of estimating how long things would take, we didn’t give ourselves enough leeway because we thought we were moving at a good enough pace (and didn’t fall off the pace until close to the end), we may have underestimated our importance to the organizations we were dealing with, and the result is that we communicated badly with you. These are all problems, we apologize, and we are learning from them and should not have them again next year.

    On your other points:

    • Will there be a “refutation period”? Our practice has been to make our full set of reviews available only to our finalists, and collect feedback from them – giving them a week or more – before making the reviews public. This has led to changes in our reviews and rankings. Our minds are also open to any new information we receive with enough time to process it before the Board meeting. That said, we are on a “give and learn” model – we prefer to make our gifts based on limited information and keep learning, rather than wait until “all” the facts are in (they never will be).
    • Will we conduct more site visits? I hope so. As I write in this post, I think we’ve underestimated the value of site visits and the role they can play in our process. Part of me agrees with Carl that they are of limited and often overstated value, but they are a good way to see what an organization thinks is important about itself. Next year, I think we will do site visits to finalists before further data collection. This time around, in the interest of time, I think we are going to stick to the original plan and conduct site visits only to the 2-3 finalists we ultimately decide are the strongest candidates for grants.
    • Is our focus on rigorous academic-style studies overly rigid? We never intended to impose a particular methodology. Our application asks broadly for evidence of effectiveness, and our minds are open to anything persuasive. But to date, I have seen no way to check your activities against reality, in a cause like improving educational outcomes, except by some sort of comparison-group study that controls in some way for selection bias. Charities that haven’t made this “cognitive leap” won’t be able to check their activities against The Facts (unless they’re doing something else that I haven’t thought of, and that I’d be open to looking at). That’s a real problem for the charity’s ability to learn, and for our ability to recommend it to strangers, and I don’t feel bad about excluding such charities.
    • Will we collect feedback from participants? We’ve been thinking that discussion.clearfund.org is the place for this – it lets everyone speak freely with us, with or without disclosing their identity. We have reminded all applicants to share their thoughts on this forum several times. That said, perhaps we would get better response from a more formal/time-limited invitation to give feedback on specific questions. I’m open to this. Do you think we should do it?

    Thanks for your thoughts. I really appreciate honest feedback from an applicant.

  • Jeffrey B. on December 29, 2007 at 11:54 am said:

    The New York Times will publish anything if they get to use the words “hedge fund.” Good luck to you.

  • Peter Burgess on December 29, 2007 at 9:31 pm said:

    Dear Colleagues

    I do not pretend to know very much about the not-for-profit operations going on in the USA, but I have considerable experience of the relief and development operations around the world.

    I am absolutely thrilled that GiveWell is trying to get a handle on performance, and is trying to find information about how useful their activities are. GiveWell deserves all the help it can get.

    There will not be much easy transparency. There is a critical mass of very poor performance that absolutely must remain hidden in order for the donors and the public to continue flowing funds. It is important to break this information loose!

    There are also some amazing activities that deserve funding, but are unlikely to get funding simply because they are not “plugged in” to any system that will fund them. These organizations need to be on the funding radar.

    I was struck by the comment: “As for being listened to, the attention we’ve gotten has been inversely proportional to our diplomacy – we started off assuming the best of everyone, but no one would give us the time of day.”

    Diplomatic or aggressive … at the end of the day, I don’t expect transparency of any duration or of any substance unless there is leverage … especially leverage that gets donor attention. The media needs material, and anyone interested in having any influence over the not for profit community, especially the larger organizations, needs to have a good feed into the mainstream media.

    In the malaria area, the only metrics of any substance are those that relate to the number of bednets distributed, and the “coverage” and other rather simplistic measures. The cost information tends to be unclear, and the result in terms of the reduction in the burden of malaria in the society does not seem to exist. Cost effectiveness metrics really don’t exist … and to the extent that they do are based on such snall samples as to be hopelessly inaccurate. Tr-Ac-Net is working on this with the Integrated Malaria Management Consortium out of the University of Alabama, Birmingham.

    Wishing you well … and wondering how best to help.

    Sincerely

    Peter Burgess
    The Tr-Ac-Net Organization

  • Alex Reynolds on December 31, 2007 at 2:38 pm said:

    The actions of GiveWell’s founders on Metafilter reflect badly on your group’s lip service to transparency and humility. It may do well for you to ask questions about whether your astroturfing benefits your mission in the long-term.

  • M Bitsko on January 2, 2008 at 3:16 am said:

    “People can get away with some incredible things as soon as they say that what they’re doing is “for charity”.”

    Holden Karnofsky
    29 May 2007
    http://blog.givewell.org/?p=88

    But one incredible thing you cannot get away with, Holden, is fraud.

    http://metatalk.metafilter.com/15547/GiveWell-or-Give-em-Hel

    M Bitsko
    Fall River, MS

  • Alberto on February 19, 2008 at 4:39 am said:

    I’m working as project manager into an italian corporate foundation (Umana Mente).
    Thnk’s for your work, but my opinion is very different: trasparency is not enough to know if a charity works good. My foundation belives in creating TRUST, that means more RELATIONS between us and charity org., that means spending a lot of TIME. It’s the only way to create a real partnership.
    Alberto

Comments are closed.