# Evaluating charter schools

Note: a positive effect of recent events was a set of substantive concerns about our model raised by non-profit insiders and others. We put all our reasoning and assumptions on our website precisely because we value critiques that will help us improve, and we look forward to responding to and discussing all concerns shortly. This post, however, focuses on the work we’ve been doing recently, evaluations for Cause 4 (K-12 education).

Three of our finalists focus on creating new schools. Replications, Inc. does so through the public school system, while the Knowledge is Power Program (KIPP) and Achievement First create charter schools. In evaluating these programs, we’re hesitant to focus too much on test scores – the link between test scores and useful skills, or later life outcomes, is far from clear – but test scores are a start, and both the media coverage and the organizations themselves have focused on test scores as the primary evidence of their value-added. So what does the evidence say?

So far, it’s very unclear. All three of the organizations discussed sent us data from an incomplete set of their schools; when it came to providing comparison-school data (to put it in context), what we were provided is even more incomplete. The same is true for the independent reports available on the organizations’ websites. For example, the independent reports that KIPP lists focus either on very few schools, or examine many schools but look only at one year. It’s often unclear to us why particular schools, grades, and years were examined and others weren’t, especially in light of the fact that nearly all test-score data is publicly available through state governments. (Collecting such data may be time-consuming, and we wouldn’t necessarily expect the organizations themselves to do so – but what about funders?)

In our minds, if we want to gauge these organizations’ impact, it’s necessary to collect all easily available and relatively recent data (to avoid concerns about cherry-picking particularly successful schools), and study not just the schools relative to district- or state-wide averages, but the trajectory of students’ scores. The question has been raised whether KIPP, for example, effectively selects stronger students for participation, both through its acceptance process and through attrition (students’ dropping out as time goes on). So even if their students outperform “comparison group” students, this isn’t necessarily an effect of the schools themselves; if we can establish, on the other hand, that their students enter at similar levels to their peers but improve gradually over time, the “KIPP effect” will become much clearer.

This sort of observation wouldn’t support a “KIPP effect” as strongly as a true randomized study would, but it would still be much stronger than what we’ve seen to date. We’ve seen comparisons of particular schools to nearby schools, and to statewide averages, but nothing that is either comprehensive or attempts systematically to address the issues discussed above. For example, this report, by the Education Policy Institute which looks across many KIPP schools, but only at 5th grade test scores in 2003-04. Other available reports look only at a few schools, or schools in a particular city for a one academic year.

So, at this point our plan is to bite the bullet: get all the data we can from state governments, and do our own gruntwork putting it in a form that can be systematically analyzed. We certainly don’t want to reinvent the wheel if someone has already done this work – we’d rather read research from experts that have spent much more time on this than we have – but at this point we don’t see an alternative, because we can’t find anyone who is publishing research on the effect of these programs in a way that addresses concerns about selection/attrition/publication bias.

What do you think? Are we going too far? Is there a faster and better way to answer these questions? Do you know of any research that already exists on these issues (for these charities)?

• MoonOverJackson on January 15, 2008 at 4:22 pm said:

Hi Elie.

Great post. Why should we trust you to evaluate anything again? Are you done spamming the message boards with fake identities?

Or are you and Holden using new aliases now?

What on earth qualifies either of you to judge how professional educators do their jobs?

MoJ

• As the operator of a nonprofit alternative education program, the thought of you folks coming in and “evaluating” me (especially given the somewhat odd rambling in this post) scares the heck out of me.

There are qualified organizations, staffed by educators and researchers, with an understanding of the dynamics of teaching and learning…let THEM do the evaluation. Let’s NOT have former ahh…hedge fund guys… tell us about education…

But, then, I do get stock market advice from a kindergarten teacher, so I guess it’s all good..

• Bob, given your experience in the field, which organizations or experts do you think have done research which could provide insight on how best to answer the questions which Elie poses in his post?

More generally, the process of determining which organization’s programming will allow a donor to make the largest impact for their dollar is challenging. As a result, we encourage the community to challenge our thinking and point us in the direction of excellent research (this is the reason Elie asks the questions at the end of his post) that helps us answer these tough questions. Ultimately this will improve the public resource we create for donors who use the main GiveWell site and ensure that the donations that we make as an organization make the most impact possible.

• One crazy idea might be to try to find small individual communities where these charities have operated and look for changes in broader statistics — maybe adolescent crime or truancy rates if such things exist.

The other idea might be to randomly sample yourself instead of getting all the available data. Pick 100 students who left public schools for charity sponsored charter schools and pick 100 who did not. Then try to find some statistically significant change around the time they entered the new school.

The type of test is important as well. I’ve read that scores on tests of general intelligence tend to be remarkably consistent after the age of 5 or so. But, on the other hand, a history test might be completely dependent on how consistently and effectively a teacher presents the material. Does a change in score on a history test indicate the new school is doing more good for the student?

I guess the question is, if you had every statistic in the world at your fingertips, what would you be looking for?

• Removed on January 16, 2008 at 9:13 am said:

Inappropriate comment removed – see it here

• Removed on January 16, 2008 at 10:06 am said:

Inappropriate comment removed – see it here

• Michelle on January 16, 2008 at 11:25 am said:

What you’re seeking to do is extremely ambitious. You’re asking questions that evaluators from within the worlds of education and government have been asking for at least a hundred years. What constitutes educational success? What data do you look for? How do you compare apples to apples? How do you control for the affluence of the neighborhood, nutrition, parental support at home, or other outside-of-school factors? How do you control for the tremendous variance in schools due to local control and differences in state legislation and type of school charter?

It is a nightmare to prove the success or failure of even one testing program or federal school initiative, let alone to define and compare schools as though the school were the only factor in student effectiveness. You might want to begin understanding the world of education evaluation here: http://ies.ed.gov/ncee/
Or begin reading about educational evaluation at ERIC, the Educational Resources Information CEnter
http://eric.ed.gov/

The wheel doesn’t need to be re-invented, it needs to be invented. That’s a lot for you guys to bite off. Can you be a little more clear or specific about a single point of difference you are seeking, which might stand a chance of being evidenced in the data you’re requesting?

• Michelle on January 16, 2008 at 11:28 am said:

Also, do you know about the many existing rankings sites for parents?

http://www.greatschools.net/ is just one.

• Josh Millard on January 16, 2008 at 4:05 pm said:

Dear weird obsessive penis-enlargement-joke guy:

Please stop. It’s weird and kind of embarrassing to behold and generally pretty much the opposite of useful in any sense of the word.

Thanks a bunch.

• Michelle on January 17, 2008 at 12:33 pm said:

You’re in New York. Why not pay a few hundred dollars to someone from Bank Street College of Education for a three-hour tutorial on the problems of educational evaluation?

Better yet, why not take a course in social science research? As you’re seeing, measuring human progress is a matter of much more than crunching numbers. It’s choosing what numbers to crunch, and understanding why you believe those numbers might mean anything to long-term success, and whether you have any reason to think you’re correct about that. Any conclusions you come to with heterogenous data and a poor understanding of social science research and reporting will forever be of very limited usefulness.

http://forums.wsj.com/viewtopic.php?p=43325#43325

• Let me elaborate: the date was Dec. 11. The wording was exactly the same as Holden’s usual spiel. And the handle: Research24. Totally anonymous, and with only 1 comment to its name. This is yet another previously-undisclosed instance of Holden’s illicit activities. I’d like someone with a WSJ subscription to check Research24’s profile to see if any further information can be found, but I doubt that any can.

• Erich Riesenberg on January 17, 2008 at 2:00 pm said:

Hi Matt-

To quote from the WSJ comments: Doing good work means having great people, great technology, and self-evaluation. That’s all “overhead.” The problem in the nonprofit sector is not too much of it, but too little.

I would bet $100 to the charity of my choice it is Holden or Elie for two reasons. First, it quotes the same idiotic line that great people, technology, and evaluation all lead to increased overhead, which is nonsensical to anyone familiar with accounting. Holden has that line memorized. And two, it uses the term overhead, which is not a nonprofit accounting term, but a word consistently misused b Holden. I have a WSJ subscription and it doesn’t help identify the authors, it does not have user profiles, as far as I know. • Dave Stanford on January 17, 2008 at 2:06 pm said: It seems like you aren’t convinced that test scores are a meaningful method of comparing schools, but then you go ahead and jump right in and use them because that’s what the media does and what the organizations do. Why? Is it because evaluating something like the effectiveness of schools is difficult to do? It seems like you skip over a pretty important question in doing this. You seem to be saying right off the bat that test scores of school A compared to test scores of school B don’t tell you enough about the schools. How does taking the derivative of the test scores make this a more useful number? There’s a common criticism that statewide testing encourages teachers to teach the test alone and ignore untested subjects (the arts, history, etc.). This seems like it could improve test scores, but unless you believe better test scores are the goal of schools you still don’t have a tool that’s useful for evaluating them. • jfundraiser on January 17, 2008 at 3:39 pm said: I’ve got to come clean. I feel like I’m a gauker(sp?)at a crime scene. I know I should look away from the blood and gore, but can’t seem to walk away. So is the case with continuing to come to Givewell daily. Givewell has imploaded and I should delete the blog from my records. But I can’t seem to do it. I come again and again to hear the “professionals” at Givewell talk with no logic or reason at all. But mostly I come to “watch the dead horse” get beatten. If Givewell wants to know how further it’s mission, listen, listen very closely to what folks are saying. Read a great quote on my Starbucks cup yesterday: “you can learn a lot more from listening than you can from talking. Find someone with whom you don’t agree in the slightest and ask them to explain themselves at length. Then take a seat, shut your mouth, and don’t argue back.It’s physically impossible to listen with your mouth open.” – John Moe. I’m signing out for good now! Maybe. • What do you think? Are we going too far? Is there a faster and better way to answer these questions? Do you know of any research that already exists on these issues (for these charities)? It’s one thing to encourage feedback. It’s another to have a research effort so flawed that it requires a massive public intervention. If you want this to be a public effort, make it a public effort. Provide us with some incentive to do your work for you. Make a wiki, say, so we feel like we own the contributions we make instead of like we’re feeding a parasitic host. And lose the 65k/yr babysitters if we’re the ones you want doing the heavy lifting – deciding your whole methodology. Same goes for branding: if this is going to be a community instead of a private concern, it’s got to be represented differently. What I want you to do is lose the whiz-kid branding, the corporate facade, and make this place a user-driven public resource, spurred by the same curiosity that made the kids start this up in the first place. That’s a pipe dream, though. I expect business as usual. • Removed on January 18, 2008 at 5:19 am said: Inappropriate comment removed – see it here • Matt, I think your idea that this project might be better designed as a forum for public discussion, perhaps a wiki, is excellent. Givewell certainly was trying to re-invent the wheel, with its founders refusing to do their homework and choosing instead to fly by the seat of their pants (for cut in pay to$65K per year, no less), basically, though the process of public discussion facilitated by this website seemed a good one.

• Sean Stannard-Stockton on January 19, 2008 at 2:15 pm said:

Something for people to consider. Whether you think Elie and Holden are qualified or not to evaluate anything, every donors (foundation or individual) in the world is evaluating nonprofit programs before donating. GiveWell is just showing you what they are thinking.

They are not doing anything different than any of the many foundations in the world. Except they are letting you in on their thinking.

Michelle, I’ve been very impressed with your comments across the range of sites discussing GiveWell. But when you say, “What you’re seeking to do is extremely ambitious.” I have to disagree. GiveWell is trying to decide which nonprofits to give money too. Almost every American makes a decision like this every year.

• MoonOverJackson on January 19, 2008 at 4:09 pm said:

LOL. This is still going on? I think GiveWell had best fold up the tent, find a name that DOESN’T belong to another legitimate operation, and come back in with a project to evaluate HEDGE FUNDS.

You guys are so over. No one will ever take you seriously again.

• MoonOverJackson on January 19, 2008 at 4:10 pm said:

BTW — word on the street is that the Chronicle of Philanthropy is going to do a *print* story on the Givewell Scam. Rock on!

• Removed on January 21, 2008 at 6:03 pm said:

Inappropriate comment removed – see it here

• Mitch Nauffts on January 21, 2008 at 6:30 pm said:

I’m astounded, but not surprised, by the juvenility of most of the posts to this thread. Elementary school education in this country is failing way too many kids, and well-meaning people — including, in their own earnest way, Holden and Elie — are doing their best to identify solutions to the problem. And what do self-styled vigilantes from online “communities” (I use the term loosely) have to add to the conversation? Scatalogical references and insults. Holden and Elie, maybe it’s time you guys thought about moderating your comments and creating a space for a civil discussion about these issues. Don’t worry about the nimrods; they can always hang out at MetaFilter.

• Josh Millard on January 21, 2008 at 7:05 pm said:

Mitch, your initial take on Givewell and Metafilter wasn’t exactly glowing, so I’m not expecting some sudden reversal here, but you’re being awefully lazy about blaming the juvenility here on Metafilter — and failing to credit the civil discussion from those of use who are interested in the conversation.

I can’t see any reason why Bob et al shouldn’t moderate the true crap comments here — one of the nimrods you’re talking about is trying to start a fight about it, essentially — but the goal should be to keep out noise, not criticism, and I can see a pretty clear distinction in the comments here between those two camps.

• Mitch Nauffts on January 21, 2008 at 7:11 pm said:

Josh —

Yes, maybe I was being lazy, but the language and tone of many of the posts from the “waste of time” camp sounded suspiciously familiar. Otherwise, I agree with you 100 percent.

• Erich Riesenberg on January 22, 2008 at 11:46 am said:

Sean Stannard-Stockton writes: They are not doing anything different than any of the many foundations in the world. Except they are letting you in on their thinking. … GiveWell is trying to decide which nonprofits to give money too. Almost every American makes a decision like this every year.

Are you truly this clueless Sean? Givewell is not merely letting us in on their thinking, they are in fact telling people to follow their advice, and ridiculing people who reach different conclusions, in much the way you do. You need to pick a new horse, Sean.

• BTW — word on the street is that the Chronicle of Philanthropy is going to do a *print* story on the Givewell Scam

The Chronicle of Philanthropy: Charities Urged to Set Online Guidelines Following One Group’s ‘Lapse’ (subscription required).

• Crystal on January 22, 2008 at 1:27 pm said:

I agree 100% with Matt (see his post below). No one, other than Michelle, has offered up any serious advice to these two because no one can take Givewell seriously. Surely there are education evaluators reading this, but why would they offer up their decades worth of experience to a couple of kids with no experience in the field? Academics and educators share their wisdom and research all of the time in the form of published work, at conferences, etc. Believe it or not, Givewell isn’t the only forum for information sharing. I suspect these two are going to have a really tough time trying to get anyone to participate.

Matt says:

January 17th, 2008 at 4:42 pm

It’s one thing to encourage feedback. It’s another to have a research effort so flawed that it requires a massive public intervention. If you want this to be a public effort, make it a public effort. Provide us with some incentive to do your work for you. Make a wiki, say, so we feel like we own the contributions we make instead of like we’re feeding a parasitic host. And lose the 65k/yr babysitters if we’re the ones you want doing the heavy lifting – deciding your whole methodology. Same goes for branding: if this is going to be a community instead of a private concern, it’s got to be represented differently. What I want you to do is lose the whiz-kid branding, the corporate facade, and make this place a user-driven public resource, spurred by the same curiosity that made the kids start this up in the first place.

That’s a pipe dream, though. I expect business as usual.