The GiveWell Blog

Followup on Fryer/Dobbie study of “Harlem miracle”

I recently posted about a new, intriguing study on the Harlem Children’s Zone. It’s now been a little over a week since David Brooks’s op-ed brought the study some major attention, and I’ve been keeping up with the reaction of other blogs. Here’s a summary:

Methodology: unusually strong

I haven’t seen any major complaints about the study’s methodology (aside from a couple of authors who appear to have raised possible concerns without having fully read the study – concerns that I don’t believe apply to it). The Social Science Statistics Blog noted it as “a nice example of careful comparisons in a non-experimental situation providing useful knowledge.”

Many studies in this area – particularly those put out by charities – have major and glaring methodological flaws/alternative hypotheses (example). We feel that this one doesn’t, which is part of what makes it so unusual and interesting.

Significance: possibly oversold

David Brooks came under a lot of criticism for his optimistic presentation of the study, stating “We may have found a remedy for the achievement gap.” Thoughts on Education Policy gives a particularly thorough overview of reasons to be cautious, including questions about whether improved test scores really point to improved opportunities and about whether this result can be replicated (“Each school has an inordinate number of things that make it unique — the Promise Academy more so than most”).

Its “What should we learn from the Promise Academy?” series (begun today) looks interesting; it is elaborating on the latter point by highlighting all the different ways in which this school is unusual.

We feel that these concerns are valid, and expressed similar concerns ourselves (here and here). However, given the weak results from past rigorous studies of education, we still feel that the results of this study bear special attention (and possible replication attempts).

Teaching to the test?

Aaron Pallas’s post on Gotham Schools raises the most interesting and worrying concern that I’ve seen.

In the HCZ Annual Report for the 2007-08 school year submitted to the State Education Department, data are presented on not just the state ELA and math assessments, but also the Iowa Test of Basic Skills. Those eighth-graders who kicked ass on the state math test? They didn’t do so well on the low-stakes Iowa Tests. Curiously, only 2 of the 77 eighth-graders were absent on the ITBS reading test day in June, 2008, but 20 of these 77 were absent for the ITBS math test. For the 57 students who did take the ITBS math test, HCZ reported an average Normal Curve Equivalent (NCE) score of 41, which failed to meet the school’s objective of an average NCE of 50 for a cohort of students who have completed at least two consecutive years at HCZ Promise Academy. In fact, this same cohort had a slightly higher average NCE of 42 in June, 2007. [Note that the study shows a huge improvement on the high-stakes test over the same time period, 2007-2008.]

Normal Curve Equivalents (NCE’s) range from 1 to 99, and are scaled to have a mean of 50 and a standard deviation of 21.06. An NCE of 41 corresponds to roughly the 33rd percentile of the reference distribution, which for the ITBS would likely be a national sample of on-grade test-takers. Scoring at the 33rd percentile is no great success story.

One possible interpretation is that cheating occurred on the higher-stakes tests, but this seems unlikely since performance was similarly strong on lower-stakes practice tests (specifics here). Another possible interpretation is that Harlem Children’s Zone teachers focused so narrowly on the high-stakes tests that they did not teach transferable skills (as Mr. Pallas implies).

We haven’t worried much about the “teaching to the test” issue to date, if only because so few interventions have shown any impact on test scores; at the very least, raising achievement test scores doesn’t appear to be easy. But this is a major concern.

Another possible interpretation is that stronger students were absent on the day of the low-stakes test, for some irrelevant reason – or that Mr. Pallas is simply misinterpreting something (I’ve only read, not vetted, his critique).

Bottom line

We know that the Fryer/Dobbie study shows an unusually encouraging result with unusual rigor. We don’t know whether it’s found a replicable way to improve important skills for disadvantaged children.

We feel that the best response to success, in an area such as this one, is not to immediately celebrate and pour in funding; it’s to investigate further.

Related posts:

Comments

  • A Kasse on May 19, 2009 at 7:37 am said:

    Cheating possibly occurred on the higher-stakes tests like before the date of examinations, We don’t know if there is a leakage specially if the professor who usually part of questionnaires team, but this seems likely common on a lower-stakes practice tests.

  • Jennifer on May 19, 2009 at 1:40 pm said:

    Thought you might like this fun quiz.

    http://www.humanitarianiq.com/

Comments are closed.