Our Landscape of the Open Science Community

[Added August 27, 2014: GiveWell Labs is now known as the Open Philanthropy Project.]

We previously wrote about a decision to complete a “medium-depth investigation” of the cause of open science: promoting new ways of producing, sharing, reviewing, and evaluating scientific research. The investigation broadly fits under the heading of GiveWell Labs research, which we are conducting in partnership with Good Ventures.

We have now completed the “medium-depth investigation,” led by Senior Research Analyst Alexander Berger, and have written up the investigative process we followed and the output that process produced (XLS). This post synthesizes that output, and gives our current views on the following questions:

  • What is the problem? The traditional journal system, which plays a key role in academic research across a variety of fields, has many limitations that might be addressed by a less traditional, more web-based, more generally “open” approach to publishing research.
  • What are possible interventions? Organizations are engaged in a wide variety of approaches, including building tools that facilitate new ways of publishing and evaluating research and conducting campaigns to increase the extent to which researchers share useful (but not generally traditional-journal-worthy) information.
  • Who else is working on this? Some for-profit organizations have gotten significant funding; on the nonprofit side, there are several foundations working on various aspects of the problem, though most are relatively new to the space. We have the sense that there is currently little funding available for groups focused on changing incentives and doing advocacy (as opposed to building tools and platforms), though we don’t have high confidence in this view.
  • What are possible steps for further investigation? If we were to investigate this cause more deeply, we’d seek a better understanding of the positive consequences that a shift to “open science” might bring, the rate at which such a shift is already occurring, and the organizations and funders that are currently in this space.

Note that these questions match those we have been asking in our shallow investigations.

Overall, we feel that we’ve significantly improved our understanding of this space, though major questions remain. Our main takeaways are as follows:

  • We see less “room for more philanthropy” in the space of supporting tools and platforms than we expected, partly because of the presence of for-profit organizations, some of which have substantial funding.

  • We see more such room in the space of “advocacy and incentives” than we expected, as most of the organizations in that category seem to have relatively little in terms of funding.
  • We still have some major questions about this space. One set of questions regards how beneficial a transition to “open science” would be, and how much a philanthropist might hope to speed it along; we think we could gain substantial ground on this question with further work. Another set of questions, however, involves how new funders who are entering this space will approach the problem. These questions will be hard to answer without letting time pass.

Details follow.

What is the problem?

The general picture that we felt emerged from our conversations was as follows:
The traditional journal system plays a crucial role in modern academic research. Academics seek to publish in prestigious journals; academics largely assess each other (for purposes of awarding tenure among other things) by their records of publishing in prestigious journals. Yet the traditional system is problematic in many ways:

  • Journals usually charge fees for access to publications; an alternative publication system could include universal open access to academic research.
  • Journals use a time-consuming peer-review process that doesn’t necessarily ensure that a paper is reliable or error-free.
  • Journals often fail to encourage or facilitate optimal sharing of data and code (as well as preregistration), and the journal system gives authors little reason to go out of their way to share.
  • Journals often have conventions that run counter to the goal of producing as much social value as possible. They may favor “newsworthy” results, leading to publication bias; they may favor publishing novel analysis over replications, reanalyses and debates; they may have arbitrary length requirements that limit the amount of detail that can be included; they may have other informal preferences that discourage certain forms of investigation, even when those investigations would be highly valuable. This is particularly problematic because considerations about “what a top journal might publish” appears to drive much of the incentive structure for researchers.

It is not difficult to imagine a world in which scientists habitually publish their work in online venues other than (or in addition to) traditional journals, and follow substantially different practices from those encouraged by the journal system. Depending on the norms and tools that sprung up around such a practice, this could lead to:

  • More widespread sharing of data and code.
  • More and better replications, and therefore potentially improved reproducibility.
  • More online debate and discussion that could provide alternatives to peer review in terms of evaluating the value of research. Such alternative evaluation methods could be faster, more reliable, and more flexible than peer review, thus encouraging many of the valuable practices that peer review does not sufficiently encourage.
  • More efficient and flexible collaboration, as researchers could more easily find other researchers working on similar topics and could more easily synthesize the work relevant to theirs.

A unifying theme is the possibility of science’s becoming more “open” – of sharing academic research both widely (such that anyone can access it) and deeply (sharing far more information than is in a typical journal article) – leading to more possibilities for both critique and collaboration.

Such changes could span a wide range of fields, from biology to development economics to psychology, leading to many difficult-to-forecast positive impacts. If we were to recommend this cause, we would ultimately have to do the best we could to evaluate the likely size of such benefits, but we haven’t undertaken to do so at this time, focusing instead on the landscape of people, organizations and approaches working to bring this transition about. (Much of our investigation to date on “open science” has focused on biomedical research because we believe that biomedical research is likely to deliver significant humanitarian value over the long term—and because it constitutes roughly half of all research funded in the in the U.S.—but this is something we would investigate further before committing to this area.)

What are possible interventions?

The “Organizations” sheet of our landscape spreadsheet (XLS) lists groups working on many different aspects of open science:

  • Altmetrics – metrics for evaluating the use/influence/importance of research that go beyond the traditional measures of “where a paper is published and how many citations it has.”
  • Post-publication peer review – tools that allow online critique and discussion of research, beyond the traditional journal-based prospective peer review process.
  • Innovative open access publishing, including preprints – models that facilitate sharing research publicly rather than simply publishing it in closed journals, sometimes prior to any peer review occurring.
  • Sharing data and code – projects that encourage researchers to share more information about their research, by providing tools to make sharing easier or by creating incentives to share.
  • Reproducibility – projects that focus on assessing and improving the reproducibility of research, something that the traditional journal system has only very limited mechanisms to address.
  • Attribution – tools allowing researchers to cite each others’ work in nontraditional ways, thus encouraging nontraditional practices (such as data-sharing).
  • Advocacy – public- or government-focused campaigns aiming to encourage open access, data/code sharing, and other practices that might have social benefits but private costs for researchers or publishers.
  • Alternative publication and peer review models – providing novel ways for researchers to disseminate their research processes and findings and have them reviewed (pre-publication).
  • Social networks – platforms encouraging researchers to connect with each other, and in the process to share their research in nontraditional forums.

The process by which we found these groups and categorized them is described on our process page. We’ve posted an editable version of the spreadsheet on Google Drive, and we welcome any edits or additions to that version.

Who else is working on this?

The “Funders” sheet of our landscape spreadsheet (XLS) lists the major funders we’ve come across in this field.

One important presence in the funding landscape is for-profit capital. MacMillan Publishers, owner of Nature (one of the most prestigious scientific journals), owns a group called Digital Science, which runs and/or funds multiple projects working to address these issues. In addition, there are three organizations we know of in the “social networks for researchers” category that have gotten substantial for-profit funding. According to TechCrunch, Academia.edu has raised several million dollars, ResearchGate has raised at least $35 million, and Mendeley was acquired by a major journal publisher this year for $69-100 million. It’s not clear to us just how we should think of for-profit capital; there seem to be large amounts of funding available for groups that are successfully changing the way researchers share their work, but it’s an open question how aligned the incentives of for-profit investors are with the vision of “making science more open” discussed in the previous section. All three of these companies do explicitly discuss efforts to “make science more open” as being an important part of their overall goals.

Another important presence is the Alfred P. Sloan Foundation, which we have published conversation notes from. The Sloan Foundation appears to be mostly focused on funding platforms and tools that will make it easier for researchers to operate along the lines of “open science”:

Researchers have various reasons for not sharing their data and code, but the difficulty of sharing it in a public context is often the easiest explanation for not doing so. If it became easier to share, then researchers might feel more pressure to share, because the technical excuse would cease to be credible.

Other funders we encountered in this area were generally newer to the space:

  • The Gordon and Betty Moore Foundation is currently launching a 5-year, $60 million, Data-Driven Discovery Initiative.
  • The Laura and John Arnold Foundation recently made a $5 million grant through their Research Integrity program to launch the Center for Open Science.
  • The Andrew W. Mellon Foundation, which typically focuses on the humanities, has a Scholarly Communication and Information Technology program that spent $26 million in 2011 (big PDF), much of it going to support libraries and archives but some going to the kinds of novel approaches described above.

In general, it seems to us that there is currently much more organizational activity on the “building tools and platforms” front than on the “changing incentives and advocating for better practices” front. This can be seen by comparing the “Advocacy” groups in our landscape spreadsheet to the other groups, as well as through the preceding two paragraphs, though the relative youth of the Moore and Arnold Foundations in this space is a source of significant uncertainty in that view. Another possibility is that much of the work being done to change incentives and improve practices happens at the disciplinary or journal level in ways that aren’t caught by the interview process that we conducted.

What are possible next steps for further investigation?

We are unlikely to put substantially more time into this cause until we’ve examined some other causes. A major justification for doing a “medium-depth” investigation of this cause was to experiment with the idea of a “medium-depth review” itself, and we intend to do more “medium-depth reviews” as our research progresses. That said, we are likely to take minor steps to improve our understanding and stay updated on the cause, and we are open to outstanding giving opportunities in this cause if they meet our working criteria.

If we were to aim for the next level of understanding of this cause, we would:

  • Improve our understanding of the size, scope and consequences of the problems listed in the “What is the problem?” section, seeking to understand how much benefit we could expect from a transition from traditional to “open” science. We would also attempt to gauge the progress that has been made on this front so far, to get a sense of the likely returns to further funding (with the possibility that speedy progress to date may reflect an underlying inevitable process that may limit the need for much greater funding).
  • Try to improve our relationships with and understanding of other funders in the space. Since there are several funders that are relatively new and/or have agendas that we don’t know a great deal about, it is very important to understand how they’re thinking so that we can focus on underfunded areas.
  • Have further conversations with the organizations included in our landscape, with the hope of understanding their missions and funding needs.
  • General-purpose networking in order to deepen our understanding of the landscape and improve our odds of running into potential strong giving opportunities. Alexander plans to attend the Peer Review Congress in Chicago in September, since we see this as a relatively efficient way to interact with a lot of relevant people in a short amount of time. (We’re also hoping that the conference will give us more of a sense of the work going on in what we previously called the “efficiency and integrity of medical research” subset of the metaresearch community, which we have explicitly not included in this discussion.)

We think these steps would be appropriate ones to take prior to committing substantial funding or undertaking a full-blown strategy development process, though we could envision recommending some funding to particular outstanding giving opportunities that we encountered in the process of learning more about this field.

Note:

Comments

Our Landscape of the Open Science Community — 12 Comments

  1. There was a very fascinating conference at Stanford a few weeks ago on the Future of Academic Publishing. It brought together publishers, researchers, librarians, federal agencies, and other stakeholders. The main purpose was to discuss how the $9 billion scientific publishing industry will comply with the recent OSTP memo from the White House that requires all federally funded science to be made available for free (both results and possibly data/code). There was also some discussion of other efforts to make science more open, like preregistration of studies. I wish I had known how interesting the conference would be so I could have invited you guys. I was personally surprised at how much effort and expertise is already being directed at changing the industry. It’s hard for me to see where the movement could use more funding, but I imagine the biggest possible impact would come from lobbying federal research agencies in the next couple months as they finish deciding how to comply with the OSTP memo.

  2. Incentives in research fields stems from many sources, but the major drivers in the scientific disciplines are funding agencies. Award decisions would need to shift simultaneously with the development of a more open science community in order for researchers to embrace more openness. A significant challenge will be ensuring that intellectual property is protected and credit is given where it is due. Nonetheless, the movement is certainly growing rapidly, and I look forward to further developments.

  3. If the tools will be there, academics will use them. Tools that are not being used are not good enough. If the internet revolution of the last 20 years taught us anything, it taught us that much. Besides, academics are people who are interested in furthering their personal research, as well as pushing forward the understanding and knowledge in their chosen field as a whole. In other words, most of the time they are hungry for intellectual dialogue and collaboration. Sure they need to protect their IP, and publish articles to further their careers, but any worthwhile tool will provide protection, and improve the chance of publication in a peer review journal.
    Simply put, do not put the cart before the horse – incentives will follow the technology, not the other way around. This may be slow, because traditions die out slowly, but it will be much faster than we think (I keep reminding people that there were no tablets prior to 2010), and no slower than when the millennium generation gets prominent academic positions.

    What worries me more, is the secretive nature of private sectors like the pharmaceutical industry. They are responsible for all sorts of research misconduct – hiding valuable information, caring about their bottom line and not the science, etc. Perhaps though, that does not matter to you, and if academia pushes ahead that will be sufficient for science in general to leap forward.

  4. Uri: Although I think you might be right, I don’t think we can afford to be complacent. New tools have been available for and ignored by educators for decades, who are too interested in the status quo to promote change. There is nothing stopping researchers today from sharing their data, or attempting to publish confirming results, but the existing system actively discourages such work.

  5. It is slightly off topic, but I personally would find it interesting if GiveWell would research “the best” volunteer grid computing non-profit. These projects utilize volunteers’ idle computer time in order to chew on small portions of a larger scientific problem such as computational chemistry and biology research in AIDS and Schistosoma (a GiveWell favorite). Right now, I’m using IBM’s World Community Grid, but it would be nice to know how best to use my computer resource in this respect and to get the word out in general since it doesn’t divert financial resources from GiveWell’s recommended charities.

  6. Hello Ben – do you have any information about whether this shortens a computer’s life span, and how much it increases the computer’s electricity consumption?
    Can it be used as a background service while other things are being done on the computer, or must the computer be idle?

  7. Uri – great questions.
    1. Computer life – Based on what I’ve read, your computer is liable to be obsolete before you wear it out. People have reported running this software on the same computer for 8+ years with no problems.
    2. Power usage – Some have seen significant increases in power usage, but costs vary widely with local electricity costs. It is easy enough to download the program, run it for a couple of months, and see how your bill changes. Obviously, significant money would have to be weighed against the assumption that you would otherwise donate it to effective charities.
    3. Schedule – Yes, it can run all the time or just some of the time with more or less of the computational capacity used at any one time. It is very customizable using the preferences of the program. Unless you are doing intensive computing (e.g. gaming, music recording, etc.), you shouldn’t experience any slowing.

    My explanation is not very rigorous, but this is why I’d like to see GiveWell weigh in. The good news is that the software is free so experimentation is cheap.

  8. Ted: Thanks for the comments and suggestions. We were aware of the Stanford conference beforehand, partly because of our connection with Vannevar. I agree that there could conceivably be opportunities around the current regulatory process, though our best guess is that it’s too late for us to get involved in that (and we are accordingly not actively pursuing opportunities to do so). Our view about the relative lack of funding for advocacy work on both the open data and open publication sides stems largely from our conversation with Heather Joseph, executive director of SPARC, the main open access advocacy group in DC. Notes from that conversation are available here (PDF). Thanks for posting the links to the presentations!

    Uri: We’re planing to do more research on the kind of issue you discuss with respect to pharmaceutical companies. For instance, the conference I’m planning to attend in September has a number of papers related to that topic. We’re somewhat sympathetic to the view that sufficiently good tools would be used, but that doesn’t imply to us that additional philanthropic funding towards the development of tools is the way to go.

    Ben: sorry, we don’t have any info on this topic, and we don’t expect to get to it.

  9. To add a 5th point to your clarification of the problem, the traditional system also for the timing of data publication: how long collected data and analysis might be held before sharing (publication) is defined by internal priorities such as publication timing, thesis completion, research load, tenure position etc., rather than the external priorities such as community needs, minimization of time between collection and sharing, contributing to a development goal, etc. The timing itself of science’s data publishing and holding is, I think, a distinct element to add to your list for what needs to change to maximize its contribution to global problem solving. Beyond the many issues of journal publishing, vital data can often be held for years before it gets included in a journal or conference paper and shared in any meaningful way and that delay alone is a concern.