GitWikXiv follow-up: A path to forkable papers

[Other posts in this series: 1,2,3.]

I now have a more concrete idea of some of the pie-in-the-sky changes I would like to see in academic publishing in the long term. I envision three pillars:

  • “Scientifica”: a linked, universally collaborative document that takes the reader from the most basic introductory concepts to the forefront of research.Edit 2016-4-22: I am embarrassed that I did not make it clear when this was initially posted that the Scientifica idealization is mostly a product of Godfrey Miller. Hopefully he didn’t notice…a   Imagine a Wikipedia for all of science, maintained by researchers. Knowen and Scholarpedia are early prototypes, although I believe a somewhat stronger consensus mechanism akin to particle physics collaborations will be necessary.
  • ArXiv++: a central repository of articles that enables universal collaboration through unrestricted forking of papers. This could arise by equipping the arXiv with an open attribution standard and moving toward a copyleft norm (see below).
  • Discussion overlay: There is a massive need for quick, low-threshold commentary on articles, although I have fewer concrete things to say about this at the moment. For the time being, imagine that each arXiv article accumulated nestedNested comments are just comments that allow comment-specific replies, organized in a hierarchy; see here for a visual example.b   comments (or other annotations) that the reader could choose to view or suppress, and which could be added to with the click of a button.

The conceptual flow here is that bleeding-edge research is documented on the arXiv, is discussed on the overlay, and — when it has been hashed out through consensus — it is folded into Scientifica.… [continue reading]

GitWikXiv follow-up: An open attribution standard?

[Other posts in this series: 1,2,4.]

My GitWikXiv post on making the academic paper universally collaborative got a lot of good comments. In particular, I recommend reading Ivar Martin, who sees a future of academic writing that is very different from what we have now.

Along a slightly more conventional route, the folks working on Authorea made a good case that they have several of the components that are needed to allow universal collaboration, and they seem to have a bit of traction.More generally, the comments on the post gave me the impression that lots of people are working on tools, but not many people are working on open standards. (This isn’t surprising, since software tools are a lot easier to develop by a handful of people.) It may be that a lot of the social/cultural obstacles (in contrast to technical ones), that we all seem to agree are the most difficult, aren’t actually mental problems so much as coordination problems. In other words, it might not have anything to do with old researchers being set in their ways as much as tragedy-of-the-commons-type obstacles. So maybe there should be more focus on open standards like ORCID, smart citations, data accessibility, and an attribution standard like I discuss here.a   I was asked what it would take to solve the remaining problems by my lights, and I sketched a hypothetical way to let Authorea (which is a for-profit company) interface with the arXiv to enable universal collaboration with proper attribution. The key step would be the introduction of an attribution open file standard that could be agreed upon by the academic community, and especially by the arXiv advisory board.… [continue reading]

GitWikXiv follow-up: Distinctions in academic tools

[Other posts in this series: 1,3,4.]

In a follow-up to my GitWikXiv post on making the academic paper more collaborative, I’d like to quickly lay out two important distinctions as a way to anchor further discussion.

Revision vs. attribution vs. evaluation

Any system for allowing hundreds of academics to collaborate on new works needs to track and incentivize who contributes what. But it’s key to keep these parts separate conceptually (and perhaps structurally).

  • Revisions are the bare data necessary to reconstruct the evolution of a document through time. This is the well trodden ground of revision control software like GitHub.
  • Attribution is the assigning of credit. At the minimum this includes tagging individual revisions with the name/ID of the revisor(s). But more generally it includes the sort of information that can be found in footnotes (“I thank J. Smith for alerting me to this possibility”), acknowledgements (“We are grateful to J. Doe for discussion”), and author contributions statements (“A. Atkins ran the experiment; B. Bonkers analyzed the data”).
  • Evaluation of the revisions is done to assess how much they are worth. This can be expressed as an upvote (as on StackExchange), as a number of citations or other bibliometric like the h-index, or as being published in a certain venue like Nature.In general I am against most evaluation metrics. I actually think that these metric correlate pretty strongly with academic accomplishment, all else being equal, but I think all else is very not equal, and that the metrics become gamed as soon as you attach incentives to them. For instance, the number of times an actor is mentioned on twitter probably correlates pretty strongly with how good an actor they are, but it drastically underrates broadway actors compared to movie actors, or niche art-film actors compared to Adam Sandler.
[continue reading]

Beyond papers – GitWikXiv

[Other posts in this series: 2,3,4.]

I had the chance to have dinner tonight with Paul Ginsparg of arXiv fame, and he graciously gave me some feedback on a very speculative idea that I’ve been kicking around: augmenting — or even replacing — the current academic article model with collaborative documents.

Even after years of mulling it over, my thoughts on this aren’t fully formed. But I thought I’d share my thinking, however incomplete, after incorporating Paul’s commentary while it is still fresh in my memory. First, let me start with some of the motivating problems as I see them:

  • People still reference papers from 40 years ago for key calculations (not just for historical interest or apportioning credit). They often have such poor typesetting that they are hard to read, don’t have machine-readable text, no URL links, etc.
  • Getting oriented on a topic often requires reading a dozen or more scattered papers with varying notation, where the key advances (as judged with hindsight) are mixed in with material that is much less important.
  • More specifically, papers sometimes have a small crucial idea that is buried in tangential details having to do with that particular author’s use for the idea, even if the idea has grown way beyond the author.
  • Some authors could contribute the key idea, but others could contribute clarity of thought, or make connections to other fields. In general these people may not know each other, or be able to easily collaborate.
  • There aren’t enough good review articles.When the marginal cost of producing a textbook is near zero, the fact that no one gets proper credit for writing good textbooks isn’t so bad simply because you only need one or two good ones, and the audience is huge.
[continue reading]