Research debt

Chris Olah coins the term “research debt” to discuss a bundle of related destructive phenomena in research communities:

  • Poor Exposition – Often, there is no good explanation of important ideas and one has to struggle to understand them. This problem is so pervasive that we take it for granted and don’t appreciate how much better things could be.
  • Undigested Ideas – Most ideas start off rough and hard to understand. They become radically easier as we polish them, developing the right analogies, language, and ways of thinking.
  • Bad abstractions and notation – Abstractions and notation are the user interface of research, shaping how we think and communicate. Unfortunately, we often get stuck with the first formalisms to develop even when they’re bad. For example, an object with extra electrons is negative, and pi is wrong.
  • Noise – Being a researcher is like standing in the middle of a construction site. Countless papers scream for your attention and there’s no easy way to filter or summarize them. We think noise is the main way experts experience research debt.

Shout it from the rooftops (my emphasis):

It’s worth being clear that research debt isn’t just about ideas not being explained well. It’s a lack of digesting ideas – or, at least, a lack of the public version of ideas being digested. It’s a communal messiness of thought.

Developing good abstractions, notations, visualizations, and so forth, is improving the user interfaces for ideas. This helps both with understanding ideas for the first time and with thinking clearly about them. Conversely, if we can’t explain an idea well, that’s often a sign that we don’t understand it as well as we could…

Distillation is also hard.

[continue reading]

Sank argues for a SciRate issue tracker

SciRate is the best location I know of for public discussion and feedback on academic papers, and is an impressive open-source achievement by Adam Harrow and collaborators. Right now it has the most traction in the field of quantum informationQuantum info leading the way, as usual… a  , but it could stand to become more popular, and to expand into other fields.

My colleague and good friend Dan Sank proposes a small but important tweak for SciRate: issue tracking, à la GitHub.

Issues in Scirate?

Scirate enables us to express comments/opinions on published works. Another very useful kind of feedback for research papers is issues. By “issue” I mean exactly the kind of thing I’m writing right now: a description of

  1. a problem with the work which can be definitively fixed, or
  2. a possible improvement to that product.

This differs from comments which are just statements of opinion which don’t require any reaction from the author. We all know that issues are essential in developing software, and based on a recent experience where I used github to host development of a research paper with three coauthors and more than a dozen group members providing feedback, I think that issues should also be used for research papers.

It might be nice to attach an issue tracker to Scirate, or at least have Scirate give links to an external issue tracker attached to each paper.

Why not just use a public github repo and get the issue tracker for free?

Making a github repo public makes everything public, including any sensitive information including comments about particular works/people. Having written a paper using github, I can imagine the authors would not want to make that repo public before going through the entire issue history making sure nobody said anything embarrassing/demeaning/etc.

[continue reading]

Bullshit in science

Francisco Azuaje (emphasis mine):

According to American philosopher Harry FrankfurtHere’s Frankfurt’s popular essay [PDF]. a  , a key difference between liars and bullshitters is that the former tend to accept that they are not telling the truth, while the latter simply do not care whether something is true or not.

Bullshitters strive to maximize personal gain through a continuing distortion of reality. If something is true and can be manipulated to achieve their selfish objectives, then good. If something is not true, who cares? All the same. These attributes make bullshitting worse than lying.

Furthermore, according to Frankfurt, it is the bullshitter’s capacity to get away with bullshitting so easily that makes them particularly dangerous. Individuals in prominent positions of authority may be punished for lying, especially if lying has serious damaging consequences. Professional and casual bullshitters at all levels of influence typically operate with freedom. Regardless of their roles in society, their exposure is not necessarily accompanied by negative legal or intellectual consequences, at least for the bullshitter…

Researchers may also be guilty of bullshitting by omission. This is the case when they do not openly challenge bullshitting positions, either in the public or academic settings. Scientists frequently wrongly assume that the public always has knowledge of well-established scientific facts. Moreover, scientists sometimes over-estimate the moderating role of the media or their capacity to differentiate facts from falsehood, and solid from weaker evidence.

Bullshitting happens. But very often it is a byproduct of indifference. Indifference frequently masking a fear of appearing confrontational to peers and funders. Depending on where you are or with whom you work, frontal bullshit fighting may not be good for career advancement.

[continue reading]

ArXiv and Zotero surveys

Quick note: the arXiv is administering a survey of user opinion on potential future changes, many of which were discussed previously on this blog. It can be reached by clicking the banner on the top of the arXiv homepage. I encourage you to take the survey if you haven’t already. (Doubly so if you agree with me…)

Likewise, Zotero is administering a somewhat shorter survey about what sorts of folks use Zotero and what they do with it.

To the question “Do you have suggestions for any of the above-mentioned new services, or any other new services you would like to see in arXiv?”, I responded:

I think the most important thing the arXiv to do would be to “nudge” authors toward releasing their work with a copyleft, e.g., Creative Commons – Attribution. (Or at least stop nudging them toward the minimal arXiv license, as is done now in the submission process.) For instance, make it clear to authors that if they publish in various open access journals that they should release the arXiv post on a similarly permissive license. Also, make is easier for authors to make the license more permissive at a later date once they know where they are publishing. So long as there is informed consent, anything that would increase the number of papers which can be built on (not just distributed) would be an improvement.

I would also like the arXiv to think about allowing for more fine-grained contribution tracking in the long term. I predict that collaboratively written documents will become much more common, and for this it will be necessary to produce a record of who changes what, like GitHub, with greater detail than merely the list of authors.

[continue reading]


Question: What sort of physics — if any — should be funded on the margin right now by someone trying to maximize positive impact for society, perhaps over the very long term?

First, it’s useful to separate the field into fundamental physics and non-fundamental physics, where the former is concerned with discovering new fundamental laws of the universe (particle physics, high-energy theory, cosmology, some astrophysics) and the latter applies accepted laws to understand physical systems (condensed matter, material physics, quantum information and control, plasma physics, nuclear physics, fluid dynamics, biophysics, atomic/molecular/optical physics, geophysics).Some folks like David Nelson dispute the importance/usefulness of this distinction: PDF. In my opinion, he is correct, but only about the most boring part of fundamental physics (which has unfortunately dominated most of those subfields). More speculative research, such as the validity (!!!) of quantum mechanics, is undeniably of a different character from the investigation of low-energy field theories. But that point isn’t important for the present topic. a  

That distinction made, let’s dive in.

Non-fundamental physics

Let’s first list some places where non-fundamental physics might have a social impact:

  1. condensed matter and material science discoveries that give high-temperature superconductors, stronger/lighter/better-insulating/better-conducting materials, higher density batteries, new computing architectures, better solar cells;
  2. quantum information discoveries that make quantum computers more useful than we currently think they will be, especially a killer app for quantum simulations;
  3. plasma physics discoveries that make fusion power doable, or fission power cheaper;
  4. quantum device technologies that allow for more precise measurements;
  5. climate physics (vague);Added 2016-Dec-20. b  
  6. biophysics discoveries (vague);
  7. nanotech discoveries (vague).

In my mostly uninformed opinion, only fusion power (#3) could be among the most valuable causes in the world, plausibly scoring very highly on importance, tractability, and neglectedness — with the notable caveat that the measurable progress would necessitate an investment of billions rather than millions of dollars.… [continue reading]

Comments on Stern, journals, and incentives

David L. Stern on changing incentives in science by getting rid of journals:

Instead, I believe, we will do better to rely simply on the scientific process itself. Over time, good science is replicated, elevated, and established as most likely true; bad science may be unreplicated, flaws may be noted, and it usually is quietly dismissed as untrue. This process may take considerable time—sometimes years, sometimes decades. But, usually, the most egregious papers are detected quickly by experts as most likely garbage. This self-correcting aspect of science often does not involve explicit written documentation of a paper’s flaws. The community simply decides that these papers are unhelpful and the field moves in a different direction.

In sum, we should stop worrying about peer review….

The real question that people seem to be struggling with is “How will we judge the quality of the science if it is not peer reviewed and published in a journal that I ‘respect’?” Of course, the answer is obvious. Read the papers! But here is where we come to the crux of the incentive problem. Currently, scientists are rewarded for publishing in “top” journals, on the assumption that these journals publish only great science. Since this assumption is demonstrably false, and since journal publishing involves many evils that are discussed at length in other posts, a better solution is to cut journals out of the incentive structure altogether.

(H/t Tyler Cowen.)

I think this would make the situation worse, not better, in bringing new ideas to the table. For all of its flaws, peer review has the benefit that any (not obviously terrible) paper gets a somewhat careful reading by a couple of experts.… [continue reading]

PI accepting 2016 master’s student applications

Perimeter Institute runs a pretty great and unusual 1-year master’s program called Perimeter Scholars International.PSI…ha! a   If you’re in your last year as an undergrad, I strongly advise you (seriously) to consider applying. Your choice of grad school is 80% of the selection power determining your thesis topic, and that topic places very strong constraints on your entire academic career. The more your choice is informed by actual physics knowledge (rather than the apparent impressiveness of professors and institutions), the better. An additional year at a new institution taking classes with new teachers can really help.

(Older academics can advertise this to students by printing out this poster.)

Here’s the blurb:

Each year, Canada’s Perimeter Institute for Theoretical Physics recruits approximately 30 exceptional science graduates for an immersive, 10-month physics boot camp: Perimeter Scholars International (PSI). This unique Master’s program seeks not only students with stellar undergraduate physics track records, but also those with diverse backgrounds, collaborative spirit, creativity, and other attributes that will set them apart as future innovators.

Features of the program include:

  • All student costs (tuition and living) are covered, removing financial and/or geographical barriers to entry
    Students learn from world-leading theoretical physicists – resident Perimeter researchers and visiting scientists – within the inspiring environment of Perimeter Institute.
  • Collaboration is valued over competition; deep understanding and creativity are valued over rote learning and examination.
  • PSI recruits worldwide: 85 percent of students come from outside of Canada.
  • PSI takes calculated risks, seeking extraordinary talent who may have non-traditional academic backgrounds but have demonstrated exceptional scientific aptitude.

PSI is now accepting applications for the class of 2016/17. Applications are due by February 1, 2016.

[continue reading]

China to lead particle physics

China will build the successor to the LHC.

Note that the China Daily article above incorrectly suggests that they will build a 50-70km circular electron-positron accelerator at ~100 TeV CoM. In fact, the project comes in two phases inside the same tunnel: first a 250 GeV electron-positron ‘precision’ machineNote that the 250 GeV electron-positron collisions will produce only one Higgs, and the fact that the COM energy is double the Higgs mass is a coincidence. See slides 9-16 here for some of the processes that will be studied. a  , the Circular Electron-Positron Collider (CEPC), followed by an upgrade to a 70 TeV proton-proton ‘discovery’ machine, the Super Proton-Proton Collider (SPPC). The current timeline for operations, which will inevitably be pushed back, projects that data taking will start in 2028 and 2042, respectively. (H/t Graeme Smith.)

The existence of this accelerator has lots of interesting implications for accelerators in the Wester hemisphere. For instance, the International Linear Collider (ILC) was planning on using a ‘push-pull’ configuration where they would alternate beam time between two devices (by keeping them on huge rolling platforms!). The idea is that having two completely separate and competing detectors is critical for maintaining objectivity in world where you only have a single accelerator. Since ILC is linear, there is only one interaction region (unlike for the common circular accelerator). So to use two detectors, you need to be able to swap them in and out! But this becomes largely unnecessary if CEPC exists to keep ILC honest.

I think this is a bad development for physics because I am pessimistic about particle accelerators telling us something truly deep and novel about the universe, at least in the next century.… [continue reading]

PI accepting 2016 Postdoc applications

Perimeter Institute is now accepting applications for 3- and 5-year postdoc positions to start Fall 2016. After having been here a year, I can tell you that PI is amazing. This is the greatest place for fundamental physics research in the world. Stop working on problems that someone else would do anyway and come tackle the big questions with me!

Here is the poster, and here is the blurb:

Perimeter Institute for Theoretical Physics invites applications for postdoctoral positions from new and recent PhDs working in fundamental theoretical physics. Our areas of strength include classical gravity, condensed matter theory, cosmology, particle physics, mathematical physics, quantum fields and strings, quantum foundations, quantum information, and quantum gravity. We also encourage applications from scientists whose work falls in more than one of these categories. Our postdoctoral positions are normally for a period of three years. Outstanding candidates may also be considered for a senior postdoctoral position with a five-year term.

Perimeter Institute offers a dynamic, multi-disciplinary environment with maximum research freedom and opportunity to collaborate within and across fields. Our postdoctoral positions are intended for highly original and intellectually adventurous young theorists. Perimeter offers comprehensive support including a generous research and travel fund, opportunities to invite visiting collaborators, and help in organizing workshops and conferences. A unique mentoring system gives early-career scientists the feedback and support they need to flourish as independent researchers.

The Institute offers an exceptional research environment and is currently staffed with 40 full-time and part-time faculty members, 42 Distinguished Visiting Research Chairs, 55 Postdoctoral Researchers, 47 Graduate Students, and 28 exceptional master’s-level students participating in Perimeter Scholars International. Perimeter also hosts hundreds of visitors and conference participants throughout the academic year.

[continue reading]

Additional material on the arXiv

The arXiv admin board is considering adding more options for linking to material related to a submission. Some examples: blog posts, news items, video lectures, scientific video, software, lecture slides, simulations,
follow-up articles, author’s personal website. What else might be useful?

Here is a mockup of what things could look like (link to HTML):

[continue reading]

Megajournals unbundling assessments of correctness and importance

Can the judgement of scientific correctness and importance be separated in journal publishing? Progress in this direction is being made by megajournals (a misleading name) that assess only correctness, leaving impact evaluation to other post-publication metrics. The first link suggests that such journals may have saturated the market, but actually this result is overwhelmingly dominated by PLOS ONE, and the other megajournals look like they are still growing. (H/t Tyler Cowen.)

Although I am generally for the “unbundling” of the various roles played by the journal, I think this actually could have bad results. There currently is a stupendous amount of academic writing being produced, and only a tiny fraction of it can be read carefully by thoughtful people. Folks are fighting for the attention of their colleagues, and most papers are not worth it. Right now, if you think you have a good result you can submit to a high-impact journal, and there is at least a chance that the editor will send it out for review, and at least two reasonably qualified referees will be forced to read it. If they decide your paper is important, it gets published in a way that marks its importance.

But consider the alternate universe, where everything correct just goes up on the arXiv, and ex post facto certifications are applied to work that someone important later decides is super interesting. In this case, an article is not guaranteed to get any qualified readers at all. Rather, new articles will be read or not read based on some combination of author prestige, abstract salesmanship, and the amplification of initial random noiseTo explain the latter: In any given set of papers with indistinguishable external features, some will get read and others won’t by chance.[continue reading]

GitWikXiv follow-up: A path to forkable papers

[Other posts in this series: 1,2,3.]

I now have a more concrete idea of some of the pie-in-the-sky changes I would like to see in academic publishing in the long term. I envision three pillars:

  • “Scientifica”: a linked, universally collaborative document that takes the reader from the most basic introductory concepts to the forefront of research.Edit 2016-4-22: I am embarrassed that I did not make it clear when this was initially posted that the Scientifica idealization is mostly a product of Godfrey Miller. Hopefully he didn’t notice… a   Imagine a Wikipedia for all of science, maintained by researchers. Knowen and Scholarpedia are early prototypes, although I believe a somewhat stronger consensus mechanism akin to particle physics collaborations will be necessary.
  • ArXiv++: a central repository of articles that enables universal collaboration through unrestricted forking of papers. This could arise by equipping the arXiv with an open attribution standard and moving toward a copyleft norm (see below).
  • Discussion overlay: There is a massive need for quick, low-threshold commentary on articles, although I have fewer concrete things to say about this at the moment. For the time being, imagine that each arXiv article accumulated nestedNested comments are just comments that allow comment-specific replies, organized in a hierarchy; see here for a visual example. b   comments (or other annotations) that the reader could choose to view or suppress, and which could be added to with the click of a button.

The conceptual flow here is that bleeding-edge research is documented on the arXiv, is discussed on the overlay, and — when it has been hashed out through consensus — it is folded into Scientifica.… [continue reading]

GitWikXiv follow-up: An open attribution standard?

[Other posts in this series: 1,2,4.]

My GitWikXiv post on making the academic paper universally collaborative got a lot of good comments. In particular, I recommend reading Ivar Martin, who sees a future of academic writing that is very different from what we have now.

Along a slightly more conventional route, the folks working on Authorea made a good case that they have several of the components that are needed to allow universal collaboration, and they seem to have a bit of traction.More generally, the comments on the post gave me the impression that lots of people are working on tools, but not many people are working on open standards. (This isn’t surprising, since software tools are a lot easier to develop by a handful of people.) It may be that a lot of the social/cultural obstacles (in contrast to technical ones), that we all seem to agree are the most difficult, aren’t actually mental problems so much as coordination problems. In other words, it might not have anything to do with old researchers being set in their ways as much as tragedy-of-the-commons-type obstacles. So maybe there should be more focus on open standards like ORCID, smart citations, data accessibility, and an attribution standard like I discuss here. a   I was asked what it would take to solve the remaining problems by my lights, and I sketched a hypothetical way to let Authorea (which is a for-profit company) interface with the arXiv to enable universal collaboration with proper attribution. The key step would be the introduction of an attribution open file standard that could be agreed upon by the academic community, and especially by the arXiv advisory board.… [continue reading]

GitWikXiv follow-up: Distinctions in academic tools

[Other posts in this series: 1,3,4.]

In a follow-up to my GitWikXiv post on making the academic paper more collaborative, I’d like to quickly lay out two important distinctions as a way to anchor further discussion.

Revision vs. attribution vs. evaluation

Any system for allowing hundreds of academics to collaborate on new works needs to track and incentivize who contributes what. But it’s key to keep these parts separate conceptually (and perhaps structurally).

  • Revisions are the bare data necessary to reconstruct the evolution of a document through time. This is the well trodden ground of revision control software like GitHub.
  • Attribution is the assigning of credit. At the minimum this includes tagging individual revisions with the name/ID of the revisor(s). But more generally it includes the sort of information that can be found in footnotes (“I thank J. Smith for alerting me to this possibility”), acknowledgements (“We are grateful to J. Doe for discussion”), and author contributions statements (“A. Atkins ran the experiment; B. Bonkers analyzed the data”).
  • Evaluation of the revisions is done to assess how much they are worth. This can be expressed as an upvote (as on StackExchange), as a number of citations or other bibliometric like the h-index, or as being published in a certain venue like Nature.In general I am against most evaluation metrics. I actually think that these metric correlate pretty strongly with academic accomplishment, all else being equal, but I think all else is very not equal, and that the metrics become gamed as soon as you attach incentives to them. For instance, the number of times an actor is mentioned on twitter probably correlates pretty strongly with how good an actor they are, but it drastically underrates broadway actors compared to movie actors, or niche art-film actors compared to Adam Sandler.
[continue reading]

Beyond papers – GitWikXiv

[Other posts in this series: 2,3,4.]

I had the chance to have dinner tonight with Paul Ginsparg of arXiv fame, and he graciously gave me some feedback on a very speculative idea that I’ve been kicking around: augmenting — or even replacing — the current academic article model with collaborative documents.

Even after years of mulling it over, my thoughts on this aren’t fully formed. But I thought I’d share my thinking, however incomplete, after incorporating Paul’s commentary while it is still fresh in my memory. First, let me start with some of the motivating problems as I see them:

  • People still reference papers from 40 years ago for key calculations (not just for historical interest or apportioning credit). They often have such poor typesetting that they are hard to read, don’t have machine-readable text, no URL links, etc.
  • Getting oriented on a topic often requires reading a dozen or more scattered papers with varying notation, where the key advances (as judged with hindsight) are mixed in with material that is much less important.
  • More specifically, papers sometimes have a small crucial idea that is buried in tangential details having to do with that particular author’s use for the idea, even if the idea has grown way beyond the author.
  • Some authors could contribute the key idea, but others could contribute clarity of thought, or make connections to other fields. In general these people may not know each other, or be able to easily collaborate.
  • There aren’t enough good review articles.When the marginal cost of producing a textbook is near zero, the fact that no one gets proper credit for writing good textbooks isn’t so bad simply because you only need one or two good ones, and the audience is huge.
[continue reading]