Effective Code Review for Journals

Nature Biotechnology recently announced that it would be requiring authors to ‘check the accessibility of code used in computational studies’, in an effort to mitigate retractions and errors resulting from bugs & under-validated code. The article quoted the Science Lab’s director, Kaitlin Thaney, in observing the Science Lab’s position that openness in research is not only a matter of releasing information, but making sure it is effectively reusable, too, in order to reproduce and confirm results and carry that work forward.

But, technical challenges remain. As was discovered in the series of code review pilot studies from the Science Lab and Marian Petre from Open University in 2013 and 2014, third parties reviewing code they weren’t involved in writing leads to superficial reviews without much value; see reflections on these studies from Thaney as well as Greg Wilson, in addition to recent comments to the same effect from Wilson here.

However, journals like Nature Biotech can still compel some very valuable change by marshaling a system of code review for their submissions. As we discuss in our teaching kit on code review (and as was originally investigated in this study), much value can be derived from setting expectations for code clarity and integrity. By demanding authors submit a high-coverage test suite for any original code used,  journals can encourage researchers to use this fundamental technique for ensuring code quality; also, as discussed in depth in the study linked above, the act of requiring authors to describe and justify the changes made at each pull request results in measurably less bugs committed – before code review has even begun. Specifically, journals could require:

  • a passing test suite with a minimum standard of coverage (>90%)
  • a commit log consisting of small pull requests (<500 lines each), each with an accompanying description & justification of the changes made and strategies taken.

Neither of these require reviewers to read code in-depth, but both push authors to seriously reflect on their code, and thus improve its quality.

For more strategies on how to implement a system of code review for scientific software, check out our curriculum on code review. The ideas and strategies presented there are crafted with busy scientists in mind, and explore how to get the most out of short, low-time-commitment reviews; feedback and contributions always welcome over at the project repo.