Today we got to talk to Titus Brown, former Computer scientist/microbiologists in MSU who moved to UC-Davis, and now working on Data-intensive Biology, on how to determine an authorship. Authorship is becoming more of a issue when it comes to massive collaborative work. Some of the traditional rules followed by most biological/medical journals may not be appropriate when encountered with open science and digital tracked projects. Our class already had a discussion on this in the beginning of the semester, but today we are going much deeper into the core: Shall we give authorship to everyone that contributed to the Git repo, or to any digital captured project?
A little background of class:
With the development of genomic sequencing technology, biologists are handling tons of more data than decades of years ago. Computers and analyzing software become indispensable. Yet most commercial analyzing tools are quite functionally-limited or not user-friendly. Reasons include not many scientists using them, most biologist are not well trained for software programming, and especially for early-stage scientists to quickly promote, they are more motivated to have journal publications than writing software package. To change this situation, Dr. Titus Brown and his colleagues started writing an open source software Khmer in 2008/2009. Upon finishing they published a paper to transfer the authorship from the paper to the software, and claimed that anyone who contributed in editing the Github Repo can get into the author list. This action and its blog post triggered a lot voice and discussion on Twitter. To fully discuss this topic, we did the SWOT analysis (Strength, Weakness, Opportunities and Threats) suggested by Dr. Titus Brown.
One best thing of using Git Repo to track authorship is explicit, verifiable and transparent. As all changes were clearly dated and recorded, no argument would arise like in the traditional lab settings. Even under in open access, manages of programs can easily set up automatic test when changes were made to the repo, and checklist that guarantee contributions are good to the software. When there are hundreds or more authors for one paper, website like Depsy (which helps build the software-intensive science of the future by promoting credit for software as a fundamental building block of science) would give detailed credit to authors by analyzing codes.
After talking about so many good things of Git Repo as a golden standard, is it perfect? Currently the answer is no. A lot of STEM related research are not digitally captured yet. Dependency on software are restricted in many areas. Thus in environment where people cannot rely too much on digitally notion, lab culture should come in.
Another drawback that more researchers become aware of, is the miscitation of the original packages and interfaces. Scientists are usually able to cite publications that are most recent and related to their work. The original packages that served as foundations to all the subsequent packages, however, were left out, while they should be given credit. The open source environment and clastic authorship might make it harder to trace back to the original contributor.
The third problem about the open-access authorship is when candidates are applying for the job, it’s more difficult to evaluate their contribution and select using their publication list when hundreds or thousands people shared the author list. This is when strong recommendation and other background differentiation comes into play.
Although there are some concerns about this action, like moral crisis in ‘devaluating in currency system’, we still believe that opening up the standard for traditional citation is a good chance to adapt to the highly digitally-based, and more large dataset based scientific world. This is also a opportunity to explore better metrics in evaluating research, when researchers are indulged in the quantity and citations numbers of publications. Don’t forget Goodhart’s law: When a measure becomes a target, it ceases to be a good measure.
We can try dividing scientists into three groups: the ones who fulfill themselves with scientific progress, the ones who take everything to advance in career, and trolls who only attacks softwares looking for bugs. The first type are usually self-motivated and would not misuse the authorship. However, the other two groups always need better regulations to control the direction of their endeavor and guarantee the fairness in publications.
We ran out of time for discussion.