TechEcho

2 comments

dalkeover 9 years ago

While the principle is sound, I have a few issues with the explanatory text.> By building a hub for research software, where we would categorize it and aggregate metrics about its use and reuse, we would be able to shine a spotlight on its developers,What are these metrics? Download statistics? Number of forks? Number of stars? How do they help 'shine a spotlight'?Organizations have download statistics already, though they are far from accurate. For example, I co-authored the structure visualization program VMD. It included several third-party components, for example, the STRIDE program to assign secondary structures, and the SURF program to compute molecular surfaces. How would the original authors know about those uses?(In actuality, we told them we used their software, and the SURF developer's PI once asked us for download statistics.)> if you’re a department head and a visit to our hub confirms that one of your researchers is in fact a leading expert for novel sequence alignment software, while you know her other “actual research” papers are not getting traction, perhaps you will allow her to focus on software.The hub proposal offers nothing better for this use case than the current system. People who use a successful sequence alignment program end up publishing the results. These papers cite the software used. If the software is indeed one of the best in class, then the department head right now can review citation statistics. What does the hub add?Suppose, as is often the case, that one of the researchers is a contributor to a large and successful project. How does the department head evaluate if the researcher's contribution is significant to the overall project?As it says, this is a rabbit hole. But it's one that has to be solved, and solved clearly enough for the department head to agree with the solution, in order to handle this use case. I'm not sure that it can.Personally, the best solution I know of is a curated list (like ASCL).Perhaps as good would is something like PubPeer, to allow reviews of the software.> Research software is often incredibly specific, and trying to Google for it is more often than not, an exercise in futility ... “sickle”More often, research software that people write is incredibly generic. "Call four different programs, parse their outputs, combine the results into a spreadsheet, and make some graphs." This might take a couple of weeks, and doesn't result in any publishable paper or good opportunities for code reused.Yet this is surely more typical of what a 'research software engineer' does, than developing new, cutting edge software.This leads to another possible use case. Suppose you want to read in a FITS file using Python. Which package should you use? A search of ASCL - <a href="http://ascl.net/code/search/FITS" rel="nofollow">http://ascl.net/code/search/FITS</a> - has "WINGSPAN: A WINdows Gamma-ray SPectral Analysis program" as the first hit, and the much better fit "FTOOLS: A general package of software to manipulate FITS files" as the second.Way down the list is 'PyFITS: Python FITS Module'. And then there's 'Astropy: Community Python library for astronomy' which has merged in "major packages such as PyFITS, PyWCS, vo, and asciitable".The task then is, which metrics would help a user make the right decision?

评论 #10419100 未加载

khinsenover 9 years ago

The analysis of the problem is good, but I am not convinced about the solution.There has never been a central hub for scientific papers, but that has never been a problem. Why should it be a problem for software? We have technologies for making software citable with a DOI, so we could aim for the same network-of-references approach to discoverability that has worked reasonably well for journal articles.As for impact metrics, they have done more harm than good for papers, so I don't see why should run to make the same mistake for software. Moreover, we could do much better. Given that we are slowly moving towards provenance tracking and workflow management for replicability, we could use that same provenance information for measuring software use in a way that is verifiable and hard to game. I have outlined such an approach in a recent paper (<a href="http://dx.doi.org/10.12688/f1000research.5773.3" rel="nofollow">http://dx.doi.org/10.12688/f1000research.5773.3</a>, see the Conclusions), which should be combined with transitive credit (<a href="http://openresearchsoftware.metajnl.com/articles/10.5334/jors.be/" rel="nofollow">http://openresearchsoftware.metajnl.com/articles/10.5334/jor...</a>). Such a metric would measure how much a piece of software has contributed to published computational results.

2 comments

dalkeover 9 years ago

评论 #10419100 未加载

khinsenover 9 years ago

We need a hub for software in science

2 comments

We need a hub for software in science

2 comments