A proposal for recommending and ranking scientific literature

This blog post was motivated by a twitter conversation involving Andy Revkin.

Scientists need a system to help them find the papers that are really worth taking the time to read carefully.

Right now, working scientists and those who would like to follow scientific literature have difficulty wading through the thousands of papers that are published every day to get to the papers that are worth reading.

The problem is caused by the emphasis on quantitative publication-based metrics to assess scientific productivity.  These metrics give authors incentive to publish many papers describing micro-advances, and to divide a single integrated study into several papers. (These metrics also provide incentives to add co-authors who have contributed little, but that is another story.)

Working scientists, and people who would like to follow the work of scientists need help.

The following proposal is a rough sketch and not all of the details have been worked out.

Moss, Wunderlich Park

The basic idea is to create an online platform that would help people to understand what they should read to be up on the scientific conversation in a scientific topic area or sub-discipline.

The platform would be about recommending reading.

It would not be about criticizing content that is found in the literature, and it is not about saying what not to read. Aside from looking at the statistics of  recommendations, the only action someone can do is recommend a paper for people interested  in a topic area.

It is inspired by things like Stack Exchange and Reddit. In the ideal set-up, perhaps on some platform similar to Google Scholar, there would be a way to tell the system that you recommend people interested in, for example, metamorphic petrology to take the time to read this paper.

Key would be in enabling the sorting of recommendations in different ways.

Disciplinary expertise. Recommendations from different people could be weighted differently depending on how many (weighted) recommendations their own work has gotten within that topic-area (sub-discipline). So, a metamorphic petrologist whose work has gotten many recommendations would have more influence in ranking of papers within the metamorphic petrology topic area.

Different time periods. One could look at recommendations as a time-series, and use the net-present-value of weighted recommendations-instances to sort reading recommendations. If the user wanted to see what the most important papers were on the decadal time scale, they could use a decade as the discount rate. If the user wanted to see what the most important papers were over the last weeks, they could discount on a one week time scale. The time discounting could also be used to reduce the weight of recommendations from people who make recommendations very frequently.

Sorting and searching. While the institution that hosts the database should provide basic search functions, if the resulting database is open access, as it should be, many people could provide services filtering and sorting results in different ways.  One could imagine constructing associations between topic areas by looking at papers recommended in more than one topic area, and searching for important papers to read based on those associations.

Questions remain:

— How should other aspects of the platform be designed, including how to create topic areas within the system?

— To what extent can or should anonymity be provided?

— How can we design a system such that when people try to game the system, they are doing what is best for the system?

Of course, proposals like this suffer from a chicken and egg problem. If everyone were already be using a system like this, the system would be useful and busy scientists would have incentive to use it. But if nobody is using the system, then nobody has incentive to contribute to it. Therefore, a system like this would need to be initiated by people with some standing, perhaps Google, professional associations, or national academies.

It would be great if there were some kind of community-wide reading recommendation service with the granularity to be useful even on topics of extremely narrow interest.