Thursday 19 August 2021

Addressing The Threat That R Poses To Reproducible Research (2021)


  

Data Colada is an excellent blog published by three behaviourial scientists: Uri Simonsohn, Leif Nelson and Joe Simmons.

HERE is a wonderful solution to a common problem. The problem is for users of the open source statistical software called R. Millions of people rely on this software for statistical analysis - and although it was developed by academic statisticians, it is now very widely used. 

Here is Data Colada's explanation of the problem:

R itself has some reproducibility problems..., but the big problem is its packages: the addon scripts that users install to enable R to do things like run meta-analyses, scrape the web, cluster standard errors, format numbers, etc. The problem is that packages are constantly being updated, and sometimes those updates are not backwards compatible. This means that the R code that you write and run today may no longer work in the (near or far) future because one of the packages your code relies on has been updated. But worse, R packages depend on other packages. Your code could break after a package you don't know you are using updates a function you have never even used.

 

An elegant solution to a real problem.