Wednesday 23 May 2018

Data science is science’s second chance to get causal inference right (2018)



Data science is science’s second chance to get causal inference right. A classification of data science tasks

Miguel A. Hernán, John Hsu, & Brian Healy

Abstract

Causal inference from observational data is the goal of many health and social scientists. However, academic statistics has often frowned upon data analyses with a causal objective. The advent of data science provides a historical opportunity to redefine data analysis in such a way that it naturally accommodates causal inference from observational data. We argue that the scientific contributions of data science can be organized into three classes of tasks: description, prediction, and causal inference. An explicit classification of data science tasks is necessary to describe the role of subject-matter expert knowledge in data analysis. We discuss the implications of this classification for the use of data to guide decision making in the real world

HERE