Download: pre-print
Statistics students need to develop the capacity to make sense of the staggering amount of information collected in our increasingly data-centered world. Data science is an important part of modern statistics, but our introductory and second statistics courses often neglect this fact. This paper discusses ways to provide a practical foundation for students to learn to {"}compute with data{"} as defined by Nolan and Temple Lang (2010), as well as develop {"}data habits of mind{"} (Finzer, 2013). We describe how introductory and second courses can integrate two key precursors to data science: the use of reproducible analysis tools and access to large databases. By introducing students to commonplace tools for data management, visualization, and reproducible analysis in data science and applying these to real-world scenarios, we prepare them to think statistically in the era of big data.
@TechReport{data-sci-precursors, author = {Nicholas Horton and Ben Baumer and Hadley Wickham}, journal = {ICOTS 2014}, institution = {had.co.nz}, title = {Teaching precursors to data science in introductory and second courses in statistics}, year = {2014}, }