Data science is an emerging field in industry, and as yet, it is not welldefined as an academic subject. This book represents an ongoing investigation into the central question: “What is data science?” It’s based on a class called “Introduction to Data Science,” which I designed and taught at Columbia University for the first time in the Fall of 2012. In order to understand this book and its origins, it might help you to understand a little bit about me and what my motivations were for creating the class. Motivation In short, I created a course that I wish had existed when I was in college, but that was the 1990s, and we weren’t in the midst of a data explosion, so the class couldn’t have existed back then. I was a math major as an undergraduate, and the track I was on was theoretical and prooforiented. While I am glad I took this path, and feel it trained me for rigorous problem-solving, I would have also liked to have been exposed then to ways those skills could be put to use to solve real-world problems.
Table of Contents
- Introduction: What Is Data Science?.
- Statistical Inference, Exploratory Data Analysis, and the Data Science Process.
- Spam Filters, Naive Bayes, and Wrangling. .
- Logistic Regression.
- Time Stamps and Financial Modeling.
- Extracting Meaning from Data.
- Recommendation Engines: Building a User-Facing Data Product at Scale
- Data Visualization and Fraud Detection
- Social Networks and Data Journalism
- Lessons Learned from Data Competitions: Data Leakage and Model Evaluation.
- Data Engineering: MapReduce, Pregel, and Hadoop
- The Students Speak
- Next-Generation Data Scientists, Hubris, and Ethics
Tags: #Data Science At the Command line — Janssens #Doing Data Science By Cathy O'neil pdf #Doing Data Science Github #Doing Data Science O'reilly pdf #Doing Data Science pdf Free #Doing Data Science Review #O'reilly Data Science pdf