Preface to Python Data Science Handbook (Early Release)

Data_Science_VD
Data Science Venn Diagram

O’Reilly Publishing has released a preface to Python Data Science Handbook (Early Release) by Jake VanderPlas, eScience’s Senior Data Scientist and Director of Research, Physical Sciences.

“What is data science?” VanderPlas writes. “It’s a surprisingly hard definition to nail down, especially given how ubiquitous the term has become. Despite its hype-laden veneer, [data science] is perhaps the best label we have for the cross-disciplinary set of skills that are becoming increasingly important in many applications across industry and academia.”

And why Python? “[It] has emerged over the last couple decades as a first-class tool for scientific computing tasks, including the analysis and visualization of large datasets.”

VanderPlas’ book is geared toward technically-minded students, researchers, and developers with a strong background in writing code and using computational and numerical tools, focusing on a broad overlapping data science “mental model” of computational, statistical, and domain expertise known as the Data Science Venn Diagram. The first four sections of Python Data Science Handbook focuses on the computational component of the programming language and the extensive ecosystem of data-focused tools available within it, with the rest of the book a discussion about the fundamental concepts of statistics and mathematics, and their use in analyzing datasets. “The goal,” says VanderPlas, “is that by the end readers will be poised to use these Python tools process, describe, model, and draw inferences from the various data they encounter.”

VanderPlas encourages readers not to think of data science as a new domain or expertise to learn, but “a new set of skills that you can apply within your current area of expertise. Whether you are reporting election results, forecasting stock returns, optimizing online ad clicks, identifying microorganisms in microscope photos, seeking new classes of astronomical objects, or working with data in any other field, my goal is that the content of this book would give you the ability to ask and answer new questions about your chosen subject area.”

You can read the preface to Python Data Science Handbook (Early Release) here:
https://beta.oreilly.com/learning/introduction-to-pandas