Data Science for Business – A serious guide for those who need to know – #bigdata #bookreview

Data Science for Business

What You Need to Know about Data Mining and Data-Analytic Thinking
Foster Provost and Tom Fawcett
(O’Reilly – paperback, Kindle)

This is not an introductory text for casual readers curious about the hoopla over data science and Big Data.

And you definitely won’t find code here for simple screen scrapers written in Python 2.7 or programs that access the Twitter API to scoop up messages containing certain hashtags.

Data Science for Business is based on an MBA course Foster Provost teaches at New York University, and it is aimed at three specific, serious audiences:

  • “Aspiring data scientists”
  • “Developers who will be implementing data science solutions…”
  • “Business people who will be working with data scientists, managing data science-oriented projects, or investing in data science ventures….”

Provost’s and Fawcett’s book  “concentrates on the fundamentals of data science and data mining,” the two authors state. But it specifically avoids “an algorithm-centered approach” and instead focuses on “a relatively small set of fundamental concepts or principles that underlie techniques for extracting useful knowledge from data. These concepts serve as the foundation for many well-known algorithms of data mining,” the authors note.

“Moreover, these concepts underlie the analysis of data-centered business problems, the creation and evaluation of data science solutions, and the evaluation of general data science strategies and proposals.”

The book is well-written and adequately illustrated with charts, diagrams, mathematical equations and mathematical examples. And the text, while technical and dense in some places, is organized into short sections. Most of the chapters end with insightful summaries that help the lessons stick.

Both authors are experienced veterans in the use of data science in business.  Their new book includes two helpful appendices. One shows how to “assess potential data mining projects” and “uncover potential flaws in proposals.” The second appendix presents a sample proposal and discusses its flaws.

“If you are a business stakeholder rather than a data scientist,” the authors caution, “don’t let so-called data scientists bamboozle you with jargon: the concepts of this book plus knowledge of your own business and data systems should allow you to understand 80% or more of the data science at a reasonable enough level to be productive for your business.”

They also challenge data scientists to “think deeply about why your work is relevant to helping the business and be able to present it as such.”

Si Dunn

Getting Started with D3 – A guide to working with data-driven documents – #bookreview #javascript

Getting Started with D3
Mike Dewar
(O’Reilly, paperbackKindle)

This focused, 58-page how-to guide introduces the basics of D3, a JavaScript library written by Mike Bostock.

The D3 library, a free download, can be used to manipulate documents based on data. According to the Data-Driven Documents website, “D3 allows you to bind arbitrary data to a Document Object Model (DOM), and then apply data-driven transformations to the document. For example, you can use D3 to generate an HTML table from an array of numbers. Or, use the same data to create an interactive SVG bar chart with smooth transitions and interaction.”

Mike Dewar’s book is aimed at “the data scientist: someone has data to visualize and who wants to use the power of the modern web browser to give his visualizations additional impact.” However, if you don’t consider yourself a data scientist, but are comfortable with coding and manipulating data, this book can still show you how to use a combination of JavaScript and SVG [Scalable Vector Graphics] “to build everything from simple bar charts to complex infographics.”

Getting Started with D3 has six chapters, and they are illustrated with code samples and examples of graphics produced using D3.

  1. Introduction
  2. The Enter Selection
  3. Scales, Axes, and Lines
  4. Interactions and Transitions
  5. Layout
  6. Conclusion

In his conclusion, Mike Dewar, a data scientist at Bitly, offers encouragement and additional resources for digging deeper into D3. “The documentation for D3 is extensive,” he writes, “and is available at along with a huge gallery of examples. This is an essential resource, both for reference and inspiration.”

His book is also an essential resource, for learning the basics of using D3.

Si Dunn