Data Science for Business – A serious guide for those who need to know – #bigdata #bookreview

Data Science for Business

What You Need to Know about Data Mining and Data-Analytic Thinking
Foster Provost and Tom Fawcett
(O’Reilly – paperback, Kindle)

This is not an introductory text for casual readers curious about the hoopla over data science and Big Data.

And you definitely won’t find code here for simple screen scrapers written in Python 2.7 or programs that access the Twitter API to scoop up messages containing certain hashtags.

Data Science for Business is based on an MBA course Foster Provost teaches at New York University, and it is aimed at three specific, serious audiences:

  • “Aspiring data scientists”
  • “Developers who will be implementing data science solutions…”
  • “Business people who will be working with data scientists, managing data science-oriented projects, or investing in data science ventures….”

Provost’s and Fawcett’s book  “concentrates on the fundamentals of data science and data mining,” the two authors state. But it specifically avoids “an algorithm-centered approach” and instead focuses on “a relatively small set of fundamental concepts or principles that underlie techniques for extracting useful knowledge from data. These concepts serve as the foundation for many well-known algorithms of data mining,” the authors note.

“Moreover, these concepts underlie the analysis of data-centered business problems, the creation and evaluation of data science solutions, and the evaluation of general data science strategies and proposals.”

The book is well-written and adequately illustrated with charts, diagrams, mathematical equations and mathematical examples. And the text, while technical and dense in some places, is organized into short sections. Most of the chapters end with insightful summaries that help the lessons stick.

Both authors are experienced veterans in the use of data science in business.  Their new book includes two helpful appendices. One shows how to “assess potential data mining projects” and “uncover potential flaws in proposals.” The second appendix presents a sample proposal and discusses its flaws.

“If you are a business stakeholder rather than a data scientist,” the authors caution, “don’t let so-called data scientists bamboozle you with jargon: the concepts of this book plus knowledge of your own business and data systems should allow you to understand 80% or more of the data science at a reasonable enough level to be productive for your business.”

They also challenge data scientists to “think deeply about why your work is relevant to helping the business and be able to present it as such.”

Si Dunn

The LEGO Build-It Book 1: Amazing Vehicles – Creating with 1 brick collection – #bookreview

The LEGO Build-It Book 1: Amazing Vehicles

Nathanaël Kuipers and Mattia Zamboni
(No Starch Press, paperback)
ISBN: 978-1-59327-503-7

Using just one collection of LEGO bricks and this colorful how-to guide, you can build 10 different model vehicles, starting with a simple go-kart and working your way up to a muscle car, a street rod, and a rescue truck, among others.

No Starch Press recently has launched its LEGO Build-It Book series with this well-crafted volume, aimed at readers age 7 and up. Volume 2, due out in September 2013, will offer another group of 10 construction projects that can be built from just one collection of LEGO bricks.

Many young readers will appreciate the new LEGO book because it has many illustrations that mostly just show,  step by numbered step, how each vehicle goes together.

Nathanaël Kuipers is a Dutch design professional who spent several years working for the LEGO Group in Denmark, where he was mainly responsible for engineering LEGO Technic models. Co-author Mattia Zamboni has a background in graphic design, photography, and LEGO, as well as electrical engineering.

A key message from this book and the evolving Build-It Book series, Kuipers says, is: “You don’t need to buy the really expensive products or lots and lots of sets to make interesting models. With a little creativity and some useful techniques, you can build endless models from a simple collection of bricks.”

Si Dunn

The Practice of Network Security Monitoring – You’re compromised, so deal with it. #security #bookreview

The Practice of Network Security Monitoring

Understanding Incident Detection and Response
Richard Bejtlich
(No Starch Press – paperback, Kindle)

Security expert Richard Bejtlich’s focus in his new book is not on “the planning and defense phases of the security cycle.” Instead, he emphasizes how to handle “systems that are already compromised or that are on the verge of being compromised.”

His well-organized, well-written, 341-page book aims to help you “start detecting and responding to digital intrusions using network-centric operations, tools, and techniques.”

Bejtlich has long emphasized a “detection-centered philosophy” built around a straightforward central tenet: “Prevention eventually fails.” No matter how many digital walls and moats you build around your network, someone will find a way to tunnel in, parachute in, or sneak in via an unsuspecting employee’s $9.95 thumb drive.

“It’s becoming smarter,” he writes, “to operate as though your enterprise is always compromised. Incident response is no longer an infrequent, ad-hoc affair. Rather, incident response should be a continuous business process with defined metrics and objectives.”

You may recognize some of Bejtlich’s previous books on network security monitoring (NSM): The Tao of Network Security Monitoring; Extrusion Detection; and Real Digital Forensics.

The Practice of Network Security Monitoring is tailored toward two key audiences: (1) security professionals who have little or no experience with NSM; and (2) “more senior incident handlers, architects, and engineers who need to teach NSM to managers, junior analysts, or others who may be technically less adept.”

Readers, he add, should understand “the basic use of the Linux and Windows operating systems, TCP/IP networking, and the essentials of network attack and defense.”

The examples in Bejtlich’s book rely on open source and vendor-neutral tools, primarily from Doug Burks’ Security Onion (SO) distribution.

The 13-chapter book is organized into four parts:

  • Part I: Getting Started – Introduces NSM and sensor placement issues.
  • Part II: Security Onion Deployment – Shows how to install and configure SO.
  • Part III: Tools – Examines the “key software shipped with SO and how to use these applications.”
  • Part IV: NSM in Action – Looks at “how to use NSM processes and data to detect and respond to intrusions.”

Following the technical chapters, Bejtlich offers some concluding thoughts on network security management, cloud computing, and establishing an effective workflow for NSM. “NSM isn’t just about tools,” he writes. “NSM is an operation, and that concept implies workflow, metrics, and collaboration. A workflow establishes  a series of steps that an analyst follows to perform the detection and response mission. Metrics, like the classification and count of incidents and time elapsed from incident detection to containment, measure the effectiveness of the workflow. Collaboration enables analysts to work smarter and faster.”

He also observes: “It is possible to defeat adversaries if we stop them before they accomplish their mission. As it has been since the early 1990s, NSM will continue to be a powerful, cost-effective way to counter intruders.”

Si Dunn

Programming Groovy 2 – For the experienced Java developer seeking dynamic productivity – #programming #bookreview

Programming Groovy 2

Dynamic Productivity for the Java Developer

Venkat Subramaniam
(Pragmatic Bookshelf, paperback)


The programming language
Groovy has a bit of a checkered past. Before it reached Release 1 in early 2007, it was almost abandoned because of a series of development problems. But some dedicated developers reworked it, gave it new life, and helped it gain acceptance in a widening array of commercial projects. Release 2.0 became available last year, and you can download 2.1, with 2.2 in beta. Groovy is doing groovy now, thank you very much (and thank the Grails web application framework, too).

In his new book, which keys off of his 2008 Groovy 1 edition, Dr. Venkat Subramaniam  describes Groovy 2 as “lightweight, low-ceremony, dynamic, object-oriented, and runs on the JVM [the Java Virtual Machine].”

He notes: “Groovy is open sourced under the Apache License, version 2.0. It derives strength from various languages such as Smalltalk, Python, and Ruby, while retaining a syntax familiar to Java programmers. Groovy compiles into Java bytecode and extends the Java API and libraries. It runs on Java 1.5 and newer. For deployment, all we need is a Groovy Java archive (JAR) in addition to the regular Java stuff, and we’re all set.”

Groovy is not for coding beginners, nor is it a means to avoid learning Java. This book–well written and nicely illustrated with short code examples and screenshots–”is aimed at Java programmers who already know the JDK [Java Development Kit] well but are interested in learning the Groovy language and its dynamic capabilities,” Dr. Subramaniam says.

He has organized his 19-chapter, 333-page book into three major parts:

Part I: Beginning Groovy – Focuses on the fundamentals of the language but deliberately skips the basics of programming. The book, after all, is aimed at “experienced Java programmers.”

Part II: Using Groovy – Shows “how to use Groovy for everyday coding–working with XML, accessing databases, and working with multiple Java/Groovy classes and scripts….” Also delves into Groovy extensions and additions to the JDK.

Part III: “MOPping Groovy” – The odd title may conjure up a brief image of mopping up spilled gravy. But this part deals with “Groovy’s metaprogramming capabilities….” The coverage includes: (1) metaobject protocol (MOP); (2) “how to do AOP-like operations in Groovy” [AOP = “aspect-oriented programming”]; and (3) “dynamic method/property discovery and dispatching,” as well as Groovy’s “compile-time metaprogramming capability….”

If you are a Java developer seeking new tools. new challenges, and new horizons, this could be the right time and right way to get your groove on: with Groovy 2 and Dr. Venkat Subramaniam’s fine how-to guide.

“Groovy is an attractive language for a number of reasons,” the author says, naming four key ones:

“It has a flat learning curve.”

“It follows Java semantics.”

“It bestows dynamic love.”

“It extends the JDK.”

“Groovy,” he adds, “feels like the Java language we already know, with a few augmentations.”

Si Dunn

The Healthy Programmer – Better coding through better living – #programming #bookreview

The Healthy Programmer

Get Fit, Feel Better, and Keep Coding

Joe Kutner

(Pragmatic Bookshelf – paperback)

Yes, you know it is unhealthy to spend all day and much of the night hunched at keyboard, staring at a computer screen, gripping a mouse and nervously clawing at bags of vending-machine snacks because you haven’t had time to eat proper meals.

Yet that is exactly how many of us earn a living: spending long hours writing code, fixing code, or writing about the processes of writing and fixing code.

The work of a programmer can be devilishly complex and tiring. Often, it can be highly stressful, too. And, it can, over the long run, damage your health or even help shorten your life, if you aren’t careful.

Joe Kutner’s The Healthy Programmer takes a pragmatic and low-key approach to showing you how you can start improving the conditions of your body and brain without disrupting your job. His tips, tricks, and “best practices” are backed up by advice and commentary from doctors, therapists, nutritionists, scientists, and fitness experts.

“Having a system or a process is crucial to getting things done,” Kutner says. “In software, we often use an agile method to guide our development efforts. Agile processes are characterized by an iterative and incremental approach to development, which allows us to adapt to changing requirements. The method you use to stay healthy shouldn’t be any different.”

In his book, he shows “how to define a system of time-boxed iterations that will improve your health. We’ll start with two-week intervals, but like with any agile method, you’ll be allowed to change that as needed. At the end of each iteration you’ll do a retrospective to assess your progress.”

Crucially, Kutner’s approach is to start small, by changing one habit, and start gently, by doing some walking. “You won’t be bombarded with exercises and activities right away,” he emphasizes. “Instead, we’ll spend the first few chapters introducing some very simple, but essential, components of a healthy lifestyle. Don’t think that they are too simple, though. These are the activities that will have the biggest effect on your life.”

Kutner’s well-researched, well-written book takes a whole-body approach, with a keen understanding how programmers work.  He has been one for more than a decade and has spent much of that time researching the physical hazards of sedentary coding.

Chair exercises, standing desks, wrist braces, eye-care tips, and dietary recommendations are some of the areas covered. A “Pomodoro break,” for example, can help people involved in many different types of creative work, including programming. The basic approach involves working on a single task for a specific amount of time, such as 60 minutes, with short periods of exercise interspersed.

You might set a timer for 25 minutes, then focus on debugging some code. When the timer goes off, you reset it for five minutes and take a short walk. Then spend another 25 minutes doing a code review. When the timer goes off again, get up from your desk and do some exercises for five minutes. Then start a new task (or continue a previous problem) and repeat the cycle.

You may already have a daily exercise routine.  But Kutner warns that it “can interfere with your job [as a coder] if you don’t coordinate the two activities. If you do coordinate them, you may actually improve your ability to write code. That’s because immediately after exercise, blood shifts rapidly back to the brain, which makes it the perfect time to focus on tasks that require complex analysis and creativity.”

The Healthy Programmer has many good tips for avoiding or minimizing back pain, wrist pain, headaches and other irritants, as well good techniques for “upgrading your hardware,” meaning your body. Numerous easy-to-perform exercises are described and illustrated, including some you can do while seated or standing at your workstation. “[Y]our lifestyle can enhance your ability to do your job well,” Kutner emphasizes. “That’s why staying healthy is the best way to ensure you keep doing this job you love for years to come.”

Si Dunn