Natural Language Annotation for Machine Learning – #programming #bookreview

Natural Language Annotation for Machine Learning
James Pustejovsky and Amber Stubbs
(O’Reilly, paperbackKindle)

You may not be sure what’s going on here, at first, even after you’ve read the tag line on the book’s cover: “A Guide to Corpus-Building for Applications.

Fortunately, a few definitions inside this book can enlighten you quickly and might even get you interested in delving deeper into natural language processing and computational linguistics as a career.

“A natural language,” the authors note,” refers to any language spoken by humans, either currently (e.g., English, Chinese, Spanish) or in the past (e.g., Latin, ancient Greek, Sanskrit). Annotation refers to the process of adding metadata information to the text in order to augment a computer’s ability to perform Natural Language Processing (NLP).”

Meanwhile: “Machine learning refers to the area of computer science focusing on the development and implementation of systems that improve as they encounter more data.”

And, finally, what is a corpus? “A corpus,” the authors explain, “is a collection of machine-readable texts that have been produced in a natural communicative setting. They have been sampled to be representative and balanced with respect to particular factors; for example, by genre—newspaper articles, literary fiction, spoken speech, blogs and diaries, and legal documents.”

The Internet is delivering vast amounts of information in many different formats to researchers in the fields of theoretical and computational linguistics. And, in turn, specialists are now working to develop new insights and algorithms “and turn them into functioning, high-performance programs that can impact the ways we interact with computers using language.”

This book’s central focus is on learning how an efficient annotation development cycle works and how you can use such a cycle to add metadata to a training corpus that helps machine-language algorithms work more effectively.

Natural Language Annotation for Machine Learning is not light reading. But it is well structured, well written and offers detailed examples. Using an effective hands-on approach, it takes the reader from annotation specifications and designs to the use of annotations in machine-language algorithms. And the final two chapters of the 326-page book “give a complete walkthrough of a single annotation project and how it was recreated with machine learning and rule-based algorithms.”

“[I]t is not enough,” the authors emphasize, “to simply provide a computer with a large amount of data and expect it to learn to speak—the data has to be prepared in such a way that the computer can more easily find patterns and inferences. This is usually done by adding relevant metadata to a dataset. Any metadata tag used to mark up elements of the dataset is called an annotation over the input. However,” they point out, “in order for the algorithms to learn efficiently and effectively, the annotation done on the data must be accurate, and relevant to the task the machine is being asked to perform. For this reason, the discipline of language annotation is a critical link in developing intelligent human language technology.”

Si Dunn

HTML5 and JavaScript Web Apps – With emphasis on the Mobile Web – #programming #bookreview

HTML5 and JavaScript Web Apps
Wesley Hales
(O’Reilly,
paperbackKindle)

Increasingly, the world of Web development is taking on a “mobile first” attitude. And for good reason. Sales of desktop and laptop computers are shrinking, while sales of mobile devices seem to be swelling into a flood.

“Consumers are on track to buy one billion HTML5-capable mobile devices in 2013,” Wesley Hales writes in his new book. “Today, half of US adults own smartphones. This comprises 150 million people, and 28% of those consider mobile their primary way of accessing the Web. The ground swell of support for HTML5 applications over native ones is here, and today’s developers are flipping their priorities to put mobile development first.”

Hales’ HTML5 and JavaScript Web Apps focuses on using HTML5, JavaScript, and the latest W3C specifications to create mobile and desktop web apps that can work on a wide range of browsers and devices.

Indeed, deciding what to support is a key point in this useful, well-focused how-to guide. Hales notes: “Unfortunately the Mobile Web isn’t write-once-run-anywhere yet. As specifications become final and features are implemented, interoperability will be achieved. In today’s world of mobile browsers, however, we don’t have a largely consistent implementation across all browsers. Even though new tablets and phones are constantly being released to achieve a consistent level of HTML5 implementation, we all know that we’re [also] stuck with supporting the older, fragmented devices for a set amount of time.”

The 156-page book straddles “the gap between the Web and the Mobile Web” but puts a lot of emphasis on developing mobile applications. Here are its nine chapters:

  1. Client-Side Architecture
  2. The Mobile Web
  3. Building for the Mobile Web
  4. The Desktop Web
  5. WebSockets
  6. Optimizing with Web Storage
  7. Geolocation
  8. Device Orientation API
  9. Web Workers

This is not a book for JavaScript, HTML, or CSS beginners. But if you have at least some basic experience with Web application development, Hales can help you get on track toward becoming a Mobile Web guru. Meanwhile, if you are already well-versed in the ways of the Web app world, you may still learn some new and useful things from HTML5 and JavaScript Web Apps.

Si Dunn

Master Your Mac – Useful how-to projects for intermediate users – #bookreview

Master Your Mac
Matt Cone
(No Starch Press, paperbackKindle)

This well-written how-to book will please many new Mac users, as well as many who have been using Macs for years.

But, to fully benefit from this excellent new guide, you must be willing to go beneath the Mac’s easy-to-use OS X surface and work at the command line.

In other words, if you are happy sticking to a regular routine of basics, such as email, Facebook, Twitter , documents and iTunes,  you probably don’t need this book very much.

However, if you are curious about what lies beneath “the obvious applications and documented uses of OS X,” you will find plenty to like in the 400 pages.

The author is offering “a workbook full of advanced projects that push the limits of OS X. You’ll get started with scripting and automation, configure new shortcuts, secure your Mac against invisible threats, and learn how to repair your hard drive.”

 One of the key strengths of this book is its organization. First you are shown how to create “an immediate solution to a real problem.” Then you are given explanations and examples on how to go “above and beyond the project.” For example, “[w]hen you learn AppleScript in Chapter 12…you’ll create your very own script, but you’ll also learn how to incorporate other data structures and interface elements to build a much more advanced script.”

Also, you can tackle the book’s seven parts and 38 chapters in any order that fits your interests and needs. Curious about how to encrypt your hard disk and backups? See Chapter 32. Need to attach multiple monitors to your machine? See Chapter 9. Want to use your Mac as a web server or FTP server? See Chapter 24. Need to create a Bluetooth proximity monitor that automatically locks your screen when you step away from your keyboard? See Chapter 13.

Matt Cone is a well-known and experienced Apple specialist who has been using Macs for more than 20 years. He also is a very good technical writer. His new book is heavily illustrated with steps, screen shots, code samples, and other images. If you are a Macintosh user who wants to get more than just the usual basics from OS X ( including Mountain Lion), Master Your Mac can be your handy go-to guide.

Si Dunn

JavaScript as Compilation Target: ClojureScript and Dart – #programming #bookreview

Despite its widespread success, JavaScript has a reputation for being a computer language with many flaws. Still, it is now everywhere on the planet, so it is here to stay, very likely for a long, long time.

Not surprisingly, several new languages have emerged that jump over some of JavaScript’s hurdles, offer improved capabilities, and also compile to optimized JavaScript code.

Two of these languages are the focus of noteworthy new “Up and Running” books from O’Reilly: ClojureScript: Up and Running and Dart: Up and Running.

Here are short reviews of each book:

ClojureScript: Up and Running
Stuart Sierra and Luke VanderHart
(O’Reilly, paperback, Kindle)

ClojureScript, the authors contend, “provides developers with a language that is more powerful than JavaScript, which can reach all the same places JavaScript can, with fewer of JavaScript’s shortcomings.”

The primary targets of ClojureScript are “web browser applications, but it is also applicable to any environment where JavaScript is the only programmable technology available,” they add.

“ClojureScript is more than Clojure syntax layered on top of JavaScript: it supports the full semantics of the Clojure language, including immutable data structures, lazy sequences, first-class functions, and macros,” they emphasize.

Their 100-page book focuses on how to use ClojureScript’s features, starting at the “Hello world” level and gradually advancing to “Development Process and Workflow” and “Integrating with Clojure.” (ClojureScript is designed for building client-side applications, but it can be merged with Clojure on the JVM to create client-server applications.)

Early in the book, they also describe how to compile a ClojureScript file to JavaScript and emit code “that is fully compatible with the Advanced Optimizations mode of the Google Closure Compiler.”

The two writers are Clojure/ClojureScript developers with a previous book to their credit.

ClojureScript: Up and Running is written well and appropriately illustrated with code samples, flow charts, and other diagrams. The authors recommend using the Leiningen build system for Clojure, plus the lein-cljsbuild plug-in for ClojureScript.

Their book is a smooth introduction to ClojureScript that requires no prior knowledge of Clojure. But you do need a basic working knowledge of JavaScript, HTML, CSS, and the Document Object Model (DOM).

#

Dart: Up and Running
Kathy Walrath and Seth Ladd
(O’Reilly, paperback, Kindle)

Google created Dart to be “an open-source, batteries-included developer platform for building structured HTML5 web apps,” the two authors note.

Dart provides not only a new language, but libraries, an editor, a virtual machine (VM), a browser that can run Dart apps natively, and a compiler to JavaScript.”

Indeed, Dart looks very similar to JavaScript and is “easy to learn,” the two writers state. “A wide range of developers can learn Dart quickly. It’s an object-oriented language with classes, single inheritance, lexical scope, top-level functions, and a familiar syntax. Most developers are up and running with Dart in just a few hours.”

The authors work at Google and note that some of the software engineers who helped develop the V8 JavaScript engine that is “responsible for much of Chrome’s speed” are now “working on the Dart project.”

Dart has been designed to scale from simple scripts all the way up to complex apps, and it can run on both the client and the server.

Those who choose to code with Dart are urged to download the open-source Dart Editor tool, because it also comes with a “Dart-to-JavaScript compiler and a version of Chromium (nicknamed Dartium) that includes the Dart VM.”

Since Dart is new, the writers also urge readers to keep an eye periodically on the Dart website and on their book’s GitHub site, where code can be downloaded and errors and corrections noted.

Dart: Up and Running is a well-structured, well-written how-to book, nicely fortified with short code examples and other illustrations. While the book appears very approachable and simple, it is not for complete beginners. You should have a basic working knowledge of JavaScript, HTML, CSS, and the Document Object Model (DOM).

If you are looking for a web development language that matches JavaScript’s dynamic nature but also addresses JavaScript’s sometimes-aggravating shortcomings, consider trying Dart—with this book in hand.

Si Dunn

Adobe Edge Animate: The Missing Manual – #bookreview

Adobe Edge Animate: The Missing Manual
Chris Grover
(O’Reilly, paperbackKindle)

Chris Grover’s well-written and updated new book shows you how to build animated HTML 5 graphics for the iPhone, the iPad, and the Web, using familiar Adobe features. By the sixth page of the first chapter, you are using the software to begin creating your first animation.

The previous edition of this book, covering Adobe Edge Animate Preview 7, was released just two months ago, shortly before Adobe released the 1.0 commercial version of its Edge Animate product. This new edition has been updated and expanded to cover the commercial version.

Prior to the 1.0 release, seven Preview versions of Adobe Edge Animate had been issued as free downloads, and user feedback was gathered so the product could be enhanced and expanded.

Here is what I reported about this book’s Preview 7 edition in an  October, 2012, review:

First, this book can help you get started with the 1.0 commercial version of Adobe Edge Animate. Second, O’Reilly will soon bring out an Adobe Edge Animate “Missing Manual” that covers the new commercial release. And, third, sources at O’Reilly tell me that readers who purchase this Preview 7 edition of Chris Grover’s book will get access to “the e-book version of Adobe Edge Animate the 1.0 version and all of its updates.”

The new edition of Adobe Edge Animate: The Missing Manual has ten chapters organized into five parts, even though page xiv of the paperback version states that the book is “divided into three parts.” (It then lists four parts, instead of  five, or three).  The new part in this edition is titled “Publishing Animate Compositions” and focuses on “Publishing Responsive Web Pages” that will look good “in web browsers of all shapes and sizes….” Here are the new edition’s parts and chapters:

Part One:Working with the Stage

  • Chapter 1: Introducing Adobe Edge Animate
  • Chapter 2: Creating and Animating Art
  • Chapter 3: Adding and Formatting Text

Part Two: Animation with Edge Animate

  • Chapter 4: Learning Timeline and Transition Techniques
  • Chapter 5: Triggering Actions
  • Chapter 6: Working Smart with Symbols

Part Three: Edge Animate with HTML 5 and JavaScript

  • Chapter 7: Working with Basic HTML and CSS
  • Chapter 8: Controlling Your Animations with JavaScript and jQuery
  • Chapter 9: Helpful JavaScript Tricks

Part Four: Publishing Your Composition

  • Chapter 10: Publishing Responsive Web Pages

Part Five: Appendixes

  • Appendix A: Installation and Help
  • Appendix B: Menu by Menu

Where keystrokes are appropriate, Chris Grover lists both and does not make you have to translate between systems, as some how-to manuals do.

“Animate works almost precisely the same in its Macintosh and Windows versions,” he assures. “Every button in every dialog box is exactly the same; the software response to ever command is identical. In this book, the illustrations have been given even-handed treatment, rotating between the two operating systems where Animate is at home (Windows 7 and Mac OS X).”

Si Dunn

For more information: (O’Reilly, paperback, Kindle)

Spring Data: Modern Data Access for Enterprise Java – #java #bookreview

Spring Data: Modern Data Access for Enterprise Java
Mark Pollack, Oliver Gierke, Thomas Risberg, Jonathan L. Brisbin and Michael Hunger
(O’Reilly, paperbackKindle)

Big Data keeps getting wider and deeper by the second. And so do the demands for analyzing and profiting from all of those piled-up terabytes.

Meanwhile, the once whiz-bang technology known as the relational database is having a very hard time keeping pace. The sheer amount of data that companies now gather, store, access, and analyze is pushing traditional relational databases to the breaking point.

Many Java developers who are trying to keep these overloaded systems held together with baling wire, also are starting to learn to work with some of the “alternative data stores that are being used in mission-critical enterprise applications,” the authors of Spring Data point out.

A lot of data now is being stored elsewhere and not in relational databases. Yet companies cannot abandon what they have already gathered and invested heavily to access. So they need to keep using and supporting their relational databases, plus some newer, faster, more voracious solutions lumped under the heading “NoSQL databases,” (even though you can query them).

In “the new data access landscape,” the authors note: “there is a revolution taking place, which for data geeks is quite exciting. Relational databases are not dead; they are still central to the operations of many enterprises and will remain so for quite some time. The trends, though, are very clear: new data access technologies are solving problems that traditional relational databases can’t, so we need to broaden our skill set as developers and have a foot in both camps.”

They add: “The Spring Framework has a long history of simplifying the development of Java applications, in particular for writing RDBMS-based data access layers that use Java database connectivity (JDBC) or object-relational mappers.”

Their new book “is intended to give you a hands-on introduction to the Spring Data project, whose core mission is to enable Java developers to use state-of-the-art data processing and manipulation tools but also use traditional databases in a state-of-the-art manner.”

They have organized their 288-page book into six parts and 14 chapters:

Part I – Background

  • Chapter 1 – The Spring Data Project
  • Chapter 2 – Repositories: Convenient Data Access Layers
  • Chapter 3 – Type-Safe Querying Using Querydsl

Part II – Relational Databases

  • Chapter 4 – JPA Repositories
  • Chapter 5 – Type-safe JDBC Programming with Querydsl SQL

Part III – NoSQL

  • Chapter 6 – MongoDB: A Document Store
  • Chapter 7 – Neo4j: A Graph Database
  • Chapter8 – Redis: A Key/Value Store

Part IV – Rapid Application Development

  • Chapter 9 – Persistence Layers with Spring Roo
  • Chapter 10 – REST Repository Exporter

Part V – Big Data

  • Chapter 11 – Spring for Apache Hadoop
  • Chapter 12 – Analyzing Data with Hadoop
  • Chapter 13 – Creating Big Data Pipelines with Spring Batch and Spring Integration

Part 5 – Data Grids

  • Chapter 14 – GemFire: A Distributed Data Grid

“Many of the values that have made Spring the preferred platform for enterprise Java developers deliver particular benefit in a world of fragmented persistence solutions,” states Ron Johnson, creator of Spring Framework. Writing in the book’s foreword, he notes: “Part of the value of Spring is how it brings consistency (without descending to a lowest common denominator) in its approach to different technologies with which it integrates.

“A distinct ‘Spring way’ helps shorten the learning curve for developers and simplifies code maintenance. If you are already familiar with Spring, you will find that Spring Data eases your exploration and adoption of unfamiliar stores. If you aren’t already familiar with Spring, this is a good opportunity to see how Spring can simplify your code and make it more consistent.”

Spring Data definitely is not light reading, but it is well-written, and provides a good blending of procedures, steps, explanations, code samples, screenshots and other illustrations.

Si Dunn

Bruce Barnbaum’s ‘Tone Poems’ – Beautiful photographs, with music – #bookreview

Bruce Barnbaum is a superb black-and-white photographer, and Rocky Nook, Inc., recently has brought forth new editions of two of his beautifully crafted image collections.

Styled as part of a four-volume series, these two coffee-table books should appeal to almost anyone who loves good visual images and good music and appreciates opportunities to enjoy them together.

The two books, originally published by Photographic Arts Editions, are:

Tone Poems – Book 1, Opuses 1, 2 & 3
Bruce Barnbaum
(Rocky Nook, hardback)

Tone Poems – Book 2, Opuses 4, 5 & 6
Bruce Barnbaum
(Rocky Nook, hardback)

“It was the land, specifically the magnificent landscape of the Sierra Nevada Mountains of California, that initially drew me into photography,” Barnbaum writes, in a Tone Poems chapter titled “Opus 3, Lyricism of the Land.” Almost 40 years later, he is “still drawn to that landscape, but filled with ideas about photography—and about the land—that I never dreamed of having back in my younger days.” Barnbaum also is drawn to the landscapes of many other parts of the world and is keenly aware of their frailties, as well as the increasing threats that human activity and commercial development pose to their natural beauty.

Why two photography books that also have commentary about the compositions and CDs of music intended to be played as accompaniment to the stunning images?

“Sometimes, even the combination of words and pictures are insufficient to adequately convey my feelings,” Barnbaum notes. “Music, added to the mix, helps convey it much more strongly.”

The CDs included with these books feature selections of classical music played by noted pianist Judith Cohen, artistic director of the Governor’s Chamber Music Series in the state of Washington.

“The music and the images are meant to celebrate the life, the light and the poetic lyricism of the land,” Barnbaum emphasizes.

The two books succeed in reaching these lofty goals.

– Si Dunn