Mule in Action, 2nd Edition – Want to be an integration developer? Here’s a good start – #bookreview

 

Mule in Action, Second Edition

David Dossot, John D’Emic, Victor Romero

(Manning – paperback)

 

An enterprise service bus (ESB) can help you link together many different types of platforms and applications–old and new–and keep them communicating and passing data between each other.

“Mule,” this book’s authors note, “is a lightweight, event-driven enterprise service bus and an integration platform and broker.  As such, it resembles more a rich and diverse toolbox than a shrink-wrapped application.”

Mule in Action, Second Edition, is a comprehensive and generally well-written overview of Mule 3 and how to put its open-source building blocks together to create integration solutions and develop them with Mule. The book provides very good focus on sending, receiving, routing, and transforming data, key aspects of an ESB.

More attention, however, could have been paid to clarity and detail in Chapter 1, the all-important chapter that helps Mule newcomers get started and enthused.

This second edition is a recent update of the 2009 first edition. Unfortunately, the Mule screens have changed a bit since the book’s screen shots were created for the new edition. Therefore, some of the how-to instructions and screen images do not match what the user now sees. This gets particularly confusing while trying to learn how to configure a JMS outbound endpoint for the first time, using Mule Studio’s graphical editor. The instructions seem insufficient, and the mismatch of screens can leave a beginner unsure how to proceed.

The same goes for configuring the message setting in the Logger element. The text instructs: “You’ll set the message attribute to print a String followed by the payload of the message, using the Mule Expression Language.” But no example is given. Fortunately, a reviewer on Amazon has posted a correct procedure. In his view, the message attribute should be: We received a message: #[message.payload]  –without any quote marks around it. (It works.)

Of course, this book is not really aimed at beginners–it’s for developers, architects, and managers (even though there will be Mule “beginners” in those ranks). Fortunately, it soon moves away from relying solely on Mule Studio’s graphical editor. The book’s examples, as the authors note, “mostly focus on the XML configurations of flows.” Thus, there are many XML code examples to work with, plus occasional screen shots of the flows as they appear in Mule Studio. And you can use other IDEs to work with the XML, if you prefer.

Indeed, the authors note, “no functionality in the CE version of Mule is dependent on Mule Studio.”

Overall, this is a very good book, and it definitely covers a lot of ground, from “discovering” Mule to becoming a Mule developer of integration applications, and using certain tools (such as business process management systems) to augment the applications you develop. I just wish a little more how-to clarity had been delivered in Chapter 1.

Si Dunn

Optimizing Hadoop for MapReduce – A practical guide to lowering some costs of mining Big Data – #bookreview

Optimizing Hadoop for MapReduce

Learn how to configure your Hadoop cluster to run optimal MapReduce jobs

Khaled Tannir

(Packt Publishing, paperback, Kindle)

Time is money, as the old saying goes. And that saying especially applies to the world of Big Data, where much time, computing power and cash can be consumed while trying to extract profitable information from mountains of data.

This short, well-focused book by veteran software developer Khalid Tannir describes how to achieve a very important, money-saving goal: improve the efficiency of MapReduce jobs that are run with Hadoop.

As Tannir explains in his preface:

“MapReduce is an important parallel processing model for large-scale, data-intensive applications such as data mining and web indexing. Hadoop, an open source implementation of MapReduce, is widely applied to support cluster computing jobs that require low response time.

“Most of the MapReduce programs are written for data analysis and they usually take a long time to finish. Many companies are embracing Hadoop for advanced data analytics over large datasets that require time completion guarantees.

“Efficiency, especially the I/O costs of MapReduce, still needs to be addressed for successful implications. The experience shows that a misconfigured Hadoop cluster can noticeably reduce and significantly downgrade the performance of MapReduce jobs.”

Tannir’s well-focused, seven-chapter book zeroes in on how to find and fix misconfigured Hadoop clusters and numerous other problems. But first, he explains how Hadoop parameters are configured and how MapReduce metrics are monitored.

Two chapters are devoted to learning how to identify system bottlenecks , including CPU bottlenecks, storage bottlenecks, and network bandwidth bottlenecks.

One chapter examines how to properly identify resource weaknesses, particularly in Hadoop clusters. Then, as the book shifts strongly to solutions, Tannir explains how to reconfigure Hadoop clusters for greater efficiency.

Indeed, the final three chapters deliver details and steps that can help you improve how well Hadoop and MapReduce work together in your setting.

For example, the author explains how to make the map and reduce functions operate more efficiently, how to work with small or unsplittable files, how to deal with spilled records (those written to local disk when the allocated memory buffer is full), and ways to tune map and reduce parameters to improve performance.

“Most MapReduce programs are written for data analysis and they usually take a lot of time to finish,” Tannir emphasizes. However: “Many companies are embracing Hadoop for advanced data analytics over large datasets that require completion-time guarantees.” And that means “[e]fficiency, especially the I/O costs of MapReduce, still need(s) to be addressed for successful implications.”

He describes how to use compression, Combiners, the correct Writable types, and quick reuse of types to help improve memory management and the speed of job execution.

And, along with other tips, Tannir presents several “best practices” to help manage Hadoop clusters and make them do their work quicker and with fewer demands on hardware and software resources. 

Tannir notes that “setting up a Hadoop cluster is basically the challenge of combining the requirements of high availability, load balancing, and the individual requirements of the services you aim to get from your cluster servers.”

If you work with Hadoop and MapReduce or are now learning how to help install, maintain or administer Hadoop clusters, you can find helpful information and many useful tips in Khaled Tannir’s Optimizing Hadoop for Map Reduce.

Si Dunn

Making Sense of NoSQL – A balanced, well-written overview – #bigdata #bookreview

Making Sense of NoSQL

A Guide for Managers and the Rest of Us
Dan McCreary and Ann Kelly
(Manning, paperback)

This is NOT a how-to guide for learning to use NoSQL software and build NoSQL databases. It is a meaty, well-structured overview aimed primarily at “technical managers, [software] architects, and developers.” However, it also is written to appeal to other, not-so-technical readers who are curious about NoSQL databases and where NoSQL could fit into the Big Data picture for their business, institution, or organization.

Making Sense of NoSQL definitely lives up to its subtitle: “A guide for managers and the rest of us.”

Many executives, managers, consultants and others today are dealing with expensive questions related to Big Data, primarily how it affects their current databases, database management systems, and the employees and contractors who maintain them. A variety of  problems can fall upon those who operate and update big relational (SQL) databases and their huge arrays of servers pieced together over years or decades.

The authors, Dan McCreary and Ann Kelly, are strong proponents, obviously, of the NoSQL approach. It offers, they note, “many ways to allow you to grow your database without ever having to shut down your servers.” However, they also realize that NoSQL may not a good, nor affordable, choice in many situations. Indeed, a blending of SQL and NoSQL systems may be a better choice. Or, making changes from SQL to NoSQL may not be financially feasible at all. So they have structured their book into four parts that attempt to help readers “objectively evaluate SQL and NoSQL database systems to see which business problems they solve.”

Part 1 provides an overview of NoSQL, its history, and its potential business benefits. Part 2 focuses on “database patterns,” including “legacy database patterns (which most solution architects are familiar with), NoSQL patterns, and native XML databases.” Part 3 examines “how NoSQL solutions solve the real-world business problems of big data, search, high availability, and agility.” And Part 4 looks at “two advanced topics associated with NoSQL: functional programming and system security.”

McCreary and Kelly observe that “[t]he transition to functional programming requires a paradigm shift away from software designed to control state and toward software that has a focus on independent data transformation.” (Erlang, Scala, and F# are some of the functional languages that they highlight.) And, they contend: “It’s no longer sufficient to design a system that will scale to 2, 4, or 8 core processors. You need to ask if your architecture will scale to 100, 1,000, or even 10,000 processors.”

Meanwhile, various security challenges can arise as a NoSQL database “becomes popular and is used by multiple projects” across “department trust boundaries.”

Computer science students, software developers, and others who are trying to stay knowledgeable about Big Data technology and issues should also consider reading this well-written book.

Si Dunn

Testing Cloud Services – How to Test SaaS, PaaS and IaaS – #cloud #bookreview

Testing Cloud Services

How to Test SaaS, PaaS & IaaS
Kees Blokland, Jeroen Mengerink and Martin Pol
(Rocky Nook – paperback, Kindle)

Cloud computing now affects almost all of us, at least indirectly. But some of us have to deal directly with one or more “clouds” on a regular basis. We select or implement particular cloud services for our employers or for our own businesses. Or, we have to maintain those services and fix any problems encountered by co-workers or employees.

Testing Cloud Services, written by three well-experienced test specialists, emphasizes that the time to begin testing SaaS (Software as a Service), PaaS (Platform as a Service), or IaaS (Infrastructure as a Service) is not after you have made your selections. You should begin testing them during the selection and installation processes and keep testing them regularly once they are live.

“Cloud computing not only poses challenges for testing, it also provides interesting new testing options,” the authors note. “For example, cloud computing can be used for test environments or test tools. It can also mean that all test activities and the test organization as a whole are brought to the cloud. This will be called Testing as a Service.”

Their well-written, six-chapter book deals with numerous topics related to using and testing cloud services, including the role of the test manager, identifying the risks of cloud computing and testing those risks, and picking the right test measures for the chosen services.

In Chapter 5, a significant portion of the book is devoted both to test measures and test management. “Testing SaaS is very different from testing PaaS or IaaS,” the writers state. Much of the lengthy chapter focuses on SaaS, but it also addresses PaaS and IaaS, and the authors describe the following test measures:

  • Testing during selection of cloud services
  • Testing performance
  • Testing security
  • Testing for manageability
  • Testing availability/continuity
  • Testing functionality
  • Testing migrations
  • Testing due to legislation and regulations
  • Testing in production

Particularly if you are a newcomer to choosing, testing, and maintaining cloud services, this book can be an informative and helpful how-to guide.

Si Dunn

The Practice of Network Security Monitoring – You’re compromised, so deal with it. #security #bookreview

The Practice of Network Security Monitoring

Understanding Incident Detection and Response
Richard Bejtlich
(No Starch Press – paperback, Kindle)

Security expert Richard Bejtlich’s focus in his new book is not on “the planning and defense phases of the security cycle.” Instead, he emphasizes how to handle “systems that are already compromised or that are on the verge of being compromised.”

His well-organized, well-written, 341-page book aims to help you “start detecting and responding to digital intrusions using network-centric operations, tools, and techniques.”

Bejtlich has long emphasized a “detection-centered philosophy” built around a straightforward central tenet: “Prevention eventually fails.” No matter how many digital walls and moats you build around your network, someone will find a way to tunnel in, parachute in, or sneak in via an unsuspecting employee’s $9.95 thumb drive.

“It’s becoming smarter,” he writes, “to operate as though your enterprise is always compromised. Incident response is no longer an infrequent, ad-hoc affair. Rather, incident response should be a continuous business process with defined metrics and objectives.”

You may recognize some of Bejtlich’s previous books on network security monitoring (NSM): The Tao of Network Security Monitoring; Extrusion Detection; and Real Digital Forensics.

The Practice of Network Security Monitoring is tailored toward two key audiences: (1) security professionals who have little or no experience with NSM; and (2) “more senior incident handlers, architects, and engineers who need to teach NSM to managers, junior analysts, or others who may be technically less adept.”

Readers, he add, should understand “the basic use of the Linux and Windows operating systems, TCP/IP networking, and the essentials of network attack and defense.”

The examples in Bejtlich’s book rely on open source and vendor-neutral tools, primarily from Doug Burks’ Security Onion (SO) distribution.

The 13-chapter book is organized into four parts:

  • Part I: Getting Started – Introduces NSM and sensor placement issues.
  • Part II: Security Onion Deployment – Shows how to install and configure SO.
  • Part III: Tools – Examines the “key software shipped with SO and how to use these applications.”
  • Part IV: NSM in Action – Looks at “how to use NSM processes and data to detect and respond to intrusions.”

Following the technical chapters, Bejtlich offers some concluding thoughts on network security management, cloud computing, and establishing an effective workflow for NSM. “NSM isn’t just about tools,” he writes. “NSM is an operation, and that concept implies workflow, metrics, and collaboration. A workflow establishes  a series of steps that an analyst follows to perform the detection and response mission. Metrics, like the classification and count of incidents and time elapsed from incident detection to containment, measure the effectiveness of the workflow. Collaboration enables analysts to work smarter and faster.”

He also observes: “It is possible to defeat adversaries if we stop them before they accomplish their mission. As it has been since the early 1990s, NSM will continue to be a powerful, cost-effective way to counter intruders.”

Si Dunn

Puppet 3 Beginner’s Guide – Automate configuration management & become a better system admin – #programming #bookreview

Puppet 3 Beginner’s Guide
John Arundel
(Packt Publishing – paperback, Kindle)

If you administer a small network built around just a few servers, you may still be doing at least some of the configuration management by hand. You literally move from machine to machine, manually entering updates, changes, or fixes. And your small network may be running several different brands–and vintages–of hardware and software, which complicates the update and repair process.

However, infrastructure consultant John Arundel warns, once you get “[b]eyond ten or so servers, there simply isn’t a choice. You can’t manage an infrastructure like this by hand. If you’re using a cloud computing architecture, where servers are created and destroyed minute-by-minute in response to changing demand, the artisan approach to server crafting just won’t work.”

In his new book, Puppet 3 Beginner’s Guide, Arundel emphasizes: “Manual configuration management is tedious and repetitive, it’s error-prone, and it doesn’t scale well. Puppet is a tool for automating this process.”

Among “UNIX-like systems,” there are at least three major configuration management (CM) packages, including Puppet. The others are Chef and CFEngine, plus a few more competitors. Arundel calls them “all great solutions to the CM problem…it’s not very important which one you choose as long as you choose one.” But he hopes, of course, you will favor Puppet and his well-written how-to guide.

Puppet 3 Beginner’s Guide is structured to help system administrators “start from scratch…and learn how to fully utilize Puppet through simple, practical examples,” he writes.

He places important emphasis on the rapidly closing “divide between ‘devs,’ who wrangle code, and ‘ops,’ who wrangle configurations. Traditionally, the skills sets of the two groups haven’t overlapped much,” he notes. “It was common until recently for system administrators not to write complex programs, and for developers to have little or no experience of building and managing servers.”

Today, system admins are “facing the challenge of scaling systems to enormous size for the web, [and] have had to get smart about programming and automation.” Meanwhile, “[d]evelopers, who now often build applications, services, and businesses by themselves, couldn’t do what they do without knowing how to set up and fix servers,” he says.

Therefore, “[t]he term ‘devops’ has begun to be used to describe the growing overlap between these skill sets…Devops write code, herd servers, build apps, scale systems, analyze outages, and fix bugs. With the advent of CM systems, devs and ops are now all just people who work with code.”

Arundel’s 184-page Puppet 3 Beginner’s Guide offers 10 chapters smoothly structured with headings, short paragraphs, code examples, and other illustrations. He has generated his code examples using the Ubuntu 12.04 LTS “Precise” distribution of Linux. But he explains how to load the software using “Red Hat Linux, CentOS, or another Linux distribution that uses the Yum package system,” as well.

The chapters are:

  • Chapter 1, Introduction to Puppet
  • Chapter 2, First Steps with Puppet
  • Chapter 3, Packages, Files, and Services
  • Chapter 4, Managing Puppet with Git
  • Chapter 5, Managing Users
  • Chapter 6, Tasks and Templates
  • Chapter 7, Definitions and Classes
  • Chapter 8, Expressions and Logic
  • Chapter 9, Reporting and Troubleshooting
  • Chapter 10, Moving on Up

That final chapter covers a range of topics, including how to make Puppet code “more elegant, more readable, and more maintainable.” The author offers “links and suggestions for further reading.” And he describes several projects to help you “improve your skills and your infrastructure at the same time.” Those projects, he says, “provide a series of stepping-stones from your first use of Puppet to a completely automated environment.”

Besides Linux, Puppet will run on other several platforms, including Windows and Macs. But there is almost no help for those in Arundel’s book. Essentially, it’s Linux or bust. For other operating systems, you will need to refer to the Puppet Labs website.

It can take a bit of work to get Puppet installed and properly configured. But once you have Puppet running, the Puppet 3 Beginner’s Guide can help you become both a proficient Puppet user and a more efficient, knowledgeable, and versatile system administrator.

Si Dunn

Windows PowerShell 3.0: Step by Step – A huge guide to things you can do after you’ve found PowerShell – #bookreview

Windows PowerShell 3.0: Step by Step
Ed Wilson
(Microsoft Press – paperback, Kindle)

 

Wondering what the “Open Windows PowerShell” option does on your Windows 8 PC?

There’s a book for that: Windows PowerShell 3.0: Step by Step by Ed Wilson.

According to Wilson, “Windows PowerShell 3.0 is an essential management and automation tool that brings the simplicity of the command line to the next generation operating systems.” It is “included in Windows 8 and Windows Server 2012, and portable to Windows 7 and Windows Server 2008 R2” and “offers unprecedented power and flexibility to everyone from power users to enterprise network administrators and architects.”

Windows PowerShell is accessed as a command console that also offers a programming language. This means you can create files that will perform some automated actions using “cmdlets” (pronounced “command-lets”) at the PowerShell prompt. The cmdlets, Wilson writes, “are like executable programs, but they take advantage of the facilities built into Windows PowerShell, and therefore are easy to write.” cmdlets are not scripts, he adds, “because they are built using the services of a special .NET Framework namespace.”

In one basic, introductory example in Wilson’s book, you create a batch file — TroubleShoot.bat — that automatically enters four commands in sequence and pipes the results of each command to a text file:

ipconfig /all >C:\tshoot.txt
route print >>C:\tshoot.txt
hostname >>C:\tshoot.txt
net statistics workstation >>C:\tshoot.txt

Wilson’s book spans 666 pages, so there are many other features and uses for PowerShell that should please power users, technical staff, Windows network administrators, and Windows networking consultants. Some programmers also will relish its opportunities to write various types of PowerShell files and create functions, subroutines, modules, and other processes.

If you are studying to become a Microsoft Certified Solutions Expert (MCSE) or Microsoft Certified Trainer (MCT), you may know this already: Windows PowerShell is considered “a key component of many Microsoft courses and certification exams.”

Windows PowerShell 3.0: Step by Step is well written, and it is solidly illustrated with code examples, screenshots, and other graphics. The author is a senior consultant at Microsoft and a well-known scripting expert. Readers are not expected to have “any background in programming, development, or scripting.” So, it is a good (albeit hefty)  how-to guide for PowerShell beginners and intermediate users.

Si Dunn