The Hidden Gold Mine: IBM Analyzes Log Data
for IT Troubleshooting

Smart log analysis means a faster, more effective response to problems

Cloud & smarter Infrastructure Weekly. An IBM service management perspective.IBM's ongoing investment in automated big data analytics is paying heavy dividends—and that's particularly true in the case of data center troubleshooting.

Why is this the case? Consider the information sources available to IT professionals in order to resolve performance shortfalls or outages:

Of the three, logs represent by far the highest untapped potential for troubleshooting purposes. While metrics and data can reflect the existence or extent of a problem, logs often reflect something far more useful: its root cause. This can often be derived via system health statistics, software errors and exceptions, stack traces, customer transactions and sessions, transaction timing, configuration information, and ticketing data (among other data).

But is manual log analysis a practical reality? Given an infrastructure of 5,000 servers running more than 100 applications, more than a terabyte of such data is typically generated every day—and of it, some 97% is unstructured, such as logs.

That’s why organizations are increasingly looking for a smart, automated solution capable of scaling to any necessary level, and supporting all the heterogeneous platforms on which their workloads are executing.

A whole new family of analytics tools from IBM

Enter the new IBM SmartCloud Analytics family of solutions—a key offering of which is Log Analysis.

This solution, based on core technology from IBM's proven Big Data solution portfolio, brings the power of automated analysis to IT asset logs—seeking out, discovering, and reporting technical issues far more quickly than could be achieved by IT team members alone.

This solution, based on core technology from IBM's proven Big Data solution portfolio, brings the power of automated analysis to IT asset logs—seeking out, discovering, and reporting technical issues far more quickly than could be achieved by IT team members alone. Yet because it is exceptionally feature-rich, and extensible through customization, Log Analysis also supports the specific business and IT context of almost any organization, delivering the same kind of tailored insight a human expert might (given unlimited time)—or better.

Consider how complex the human version of this task typically is. Following a problem, IT team members must collect and parse many forms of unstructured data, interpret logs in a governed and comprehensive fashion, and (ideally) do this in a way that goes beyond simple searches or ad hoc scripts that only address a limited number of problem types. For the most complex problem types, multiple experts may be required, slowing problem resolution even more. And even in the best such case, the resulting fix is reactive, not proactive; it only takes place at all following a problem, when the business impact has already begun to take a toll.

How does Log Analysis transcend this paradigm? Essentially, by performing much the same tasks, but far more quickly, thoroughly, consistently, and continuously. It collects log files (via either an agent-driven push method, or an agentless pull method) in incredible volumes and enables quick searches for patterns and visualizes relationships via dashboard displays.

In this, it is aided by extensive troubleshooting insight drawn from IBM's long history of successful customer engagements, concerning how problems tend to manifest and what log characteristics arise as a result. The solutions then reflect the results via a single actionable dashboard useful for tracking dynamic infrastructural conditions in near real-time—a unified view of overall performance that can be made far more granular on demand.

Both IT generalists and application specialists will benefit from the new insight they receive. When technical problems emerge, those problems can be identified, isolated, and repaired far more quickly than they could via a manual approach—and the business consequence of the problem will be minimized as a result.

Create insight packs tailored fit to your IT and business needs

Additionally, the solution can be customized to suit specific assets, business requirements, or other considerations. This happens via the provided tools by which solution users can create "insight packs" that capture and package advanced knowledge on specific topics. This feature, available for customers who also use IBM WebSphere application environments or IBM's enterprise-class DB2 database offering, multiplies the total business value of the solution by combining and leveraging domain knowledge previously only available from human experts.

How to interpret log data from a specific asset? How should log data be linked with metric or application data? What should be searched for, given a particular problem and a particular class of log? Such knowledge can be made a part of the solution as soon as an expert creates a new insight pack. And once that happens, it can then be automated and leveraged by Log Analysis in a far more ongoing, constant fashion—even if those experts subsequently leave the organization altogether. Insight packs can also be shared across technology teams, operational sites, or even with business partners outside company walls.

In every such scenario, expert knowledge will be rapidly reused to best effect, instead of slowly and painstakingly having to be reinvented from scratch.

Many forms of value from a single solution

The possible number of use cases for Log Analysis is limitless, but a few very common scenarios might help illustrate how its smart, automated, continuous log analysis will be leveraged in different ways:

  • Application developers will benefit from it in developing and testing new builds. Perhaps an application isn't as stable as it should be prior to deployment into production; via Log Analysis, developers can perform detailed analyses of logs such as stack traces, determining exactly where, when, and under what conditions the code breaks.
  • Application owners, on the other hand, typically have a business perspective. While they cannot repair code per se, they often develop insight into the circumstances under which applications begin to underperform, as well as the best practices to respond. Using Log Analysis, these team members can capture and package those best practices into an insight pack, passing them on to other members of the team so that everyone can respond to such situations in a consistent and optimized manner.
  • IT operations team members may receive reports from certain end users that an application performs in an inconsistent way—but it's not clear what leads to the inconsistency. Through Log Analysis, it's possible for these team members to identify when application performance declined for those particular users—then, given that extra context, search the expert advice for available solutions.

And if that weren't enough, future integrations with other solutions from the IBM Cloud and Smarter Infrastructure portfolio promise even more value in the months to come. Specifically, monitoring and event management solutions will soon incorporate insight from Log Analysis, and vice versa, providing a truly end-to-end, federated way to identify and solve technical problems as completely as possible, and in the shortest possible time, with the least impact.


 

Additional information

Recent Articles

 

Contact IBM

Considering a purchase?

Log Analysis Simplified

Innovate 2013

IBM SmartCloud Analytics - Log Analysis provides the capability to rapidly analyze unstructured data to assist in problem identification, isolation and repair.

Unified Device Management

Innovate 2013

IBM Endpoint Manager lowers the total cost of managing and securing mobile devices, laptops, desktops, and servers – physical or virtual, on or off-network, personally or corporate-owned.