Observability aims to proactively provide full visibility into the source of known and unknown problems in any type of environment.
Application performance monitoring (APM) has been evolving for years. It has gone from monitoring of single systems to much more sophisticated solutions that encompass highly-distributed, complex applications that are composed of many independent dynamic systems. As such, APM’s role has shifted from more of a passive role to one that is more dynamic and incorporates artificial intelligence and observability.
The need for the change is due to the way modern businesses operate. Customer experience and application responsiveness are critical differentiators. Anything that impacts either of these things can drive away customers, infuriate internal workers, or alienate partners.
Today, rather than waiting for problems (ranging from performance degradation to outright disruption and downtime) to happen, businesses need to be ahead of the issues. They need to anticipate problems in the making and take corrective actions before they impact the application user.
As such, new tools are being incorporated into APM solutions to expand their functionality. A good indication of the change that is occurring is in how the APM is categorized. For example, Gartner defines APM suites as one or more software or hardware components that facilitate monitoring to meet three main functional dimensions:
- Digital experience monitoring (DEM)
- Application discovery, tracing, and diagnostics (ADTD)
- Artificial intelligence for IT operations (AIOps) for applications.
See Also: Continuous Intelligence Insights
Evolving to observability
APM solutions are undergoing changes by incorporating additional functionality and capabilities.
Traditionally, an APM workflow would collect data from applications, look for anomalous patterns, and generate alerts based on the anomalies. IT staff or SREs would then drill into the data to determine the source of the performance issue. In other words, the goal of APM is to detect performance problems, then diagnose their source.
Increasingly, solutions include tracing, which is the process of tracking transactions within an application as different parts of the application respond to them. For example, instead of simply knowing that application latency is high, which is something that APM could determine, tracing lets staff pinpoint which part of the application—the frontend, the database, the business logic, or something else—is the weak link that is causing the latency problem. Most of the modern APM products use tracing data behind the scenes to connect the dots and provide causal relationships and dependencies, but they rarely offer the ability to inspect each transaction in detail.
Today, many businesses want more. Specifically, they want observability. In general, where APM focuses on well-known problem patterns and application architectures, observability aims to provide full visibility into the source of problems in any type of environment. It does that mainly by correlating a variety of data points with each other to determine the root cause of a performance issue more easily or, at a minimum, to point staff toward the most likely root cause so that they can make a manual determination.
Where does tracing fit into the observability picture? Tracing is increasingly is a data type for APM. However, it has also been deemed one of the so-called “pillars of observability.” The other main data sources for observability are logs and metrics.