Not all data and event monitoring and analysis situations are alike. Here, RTInsights contributor Chris Bird explains when you need in-stream detection of event anomalies and when out-of-band, or after-the-fact, detection is the best choice.
The Danish philosopher Soren Kierkegaard is credited with saying, “Life can only be understood backwards but it must be lived forwards.” This same thought can be applied to transactional systems. Transactions must be “lived forward” and applied as they happen. The full context of what they mean, however, can only be understood after the event. But there are many applications, such as risk management, that require instantaneous analysis in order to make rapid decisions about a transaction. Unfortunately, our ability to analyze events in real-time or “in-stream” is in need of improvement.
When a system is “listening” to an event stream, it can perform analysis on the fly, while the event data is coming in. The system can also recognize when a noteworthy event has occurred and requires further analysis. Even though software developers are working to move analysis capabilities closer to the moment of event notification, it still isn’t feasible to perform all analytics processing on every transaction at the moment of event detection. Sometimes that is because the compute work load is just too high at that point and sometimes it is because any delays introduced in the transaction flow are unacceptable.
To compensate for this problem, the industry has created the concept of “in-stream detection” in which the system or infrastructure recognizes that something interesting has happened, and “out of band” processing which is when the analytics system processing happens outside of the transaction window. In situations where there are a lot of data and events, and a need to evaluate the performance or characteristics of that data and events, both patterns may be needed.
Here are three examples from three different professional fields where both in-stream detection and out-of-band analysis are needed to perform the necessary tasks.
Law Enforcement: Code Breaking
One way that a surveillance system is typically alerted to the possible presence of criminal activity over a communications network is when there is an increase in encrypted message chatter. While it could technically be possible for the system to simply intercept and decode every single message and transaction that it encounters in real-time, that would require an enormous amount of processing capacity. It is simply not feasible to decode everything that flows over the network without delaying transmission.
Instead, a combination of an in-stream, abbreviated analysis or detection, with out-of-band analysis only on suspect transactions, is the faster, more practical approach. So, when a system detects an increase in “chatter,” it may decide at that time to throw all of its resources to evaluating and decrypting the traffic.
Finance: Money Transfer
Another example of in-stream and out-of-band working together is when a system set to monitor money transfers (such as in a financial institution) records an upswing in the value of each money transaction or in the total volume of transactions. Because it isn’t practical to expend a lot of time investigating every money transfer, the system instead searches for abnormal patterns of activity that indicate unusual, potentially fraudulent transactions. This general pattern is part of the standard Detect/Act messaging pattern in which explicit separation is made between the Detect and Act sides of the analysis.
IT: Systems Management
A standard tenet of systems management is that it’s only necessary to alert an event once. Here’s an example of why this is a good tenet. Once a disk starts to get low in remaining capacity, it is likely to continue to lose capacity. Operators certainly don’t want to receive a new notification of this problem at every system operation. Operators set thresholds so that the system can alert at the appropriate times. So, we want an in-stream alert when a disk is low on capacity but want to provide the details to the IT maintenance station out-of-band when there is more time to perform the analysis of the situation that led to the underlying condition.
Conclusion
When we design systems, whether computerized systems or manual systems, it is important to separate how we detect that something has happened (i.e, an event) from what we do about it (i.e., the action). We want to decide what is suspicious and worthy of further analysis, balancing the timeliness of the analysis with the effort required to perform the analysis. The in-stream/out-of-band pattern gives us a way to separate the detection and action components of the system and to help us decide how urgently to act on incoming events.