Organizations need transparency into real-time applications to understand inter-dependencies, prevent unexpected problems, and avoid incorrect results.
Real-time applications for decision support and decision automation typically use multiple data sets. In many cases, application complexity masks the influence or dependencies any one factor may have on derived insights, predictions, and suggested courses of action.
The issue is particularly acute with modern development and deployment practices where applications are highly distributed, loosely coupled entities. In many cases, organizations have no control over the data they are using in their applications. There are numerous examples from the financial services market where an institution will use third-party indices, datasets, and other information in day-to-day operations.
See also: NOAA to Enhance Weather Analytics via Supercomputer Upgrade
All is fine if everything works as planned. If data streams as it should, the data quality is fine, and all inter-twined elements of a real-time application work in unison, then all is right with the world. But what happens if there is a glitch.
A recent New York Times article highlights the consequences of a hidden dependency that disrupts a smoothly operating real-time system. The article notes that researchers noticed that the quality of weather forecasting models declined as air travel dropped off during the pandemic. One might easily conjecture there was a physical cause and effect. Jet engine exhaust includes water vapor, soot, carbon dioxide, unburned fuel, and several oxides. Perhaps the diminished amounts of these particulates due to fewer commercial flights had some impact on lower stratospheric dynamics.
It turns out that was not the case. The article reported that researchers found “that when a short-term forecasting model received less data on temperature, wind, and humidity from aircraft, the forecast skill (the difference between predicted meteorological conditions and what actually occurred) was worse.”
Atmospheric observations made by instruments on commercial and cargo planes are “among the most important data used in forecasting models,” according to the article. The measurements are transmitted in real-time to forecasting organizations around the world, including the National Weather Service.
A prediction’s degradation due to this data disruption can have many downstream implications. For example, short-term forecasts are used by many industries (logistics, agriculture, entertainment, public safety, and more) to make business decisions.
Similar situations can occur in a wide variety of businesses that use real-time applications. Financial services firms routinely make real-time decisions on customer credit limits, risk, and fraud detection using many data sources. The impact of COVID-19 has focused attention on whether models, predictions, and assumptions are still valid as the pandemic’s economic impact rapidly changed financial conditions.
Having some knowledge of these changes and the impact they have on the outcome is useful. You might be able to take such factors into account and adjust on a case-by-case basis. (That defeats the purpose of automating the decision, but at least it tries to adjust based on the circumstances.) But what happens if you do not know the changes have occurred. That can easily be the case in a complex, distributed application.
Today’s complex real-time applications can have many inter-dependencies. Not understanding the dependencies can lead to incorrect predictions. What’s needed is transparency into the workings of real-time applications.