Are the massive data lakes that have been built by enterprises over the last few years becoming a bigger stumbling block to transformation plans?
As more organizations start to connect the dots between digital business transformation and the need for real-time analytics, many of them are starting to discover that all the time and effort spent building massive data lakes over the last few years may prove to be more of a hindrance than an enabler.
Most business leaders equate digital business with being able to change or alter a process in real-time in response to changing business conditions. The issue with data lakes is that it presupposes all the data will be collected in some central location to be analyzed. By the time that analytics occurs the opportunity to act on the data has all too often passed.
See also: Building a data lake while avoiding the “dump”
DataTorrent president and CEO Guy Churchward predicts data lakes in 2018 will prove to be the achilles heel of most digital transformation projects because by such initiatives require access to real-time analytics. Running analytics in memory against stale data stored in a data lake does not equate to applying analytics in real time, says Churchward.
“You can’t run real-time analytics against a data lake,” says Churchward.
Parking data in a central repository, contends Churchward, results in IT organization delivering analytics of events based on stale data. Data lakes still have a role to play in terms of analyzing historical data. But Churchward says many IT leaders are going to find they will need to adopt a new philosophy when it comes to analyzing data in real-time. Much of that analytics needs to be applied as close to the processes they impact as possible, which Churchward says will mean pushing data analytics out the edge of the network.
Churchward joined DataTorrent late last year after spending many years at EMC where much of his focus was on helping IT organizations construct data lakes. But since joining DataTorrent, a provider of real-time analytics and data ingestion engine based on the Apache Apex project, Churchward says his eyes have been opened about the central role digital analytics plays in any digital business transformation after spending time with DataTorrent customers. In fact, Churchward says many IT leaders that fail to shift analytics applications away from data lakes in favor of the network edge in real-time are likely to be out of a job this time next year.
Of course, many of those same IT leaders spent much of the last two years allocating IT budget dollars to building data lakes that they personally championed. In too many cases, Churchward notes IT departments are now finding themselves at loggerheads with data scientists advocating for real-time analytics to be embedded within a specific business process.
Much of the impetus for applying real-time analytics is being driven by consumer applications that employ advanced analytics have a greater appreciation what’s now possible. Many business executives want to know why such capabilities can’t also be manifested in enterprise applications. The good news is that going forward agile development practices will soon make it possible to employ multiple analytics models based on disparate classes of algorithms in real time assuming, of course, the IT organization is applying those models to the freshest data available.