Discover the role data-in-motion plays in enterprise GenAI and how to augment LLMs trained on historical data with fresh or even real-time data.
The number of GenAI applications springing up daily is mind-boggling. There are horizontal applications as well as solutions for specific industries like manufacturing, healthcare, legal, and retail, etc. It makes Dinesh really want to gnaw at it and find out how data-in-motion (DiM) fits into this surge of new applications.
For example LLMs behind GenAI have been trained on historical data, but the solutions people are excited about probably have to use additional “fresh data.” Dinesh and Manish discuss options for using DiM in the context of LLMs.
Some topics highlighted in this video are:
- The high value of fresh data
- Food delivery app example
- Contextual data defined
- Changes to traditional data platforms for GenAI
Guest: Manish Devgan, a product executive with a proven track record of delivering industry-leading software. He has successfully led the development of numerous products at companies such as BEA Systems, Oracle, Terracotta, Software AG, and Hazelcast. Manish is an innovator and holds several patents. Most recently, he was Chief Product Officer at Hazelcast, where he helped build a category-leading real-time data platform for application builders leveraging streaming data, low-latency datastore, and real-time ML/AI. In his free time, Manish likes to go on long walks while listening to Punjabi music.
Host: Dinesh Chandrasekhar is a technology evangelist, a thought leader, and a seasoned IT industry analyst. With close to 30 years of experience, Dinesh has worked on B2B enterprise software as well as SaaS products delivering and marketing sophisticated solutions for customers with complex architectures. He has also defined and executed highly successful GTM strategies to launch several high-growth products into the market at various companies like LogicMonitor, Cloudera, Hortonworks, CA Technologies, Software AG, IBM etc. He is a prolific speaker, blogger, and a weekend coder. Dinesh holds an MBA degree from Santa Clara University and a Master’s degree in Computer Applications from the University of Madras. Currently, Dinesh runs his own company, Stratola, a customer-focused business strategy consulting and full-stack marketing services firm.
Resources:
Watch Smart Talk Episode 1: The Data-in-Motion Ecosystem Landscape
View the data-in-motion ecosystem map here
Learn more about data-in-motion on RTInsights here
Transcript
Dinesh Chandrasekhar (00:22):
Shifting gears a little bit, given that now we have established the base of what we see as data and motion, how real-time data platforms work in this particular space, how critical freshness of data is and so forth, let’s talk about what’s happening around us right now. Topic of the day, topic of the last several months I think is GenAI. Generative AI applications have been on the rise. I mean, to say that there is an explosion happening around us is probably an understatement because the number of applications that are springing up on a day-to-day basis is mind-boggling. Every day you open up one of these social media sites or go on to one of the tech news sites, you see that there are a dozen new applications that have come up in a particular area. These are not just technology applications alone. These are specifically solutions in specific areas as well, pertaining to manufacturing, pertaining to healthcare, retail and so forth as well.
It kind of makes you really want to gnaw at it and find out how does data in motion fit into this movement that’s happening around us? Just to set the context for people around us that are listening to this and need to know why we are focusing on GenAI, if you haven’t been paying attention to what’s happening around you, the GenAI market is growing at a pace where it is going to become a 300 billion market by 2027 and a 1.3 trillion market in 2032, right? That’s the kind of growth that we are talking about. It’s expected to skyrocket. Meaning in 2023, we were talking about software spending on GenAI like 8%, and by 2027 we are talking about 35% is what is expected in terms of GenAI software spending.
Every company is making an announcement one way or the other, talking about GenAI based applications or feature functions they are incorporating into their existing product stack and stuff. What we are also seeing is that more than 80% of the enterprises will use or will have used GenAI applications or would’ve deployed some kind of GenAI applications by 2026. We are talking about literally a couple of years from now. 80% of the enterprises would be doing that.
With all that said, there is no denying that there is a significant movement happening all around us and we need to pay attention to it. One of the key things that I know is important here is the data that feeds into these GenAI applications, and I’m kind of like using GenAI applications as a very broad term. You could be thinking about LLMs and all that kind of stuff under that because it’s kind of like a broader term to mention that. In your view, again, having your experience with ML and AI and all these spaces that you worked with, how do you look at this GenAI space that is happening around us? What do you think of it? Particularly in the context of data and motion, what does that got to do with it? Would love to hear your thoughts.
Manish Devgan (03:40):
Yeah, thanks. As we have agreed, there’s a very high value in fresh data, which is typically data in motion because either it’s being created at that point or it’s being moved, even reverse ETL from a data warehouse to an operational store. When we talk about AI, ML, let me kind of classify it into two, let’s talk about predictive AI first. People refer to predictive AI also as traditional AI or simply machine learning. In an autonomous application like fraud detection, as you are processing streams of data, you may be calling a predictive model. In technical terms, that’s like you’re inferencing a pre-trained model. That model will typically leverage real-time features. What is a real-time feature? Real-time feature could be basically coming from data in motion, like your location information, your average number of transactions in the last five minutes, etc. You can see how important data in motion is here in this predictive AI in real time.
Same as the case, when you, let’s say order using Uber Eats. Based on the fresh data, the app is able to give you estimated time of getting your order. The real-time features being considered in let’s say an Uber Eats application might be traffic patterns, how many orders are there in the pipeline at the restaurant you ordered from, etc. Now that’s predictive AI.
Now in the world of generative AI, I have so far seen customers build mostly conversational applications. Even if the use case is centered around data in motion, there is some human intelligence in the loop. For example, the example I gave about the robotic arm detecting that there is a problem in the pressure sensor and then stopping the train process. In the generative AI case, that robotic arm on the assembly line raises an alarm and sends a notification to the supervisor with a recommendation to stop the assembly line. Now this recommendation may be based on an LLM plus vector search, providing the context. What is the context there? It could be the specs, it could be the manual of the robotic arm, plus additional contextual information like pressure from the nozzles or the average pressure there has been for the last 10 minutes.
Now, the supervisor then interacts with this conversational app because the conversational app could have notified the supervisor that there’s a problem. Then it basically, based on that information, decides what to do. In a strict sense, this is really not autonomous decision-making yet. I think that kind of use cases where you are actually making a decision based on one of the decision points could be an external LM, I think that’ll come with trust as we become more mature. Trust and risks are yet another topic.
Dinesh Chandrasekhar (06:50):
I’m sure. I love the way how you broke down the existing or traditional or predictive AI models versus how you’re looking at the generative AI applications and how they are going to evolve as well, particularly the talk about autonomy and all that. Just curious, out of recent news reading, and I’m thinking out loud as you explain this particular set of use cases, what do you think about the new pricing changes or dynamic pricing that one of the fast food chains and I announced recently and they got a little bit of a backfire on that and they said, “Well, we are not going to be doing that anymore,” because people didn’t want to be overcharged because of peak periods and all that kind of stuff. I mean, dynamic pricing has been around for quite some time, but is it also something to do with data in motion, and particularly in the context of leveraging AI maybe and saying, “Here’s how you can price your burger,” because it’s peak period, the $6 burger, now suddenly it’s like $8. Is that an application or am I overthinking this?
Manish Devgan (07:56):
No, possibly. I mean, I personally don’t like the surge pricing from the right companies, but it’s an opportunity for a business. I think that’s what’s going to happen. The opportunity and the value and the new revenue streams are going to drive a lot of these use cases.
Dinesh Chandrasekhar (08:15):
Fantastic. Maybe as a way to take this conversation to the next level, you spoke about autonomy and I’ve been hearing a lot about autonomous applications and the kind of digital agents that are being put out there. People are naturally fearing their jobs are going to be replaced by these digital agents and all that as well. There’s a lot happening on the autonomy side, chatbots or replacing a lot of the call centers, companies that laid off plenty of people because they are being done by these chatbots now and so forth. Is there a broader set of meaningful applications in the autonomy side where we could potentially see a lot more happening beyond just the traditional chatbots that we are looking at right now, where you see GenAI can be even more value adding into how we do things today so that we can start looking at the forest for the trees, like the bigger picture and maybe doing better things with GenAI?
Manish Devgan (09:23):
Yeah, I mean, I think healthcare legal, I mean, there’s so much information being created every day that the professionals, so in healthcare it is the physician. They’re not able to keep up with the new medical research which is coming about. I think things like generative AI will help them augment their work and make them more productive. I think more than taking away jobs, I feel that people will become more productive. They will have a conversational buddy where they can actually interact with and figure out how that person could actually help them be more productive. I think that’s how I look at it.
Dinesh Chandrasekhar (10:07):
Fantastic. On that note, you are obviously a product leader, an evangelist. You are very vocal. You have been in this space for quite a long time. How do you see this technology vendor landscape changing as you look at so many people doing so many different things, everybody wants to say GenAI and all that. Are there any specific trends that you want to highlight? Anything that we should say, “Manish already called this out on our episode, like a year from now,” and we should take pride in?
Manish Devgan (10:39):
Yeah, only if I knew the future.
Dinesh Chandrasekhar (10:43):
We all wish we had a eight-ball, I know that.
Manish Devgan (10:47):
I think because I’ve been building data platforms for many, many years, I think I see this coming, that the existing data platform need to address these new, I’ll call them GenAI workloads. I’m sure people will build things, officially require things like vector search. If you already have a data platform, they would actually demand those kind of capabilities in their data platform. If you’re a database, vector search would be table stakes.
Now, having said that, I still see big gaps in the use of predictive AI in applications. I’ve seen so many high value use cases where real-time machine learning is being used, but overall, that percentage is still very small. I still see a lot of batch pipelines and the real-time pipelines, the serving pipelines are different, and these need to come together. The whole area of MLOps, you train your model somewhere else, and then you actually have this online inferencing going on, but those pipelines are different. Those I think still need a lot of work.
I also see a big opportunity in app platforms to close the skills gap. Vendors need to help with simple build of applications. If there is a product which can help orchestrate workflows, allowing more use of fresh data, more low-code tools, support for things like domain-specific language or declarative means, et cetera, of processing the data and data might be coming from unbounded data stream. I recently saw an amazing developer experience from a company called VANTIQ. They were showing how easy it is to actually help their customers integrate LLMs with data in motion. Those kind of orchestration tools are going to be important because there is a skills gap. Not every company has developers who can wire these systems together to build your application.
Like you said, I think, Dinesh, that there also this shift in mindset required because data doesn’t have to come to rest for processing to happen. We have to be doing certain things because then that’s what software solutions were at that time. Now things have changed. Now they do help you process data in motion. Of course the big trend we talked about was providing governance around AI in general. Specifically if you’re going to have enterprise apps making active decisions based on high-value data in motion with some parts of the application, potentially tapping into an external LLM. The whole thing around AI observability for building trust as you start making autonomous decision using GenAI will be very, very important.
Dinesh Chandrasekhar (13:46):
Wow. Thank you, Manish. I think that requires a blog post by itself. I think there are several points that made absolute sense to me. Particularly as I’ve been involved in this space for the last several years, I’ve seen how much attention is being paid to data in motion in the last several years compared to where I was in the field a few years ago where there were only a few major players and so forth. Now the kind of applications that have come up, as you said in your explanation about how relevant data and motion is today in the context of the applications that we are building pertaining to GenAI and maybe the low-code, no-code, kind of declarative application that we are building, it makes perfect sense. I think there has been insane growth of vendors and offerings in this particular context, and we are just about to get into an even more complex, sophisticated world with more excitement coming our way as technologists. I think we are going to be super thrilled about it.
Thank you for joining us on this episode. I appreciate your insights. We totally loved having you on this episode. Maybe at some future point we’ll bring you back in and discuss what you predicted and how things are going and all that as well. I think that might be a good way to do this too. I wish you wonderful success with your career and look forward to seeing more insights from you in the tech world. Thank you, Manish.
Manish Devgan (15:15):
Thank you Dinesh.