Arcadia’s update makes it possible for end-users to query all the data stored in Hadoop versus requiring IT organizations to prepare datasets.
Another step towards eliminating the need to ask IT staff to query data is being made today with the addition of a search interface based on a natural language engine that is being added to a Big Data analytics application from Arcadia Data.
A forthcoming release of Arcadia Enterprise, due out in the fourth quarter, will make it possible for end-users to query all the data stored in Hadoop versus requiring IT organizations to prepare datasets for them to query using rival self-service business intelligence and analytics applications, says Dale Kim, senior director for products and solutions at Arcadia Data.
See also: Hadoop data in the dark? How governance and metadata helps
Instead, end users can make use of a Google-like search engine experience to not just query Arcadia Enterprise, but also build their own dashboards using visual tools built into Arcadia Enterprise, adds Kim.
The result is a more intuitive approach to iteratively interrogating massive amounts of data in near real-time without having to rely on an IT department to set up each query and produce a report, says Kim.
Reliance on data scientists and IT staffs have become especially problematic in an era where end users want to continuously update data sets, notes Kim.
“There a lot more changes to data sets being made these days,” says Kim.
The Arcadia natural language engine makes use of machine learning algorithms to provide type-ahead and suggestion capabilities that recommend related questions users may be interested in. Arcadia Enterprise also scores questions against all datasets in the system. The best answer is displayed along with a list of other possible answers with lower scores. When a user selects an alternative answer, the system learns that that result is a potentially more relevant answer to that question.
Arcadia Data works natively with big data environments such as Apache Hadoop; cloud stores like Amazon S3 and Azure ADLS; streaming systems like Apache Kafka; and database platforms such as Oracle, Teradata, Snowflake and Amazon Redshift.
Many organizations are struggling to get a return on their Hadoop investments because analyzing data stored in Hadoop requires a small army of data scientists and IT professionals to both set up the platform and then manage how its gets accessed. The Arcadia Data approach essentially takes data scientists and IT staff out the query management process.
It’s not clear to what degree access to search-based analytics tools will drive increased demand for a more modern approach to building data warehouses. But it is clear end users are being asked to make more business decisions at a faster rate. By providing access to platforms capable of storing massive amounts of data it becomes possible to generate predictive analytics in a way that tends to be more accurate than legacy analytics applications that rely on data subsets to sample all the data available.
Of course, not every business user trusts the quality of the data their organization collects to make a business decision. But as analytics gets applied across massive amounts of data it becomes easier to distinguish signal from the noise to make decisions based more on facts than intuition.