Real-time Analytics News for the Week Ending August 31

PinIt

In this week’s real-time analytics news: MLCommons released the results of its latest benchmark tests.

Keeping pace with news and developments in the real-time analytics and AI market can be a daunting task. Fortunately, we have you covered with a summary of the items our staff comes across each week. And if you prefer it in your inbox, sign up here!

MLCommons announced results for its industry-standard MLPerf Inference v4.1 benchmark suite, which delivers machine learning (ML) system performance benchmarking in an architecture-neutral, representative, and reproducible manner. This release includes first-time results for a new benchmark based on a mixture of experts (MoE) model architecture. It also presents new findings on power consumption related to inference execution.

MLPerf Inference v4.1 includes 964 performance results from 22 submitting organizations: AMD, ASUSTek, Cisco Systems, Connect Tech Inc, CTuning Foundation, Dell Technologies, Fujitsu, Giga Computing, Google Cloud, Hewlett Packard Enterprise, Intel, Juniper Networks, KRAI, Lenovo, Neutral Magic, NVIDIA, Oracle, Quanta Cloud Technology, Red Hat, Supermicro, Sustainable Metal Cloud, and Untether AI. The benchmark results also included the debut of six newly available or soon-to-be-shipped processors.

NVIDIA announced new libraries in accelerated computing to deliver order-of-magnitude speedups and reduce energy consumption and costs in data processing, generative AI, recommender systems, AI data curation, data processing, and more. The new libraries include:

  • LLM applications: NeMo Curator, to create custom datasets, adds image curation and Nemotron-4 340B for high-quality synthetic data generation.
  • Data processing: cuVS for vector search to build indexes in minutes instead of days and a new Polars GPU Engine in open beta.

In other NVIDIA news, the company announced NVIDIA NIM Agent Blueprints, a catalog of pre-trained, customizable AI workflows that equip millions of enterprise developers with a full suite of software for building and deploying generative AI applications for canonical use cases. Global system integrators and technology solutions providers Accenture, Deloitte, SoftServe, and World Wide Technology (WWT) are bringing NVIDIA NIM Agent Blueprints to enterprises worldwide. Cisco, Dell Technologies, Hewlett Packard Enterprise, and Lenovo are offering full-stack NVIDIA-accelerated infrastructure and solutions to speed NIM Agent Blueprints deployments.

Accenture and Google Cloud announced that their strategic alliance is advancing solutions for enterprise clients and seeing strong momentum across industries in two critical and related areas: generative AI and cybersecurity. To that end,

  • The two companies are increasing their investments in services that support businesses through every stage of their gen AI projects, including providing the expertise to determine optimal use cases, piloting projects for strategic innovation, and deploying the engineering prowess needed to scale the technology and secure the enterprise.
  • Accenture and Google Cloud are also deepening their security work as clients adapt to new risks unique to gen AI, including securing model data, managing cyberattacks, and delivering remediation services that minimize breach impact and enable faster recovery.

Other real-time analytics news in brief

IBM announced architecture details for the upcoming IBM Telum II Processor and IBM Spyre Accelerator. The new technologies are designed to significantly scale processing capacity across next-generation IBM Z mainframe systems and IBM LinuxOne platforms, helping accelerate the use of traditional AI models and Large Language AI models.

The key items announced this week include:

  • IBM Telum II Processor: Designed to power next-generation IBM Z systems, the new IBM chip features increased frequency, memory capacity, a 40 percent growth in cache and integrated AI accelerator core as well as a coherently attached Data Processing Unit (DPU) versus the first generation Telum chip. The new processor is expected to support enterprise compute solutions for LLMs, servicing the industry’s complex transaction needs.
  • IO acceleration unit: A completely new Data Processing Unit (DPU) on the Telum II processor chip is engineered to accelerate complex IO protocols for networking and storage on the mainframe. The DPU simplifies system operations and can improve key component performance.
  • IBM Spyre Accelerator: Provides additional AI compute capability to complement the Telum II processor. Working together, the Telum II and Spyre chips form a scalable architecture to support ensemble methods of AI modeling – the practice of combining multiple machine learning or deep learning AI models with encoder LLMs.

In other IBM news, the company and Intel announced a global collaboration to deploy Intel Gaudi 3 AI accelerators as a service on IBM Cloud. This offering is expected to be available in early 2025. The collaboration will also enable support for Gaudi 3 within IBM’s watsonx AI and data platform.

Cerebras Systems announced Cerebras Inference. Delivering 1,800 tokens per second for Llama 3.1 8B and 450 tokens per second for Llama 3.1 70B, Cerebras Inference is 20 times faster than NVIDIA GPU-based solutions in hyperscale clouds. Unlike alternative approaches that compromise accuracy for performance, Cerebras offers high performance while maintaining state-of-the-art accuracy by staying in the 16-bit domain for the entire inference run.

Dremio announced the general availability of new features to its Unified Lakehouse Platform to enhance data analytics capabilities. New features include Live Reflections to ensure that the materialized views and aggregations are automatically updated and Result Set Caching to accelerate query responses across all data sources by storing frequently accessed query results. Additional platform advancements include Reflections, a query acceleration technology; Reflection Recommendations, which analyzes query patterns and recommends views and aggregations; and automatic Iceberg data ingestion, which simplifies and automates the development and management of Apache Iceberg data pipelines.

Lenovo announced that it is one of the first NVIDIA partners to deliver NVIDIA NIM Agent Blueprints to global enterprises. The first NIM Agent Blueprints include a digital human workflow for customer service, a generative virtual screening workflow for accelerated drug discovery, and a multimodal PDF data extraction workflow for enterprise retrieval-augmented generation (RAG) that lets generative AI applications talk to business data for more accurate responses.

Newgen Software announced the release of NewgenONE Marvin – APEX Edition. The features of the APEX Edition include auto-classification and metadata pre-fill to automate document categorization and speech-to-text capabilities. UI Designing, through natural language prompts, will help transform interface development by leveraging GenAI. Additionally, the new Marvin edition prioritizes security for businesses with role-based access permissions to protect sensitive data and promote Responsible AI. The GenAI update will ensure pre-built guardrails, such as Llama Guard 3 and Prompt Guard.

Orby AI (Orby) announced that it has partnered with Databricks to empower enterprise automation powered by Orby’s Large Action Model (LAM). To that end, Orby has joined Databricks’ Built On Partner Program and is leveraging Databricks Mosaic AI to pretrain, build, deploy, and monitor its Large Action Model, ActIO, a deep learning model able to interpret actions and perform complex tasks based on user inputs.

Progress announced the latest release of Progress Semaphore, its metadata management and semantic AI platform. The Semaphore 5.10 platform release introduces intuitive AI-assisted knowledge modeling, extended mapping capabilities, and classification filtering to enhance user productivity and simplify the knowledge management experience. Specifically, new items in Semaphore 5.10 include an AI model builder, filtering of classification results by publish set, mapping relations within the current model, and easy import and export of Shapes Constraint Language (SHACL).

Quest Software released the newest iterations of erwin Data Intelligence and erwin Data Modeler, with advanced data intelligence, quality, and modeling capabilities. The solutions provide a comprehensive approach to managing, governing, and leveraging data across complex enterprise landscapes, ensuring organizations are equipped to meet the challenges of AI and beyond.

Red Hat announced the general availability of Red Hat OpenStack Services on OpenShift, the next major release of the Red Hat OpenStack Platform. Red Hat OpenStack Services on OpenShift opens up a new pathway for organizations to rethink their virtualization strategies, making it easier for them to scale, upgrade, and add resources to their cloud environments. With this offering, enterprises can better unify traditional and cloud-native networks into a singular, modernized network fabric.

Sumo Logic announced that it has signed a Strategic Collaboration Agreement (SCA) with Amazon Web Services (AWS). The SCA will focus on continued innovation to accelerate cybersecurity, application observability, and automation fueled by artificial intelligence (AI). Specifically, service enhancements such as Sumo Logic’s SaaS Log Analytics Platform with Amazon Bedrock and Amazon Security Lake will drive cloud security and observability, providing powerful visibility and transparency across all AWS environments.

Vultr announced that SQream joined the Vultr Cloud Alliance, a partnership program consisting of solutions enabling composable cloud services. By combining Vultr’s high-performance cloud compute, accelerated by NVIDIA GPUs, with SQream’s next generation, patented GPU-powered data processing, AI-driven enterprises can fast-track their data analysis and machine learning projects without the burden of traditional data processing limitations.

If your company has real-time analytics news, send your announcements to [email protected].

In case you missed it, here are our most recent previous weekly real-time analytics news roundups:

Salvatore Salamone

About Salvatore Salamone

Salvatore Salamone is a physicist by training who has been writing about science and information technology for more than 30 years. During that time, he has been a senior or executive editor at many industry-leading publications including High Technology, Network World, Byte Magazine, Data Communications, LAN Times, InternetWeek, Bio-IT World, and Lightwave, The Journal of Fiber Optics. He also is the author of three business technology books.

Leave a Reply

Your email address will not be published. Required fields are marked *