In this week’s real-time analytics news: NVIDIA announced a new NVIDIA AI Foundry service and NVIDIA NIM inference microservices to supercharge GenAI.
Keeping pace with news and developments in the real-time analytics and AI market can be a daunting task. Fortunately, we have you covered with a summary of the items our staff comes across each week. And if you prefer it in your inbox, sign up here!
NVIDIA announced a new NVIDIA AI Foundry service and NVIDIA NIM inference microservices to supercharge generative AI for the world’s enterprises with the Llama 3.1 collection of openly available models, also introduced today.
With NVIDIA AI Foundry, enterprises can create custom “supermodels” for their domain-specific industry use cases using Llama 3.1 and NVIDIA software, computing, and expertise. Enterprises can train these supermodels with proprietary data as well as synthetic data generated from Llama 3.1 405B and the NVIDIA Nemotron Reward model.
In related news, Accenture announced the launch of the Accenture AI Refinery framework, built on NVIDIA AI Foundry, to enable clients to build custom LLM models with the Llama 3.1 collection of openly available models. The Accenture AI Refinery framework, which sits within its foundation model services, will enable clients to build custom LLMs with domain-specific knowledge and deploy powerful AI systems that reflect their unique business needs.
Other real-time analytics news in brief
Yandex Research, in collaboration with researchers from IST Austria, NeuralMagic, and KAUST, has developed two innovative compression methods for large language models: Additive Quantization for Language Models (AQLM) and PV-Tuning. When combined, these methods allow for a reduction in model size by up to eight times while preserving response quality by 95%. The methods aim to optimize resources and enhance efficiency in running large language models. The research article detailing this approach was featured at the recent International Conference on Machine Learning (ICML).
The U.S. National Science Foundation announced the launch of a new initiative that will invest in the development of artificial intelligence-ready test beds, an infrastructure designed to propel responsible AI research and innovation forward. These test beds, or platforms, will allow researchers to study new AI methods and systems in secure, real-world settings. The initiative calls for planning grants from the research community to accelerate the development of the test beds. The test beds will support interdisciplinary collaborations, bringing together private AI laboratories, academia, and third-party evaluators to support the design, development, and deployment of AI systems.
Amazon Web Services (AWS) announced that the next generation of Llama models from Meta are now available on AWS via Amazon Bedrock and Amazon SageMaker. They are also available via Amazon Elastic Compute Cloud (Amazon EC2) using AWS Trainium and Inferentia. The Llama 3.1 models are particularly suited for developers, researchers, and businesses to use for text summarization and classification, sentiment analysis, language translation, and code generation.
Apica announced that its Ascent platform has achieved Powered by Oracle Cloud Expertise and is now available in the Oracle Cloud Marketplace, offering added value to Oracle Cloud Infrastructure (OCI) customers. Powered by OCI, Apica Ascent offers OCI customers benefits, including a visual builder for creating telemetry pipelines; data transformation controls for filtering, normalizing, and enriching data; controls to help reduce data and infrastructure costs; built-in fleet management of data collection agents; and more.
C3 AI announced that it has achieved the Amazon Web Services (AWS) Generative AI Competency. This specialization recognizes C3 AI as an AWS Partner that helps customers and the AWS Partner Network (APN) to drive the advancement of services, tools, and infrastructure pivotal for implementing generative AI technologies. Achieving the AWS Generative AI Competency differentiates C3 AI as an AWS Partner that has demonstrated technical proficiency and customer success in supporting minimized hallucinations, prompt engineering, model customization, and data privacy.
Census announced its expansion into data transformation and governance tooling. The new approach, dubbed the Universal Data Platform (UDP), aims to solve the data collaboration challenges that prevent business teams and data teams from working effectively. To that end, the company announced the release of its new Universal Data Platform, a collaborative workspace where data teams and business teams can come together to work more efficiently.
Denodo announced that the Denodo Platform now offers seamless integration with Amazon Bedrock large language models (LLMs) to streamline the development of new GenAI enterprise applications with security, privacy, and responsible AI. To that end, the Denodo Platform feeds Amazon Bedrock LLMs with governed, trusted data from all applicable data sources within a retrieval augmented generation (RAG) framework and leveraging Amazon OpenSearch as the vector database.
Groq launched Llama 3.1 models powered by its LPU AI inference technology. Groq partnered with Meta on this launch and runs the latest Llama 3.1 models, including 405B Instruct, 70B Instruct, and 8B Instruct. The three models are available on GroqCloud Dev Console, a community of over 300K developers already building on Groq systems, and on GroqChat for the general public.
Hitachi Vantara, a subsidiary of Hitachi, Ltd., announced the general availability of initial offerings of the Hitachi iQ portfolio of AI-ready infrastructure, solutions, and services. Including NVIDIA DGX BasePOD certification, the first Hitachi iQ infrastructure offering meets the highest standards of performance and reliability and helps customers seamlessly power their most critical AI applications.
Iterative announced the upcoming release of DataChain, a new open-source tool for processing and evaluating unstructured data. DataChain democratizes the popular AI-based analytical capabilities like ‘large language models (LLMs) judging LLMs’ and multimodal GenAI evaluations, greatly leveling the playing field for data curation and pre-processing. DataChain can also store and structure Python object responses using the latest data model schemas – such as those utilized by leading LLM and AI foundational model providers.
John Snow Labs announced the release of Automated Responsible AI Testing Capabilities in the Generative AI Lab. This no-code tool tests and evaluates the safety and efficacy of custom language models. It enables non-technical domain experts to define, run, and share test suites for AI model bias, fairness, robustness, and accuracy. This capability is based on John Snow Labs’ open-source LangTest library, which includes more than 100 test types for different aspects of Responsible AI, from bias and security to toxicity and political leaning. LangTest uses Generative AI to automatically generate test cases, making it practical to produce a comprehensive set of tests in minutes instead of weeks.
Mistral AI and NVIDIA released a new state-of-the-art language model, Mistral NeMo 12B, that developers can easily customize and deploy for enterprise applications supporting chatbots, multilingual tasks, coding, and summarization. By combining Mistral AI’s expertise in training data with NVIDIA’s optimized hardware and software ecosystem, the Mistral NeMo model offers high performance for diverse applications.
Progress announced the availability of Progress MarkLogic FastTrack, a UI toolkit for building data- and search-driven applications to visually explore complex connected data stored in the Progress MarkLogic platform. With this release, Progress enables IT and data experts to better collaborate on providing decision-makers with easy, visual access to the enterprise data they need to inform strategic business initiatives. In addition, organizations can accelerate business insights and add depth to their AI-driven applications and systems by enhancing their ability to automate data analysis and interpretation.
Soracom announced two new services designed to accelerate the deployment of larger and more complex IoT projects by embedding Generative AI (GenAI) capabilities more deeply in the IoT connectivity stack. Soracom Flux is a low-code application builder that allows even non-technical users to build AI-integrated IoT applications in real time by defining data flows between sensors, cameras, actuators, GenAI engines and the cloud. Soracom Query Intelligencesimplifies the management of large IoT deployments with natural-language network data analysis.
Snowflake announced that it will host the Llama 3.1 collection of multilingual open-source large language models (LLMs) in Snowflake Cortex AI for enterprises to easily harness and build AI applications at scale. This offering includes Meta’s largest and most powerful open-source LLM, Llama 3.1 405B, with Snowflake developing and open-sourcing the inference system stack to enable real-time, high-throughput inference and further democratize powerful natural language processing and generation applications.
If your company has real-time analytics news, send your announcements to [email protected].
In case you missed it, here are our most recent previous weekly real-time analytics news roundups:
- Real-time Analytics News for the Week Ending July 20
- Real-time Analytics News for the Week Ending July 13
- Real-time Analytics News for the Week Ending June 29
- Real-time Analytics News for the Week Ending June 22
- Real-time Analytics News for the Week Ending June 15
- Real-time Analytics News for the Week Ending June 8
- Real-time Analytics News for the Week Ending June 1
- Real-time Analytics News for the Week Ending May 25
- Real-time Analytics News for the Week Ending May 18