eCommerceNews US - Technology news for digital commerce decision-makers
Story image

Red Hat unveils enhanced AI tools for hybrid cloud deployments

Today

Red Hat has expanded its AI portfolio, introducing Red Hat AI Inference Server along with validated models and new API integrations, aimed at enabling more efficient enterprise AI deployments across diverse environments.

Red Hat AI Inference Server, now included in the Red Hat AI suite, provides scalable, consistent, and cost-effective inference for hybrid cloud setups. This server is integrated into the newest releases of both Red Hat OpenShift AI and Red Hat Enterprise Linux AI, while also being available as a standalone product. The offering is designed to optimise performance, flexibility, and resource usage for organisations deploying AI-driven applications.

To address the challenge many enterprises face in model selection and deployment, Red Hat has announced availability of third party validated AI models, accessible on Hugging Face. These models are tested to ensure optimal performance on the Red Hat AI platform. Red Hat also offers deployment guidance to assist customers, with select models benefiting from model compression techniques to reduce their size and increase inference speed. This approach is intended to minimise computational resources and operating costs, while the validation process helps customers remain current with the latest in generative AI innovation.

The company has begun integrating the Llama Stack, developed by Meta, alongside Anthropic's Model Context Protocol (MCP), offering standardised APIs for building and deploying AI applications and agents. Currently available in developer preview in Red Hat AI, Llama Stack delivers a unified API that includes support for inference with vLLM, retrieval-augmented generation, model evaluation, guardrails, and agent functionality. MCP, meanwhile, enables AI models to connect with external tools using a standardised interface, facilitating API and plugin integrations during agent workflows.

The new version of Red Hat OpenShift AI (v2.20) introduces enhancements that support the development, training, deployment, and monitoring of both generative and predictive AI models at scale. A technology preview model catalogue offers access to validated Red Hat and third party models, while distributed training capabilities via the KubeFlow Training Operator enable efficient scheduling and execution of AI model tuning across multiple nodes and GPUs. This includes support for remote direct memory access (RDMA) networking and optimised GPU utilisation, reducing operational costs. A feature store based on the Kubeflow Feast project is also available in technology preview, providing a central repository for managing and serving data, intended to improve accuracy and reusability of models.

Red Hat Enterprise Linux AI 1.5 introduces updates that extend the platform's reach and its multi-language support. The system is now available on Google Cloud Marketplace, which expands customer options for running AI workloads in public cloud platforms including AWS and Azure. Enhanced language capabilities for Spanish, German, French, and Italian have been added through InstructLab, enabling model customisation in these languages. Customers are also able to bring their own teacher models for detailed tuning, with support for Japanese, Hindi, and Korean planned for the future.

Additionally, the Red Hat AI InstructLab on IBM Cloud service is now generally available, aimed at simplifying model customisation and improving scalability for customers wishing to use unique data sets for AI development.

Red Hat states its long-term aim is to provide a universal inference platform that allows organisations to deploy any AI model on any accelerator and across any cloud provider. The company's approach seeks to help enterprises avoid infrastructure silos and better realise the value of their investments in generative AI.

Joe Fernandes, Vice President and General Manager of the AI Business Unit at Red Hat, said, "Faster, more efficient inference is emerging as the newest decision point for gen AI innovation. Red Hat AI, with enhanced inference capabilities through Red Hat AI Inference Server and a new collection of validated third-party models, helps equip organisations to deploy intelligent applications where they need to, how they need to and with the components that best meet their unique needs."

Michele Rosen, Research Manager at IDC, commented on shifting enterprise AI needs: "Organisations are moving beyond initial AI explorations and are focused on practical deployments. The key to their continued success lies in the ability to be adaptable with their AI strategies to fit various environments and needs. The future of AI not only demands powerful models, but models that can be deployed with ability and cost-effectiveness. Enterprises seeking to scale their AI initiatives and deliver business value will find this flexibility absolutely essential."

Red Hat's recent portfolio enhancements are in line with the views outlined by Forrester, which stated open source software will be instrumental in accelerating enterprise AI programmes.

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X