Deepinfra in the Media

7 mentions across press, blogs, and newsletters

March 2026

z.ai debuts faster, cheaper GLM-5 Turbo model for agents and 'claws' — but it's not open-source

Chinese AI startup Z.ai, known for its powerful, open source GLM family of large language models (LLMs), has introduced GLM-5-Turbo, a new, proprietary variant of its open source GLM-5 model aimed at agent-driven workflows, with

Mar 16, 2026

February 2026

WebProNews

Nvidia’s $68 Billion Quarter Proves the AI Gold Rush Is Far From Over — It’s Accelerating

Nvidia posted record quarterly revenue of $68.1 billion, with data center sales surging 75% to $62.3 billion. Guidance of $78 billion for next quarter crushed estimates, signaling t

Feb 26, 2026

Dataconomy

NVIDIA Blackwell Ultra delivers 50x higher efficiency for agentic AI

<img alt="NVIDIA Blackwell Ultra delivers 50x higher efficiency for agentic AI" class="webfeedsFeaturedVisual wp-post-image" height="675" src="https://dataconomy.com/wp-content/uploads/2026/02/1110325.jpg" style="display: block; margin: auto; margin-bottom: 10px;" title="NVIDIA Blackwell Ultra deliv

Feb 17, 2026

TechBuzz

Nvidia's Blackwell Ultra Slashes AI Agent Costs 35x

New benchmarks show Nvidia's latest chip delivers 50x performance boost for agentic AI

Feb 16, 2026

Nvidia Blog

New Data Shows NVIDIA Blackwell Ultra Delivers up to 50x Better Performance and 35x Lower Costs for Agentic AI

The NVIDIA Blackwell platform has been widely adopted by leading inference providers such as Baseten, DeepInfra, Fireworks AI and Together AI to reduce cost per token by up to 10x. Now, the NVIDIA Blackwell Ultra platform is taking this momentum further for agentic AI. AI agents and coding assistant

Feb 16, 2026

Nvidia Blog

Leading Inference Providers Cut AI Costs by up to 10x With Open Source Models on NVIDIA Blackwell

A diagnostic insight in healthcare. A character’s dialogue in an interactive game. An autonomous resolution from a customer service agent. Each of these AI-powered interactions is built on the same unit of intelligence: a token. Scaling these AI interactions requires businesses to consider whether t

Feb 12, 2026

VentureBeatTech Media

AI inference costs dropped up to 10x on Nvidia's Blackwell — but hardware is only half the equation

Lowering the cost of inference is typically a combination of hardware and software. A new analysis released Thursday by Nvidia details how four leading inference providers are reporting 4x to 10x reductions in cost per token.The dramatic cost reductions were achieved using Nvidia's Bl

Feb 12, 2026