Latest
WP Intelligence Brings America’s AI Health Care Reckoning to AI+ ExpoWP Intelligence Brings America’s AI Health Care Reckoning to AI+ Expo|Merciv Inc. and Pulse NYC Bring a Curated AI Week New York Gathering to SoHoMerciv Inc. and Pulse NYC Bring a Curated AI Week New York Gathering to SoHo|Vori Raises $22M Series B to Build AI-Powered Operating System for Independent GrocersVori Raises $22M Series B to Build AI-Powered Operating System for Independent Grocers|Pulse NYC’s AI Week Puts Production-Grade Agentic AI Under the MicroscopePulse NYC’s AI Week Puts Production-Grade Agentic AI Under the Microscope|Cartography Biosciences Secures Strategic Investment to Advance Precision Oncology Target DiscoveryCartography Biosciences Secures Strategic Investment to Advance Precision Oncology Target Discovery|Crunchbase and Snowflake Ventures Examine What Actually Scales in Enterprise AICrunchbase and Snowflake Ventures Examine What Actually Scales in Enterprise AI|York Effect Brings Boardroom Pressure to AI Week with Defending Your BudgetYork Effect Brings Boardroom Pressure to AI Week with Defending Your Budget|Boost Security Raises Additional $4M to Strengthen AI-Native Software Supply Chain SecurityBoost Security Raises Additional $4M to Strengthen AI-Native Software Supply Chain Security|Griffin Launches $100M Fund to Finance Game Development Through Revenue-Sharing ModelGriffin Launches $100M Fund to Finance Game Development Through Revenue-Sharing Model|Modicus Prime Secures $4.5M to Build AI Compliance Infrastructure for Pharma ManufacturingModicus Prime Secures $4.5M to Build AI Compliance Infrastructure for Pharma Manufacturing|WP Intelligence Brings America’s AI Health Care Reckoning to AI+ ExpoWP Intelligence Brings America’s AI Health Care Reckoning to AI+ Expo|Merciv Inc. and Pulse NYC Bring a Curated AI Week New York Gathering to SoHoMerciv Inc. and Pulse NYC Bring a Curated AI Week New York Gathering to SoHo|Vori Raises $22M Series B to Build AI-Powered Operating System for Independent GrocersVori Raises $22M Series B to Build AI-Powered Operating System for Independent Grocers|Pulse NYC’s AI Week Puts Production-Grade Agentic AI Under the MicroscopePulse NYC’s AI Week Puts Production-Grade Agentic AI Under the Microscope|Cartography Biosciences Secures Strategic Investment to Advance Precision Oncology Target DiscoveryCartography Biosciences Secures Strategic Investment to Advance Precision Oncology Target Discovery|Crunchbase and Snowflake Ventures Examine What Actually Scales in Enterprise AICrunchbase and Snowflake Ventures Examine What Actually Scales in Enterprise AI|York Effect Brings Boardroom Pressure to AI Week with Defending Your BudgetYork Effect Brings Boardroom Pressure to AI Week with Defending Your Budget|Boost Security Raises Additional $4M to Strengthen AI-Native Software Supply Chain SecurityBoost Security Raises Additional $4M to Strengthen AI-Native Software Supply Chain Security|Griffin Launches $100M Fund to Finance Game Development Through Revenue-Sharing ModelGriffin Launches $100M Fund to Finance Game Development Through Revenue-Sharing Model|Modicus Prime Secures $4.5M to Build AI Compliance Infrastructure for Pharma ManufacturingModicus Prime Secures $4.5M to Build AI Compliance Infrastructure for Pharma Manufacturing
Back to articles

DeepInfra Raises $107M in Series B to Scale AI Inference Infrastructure

The spotlight moved, and most people missed the handoff. Training soaked up the applause, but inference kept the receipts. DeepInfra just locked in $107M in Series B funding, and the message is clear. Inference isn’t the backend anymore. It’s the business.

Nikola Borisov, Founder and CEO, alongside founders Georgios Papoutsis and Yessenzhar Kanapin, built this from the metal up in September 2022. Not metaphorically. Literally. GPU to API. No shortcuts, no dependency theater. Because when a single token demands 100B+ operations, you don’t outsource performance and hope for the best. You own it, tune it, and squeeze every millisecond until it behaves.

This isn’t their first time dancing at scale. The same crew previously engineered the infrastructure behind imo, pushing 1B+ downloads and 200M monthly active users through pipes that didn’t blink under pressure. That kind of scar tissue doesn’t show up in pitch decks, but it shows up in products that actually work when demand spikes and excuses run dry.

The May 4, 2026 round pulls in a serious table. 500 Global and Georges Harik co-leading, with NVIDIA, Felicis, A Capital, Crescent Cove, PEAK6, Samsung Next, Supermicro, Upper90, SV Angel, Silkroad Innovation Hub, BrightCap Ventures, and Chamaeleon all stepping in. That’s not just capital. That’s alignment across silicon, systems, and scale. When the hardware guys and the capital guys nod at the same time, something real is happening.

DeepInfra’s bet is simple, but not easy. Open-source models win, but only if the infrastructure doesn’t choke. Companies want control without lock-in, performance without mystery pricing, and reliability that doesn’t require a support ticket and a prayer. So DeepInfra built an inference cloud that stays fast, stays predictable, and stays out of the way. Over 100 models, clean APIs, and infrastructure that treats latency like a bug, not a feature.

They’re also not hiding behind vague trust claims. SOC 2 Certified and ISO 27001 Certified. That’s the difference between saying you’re enterprise-ready and actually being invited into the room.

Look at the team and you see the pattern. Engineers like Iskren Chernev, Pernekhan Utemuratov, Patrick Horn, Stefan Fidanov, Vasil Lyutskanov, Shang-Pin Shang, and Thach Nguyen building the core. Operators across customer success, sales, finance, and recruiting tightening the system around it. This isn’t a science project. It’s a machine.

The takeaway sits right under the surface. They didn’t chase the loudest problem. They chased the most expensive one. Inference is where cost, speed, and reliability collide. Fix that, and everything upstream suddenly looks smarter.