Gimlet Labs Raises $80M Series A to Optimize AI Inference Across Multi-Silicon Infrastructure
Funding Details
$80M
Series A
There’s a tax in AI nobody budgeted for. Not on paper, not in pitch decks, but it shows up real quick when systems hit scale. It’s waste. Expensive silicon flexing at a fraction of its potential while everyone pretends throwing more GPUs at the problem is strategy. The industry loves to talk power. Very few are talking about precision.
Enter Gimlet Labs, stepping out of San Francisco with $80M in Series A funding and a point to prove. Menlo Ventures led the round, with Eclipse, Factory, Prosperity7 Ventures, and Triatomic Capital doubling down like they’ve seen this movie before and know how it ends. Total funding now sits at $92M, and this isn’t a science project. They launched in October 2025 already printing 8-figure revenue. That’s not hype. That’s velocity.
Credit where it’s due. Zain Asgar, Co Founder and CEO, along with co founders Michelle Nguyen, Omid Azizi, Natalie Serrino, and James Bartlett didn’t just stumble into this lane. They’ve been in the infrastructure trenches, building systems that actually survive contact with reality. You can feel that in the product. No theatrics, just execution.
Gimlet Labs is building a serverless inference layer for AI agents, but calling it that almost undersells the move. This is a multi silicon inference cloud that treats hardware like a lineup, not a limitation. Every stage of an agent workload gets routed to the chip that actually fits. GPUs, CPUs, whatever gets the job done without forcing developers to rewrite everything just to play nice. It’s less about brute force and more about precision. Like using a scalpel in a world addicted to sledgehammers.
And the timing is sharp. AI has officially graduated from demo to deployment, and suddenly inference is the bill nobody wants to look at. Enterprises are realizing their infrastructure is running at a fraction of its potential, while costs keep climbing like they’ve got something to prove. Gimlet’s answer is simple in theory, brutal in execution. Decouple the workload from the hardware and let the system decide what runs where. Efficiency stops being a buzzword and starts showing up on the balance sheet.
The traction tells its own story. Customer base tripled in 5 months. Deployed across AI native companies and Fortune 500 environments. Quiet nods to a major model provider and a hyperscaler in the mix. No name dropping, just results. That’s usually how you know it’s real.
What stands out here isn’t just the funding. It’s the discipline. They didn’t chase noise. They found the bottleneck everyone else was stepping over and built the release valve. In a market obsessed with building the next brain, Gimlet Labs is making sure the body can actually move.
There’s a lesson in that for anyone building right now. The winners won’t just be the ones who create intelligence. It’ll be the ones who make it usable, scalable, and economically sane. The ones who turn chaos into flow.









