Latest
Salesforce Acquires Contentful: Why Content Infrastructure Just Became Strategic InfrastructureSalesforce Acquires Contentful: Why Content Infrastructure Just Became Strategic Infrastructure|Kosmos Raises $5M Seed Round Led by Norwest for Enterprise Operational IntelligenceKosmos Raises $5M Seed Round Led by Norwest for Enterprise Operational Intelligence|Novellia Raises $18M in Series A FundingNovellia Raises $18M Series A to Expand Patient-Controlled Health Data InfrastructureNovellia Raises $18M in Series A FundingNovellia Raises $18M Series A to Expand Patient-Controlled Health Data Infrastructure|Ilant Health Raises $15M Series A to Scale Obesity Care InfrastructureIlant Health Raises $15M Series A to Scale Obesity Care Infrastructure|Subtle Medical Raises $33M Series C as Imaging AI Moves UpstreamSubtle Medical Raises $33M Series C as Imaging AI Moves Upstream|Adaptive Innovations Raises $50M Series A to Bring AI Into Home Health's Most Expensive ProblemAdaptive Innovations Raises $50M Series A to Bring AI Into Home Health's Most Expensive Problem|Tavo Biotherapeutics Raises $17M Series A to Push Glaucoma Treatment Beyond Pressure Controlies A FundingTavo Biotherapeutics Raises $17M Series A to Push Glaucoma Treatment Beyond Pressure Controlies A Funding|Advanced NanoTherapies Raises $31M Series B to Advance SirPlux Duo and FDA IDE ProgramsAdvanced NanoTherapies Raises $31M Series B to Advance SirPlux Duo and FDA IDE Programs|ZeroDrift Raises $10M Seed Round to Build AI Compliance InfrastructureZeroDrift Raises $10M Seed Round to Build AI Compliance Infrastructure|StratusGrid Raises $3M in Seed FundingStratusGrid Raises $3M After 15 Years of Bootstrapped Growth in Cloud InfrastructureStratusGrid Raises $3M in Seed FundingStratusGrid Raises $3M After 15 Years of Bootstrapped Growth in Cloud Infrastructure|Salesforce Acquires Contentful: Why Content Infrastructure Just Became Strategic InfrastructureSalesforce Acquires Contentful: Why Content Infrastructure Just Became Strategic Infrastructure|Kosmos Raises $5M Seed Round Led by Norwest for Enterprise Operational IntelligenceKosmos Raises $5M Seed Round Led by Norwest for Enterprise Operational Intelligence|Novellia Raises $18M in Series A FundingNovellia Raises $18M Series A to Expand Patient-Controlled Health Data InfrastructureNovellia Raises $18M in Series A FundingNovellia Raises $18M Series A to Expand Patient-Controlled Health Data Infrastructure|Ilant Health Raises $15M Series A to Scale Obesity Care InfrastructureIlant Health Raises $15M Series A to Scale Obesity Care Infrastructure|Subtle Medical Raises $33M Series C as Imaging AI Moves UpstreamSubtle Medical Raises $33M Series C as Imaging AI Moves Upstream|Adaptive Innovations Raises $50M Series A to Bring AI Into Home Health's Most Expensive ProblemAdaptive Innovations Raises $50M Series A to Bring AI Into Home Health's Most Expensive Problem|Tavo Biotherapeutics Raises $17M Series A to Push Glaucoma Treatment Beyond Pressure Controlies A FundingTavo Biotherapeutics Raises $17M Series A to Push Glaucoma Treatment Beyond Pressure Controlies A Funding|Advanced NanoTherapies Raises $31M Series B to Advance SirPlux Duo and FDA IDE ProgramsAdvanced NanoTherapies Raises $31M Series B to Advance SirPlux Duo and FDA IDE Programs|ZeroDrift Raises $10M Seed Round to Build AI Compliance InfrastructureZeroDrift Raises $10M Seed Round to Build AI Compliance Infrastructure|StratusGrid Raises $3M in Seed FundingStratusGrid Raises $3M After 15 Years of Bootstrapped Growth in Cloud InfrastructureStratusGrid Raises $3M in Seed FundingStratusGrid Raises $3M After 15 Years of Bootstrapped Growth in Cloud Infrastructure
Back to articles

Human Archive Raises $8.2M Seed to Build the Data Layer Behind Physical AI

Human Archive raised $8.2M from Wing Venture Capital, NVP Capital, and Y Combinator to build multimodal datasets powering robotics and embodied AI.

Human Archive, a San Francisco-based robotics data company and member of Y Combinator's Winter 2026 batch, has raised $8.2M in seed funding from Wing Venture Capital, NVP Capital, and Y Combinator. The company is building large-scale multimodal datasets designed to train robotics systems and embodied AI models operating in the physical world. Human Archive was founded by Raj Patel, Rushil Agarwal, Samay Maini, and Shloke Patel. The team is focused on a problem receiving increasing attention across robotics, AI infrastructure, and venture capital: high-quality real-world training data remains one of the scarcest resources in physical AI development.

The funding arrives as investors increasingly look beyond model development and toward the infrastructure layer supporting next-generation AI systems. Just as cloud computing became foundational to software, data infrastructure is becoming foundational to robotics. For operators, investors, and researchers building embodied intelligence systems, the Human Archive funding round is less about a seed check and more about where capital is beginning to concentrate across the AI stack.

What Happened

The artificial intelligence industry spent the last decade teaching machines to understand text, images, and language. Robotics presents a different challenge entirely. A robot doesn't just need to understand the world. It needs to move through it, manipulate it, react to it, and survive contact with reality. Reality is an unforgiving teacher.

Human Archive has built a business around collecting multimodal datasets designed specifically for robotics and embodied AI systems. The company's platform captures synchronized streams of RGB video, depth data, tactile information, motion capture signals, and audio from real-world environments. Those environments are not controlled research labs. They include homes, restaurants, hotels, farms, construction sites, transportation environments, and industrial settings where unpredictability is the operating system.

Human Archive's dataset offerings include HA-Multi and HA-Ego, designed to capture synchronized multimodal interactions that help robotics companies and embodied AI developers train systems using real-world human behavior. The funding round provides Human Archive with additional capital to expand its data collection infrastructure, contributor network, hardware deployment, and dataset offerings for robotics developers, frontier AI labs, robotics foundation model teams, and research organizations building embodied intelligence systems.

Why This Matters

The robotics industry is facing a problem that sounds deceptively simple. Robots need experience. Not simulated experience alone. Not carefully curated demonstrations. Actual exposure to how humans interact with tools, objects, environments, and each other. Collecting that type of data is expensive, operationally complex, and difficult to scale. Human Archive is attacking that problem directly.

According to company-reported figures, Human Archive has built a contributor network reaching into the tens of thousands, developed custom hardware systems, assembled a 25-person operations team, and created infrastructure capable of collecting thousands of hours of multimodal data daily. Many AI companies can build models. Far fewer can build operational systems capable of generating proprietary real-world data at scale.

History has a habit of rewarding the companies supplying critical inputs. The headlines usually celebrate the applications. The infrastructure providers often end up controlling the leverage. Human Archive is making a bet that physical AI and multimodal robotics data become critical inputs for the next generation of intelligent machines.

Market Context

Physical AI has moved from academic discussions to boardroom strategy. Companies building humanoid robots, warehouse automation systems, autonomous machines, industrial robotics platforms, and embodied AI systems are all chasing the same objective: creating machines that can function reliably in unstructured environments.

The market has attracted growing investment because investors increasingly recognize a familiar pattern. Large language models accelerated when internet-scale datasets became available. Robotics may be entering a similar phase where progress becomes increasingly tied to access to large-scale, high-quality physical-world data. That shift creates a new category of AI infrastructure companies.

Human Archive sits within that emerging layer, building the data foundation required by robotics companies, frontier AI labs, and embodied intelligence developers. The company isn't selling robots. The company is building one of the resources robots need most.

Competitive Landscape

Every major technology cycle creates a hidden layer of companies operating beneath the headlines. Consumers recognize ChatGPT. Enterprise buyers recognize Microsoft. Investors recognize Nvidia. Yet every breakthrough rests on infrastructure most people never see. Robotics is beginning to develop a similar stack.

Human Archive differentiates itself through multimodal data collection. Rather than relying exclusively on video, the company combines vision, depth, tactile feedback, motion capture, and audio into unified datasets that more closely resemble how humans experience the physical world. Humans don't learn from a single sensor. Robots probably won't either.

The more accurately a dataset captures physical interactions, the more valuable that dataset becomes for training embodied AI systems capable of operating outside controlled environments.

What This Signals

The Human Archive funding round highlights a broader shift occurring across artificial intelligence. Attention is moving beyond models. Investors are increasingly evaluating the infrastructure required to support the next generation of AI systems. Data pipelines, robotics datasets, evaluation frameworks, collection platforms, and physical-world training environments are becoming strategic assets.

Raj Patel, Rushil Agarwal, Samay Maini, and Shloke Patel recognized a reality many operators are now discovering firsthand. Building robotics intelligence requires building data infrastructure first. That may not be the most glamorous layer of the stack. It is often the layer that determines who wins.

The Bigger Industry Shift

The AI industry spent years digitizing language. The next decade may focus on digitizing physical experience. That transition creates opportunities for companies capable of capturing, organizing, labeling, and structuring real-world interactions at scale.

Human Archive's $8.2M seed round reflects growing investor conviction that physical AI and embodied intelligence will require their own ecosystem of infrastructure providers. The founders are building one piece of that foundation. Whether robotics ultimately becomes a trillion-dollar market remains unknown. What appears increasingly clear is that the winners will need more than smarter models. They will need better data. And somebody has to collect it.

Frequently Asked Questions

What is Human Archive?

Human Archive is a San Francisco-based robotics data company that builds multimodal datasets for robotics, physical AI, and embodied AI systems.

How much funding did Human Archive raise?

Human Archive raised $8.2M in seed funding from Wing Venture Capital, NVP Capital, and Y Combinator.

Who founded Human Archive?

Human Archive was founded by Raj Patel, Rushil Agarwal, Samay Maini, and Shloke Patel.

What are HA-Multi and HA-Ego?

HA-Multi and HA-Ego are Human Archive datasets designed to capture synchronized multimodal data for robotics and embodied AI training.

Who are Human Archive's customers?

Human Archive serves robotics companies, frontier AI labs, robotics foundation model developers, and research organizations building embodied intelligence systems.

Why is multimodal data important for robotics?

Multimodal data combines video, depth, tactile signals, motion capture, and audio, helping robots learn from multiple sensory inputs and operate more effectively in real-world environments.

What is embodied AI?

Embodied AI refers to artificial intelligence systems that learn from and interact with the physical world through sensors, movement, perception, and environmental feedback.

Why does this funding matter?

The funding highlights growing investor interest in robotics infrastructure and signals that real-world training data is becoming a strategic asset for physical AI development.