In February 2024, three people sat in a San Francisco apartment and decided to build a robot brain. Chelsea Finn had spent years at Google Brain and Stanford teaching machines to understand the physical world. Sergey Levine had been at Berkeley doing the same. Lachy Groom, the Stripe alum who had placed early bets on some of the most valuable companies in tech, wrote the first check.
Six hundred million dollars, one open-source release, and a valuation north of $5 billion later, their company Physical Intelligence released π0.7 — a model that does something no robot has reliably done before: figure out a task it was never taught.
That is not incremental progress. It is the kind of threshold the robotics industry has been trying to cross for thirty years.
What π0.7 actually does
Physical Intelligence's core insight is that robot software has been built backwards. Historically, every task requires its own training pipeline: collect data on Task A, train a model to do Task A, deploy. Task B means starting over. That approach works for factory arms that repeat the same weld ten thousand times. It breaks down the moment a robot needs to encounter something new.
π0.7 takes a different approach. The model treats robotic skills as combinable building blocks — like words in a sentence. It learns the components of manipulation (grasping, folding, inserting, pouring) from diverse data across multiple robot types, then recombines them for tasks it has never seen.
Physical Intelligence put π0.7 on a UR5e bimanual robot arm it had never trained on, gave it a pile of laundry — and watched it fold shirts with zero task-specific demonstrations. The success rate matched expert human teleoperators attempting the same hardware for the first time.
The model also used an air fryer to cook a sweet potato, relying on just two fragmented training episodes (closing a drawer, placing a bottle) plus web pretraining. With step-by-step verbal coaching, success jumped from 5% to 95%.
This matters because it moves robotics away from the "one task, one model" paradigm that has limited commercial deployment. A warehouse robot running π0.7 can be asked to unload a truck, fold cardboard boxes, and sort returns — without being separately programmed for each.
The people behind the model
Physical Intelligence's founding team reads like a who's who of modern robotics AI. Finn and Levine are two of the most cited researchers in the field. Groom, who previously built Stripe's payments infrastructure and invested in companies like Figma and Deel at their earliest stages, handles the business side. The engineering team includes alumni from DeepMind, Google Robotics, Tesla, and Waymo.
The company has raised $1.1 billion to date, with its Series B led by CapitalG, Alphabet's growth fund. Jeff Bezos, OpenAI, Lux Capital, Sequoia, and Thrive Capital are also investors. At $5.6 billion valuation, PI is the most valuable pure-play robotics AI company in the world.
$1.1B total raised · $5.6B valuation · 50+ researchers and engineers · 8 distinct robot platforms supported · π0.7 matches RL-tuned specialists on espresso-making, box assembly, and laundry
Why this changes the robotics economics
For founders and CTOs deploying automation, the cost structure of traditional robotics is punishing. Programming a single manipulation task can take months and cost hundreds of thousands of dollars. Each facility modification — different conveyor height, different box size, different product — requires re-engineering.
Physical Intelligence's model runs in the cloud, not on-device. A developer streams RGB-D camera images to PI's runtime, which tokenizes the visual stream along with movement history and feeds it to a 3-to-5-billion-parameter transformer. Users give plain-language goals: "pack chocolates into this box." The model outputs 50 motor commands per second.
That architecture means updates ship to an API, not to physical hardware. The same model improvement reaches every deployed robot simultaneously.
What skeptics are watching
Open question: Does the cloud dependency limit PI's addressable market to indoor logistics environments, or can edge inference eventually bridge the gap?
The GPT moment argument
The comparison to GPT-3 is not casual. Before GPT-3, language models required task-specific fine-tuning for every application. GPT-3 showed that a sufficiently large, diverse model could generalize across tasks with just a prompt. The entire LLM application ecosystem — from chatbots to code generation to document analysis — emerged from that demonstration.
π0.7 is making a similar claim about the physical world. Its compositional generalization, zero-shot cross-embodiment transfer, and ability to follow natural language instructions point toward a future where robot software is no longer the bottleneck. As PI researcher Kyle Vedder put it: "Any robot hardware maker will be able to buy physical intelligence, collect some data on their embodiment, and see our many capabilities transfer."
As we wrote in June, Neura Robotics raised $1.4 billion for humanoid robots, and the sector's total funding is on pace to exceed $15 billion in 2026. The capital is flowing not just to hardware makers but increasingly to the intelligence layer — the software that decides what the hardware actually does.
What comes next
Physical Intelligence is exploring an automated robotic research scientist: an agent that ingests multimodal evaluation data, identifies why a robot failed, and suggests hypotheses to improve the model. The company has hinted that its next version will move beyond manipulation into mobile manipulation — robots that can navigate while carrying out tasks.
For founders building in automation, logistics, or manufacturing, the message is straightforward. The unit economics of deploying robots have just shifted. The first wave of physical AI is not about hardware breakthroughs. It is about a model that finally learned how to fold a shirt.