There’s a video of a robot at Figure’s lab making coffee. Not the “robot arm precisely follows a pre-programmed path to operate a Keurig” kind of making coffee. The kind where you say “hey, make me a coffee” and it figures out the steps — find the mug, identify the coffee machine, press the right buttons, bring it to you. The kind of making coffee that requires understanding what coffee is.
We’ve had industrial robots for decades. Welding robots in car factories. Pick-and-place machines in electronics manufacturing. Assembly line arms that repeat the same motion 10,000 times a day with sub-millimeter precision. Those robots are impressive but dumb. They do exactly what they’re programmed to do and nothing else.
What’s different now is the AI part. Robots are learning to see, understand, and adapt. And that changes everything about what robots can do.
The Three Things AI Gives Robots
Eyes that understand. Computer vision combined with depth sensors lets robots build a 3D model of their environment in real-time. Not just “there’s an object at coordinates (3, 4, 2)” but “that’s a coffee mug, it’s upright, it’s on the edge of the table, and it looks fragile.” This semantic understanding is what makes a robot useful in an unstructured environment like your kitchen versus a structured one like a factory floor.
A brain that plans. This is where LLMs come in, and honestly, it surprised me how well this works. Google’s RT-2 takes a natural language instruction like “pick up the Coke can and put it in the recycling” and figures out the motor commands to make it happen — including for objects and situations it wasn’t explicitly trained on. The same language understanding that powers ChatGPT turns out to be really useful for telling robots what to do.
Hands that learn. Traditional robots need every motion pre-programmed. AI-powered robots learn from demonstration — you show them how to fold a towel, and they figure out the general principle of towel-folding, then adapt it to towels of different sizes and shapes. Stanford’s Mobile ALOHA system learned cooking, cleaning, and organizing tasks from watching humans. Not perfectly, but well enough to be useful.
Where Robots Actually Work Today
Warehouses are the success story. Amazon has over 750,000 robots in its fulfillment centers. These aren’t humanoid robots strolling around — they’re mostly flat platforms that carry shelving units to human pickers, plus robotic arms that sort and pack items. The AI handles navigation in a dynamic environment where thousands of robots and humans share space. It’s the largest deployment of AI robotics in the world, and it works.
Surgery has gone further than most people realize. The da Vinci surgical system has been used in over 12 million procedures. AI provides tremor filtering (steadier than any human hand), 3D magnified visualization, and assistance with instrument positioning. Surgeons are still in control — the robot enhances their capabilities rather than replacing them.
Agriculture is surprisingly advanced. There are robots that selectively pick strawberries — only the ripe ones, leaving unripe fruit for later. Other robots identify and eliminate weeds without herbicides. The challenge here is variability — every field is different, every plant is slightly different, and the lighting changes throughout the day. AI handles this variability in ways traditional programming simply cannot.
The Humanoid Race
Everyone’s building humanoid robots now, and opinions vary wildly on whether this is genius or hubris.
Figure 01 and 02 are the most impressive demos I’ve seen. Natural language interaction, adaptive behavior, and manipulation that actually looks fluid rather than jerky. The partnership with OpenAI means Figure’s robots understand context and instructions in a way that feels genuinely intelligent.
Tesla’s Optimus gets the most press because it’s Tesla. Progress has been faster than critics expected — recent demos show Optimus walking, picking up objects, and performing simple tasks. Whether Elon’s timeline promises are realistic is a separate question (spoiler: they’re probably not).
Boston Dynamics’ Atlas is the OG humanoid robot, and the electric version is genuinely impressive athletically. Backflips, parkour, dynamic obstacle navigation. But the gap between “impressive demo” and “useful product” remains wide.
My honest take on humanoids: the question isn’t whether they’ll eventually work — they will. The question is whether a humanoid form factor is the right approach. Why build a human-shaped robot to operate a dishwasher when you could build a better dishwasher? Humanoid robots make sense for environments designed for humans (homes, offices, stores). Purpose-built robots make sense for specific tasks (warehouses, surgery, farming).
The Unsolved Problems
Generalization remains the hard part. A robot trained to make coffee in Kitchen A struggles with Kitchen B, where the coffee machine is different and the mugs are in a different cabinet. Humans handle this effortlessly. Robots need either extensive retraining or foundation models that can generalize — and we’re not quite there yet.
Safety around humans. A warehouse robot that bumps into a shelf is annoying. A home robot that bumps into a toddler is unacceptable. The safety requirements for human-proximate robots are orders of magnitude higher than for industrial robots, and we’re still developing the standards and technologies to meet them.
Cost is prohibitive for consumers. Figure hasn’t announced consumer pricing, but estimates put humanoid robots at $50,000-100,000 initially. That’s a car, not an appliance. Consumer robotics needs to get to the $5,000-10,000 range to achieve mass adoption.
My Five-Year Prediction
Warehouse and logistics robots will be everywhere. Surgical robots will expand to more procedure types. Agricultural robots will become common on large farms. Home robots will still be expensive novelties — useful enough to justify the price for wealthy early adopters, but not yet the Roomba-like household staple.
The wildcard is whether foundation models for robotics achieve a breakthrough in generalization. If a robot can learn a new task from a 30-second demonstration rather than hours of training, the economics change completely. Several research groups are getting close. The next few years will be fascinating.
🕒 Last updated: · Originally published: March 15, 2026