Robotics Foundation Models: Early Ecosystems and First Movers; Backflipping Robots are Nothing New
By George Chowdhury |
09 Mar 2026 |
IN-8075
Log In to unlock this content.
You have x unlocks remaining.
This content falls outside of your subscription, but you may view up to five pieces of premium content outside of your subscription each month
You have x unlocks remaining.
By George Chowdhury |
08 Mar 2026 |
IN-8075
NEWSBackflipping Robots and the Fevered Dream of Generalist Robotics |
Spearheaded by NVIDIA, the promise of robotics foundation models—generalist behavioral models that can extend robot capabilities—have captured the imagination of investors. Events such as Unitree’s captivating humanoid performance during Chinese New Year celebrations in Beijing further stoke the furor. Progress has undoubtedly been made in cost, hardware capabilities, and the development of reinforcement learning for dramatically increasing the speed of deploying policies (such as dancing and acrobatics). However, decision makers must remain grounded. Back flipping humanoids are not new: Boston Dynamics demonstrated such behaviors in 2017. The value of this technology—beyond entertainment—has not yet fully materialized.
CES 2026 saw a new wave of excitement surrounding robotics and (ambiguously defined) Physical Artificial Intelligence (AI) technologies. Various facets of the supply chain—notably semiconductor vendors—have pivoted their portfolios away from declining automotive markets toward the promise of humanoids and robotics. Qualcomm, AMD, Intel, and Ambarella have developed Systems-on-Chip (SoCs) optimized for deploying foundation models at the edge. Other silicon vendors, including NXP, Microchip, and STMicroelectronics, have a renewed interest in the market and are discovering the best ways to meet the demands of a potential boom. Motor and actuator vendors, including Maxon and Harmonic Drive, are also developing portfolios. Big Tech, hyperscalers, and automotive vendors all want a piece of the pie.
The market has responded with unprecedented valuations for early-stage innovators (generally humanoid vendors). Robotics company Figure currently leads the humanoid-specific cohort with a staggering US$39 billion valuation, followed by 1X at US$10 billion and Apptronik at US$5 billion. Conversely, the only companies with live, revenue-generating deployments—Agility Robotics and Unitree—hold significantly lower valuations at US$2.1 billion and US$1.7 billion, respectively. In the software-centric robot-agnostic foundation model space, Physical Intelligence and Skild AI have secured valuations of US$5.6 billion and US$4.5 billion, indicating a market preference for the promise of generalized intelligence over today’s tangible, if limited, embodied AI products. This influx of capital highlights a massive bet on the arrival of general-purpose automation in the physical world, despite the widening gap between speculative worth and current industrial utility. The market is showing major signs of a misalignment between valuations and value.
IMPACTFoundation Models for Robotics: Why Do Microsoft, Meta, OpenAI, Apple, Tesla, NVIDIA, and Google Have One? |
Robotics foundation models, commonly built on transformer architecture, are ChatGPT for robotics—generalists that are able to field any request and capable of performing any action. At least that is the vision. Robots are only as good as the training (programming) that they are provided. Teaching many different actions (make a cup of tea and then vacuum the floor) is highly difficult because of the number of environmental parameters and possible permutations of how a task can be performed. A robotics foundation model is designed to store these myriad policies and their variants, and then, at runtime, infer between the scenario it faces and its database of trained policies. There is a lot that can go wrong, which is why Unitree’s demonstrations rely on very simple, pre-trained sequences.
Foundation models for robotics come in several different flavors today: Vision Language Models (VLMs) that are trained on language and visual data (multimodal), but controlling the robot is left to separate circuitry that contains strictly taught policies or behaviors. A newer iteration is the Vision Language Action (VLA) model. This extends the data modality of the foundation model further—incorporating action, or actuator control, directly into the model as output. Proponents envision VLAs as the ultimate generality, targeting them at humanoid robots. Each of the Big Tech players, and many humanoid vendors, have proprietary (or open-source) solutions in this space—primarily because they can afford the significant compute overhead for running and training robots.
But there is a critical flaw: foundation models are black boxes. As has recently been widely publicized, OpenAI’s own AI engineers have no idea what connections are being formed within ChatGPT. This is a major issue for robotics. Indeterminacy means that unexpected (unsafe) behaviors can occur and a robot may not do the same task in the same way repeatedly—damaging the overall value offering. Robotics vendors attempt to mitigate these issues by limiting the motion of actuators—the range of motion an arm, leg, or hand can produce.
Despite these teething issues, all stakeholders want to be the one to develop the generalist robotics brain and optimize deployment for their hardware or cloud infrastructure.
RECOMMENDATIONSOpportunity Does Not Weigh All Market Verticals Equally |
Beyond the dream of humanoids, incumbent robotics Original Equipment Manufacturers (OEMs) (Yaskawa, FANUC, ABB, KUKA) have bought into the potential of foundation models for robotics by partnering with NVIDIA to augment robot controllers. Many such companies have possessed closed technology ecosystems for the entirety of their 150-year (plus) existences. This sudden change in mentality is brought on by the advancement, and encroachment, of Chinese robotics OEMs into industrial manufacturing verticals. Major stumbling blocks exist to deploying foundation models in existing processes. Primarily, for most robot manufacturing cells, the technology only offers marginal efficiency improvements over their procedurally coded counterparts. Secondly, data gaps and exception-causing edge cases challenge the near-term practical value of AI for robotics. And finally, Systems Integrators (SIs) don’t want to manage the complexity of these deployments (it requires niche product knowledge). These factors result in the retrofit potential of robotics AI being severely limited. Primary applications within industrial manufacturing include Computer Numerical Control (CNC) machine tending and augmenting inspection.
Warehousing and logistics shows much higher potential for the uptake of robotics foundation models in the near term. General greenfield deployment environments and highly manual work provides opportunity for machines that can improve efficiency, quality control, and throughput. Accordingly, Third-Party Logistics (3PL) providers such as GXO, Amazon, DHL, and FedEx are all experimenting with emerging robotics technologies, humanoids, and foundation models. Innovators such as Boston Dynamics have found commercial opportunity for truck unloading with its Stretch Robot. Startups such as Covariant, Dexterity, Nomagic, and Plus One Robotics have also found a receptive audience in the warehousing and logistics market—although they often must offer assurances in the form of teleoperation. Verticalized vendors, including Symbotic, Zebra, and Ocado, lean heavily into these emerging robotics technologies. With all these solutions, robotics foundation models are restricted to the manipulation of specific goods in specific ways; common applications include picking and palletization with Collaborative Robots (cobots). Deployers still regularly come up against data gaps and edge cases. Innovators should look to this market first, but be aware of razor thin profit margins.
Both market verticals discussed above benefit from well-established integrator and Service Lifecycle Management (SLM) support networks. When deploying technologies in new markets—in order for generalist robotics to capture more of the economy—similar infrastructure will first have to be established. Those that build this infrastructure—in the form of cloud services, resilient AI models, or highly transparent and serviceable hardware—will ultimately win out.
Written by George Chowdhury
Related Service
- Competitive & Market Intelligence
- Executive & C-Suite
- Marketing
- Product Strategy
- Startup Leader & Founder
- Users & Implementers
Job Role
- Telco & Communications
- Hyperscalers
- Industrial & Manufacturing
- Semiconductor
- Supply Chain
- Industry & Trade Organizations
Industry
Services
Spotlights
5G, Cloud & Networks
- 5G Devices, Smartphones & Wearables
- 5G, 6G & Open RAN
- Cellular Standards & Intellectual Property Rights
- Cloud
- Enterprise Connectivity
- Space Technologies & Innovation
- Telco AI
AI & Robotics
Automotive
Bluetooth, Wi-Fi & Short Range Wireless
Cyber & Digital Security
- Citizen Digital Identity
- Digital Payment Technologies
- eSIM & SIM Solutions
- Quantum Safe Technologies
- Trusted Device Solutions