Nvidia CEO Jensen Huang mentioned final yr that we are actually coming into the age of bodily AI. Whereas the corporate continues to supply LLMs for software program use circumstances, Nvidia is more and more positioning itself as a supplier of AI fashions for absolutely AI-powered programs — together with agentic AI within the bodily world.
At CES 2026, Nvidia introduced a slate of recent fashions designed to push AI brokers past chat interfaces and into bodily environments.
Nvidia launched Cosmos Cause 2, the most recent model of its vision-language mannequin designed for embodied reasoning. Cosmos Cause 1, launched final yr, launched a two-dimensional ontology for embodied reasoning and at the moment leads Hugging Face’s bodily reasoning for video leaderboard.
Cosmos Cause 2 builds on the identical ontology whereas giving enterprises extra flexibility to customise functions and enabling bodily brokers to plan their subsequent actions, much like how software-based brokers cause by digital workflows.
Nvidia additionally launched a brand new model of Cosmos Switch, a mannequin that lets builders generate coaching simulations for robots.
Different vision-language fashions, similar to Google’s PaliGemma and Pixtral Giant from Mistral, can course of visible inputs, however not all commercially out there VLMs help reasoning.
“Robotics is at an inflection level. We’re transferring from specialist robots restricted to single duties to generalist specialist programs,” mentioned Kari Briski, Nvidia vp for generative AI software program, in a briefing with reporters. She was referring to robots that mix broad foundational data with deep task-specific expertise. “These new robots mix broad basic data with deep proficiency and sophisticated duties.”
She added that Cosmos Cause 2 “enhances the reasoning capabilities that robots have to navigate the unpredictable bodily world.”
Shifting to bodily brokers
Briski famous that Nvidia’s roadmap follows “the identical sample of property throughout all of our open fashions.”
“In constructing specialised AI brokers, a digital workforce, or the bodily embodiment of AI in robots and autonomous autos, extra than simply the mannequin is required,” Briski mentioned. “First, the AI wants the compute assets to coach, simulate the world round it. Knowledge is the gas for AI to be taught and enhance and we contribute to the world's largest assortment of open and numerous datasets, going past simply opening the weights of the fashions. The open libraries and coaching scripts give builders the instruments to purpose-build AI for his or her functions, and we publish blueprints and examples to assist deploy AI as programs of fashions.”
The corporate now has open fashions particularly for bodily AI in Cosmos, robotics, with the open-reasoning vision-language-action (VLA) mannequin Gr00t and its Nemotron fashions for agentic AI.
Nvidia is making the case that open fashions throughout totally different branches of AI kind a shared enterprise ecosystem that feeds knowledge, coaching, and reasoning to brokers in each the digital and bodily worlds.
Additions to the Nemotron household
Briski mentioned Nvidia plans to proceed increasing its open fashions, together with its Nemotron household, past reasoning to incorporate a brand new RAG and embeddings mannequin to make data extra available to brokers. The corporate launched Nemotron 3, the most recent model of its agentic reasoning fashions, in December.
Nvidia introduced three new additions to the Nemotron household: Nemotron Speech, Nemotron RAG and Nemotron Security.
In a weblog put up, Nvidia mentioned Nemotron Speech delivers “real-time low-latency speech recognition for stay captions and speech AI functions” and is 10 instances sooner than different speech fashions.
Nemotron RAG is technically comprised of two fashions: an embedding mannequin and a rerank mannequin, each of which might perceive photos to supply extra multimodal insights that knowledge brokers will faucet.
“Nemotron RAG is on prime of what we name the MMTab, or the Large Multilingual Textual content Embedding Benchmark, with sturdy multilingual efficiency whereas utilizing much less computing energy reminiscence, so they’re a very good match for programs that should deal with a number of requests in a short time and with low delay,” Briski mentioned.
Nemotron Security detects delicate knowledge so AI brokers don’t by chance unleash personally identifiable knowledge.

