Wujie Dongli Unveils World's First Long-Sequence Bidirectional Physical Causal Chain Latent Space World Model MWA
Embodied AI startup Wujie Dongli has officially released MWA™ (Embodied General Brain), the world's first 'long-sequence bidirectional physical causal chain' latent space world model. The model adopts a 'bidirectional dynamics' architecture, performing inference in a unified latent space, and pioneers temporal Chunk-level inverse dynamics modeling, enabling stable planning of continuous action sequences over 10 seconds. This fundamentally solves the challenges of multi-scenario generalization and high-precision execution for robots.
Technical Approach: Latent Space World Model + Reinforcement Learning
Wujie Dongli chose the 'latent space world model + reinforcement learning' route, differentiating from mainstream VLA (Vision-Language-Action) models. VLA models rely on imitation learning, lack understanding of physical causality, and have limited generalization. MWA builds a 'worldview' through the latent space world model, allowing robots to comprehend physical laws and causal relationships; reinforcement learning shapes 'values', converting understanding into precise execution strategies through trial and error and reward feedback.
Core Innovation: Latent Actions and Long-Sequence Bidirectional Causal Chain
MWA uses 'Latent Actions' as carriers of physical causality. An inverse dynamics encoder transforms visual changes into high-dimensional vectors, eliminating dependence on manual action labels and enabling pre-training on massive unlabeled internet videos. The model adopts a 'bidirectional dynamics' architecture: inverse dynamics infers causes from effects, forward dynamics predicts effects from causes, and a 'forward-inverse mutual review mechanism' repeatedly verifies to improve causal reasoning accuracy.
Building on this, MWA pioneers a 'long-sequence bidirectional physical causal chain', breaking the limitation of single-step instantaneous inference. It achieves temporal Chunk-level inverse dynamics modeling, outputting continuous multi-step Latent Action Chunks from visual sequences over 10 seconds, significantly reducing the 'snowball effect' of error accumulation.
Benchmark Results and Funding
On the RoboCasa GR1 TableTop benchmark co-organized by Stanford University, MWA achieved the world's highest average task success rate of 75.2%, surpassing models like NVIDIA's GR00T-N1.6. The company has completed over $200 million in angel round funding, and its Pre-A round of nearly $200 million is nearing completion, with investors including Sequoia China, Linear Capital, and JD.com-affiliated funds.
Negative Sample Data System: AnyPhys for RL
Addressing the industry's data bias toward positive samples, Wujie Dongli pioneered the AnyPhys negative sample core data system. It interweaves deep negative samples, boundary instability samples, suboptimal samples, and positive samples to build a high-information-density physical boundary coordinate system, supplementing the sample dimensions needed for dense reinforcement learning training and improving the model's anti-interference capability in real-world conditions.
Also available in 中文.