Tencent Unveils World’s First 3D Model in 2025, Supporting Native 3D Reconstruction

Sep 07, 20251 Mins read175

On September 2, 2025, Tencent Hunyuan officially announced the release of HunyuanWorld-Voyager (Hunyuan Voyager for short), the industry’s first ultra-long-range world model supporting native 3D reconstruction.

This model ranked first in overall performance on the WorldScore, a world model benchmark released by Fei-Fei Li’s team at Stanford University, surpassing existing open-source methods and achieving outstanding results in both video generation and 3D reconstruction.

Voyager ranked first in the world model rankings

Voyager also achieved superior results in both video generation and video 3D reconstruction.

According to the latest introduction, Hunyuan Voyager focuses on expanding the application of AI intelligence in spatial, providing high-fidelity 3D scene navigation capabilities for fields such as virtual reality, physical simulation, and game development.

This model overcomes the limitations of traditional video generation in terms of spatial consistency and exploration range, enabling the generation of long-distance, world-consistent navigation scenes and supporting direct video export to 3D formats.

Hunyuan Voyager’s 3D input and 3D output features are highly compatible with the previously open-source Hunyuan World Model 1.0. This extends the 1.0 model’s roaming range, improves the quality of complex scene generation, and allows for stylized control and editing of generated scenes.

Furthermore, the model supports a variety of 3D understanding and generation applications, including video scene reconstruction, 3D object texture generation, customized video style generation, and video depth estimation.

Officials stated that Hunyuan Voyager, for the first time, combines spatial and feature-based approaches to support native 3D memory and scene reconstruction, avoiding the latency and accuracy loss associated with traditional post-processing.

Additionally, 3D input conditions ensure accurate perspective, while the output directly generates 3D point clouds, making it suitable for a variety of application scenarios.

The additional depth information also supports video scene reconstruction, 3D object texture generation, stylized editing, and depth estimation.