|
Sicheng Zuo
I am a third year Ph.D student in i-VisionGroup in the Department of Automation, Tsinghua University, advised by Prof. Jiwen Lu . In 2023, I received my BS degree from the Department of Automation, Tsinghua University.
I am interested in computer vision and deep learning. My current research focuses on autonomous driving and vision foundation models.
Email  / 
Google Scholar  / 
GitHub
|
|
|
News
2025-09: One paper on 3D occupancy prediction is accepted to NeurIPS 2025.
2025-06: One paper on embodied 3D occupancy prediction is accepted to ICCV 2025.
2025-02: One paper on 3D occupancy prediction is accepted to CVPR 2025.
2024-07: One paper on image representation learning is accepted to ECCV 2024.
|
|
Publications
*Equal contribution †Project leader.
|
|
QuadricFormer: Scene as Superquadrics for 3D Semantic Occupancy Prediction
Sicheng Zuo* ,
Wenzhao Zheng*† ,
Xiaoyong Han* ,
Longchao Yang,
Yong Pan,
Jiwen Lu
The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025.
[arXiv]
[Code]
[Project Page]
QuadricFormer proposes geometrically expressive superquadrics as scene primitives, enabling efficient and powerful object-centric representation of driving scenes.
|
|
Gaussianworld: Gaussian world model for streaming 3d occupancy prediction
Sicheng Zuo* ,
Wenzhao Zheng*† ,
Yuanhui Huang ,
Jie Zhou ,
Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.
[arXiv]
[Code]
GaussianWorld reformulates 3D occupancy prediction as a 4D occupancy forecasting problem conditioned on the current sensor input and proposes a Gaussian World Model to exploit the scene evolution for perception.
|
|
EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
Yuqi Wu*,
Wenzhao Zheng*† ,
Sicheng Zuo ,
Yuanhui Huang ,
Jie Zhou ,
Jiwen Lu
IEEE International Conference on Computer Vision (ICCV), 2025.
[arXiv]
[Code]
[Project Page]
EmbodiedOcc formulates an embodied 3D occupancy prediction task and employs a Gaussian-based framework to accomplish it.
|
|
SpatialFormer: Towards Generalizable Vision Transformers with Explicit Spatial Understanding
Han Xiao* ,
Wenzhao Zheng* ,
Sicheng Zuo ,
Peng Gao,
Jie Zhou ,
Jiwen Lu
European Conference on Computer Vision (ECCV), 2024.
[Paper]
SpatialFormer proposes an efficient vision transformer architecture with explicit spatial understanding for generalizable image representation learning.
|
|
PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction
Sicheng Zuo* ,
Wenzhao Zheng* ,
Yuanhui Huang ,
Jie Zhou ,
Jiwen Lu
arXiv, 2023.
[arXiv]
[Code]
[中文解读 (in Chinese)]
As the first 2D-projection-based method on the 3D semantic occupancy prediction task, PointOcc significantly outperforms all other methods by a large margin with a much faster speed.
|
© Sicheng Zuo | Last updated: October 8, 2025.
|