Zuoyue Li - 李 作越I am a Ph.D. student in the Computer Vision and Geometry (CVG) group at ETH Zurich, supervised by Prof. Marc Pollefeys. My research interests focus on 3D vision and 3D generative models, and I collaborate closely with Prof. Martin R. Oswald and Prof. Zhaopeng Cui. My doctoral research was mainly funded by the Swiss Data Science Center (SDSC) fellowships. I am currently a research intern at Google Zurich with a topic on generative AI and digital humans. I was a research engineer intern at Meta Zurich with a topic on 3D object detection and scene understanding, and was an overseas researcher in the Computer Vision Group at the Institute of Industrial Science (IIS), The University of Tokyo (東京大学), supervised by Prof. Yoichi Sato, funded by Japan Society for the Promotion of Science (JSPS) fellowships. I obtained my M.Sc. degree in Computer Science with distinction at ETH Zurich. I completed my B.Eng. degree in Electronic and Information Engineering as an outstanding graduate at Zhejiang University (浙江大学). Email / GitHub / Google Scholar / LinkedIn |
Research
3D Urban Scene Generation from Satellite Images with Diffusion
Generalize diffusion models to 3D sparse space and perform urban scene generation on a given or predicted geometry, followed by neural rendering techniques to render arbitrary views with excellence in both single-frame quality and inter-frame consistency. |
|
CompNVS: Novel View Synthesis with Scene Completion
Synthesize novel views from RGB-D images with largely incomplete scene coverage. Perform generation on a sparse grid-based neural representation to complete unobserved scene parts. Extrapolate the missing area and render consistent photorealistic image sequences. |
|
Factorized and Controllable Neural Re-rendering of Outdoor Scene for Photo Extrapolation
Expand tourist photos from a narrow field of view to a wider one while maintaining a similar visual style. Propose factorized neural re-rendering model to produce photorealistic novel views from cluttered outdoor Internet photo collections, which enables applications such as controllable scene re-rendering, photo extrapolation, and 3D photo generation. |
|
Sat2Vid: Street-view Panoramic Video Synthesis from a Single Satellite Image
Synthesize both temporally and geometrically consistent street-view panoramic video from a single satellite image and camera trajectory. Explicitly create a 3D point cloud representation of the scene and maintain dense 3D-2D correspondences across frames that reflect the geometric scene configuration inferred from the satellite view. Generation adopts GAN-based methods in the 3D sparse space. |
|
NVS-MonoDepth: Improving Monocular Depth Prediction with Novel View Synthesis
Application of novel view synthesis to improve monocular depth estimation, with a wrapping scheme using the estimated depth to an additional viewpoint. The same depth network is applied to the synthesized view and provides another supervision. |
|
Spatio-Temporal Perturbations for Video Attribution
Take extra attention to the evaluation metrics for video attribution methods. Specifically, a new reliability measurement method is proposed, by which the reliable and objective metrics are screened. The effectiveness of the proposed attribution method is extensively investigated by both subjective and objective evaluation, and comparison with multiple significant baseline attribution methods. |
|
Towards Visually Explaining Video Understanding Networks with Perturbation
Aim to provide an easy-to-use visual explanation method for video understanding networks with diversified structures. Propose a generic perturbation-based visual explanation method, enhanced by a novel spatiotemporal smoothness constraint. The method enables the comparison of explanation results between different video classification networks and avoids generating pathological adversarial explanations for video inputs. |
|
Geometry-Aware Satellite-to-Ground Image Synthesis for Urban Areas
Generate panoramic street-view images that are geometrically consistent with a given satellite image via a GAN-based network with the proposed geo-transformation layer that retains the physical satellite-to-ground relation. The synthesized images retain well-articulated and authentic geometric shapes, as well as the texture richness of the street view in various scenarios. |
|
Topological Map Extraction from Overhead Images
Circumvent the conventional pixel-wise segmentation of aerial images and predict objects in a vector representation directly. Directly extracts the topological map of a city from overhead images as collections of building footprints and road networks. |
Awards
Doctoral Consortium Participant, CVPR 2024
Outstanding Reviewer, CVPR 2023
Outstanding Reviewer, ECCV 2022
National Scholarship for Outstanding Students Abroad, 2022
Japan Society for the Promotion of Science (JSPS) Fellowships for Research in Japan, 2020
Swiss Data Science Center (SDSC) Fellowship, 2019
Graduate with Distinction (M.Sc.) at ETH Zürich, 2018
Second Runner-up (student teams) at Helvetic Coding Contest, 2017
Outstanding Graduates at Zhejiang University, 2015
Academic Service
Conference Reviewer: CVPR, ICCV, ECCV, NeurIPS, WACV.
Journal Reviewer: TPAMI, TGRS.
Teaching
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich | Spring 2024 |
Teaching Assistant, 263-5902-00L Computer Vision, ETH Zürich | Autumn 2023 |
Teaching Assistant, 263-5904-00L Deep Learning for Computer Vision: Seminal Work, ETH Zürich | Spring 2023 |
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich | Spring 2023 |
Teaching Assistant, 252-0847-00L Computer Science, ETH Zürich | Autumn 2022 |
Teaching Assistant, 263-5904-00L Deep Learning for Computer Vision: Seminal Work, ETH Zürich | Spring 2022 |
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich | Spring 2022 |
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich | Spring 2021 |
Teaching Assistant, 263-5902-00L Computer Vision, ETH Zürich | Autumn 2020 |
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich | Spring 2020 |
Teaching Assistant, 263-5904-00L Deep Learning for Computer Vision: Seminal Work, ETH Zürich | Spring 2020 |
Teaching Assistant, 263-5902-00L Computer Vision, ETH Zürich | Autumn 2019 |
Teaching Assistant, 252-0579-00L 3D Vision, ETH Zürich | Spring 2019 |
Teaching Assistant, 263-5904-00L Deep Learning for Computer Vision: Seminal Work, ETH Zürich | Spring 2019 |
Contact
Zuoyue Li
CAB G 85.2
Universitätstrasse 6
8092 Zürich
Switzerland
Last update: 28 Apr 2024