Xiyi Chen

I am a first-year PhD student in Computer Science at the University of Maryland, College Park, advised by Prof. Ming Lin.

Before that, I obtained my master's degree in Computer Science at ETH Zürich, where I worked on human avatar synthesis and human mesh recovery, advised by Dr. Sergey Prokudin and Prof. Siyu Tang. I obtained my bachelor's degree in Computer Science also at Maryland. Back then, I worked with Prof. David Jacobs on surveillance-quality face recognition.

My research interests are 3D vision and digital humans.

Email / Github / LinkedIn / Gallery

News

08/2024 I start as a PhD student at the University of Maryland, College Park.
02/2024 Our work Morphable Diffusion is accepted to CVPR 2024!

Publications

SplatFormer: Point Transformer for Robust 3D Gaussian Splatting
Yutong Chen, Marko Mihajlovic, Xiyi Chen, Yiming Wang, Sergey Prokudin, Siyu Tang
ICLR, 2025 (Spotlight Presentation)
project page / arXiv / code

We analyzed the performance of novel view synthesis methods in challenging out-of-distribution (OOD) camera views and introduced SplatFormer, a data-driven 3D transformer designed to refine 3D Gaussian Splatting primitives for improved quality in extreme camera scenarios.

Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
Xiyi Chen, Marko Mihajlovic, Shaofei Wang, Sergey Prokudin, Siyu Tang
CVPR, 2024
project page / arXiv / code

We proposed the first diffusion model to enable the creation of fully 3D-consistent, animatable, and photorealistic human avatars from a single image of an unseen subject.

Projects

	Towards Robust 3D Body Mesh Inference of Partially-observed Humans Semester Project at Computer Vision and Learning Group report / slides / code We improved SMPLify-X optimization pipeline by applying keypoints blending and a stronger pose prior for robust human mesh inference on partially-observed human images.
	Leveraging Motion Imitation in Reinforcement Learning for Biped Character Final project for Digital Humans report / code We reproduced the imitation tasks in Deepmimic with a curriculum training strategy to extend our algorithm’s applicability to various biped robots with different shapes, masses, and dynamics models.
	Learning to Reconstruct 3D Faces by Watching TV Final project for 3D Vision report / code We proposed to use the abundant temporal information from TV series videos for 3D face reconstruction, by modifying DECA's encoders and include bidirectional RNN based temporal feature extractors to propagate and aggregate temporal information across frames.
	Gentrification Exploration in Zürich Final Project for Interactive Machine Learning: Visualization & Explainability demo / report / poster We designed an interactive machine learning tool with Variational Nearest Neighbor Gaussian Process (VNNGP) model to uncover and visualize pricing dynamics and gentrification developments in the housing rental market of Zürich.

Services

Reviewer: TPAMI, NeurIPS

Template credits: Jon Barron