Xiyi Chen

I am a master's student in Computer Science at ETH Zürich. I am currently working with Dr. Sergey Prokudin and Prof. Siyu Tang in the VLG lab.

I obtained my bachelor's degree in Computer Science from University of Maryland, College Park, where I worked with Prof. David Jacobs on surveillance-quality face recognition.

My research interests are 3D vision and digital humans, specifically, building high-fidelity human avatars with neural radiance fields and diffusion models.

Email / Github / LinkedIn / Gallery

News

04/2024 I will join the University of Maryland, College Park as a PhD student in Fall 2024.
02/2024 Our work Morphable Diffusion is accepted to CVPR 2024!

Publications

Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
Xiyi Chen, Marko Mihajlovic, Shaofei Wang, Sergey Prokudin, Siyu Tang
CVPR, 2024
paper / project page / code

We proposed the first diffusion model to enable the creation of fully 3D-consistent, animatable, and photorealistic human avatars from a single image of an unseen subject.

Projects

	Towards Robust 3D Body Mesh Inference of Partially-observed Humans Semester Project at Computer Vision and Learning Group report / slides / code We improved SMPLify-X optimization pipeline by applying keypoints blending and a stronger pose prior for robust human mesh inference on partially-observed human images.
	Leveraging Motion Imitation in Reinforcement Learning for Biped Character Final project for Digital Humans report / code We reproduced the imitation tasks in Deepmimic with a curriculum training strategy to extend our algorithm’s applicability to various biped robots with different shapes, masses, and dynamics models.
	An Ensemble Based Approach to Collaborative Filtering Final project for Computational Intelligence Lab report / code We built a collaborative filtering pipeline that ensembles multiple neural network based and matrix factorization based methods, including Variational Autoencoder, Bayesian Factorization Machine, etc.
	Learning to Reconstruct 3D Faces by Watching TV Final project for 3D Vision report / code We proposed to use the abundant temporal information from TV series videos for 3D face reconstruction. We modified DECA's encoders and include bidirectional RNN based temporal feature extractors to propagate and aggregate temporal information across frames.
	Human Motion Prediction with Scene Context Final project for Virtual Humans report / code We evaluated the performance of Transformer and GRU based RNN models on human motion prediction with joint skeletons from the PROX dataset.
	Apply Machine Learning Models and Reverse Engineering on Predicting Optical Properties of Semiconductor Bilayers Xiyi Chen, Vivian Hu, André Schleife, Michal Ondrejcek and Erick I. Hernandez Alvarez Summer research at National Center for Supercomputing Applications (NCSA) (* equal contribution) poster We reverse-designed spectrums of high-dimensional material combinations with multi-layer perceptron, LSTM networks, and differential evolution algorithm.

Template credits: Jon Barron