I am a Research Scientist at Meta working on computer vision, with an emphasis on 3D modeling and 3D generation. I did my PhD in the amazing Imagine lab at ENPC under the guidance of Mathieu Aubry. During my PhD, I was fortunate to work with Jean Ponce (Inria), Matthew Fisher (Adobe Research), Alyosha Efros and Angjoo Kanazawa (UC Berkeley). Before that, I completed my engineer's degree (=M.Sc.) at Mines Paris.
My research is focused on learning things from images without annotations, with a particular interest in recovering the underlying 3D (see representative papers). I am always looking for PhD interns, feel free to reach out!
We introduce an LRM capable of recovering illumination, geometry and material properties of real object scenes from a few posed images.
We present a new object-centric dataset for 3D deep learning and 3D generative AI.
We propose a method for compositional part-level 3D generation and reconstruction from various modalities including text, image or 3D models.
We combine Meta 3D AssetGen and TextureGen for high-quality mesh generation.
We introduce a novel text- or image-conditioned generator of 3D assets with physically-based rendering materials and detailed geometry.
We propose a method that encodes 2D images into any 3D representation, without requiring pre-trained image feature extractor.
We build upon sprite-based image decomposition approaches to design a generative method for character analysis and recognition in text lines.
We compute a primitive-based 3D reconstruction from multiple views by optimizing textured superquadric meshes with learnable transparency.
We introduce MACARONS, a method that learns in a self-supervised fashion to explore new environments and reconstruct them in 3D using RGB images only.
A Transformer-based framework to evaluate off-the-shelf features (object-centric and dense representations) for the reasoning task of VQA.
We present UNICORN, a self-supervised approach leveraging the consistency across different single-view images for high-quality 3D reconstructions.
We characterize 3D shapes as affine transformations of linear families learned without supervision, and showcase its advantages on large shape collections.
We discover the objects recurrent in unlabeled image collections by modeling images as a composition of learnable sprites.
A simple adaptation of K-means to make it work on pixels! We align prototypes to each sample image before computing cluster distances.
Last updated: March 2025