1 Tianjin University 2 Cardiff University * Corresponding author
Recovering 3D human meshes from monocular images is an inherently ambiguous and challenging task due to depth ambiguity, joint occlusion and truncation. However, most recent works avoid modeling uncertainty, typically obtaining a single reconstruction for a given input. In contrast, this paper presents the ambiguity of reception reconstruction and considers the problem as an inverse problem for which multiple feasible solutions exist. Our method, MHPro, first constructs a probability distribution and obtains a set of feasible recovery results (i.e. multi-hypotheses), from monocular images. Intra-hypothesis refinement is then performed to achieve independent feature enhancement. Finally, the multi-hypothesis features are aggregated by inter-hypothesis communication to recover the final 3D human mesh. The effectiveness of our method is validated on two benchmark datasets, Human3.6M and 3DPW, where experimental results show that our method achieves stateof-the-art performance and recovers more accurate human meshes. Our results validate the importance of intra-hypothesis refinement and interhypothesis communication in probabilistic modeling and show optimal performance across a variety of settings.
Compared to our conference version, we have mainly expanded from the following aspects: 1) Considering the important role of multi-hypothesis fusion and communication effects on our model performance, we propose the Hypothesis-Mixing Multi-Layer Perceptron to explore the relationship between channels with different hypotheses, and a new configuration of the Multi-Head Cross-Attention to achieve more thorough information exchanges among multi-hypotheses; 2) We demonstrate that our module designs and multi-hypothesis nature can effectively facilitate the multi-view fusion task by leveraging information from different views better; 3) We provide more details, more comprehensive experiments, and more thorough discussions to validate our performance.
Fig 1. We propose a multi-hypothesis method to recovering 3D human meshes from monocular images.
Fig 2. Overview of the proposed method. Given an input monocular image I, we perform probabilistic modeling (a) with normalizing flows to extract image features, predict a pose distribution and generate multiple initial human mesh hypotheses (N indicates the number of hypotheses), input these multi-hypotheses into Intra-hypothesis refinement module (b) for independent refinement and feature enhancement, use Interhypothesis communication module (c) to implement their mutual communication and finally regress to obtain the recovered human mesh M.
Fig 3. Qualitative results on LSP dataset.
Fig 4. Plausible human mesh recovery results generated by our method,
especially for ambiguous parts with depth ambiguity, joint occlusion and truncation.
Haibiao Xuan, Jinsong Zhang, Yu-Kun Lai, Kun Li. MH-HMR: Human Mesh Recovery from Monocular Images via Multi-Hypothesis Learning. Proceedings of the CAAI Transactions on Intelligence Technology (CAAI TRIT). 2023.
@article{xuan2023mhhmr,
author = {Haibiao Xuan and Jinsong Zhang and Yu-Kun Lai and Kun Li},
title = {MH-HMR: Human Mesh Recovery from Monocular Images via Multi-Hypothesis Learning},
journal = {CAAI Transactions on Intelligence Technology (CAAI TRIT)},
year={2023}
}
Haibiao Xuan, Jinsong Zhang, Kun Li. MHPro: Multi-Hypothesis Probabilistic Modeling for Human Mesh Recovery. Proceedings of the CAAI International Conference on Artificial Intelligence (CICAI), Beijing, China. 2022.
@inproceedings{xuan2022mhpro,
author = {Haibiao Xuan and Jinsong Zhang and Kun Li},
title = {MHPro: Multi-Hypothesis Probabilistic Modeling for Human Mesh Recovery},
booktitle = {Proceedings of the CAAI International Conference on Artificial Intelligence (CICAI)},
year={2022}
}