This work focuses on synthesizing human poses from human-level text descriptions. We propose a model that is based on a conditional generative adversarial network. It is designed to generate 2D human poses conditioned on human-written text descriptions. The model is trained and evaluated using the COCO dataset, which consists of images capturing complex everyday scenes with various human poses. We show through qualitative and quantitative results that the model is capable of synthesizing plausible poses matching the given text, indicating that it is possible to generate poses that are consistent with the given semantic features, especially for actions with distinctive poses.
CITATION STYLE
Zhang, Y., Briq, R., Tanke, J., & Gall, J. (2021). Adversarial Synthesis of Human Pose from Text. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12544 LNCS, pp. 145–158). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-71278-5_11
Mendeley helps you to discover research relevant for your work.