Journal Home > Volume 9 , Issue 3

Professional dance is characterized by high impulsiveness, elegance, and aesthetic beauty. In order to reach the desired professionalism, it requires years of long and exhausting practice, good physicalcondition, musicality, but also, a good understanding of choreography. Capturing dance motions and transferring them to digital avatars is commonly used in the film and entertainment industries. However, so far, access to high-quality dance data is very limited, mainly due to the many practical difficulties in capturing the movements of dancers, making it prohibitive for large-scale data acquisition. In this paper, we present a model that enhances the professionalism of amateur dance movements, allowing movement quality to be improved in both spatial and temporal domains. Our model consists of a dance-to-music alignment stage responsible for learning the optimal temporal alignment path between dance and music, and a dance-enhancement stage that injects features of professionalism in both spatial and temporal domains. To learn a homogeneous distribution and credible mapping between the heterogeneous professional and amateur datasets, we generate amateur data from professional dances taken from the AIST++dataset. We demonstrate the effectiveness of our method by comparing it with two baseline motion transfer methods via thorough qualitative visual controls, quantitative metrics, and a perceptual study. We also provide temporal and spatial module analysis to examine the mechanisms and necessity of key components of our framework.


menu
Abstract
Full text
Outline
Electronic supplementary material
About this article

Let’s all dance: Enhancing amateur dance motions

Show Author's information Qiu Zhou1,*Manyi Li2,*Qiong Zeng1( )Andreas Aristidou3Xiaojing Zhang1Lin Chen4Changhe Tu1( )
School of Computer Science & Technology, Shandong University, Qingdao 266000, China
School of Software, Shandong University, Jinan 250101, China
Department of Computer Science, University of Cyprus, Nicosia 1678, Cyprus; CYENS Centre of Excellence, Nicosia 1016, Cyprus
Qingdao Institute of Humanities and Social Sciences, Shandong University, Qingdao 266000, China

* Qiu Zhou and Manyi Li contributed equally to this work.

Abstract

Professional dance is characterized by high impulsiveness, elegance, and aesthetic beauty. In order to reach the desired professionalism, it requires years of long and exhausting practice, good physicalcondition, musicality, but also, a good understanding of choreography. Capturing dance motions and transferring them to digital avatars is commonly used in the film and entertainment industries. However, so far, access to high-quality dance data is very limited, mainly due to the many practical difficulties in capturing the movements of dancers, making it prohibitive for large-scale data acquisition. In this paper, we present a model that enhances the professionalism of amateur dance movements, allowing movement quality to be improved in both spatial and temporal domains. Our model consists of a dance-to-music alignment stage responsible for learning the optimal temporal alignment path between dance and music, and a dance-enhancement stage that injects features of professionalism in both spatial and temporal domains. To learn a homogeneous distribution and credible mapping between the heterogeneous professional and amateur datasets, we generate amateur data from professional dances taken from the AIST++dataset. We demonstrate the effectiveness of our method by comparing it with two baseline motion transfer methods via thorough qualitative visual controls, quantitative metrics, and a perceptual study. We also provide temporal and spatial module analysis to examine the mechanisms and necessity of key components of our framework.

Keywords: animation, music-to-motion alignment, dance motion enhancement, dance motion analysis

References(60)

[1]
Hanna, J. L. The Performer-Audience Connection: Emotion to Metaphor in Dance and Society. University of Texas Press, 1983.
[2]
Aristidou, A.; Shamir, A.; Chrysanthou, Y. Digital dance ethnography. Journal on Computing and Cultural Heritage Vol. 12, No. 4, Article No. 29, 2020.
[3]
Li, R. L.; Yang, S.; Ross, D. A.; Kanazawa, A. AI choreographer: Music conditioned 3D dance generation with AIST. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 13381–13392, 2021.
DOI
[4]
Chen, K.; Tan, Z.; Lei, J.; Zhang, S. H.; Guo, Y. C.; Zhang, W.; Hu, S. M. ChoreoMaster: Choreography-oriented music-driven dance synthesis. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 145, 2021.
[5]
Butterworth, J. Dance Studies: The Basics. Routledge Press, 2011.
DOI
[6]
Holden, D.; Saito, J.; Komura, T.; Joyce, T. Learning motion manifolds with convolutional autoencoders. In: Proceedings of the SIGGRAPH Asia 2015 Technical Briefs, Article No. 18, 2015.
DOI
[7]
Holden, D.; Saito, J.; Komura, T. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics Vol. 35, No. 4, Article No. 138, 2016.
[8]
Aberman, K.; Weng, Y. J.; Lischinski, D.; Cohen-Or, D.; Chen, B. Q. Unpaired motion style transfer from video to animation. ACM Transactions on Graphics Vol. 39, No. 4, Article No. 64, 2020.
[9]
Dong, Y. Z.; Aristidou, A.; Shamir, A.; Mahler, M.; Jain, E. Adult2child: Motion style transfer using CycleGANs. In: Proceedings of the 13th ACM SIGGRAPH Conference on Motion, Interaction and Games, Article No. 13, 2020.
DOI
[10]
Wen, Y. H.; Yang, Z. P.; Fu, H. B.; Gao, L.; Sun, Y. N.; Liu, Y. J. Autoregressive stylized motion synthesis with generative flow. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13607–13607, 2021.
DOI
[11]
Koutedakis, Y.; Craig Sharp, N. C. The Fit and Healthy Dancer. Wiley Press, 1999.
[12]
Krasnow, D.; Chatfield, S. J. Development of the “performance competence evaluation measure”: Assessing qualitative aspects of dance performance. Journal of Dance Medicine & Science Vol. 13, No. 4, 101–107, 2009.
[13]
Neave, N.; McCarty, K.; Freynik, J.; Caplan, N.; Hönekopp, J.; Fink, B. Male dance moves that catch a woman’s eye. Biology Letters Vol. 7, No. 2, 221–224, 2011.
[14]
Torrents, C.; Castañer, M.; Jofre, T.; Morey, G.; Reverter, F. Kinematic parameters that influence the aesthetic perception of beauty in contemporary dance. Perception Vol. 42, No. 4, 447–458, 2013.
[15]
Park, Y. S. Correlation analysis between dance experience and smoothness of dance movement by using three jerk-based quantitative methods. Korean Journal of Sport Biomechanics Vol. 26, No. 1, 1–9, 2016.
[16]
Alexiadis, D. S.; Kelly, P.; Daras, P.; O’Connor, N. E.; Boubekeur, T.; Ben Moussa, M. Evaluating a dancer’s performance using kinect-based skeleton tracking. In: Proceedings of the 19th ACM International Conference on Multimedia, 659–662, 2011.
DOI
[17]
Raheb, K. E.; Stergiou, M.; Katifori, A.; Ioannidis, Y. Dance interactive learning systems: A study on interaction workflow and teaching approaches. ACM Computing Surveys Vol. 52, No. 3, Article No. 50, 2019.
[18]
Chen, H. Y.; Cheng, Y. H.; Lo, A. Improve dancing skills with motion capture systems: Case study of a Taiwanese high school dance class. Research in Dance Education , 2021.
[19]
Chan, J. C. P.; Leung, H.; Tang, J. K. T.; Komura, T. A virtual reality dance training system using motion capture technology. IEEE Transactions on Learning Technologies Vol. 4, No. 2, 187–195, 2011.
[20]
Aristidou, A.; Stavrakis, E.; Charalambous, P.; Chrysanthou, Y.; Himona, S. L. Folk dance evaluation using laban movement analysis. Journal on Computing and Cultural Heritage Vol. 8, No. 4, Article No. 20, 2015.
[21]
Laban, R. The Mastery of Movement, 4th edn. Dance Books Ltd., 2011.
[22]
Tenenbaum, J.; Freeman, W. Separating style and content. In: Proceedings of the Advances in Neural Information Processing Systems, 662–668, 1996.
[23]
Aristidou, A.; Zeng, Q.; Stavrakis, E.; Yin, K. K.; Cohen-Or, D.; Chrysanthou, Y.; Chen, B. Emotion control of unstructured dance movements. In: Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation, Article No. 9, 2017.
DOI
[24]
Brand, M.; Hertzmann, A. Style machines. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, 183–192, 2000.
DOI
[25]
Hsu, E.; Pulli, K.; Popović J. Style translation for human motion. ACM Transactions on Graphics Vol. 24, No. 3, 1082–1089, 2005.
[26]
Xia, S. H.; Wang, C. Y.; Chai, J. X.; Hodgins, J. Realtime style transfer for unlabeled heterogeneous human motion. ACM Transactions on Graphics Vol. 34, No. 4, Article No. 119, 2015.
[27]
Mason, I.; Starke, S.; Zhang, H.; Bilen, H.; Komura, T. Few-shot learning of homogeneous human locomotion styles. Computer Graphics Forum Vol. 37, No. 7, 143–153, 2018.
[28]
Smith, H. J.; Cao, C.; Neff, M.; Wang, Y. Y. Efficient neural networks for real-time motion style transfer. Proceedings of the ACM on Computer Graphics and Interactive Techniques Vol. 2, No. 2, Article No. 13, 2019.
[29]
Du, H.; Herrmann, E.; Sprenger, J.; Cheema, N.; Hosseini, S.; Fischer, K.; Slusallek, P. Stylisticlocomotion modeling with conditional variational autoencoder. In: Proceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games, Article No. 32, 2019.
[30]
Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P. A. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research Vol. 11, 3371–3408, 2010.
[31]
Gatys, L.; Ecker, A.; Bethge, M. A neural algorithm of artistic style. Journal of Vision Vol. 16, No. 12, 326, 2016.
[32]
Huang, X.; Belongie, S. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, 1510–1519, 2017.
DOI
[33]
Arikan, O.; Forsyth, D. A. Interactive motion generation from examples. ACM Transactions on Graphics Vol. 21, No. 3, 483–490, 2002.
[34]
Kim, T. H.; Park, S. I.; Shin, S. Y. Rhythmic-motion synthesis based on motion-beat analysis. ACM Transactions on Graphics Vol. 22, No. 3, 392–401, 2003.
[35]
Lee, H. C.; Lee, I. K. Automatic synchronization of background music and motion in computer animation. Computer Graphics Forum Vol. 24, No. 3, 353–361, 2005.
[36]
Shiratori, T.; Nakazawa, A.; Ikeuchi, K. Dancing-to-music character animation. Computer Graphics Forum Vol. 25, No. 3, 449–458, 2006.
[37]
Tang, T. R.; Jia, J.; Mao, H. Y. Dance with melody: An LSTM-autoencoder approach to music-oriented dance synthesis. In: Proceedings of the 26th ACM International Conference on Multimedia, 1598–1606, 2018.
DOI
[38]
Lee, H. Y.; Yang, X.; Liu, M. Y.; Wang, T. C.; Lu, Y. D.; Yang, M. H.; Kautz, J. Dancing to music. In: Proceedings of the 33rd Conference on Neural Information Processing Systems, 2019.
[39]
Tsuchida, S.; Fukayama, S.; Hamasaki, M.; Goto, M. AIST dance video database: Multi-genre, multi-dancer, and multi-camera database for dance information processing. In: Proceedings of the 20th International Society for Music Information Retrieval Conference, 501–510, 2019.
[40]
Zhuang, W. L.; Wang, C. Y.; Chai, J. X.; Wang, Y. G.; Shao, M.; Xia, S. Y. Music2Dance: DanceNet for music-driven dance generation. ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 18, No. 2, Article No. 65, 2022.
[41]
Aristidou, A.; Yiannakidis, A.; Aberman, K.; Cohen-Or, D.; Shamir, A.; Chrysanthou, Y. Rhythm is a dancer: Music-driven motion synthesis with global structure. IEEE Transactions on Visualization and Computer Graphics , 2022.
[42]
Tadamura, K.; Nakamae, E. Synchronizing computer graphics animation and audio. IEEE MultiMedia Vol. 5, No. 4, 63–73, 1998.
[43]
Cardle, M.; Barthe, L.; Brooks, S.; Robinson, P. Music-driven motion editing: Local motion transformations guided by music analysis. In: Proceedings of the 20th Eurographics UK Conference, 38–44, 2002.
DOI
[44]
Laichuthai, A.; Kanongchaiyo, P. Synchronization between motion and music using motion graph. In: Proceedings of the 8th Electrical Engineering/ Electronics, Computer, Telecommunications and Information Technology, 496–499, 2011.
DOI
[45]
Davis, A.; Agrawala, M. Visual rhythm and beat. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 122, 2018.
[46]
Bellini, R.; Kleiman, Y.; Cohen-Or, D. Dance to the beat: Synchronizing motion to audio. Computational Visual Media Vol. 4, No. 3, 197–208, 2018.
[47]
Chung, J. S.; Zisserman, A. Out of time: Automated lip sync in the wild. In: Computer Vision – ACCV 2016 Workshops. Lecture Notes in Computer Science, Vol. 10117. Chen, C. S.; Lu, J.; Ma, K. K. Eds. Springer Cham, 251–263, 2017.
DOI
[48]
Halperin, T.; Ephrat, A.; Peleg, S. Dynamic temporal alignment of speech to lips. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 3980–3984, 2019.
DOI
[49]
Wang, J. R.; Fang, Z. Y.; Zhao, H. AlignNet: A unifying approach to audio-visual alignment. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 3298–3306, 2020.
DOI
[50]
Phillips, G. M. Interpolation and Approximation by Polynomials. New York: Springer, 2003.
DOI
[51]
Holden, D.; Komura, T.; Saito, J. Phase-func-tioned neural networks for character control. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 42, 2017.
[52]
Aristidou, A.; Lasenby, J.; Chrysanthou, Y.; Shamir, A. Inverse kinematics techniques in computer graphics: A survey. Computer Graphics Forum Vol. 37, No. 6, 35–58, 2018.
[53]
McFee, B.; Raffel, C.; Liang, D. W.; Ellis, D.; McVicar, M.; Battenberg, E.; Nieto, O. Librosa: Audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, 18–24, 2015.
DOI
[54]
Sakoe, H.; Chiba, S. Dynamic programming algorithm optimization for spoken word recognition. IEEETransactions on Acoustics, Speech, and Signal Processing Vol. 26, No. 1, 43–49, 1978.
[55]
Rabiner, L.; Juang, B. H. Fundamentals of Speech Recognition. Prentice-Hall, Inc., 1993,
[56]
Daugman, J. G. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America A Vol. 2, No. 7, 1160–1169, 1985.
[57]
Dowson, D. C.; Landau, B. V. The Fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis Vol. 12, No. 3, 450–455, 1982.
[58]
Aristidou, A.; Cohen-Or, D.; Hodgins, J. K.; Chrysanthou, Y.; Shamir, A. Deep motifs and motion signatures. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 187, 2018.
[59]
Zhou, Y.; Barnes, C.; Lu, J. W.; Yang, J. M.; Li, H. On the continuity of rotation representations in neural networks. In: Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition, 5738–5746, 2019.
DOI
[60]
Andreou, N.; Aristidou, A.; Chrysanthou, Y. Pose representations for deep skeletal animation. In: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2022.
DOI
Video
41095_0292_ESM2.mp4
File
41095_0292_ESM1.pdf (2.2 MB)
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 12 February 2022
Accepted: 05 May 2022
Published: 31 March 2023
Issue date: September 2023

Copyright

© The Author(s) 2023.

Acknowledgements

This research was supported by National Natural Science Foundation of China (Grant No. 62072284), Natural Science Foundation of Shandong Province (Grant No. ZR2021MF102), a Special Project of Shandong Province for Software Engineering (Grant No. 11480004042015), and internal funds from the University of Cyprus. The authors would like to thank Anastasios Yiannakidis (University of Cyprus) for capturing the amateur dances, and the volunteers for participating in the perceptual studies. The authors would also like to thank the anonymous reviewers and editors for their fruitful comments and suggestions.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return