Future Work - Human Pose Estimation under Cloth-like Objects from Depth Images Using a Syntheti

7 Conclusions

7.1 Conclusions

This thesis describes a method of pose estimation for humans under cloth-like object such as blankets. We use depth images as inputs to avoid the sensitivity to illumination conditions and privacy concerns. We utilizes a cloth deformation simulation for generating pairs of depth images of humans under cloths and locations of joint keypoints in pixel coordinates. These pairs of depth image and keypoint are then used for training a network. The performance evaluation using synthetic test data shows a potential ability of the proposed method for human pose estimation under cloth-like objects. Even though the postures in the input data are unknown or the human body is covered with a cloth-like objects, the network successfully es- timates the human pose. The evaluation using RMSE and PCKh@0.5 showed high accuracy on both synthetic test datasets with and without cloth. On the other hand, the application to real data has not been achieved well. As mentioned above, the difference between synthetic and real data makes the reliable estimation difficult. To alleviate the reality gap, we added real data to the training data for training. The network was trained using the synthetic dataset with a few real-world depth images which have 20 kinds of postures, and the results showed that the RMSE and PCK were 7.061 and 0.841, respectively. It is expected that this accuracy can be further improved with more variations of postures in the synthetic dataset.

flipping, cropping and adding noise can be used. In particular, adding noise to the image may help to alleviate the reality gap between the synthetic dataset and the real data.

Another future work is the implementation of abnormal posture detection. The ultimate goal of our research is to monitor a person sleeping in a bed using depth images, and to detect if the person is in a dangerous posture that could cause physical stress. It can be assumed that the position of each joint key point can be used as an important clue to detect such an abnormal posture. In practice, it may be necessary to take time-series images instead of static depth images, or it may be necessary to combine keypoints with the other data obtained from other sensors, such as a pressure sensor under the bed.

Acknowledgements

I would like to express my appreciation to my supervisor, Professor Jun Miura. He is always accommodating and gave me a helping hands if I had question about my research or writing.

He shows much support in everything I do and lead me to the right direction for my thesis writing. Also, I would like to thank him for giving me a great chance to try the Double Degree Program between Toyohashi University of Technology and University of Eastern Finland.

In addition to my supervisors, I would also like to thank Professor Yasushi Kanazawa and Senior Researcher Ville Hautamäki as the co-examiners, who have given me kind comments and suggestions on this thesis.

Finally, I must express my very profound gratitude to my parents and to my friends for providing me with unfailing support and continuous encouragement throughout my years of study and through the process of researching and writing this thesis. This accomplishment would not have been possible without them. Thank you.

References

[1] I. Mikic, K. Huang, and M. Trivedi. Activity Monitoring and Summarization for an Intelligent Meeting Room. InProceedings of IEEE Workshop on Human Motion, 2000.

[2] T. Mori, S. Tominaga, H. Noguchi, M. Shimoasaka, R. Fukui, and T. Sato. Behavior Prediction from Trajectories in a House by Estimating Transition Model Using Stay Points. In Proceedings of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp.

3419–3425, 2011.

[3] A. Pantelopoulos and N.G. Bourbakis. A Survey on Wearable Sensor-Based Systems for Health Monitoring and Prognosis. IEEE Trans. on Systems, Man, and Cybernetics Part C: Applications and Reviews, Vol. 40, No. 1, pp. 1–12, 2010.

[4] R.-Y. Huang and L.-R. Dung. Measurement of Heart Rate Variability using Off-the- shelf Smart Phones. Biomedical Engineering Online, Vol. 15, No. 1, 2016. 16 pages.

[5] Nation Sleep Foundation. http://sleepfoundation.org.

[6] Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. InProceedings of 2017 IEEE Conf. on Computer Vision and Pattern Recognition, 2017.

[7] J. Shotton, T. Sharp, A. Kipman, A. Fitzgiboon, M. Finocchio, A. Blake, M. Cook, and R. Moore. Real-time Human Pose Recognition in Parts from Single Depth Images.

Communications of the ACM, Vol. 56, No. 1, pp. 116–124, 2013.

[8] K. Nishi and J. Miura. Generation of Human Depth Images with Body Part Labels for Complex Human Pose Recognition. Pattern Recognition, Vol. 71, pp. 402–413, 2017.

[9] Y. Singh and A. S. Chauhan. Neural networks in data mining. Journal of Theoretical &

Applied Information Technology, Vol. 5, no. 1, 2009.

[10] Ruder S. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:160904747, 2016.

[11] Qian, N. On the momentum term in gradient descent learning algorithms. Neural Networks : The Official Journal of the International Neural Network Society, 12(1), pp.

145–151, 1999.

[12] Tijmen Tieleman and G. Hinton Lecture 6.5 - rmsprop, COURSERA: Neural Networks for Machine Learning., 2012.

[13] Kingma, Diederik and Jimmy Ba. Adam: A method for stochastic optimization. Inter- national Conference on Learning Representations., pp. 1-15, 2015.

[14] K. Simonyan and A. Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition. International Conference on Learning Representations., 2015.

[15] A. Krizhevsky, I. Sutskever and G. EHinton. ImageNet Classification with Deep Convo- lutional Neural Networks. Advances in Neural Information Processing Systems, 2012.

[16] He, K., Zhang, X., Ren, S. and Sun, J. Deep residual learning for image recognition.

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition., pp. 770, 2016.

[17] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition., pp. 779-788, 2016.

[18] V. Badrinarayanan, A. Kendall, and R. Cipolla. Segnet: A deep convolutional encoder- decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence., vol. 39, no. 12, pp. 2481–2495, 2017.

[19] T.B. Moeslund, A. Hilton, and V. Krüger. A Survey of Advances in Vision-based Human Motion Capture and Analysis. Computer Vision and Image Understanding, Vol. 104, pp. 90–126, 2006.

[20] Z. Liu, J. Zhu, J. Bu, and C. Chen. A Survey of Human Pose Estimation: the Body Parts Parsing Based Methods. J. of Visual Communication and Image Representation, Vol. 32, pp. 10–19, 2015.

[21] Z. Toshev and C. Szegedy. DeepPose: Human Pose Estimation via Deep Neural Net- works. InProceedings of 2014 IEEE Conf. on Computer Vision and Pattern Recogni- tion, pp. 1653–1660, 2014.

[22] G.L. Oliveira, A. Valada, C. Bollen, W. Burgard, and T. Brox. Deep Learning for Human Part Discovery in Images. InProceedings of 2016 IEEE Int. Conf. on Robotics and Automation, 2016.

[23] B. Sapp and B. Taskar. Modec: Multimodal decomposable models for human pose estimation. InProceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013.

[24] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105, 2012.

[25] X. Chen, R. Mottaghi, X. Liu, S. Fidler, R. Urtasun and A. Yuille. Detect what you can:

Detecting and representing objects using holistic models and body parts. InProceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2014.

[26] J. Long, E. Shelhamer and T. Darrell. Fully convolutional networks for semantic segmentation. InProceedings of the IEEE Conf. on Computer Vision and Pattern Recogni- tion (CVPR), pp. 3431–3440, 2015.

[27] M. Vasileiadis, S. Malassiotis, D. Giakoumis, C.-S. Bouganis, and D. Tzovaras. Robust Human Pose Tracking for Realistic Service Robot Applications. InProceedings of the 5th Int. Workshop on Assistive Computer Vision and Robotics, 2017.

[28] T. Harada, T. Mori, Y. Nishida, T. Yoshimi, and T. Sato. Body Parts Positions and Posture Estimation System Based on Pressure Distribution Image. In Proceedings of 1999 IEEE Int. Conf. on Robotics and Automation, 1999.

[29] X. Xu, F. Lin, A. Wang, C. Song, Y. Hu, and W. Xu. On-bed Sleep Posture Recognition Based on Body-Earth Mover’s Distance. In Proceedings of 2015 Biomedical Circuits and Systems Conf., 2015.

[30] Q. Sun, E. Gonzalez, and Y. Sun. On Bed Posture Recognition with Pressure Sensor Array System. In2016 IEEE SENSORS, 2016.

[31] S. Liu and S. Ostadabbas. A Vision-Based System for In-Bed Posture Tracking. In Proceedings of the 5th Int. Workshop on Assistive Computer Vision and Robotics, 2017.

[32] Maya,http://www.autodesk.com/products/maya/overview/.

[33] M. Shinzaki, Y. Iwashita, R. Kurazume, and K. Ogawara. Gait-based Person Identifi- cation Method using Shaodow Biometrics for Robustness to Changes in the Walking Direction. In Proceedings of 2015 IEEE Winter Conf. on Applications of Computer Vision, pp. 670–677, 2015.

No documento Human Pose Estimation under Cloth-like Objects from Depth Images Using a Synthetic Image Dataset with Cloth Simulation (páginas 44-50)