Simulating Anthropomorphic Upper-body Actions in Virtual Reality with Minimal Tracking

As virtual reality (VR) is being used in many industries for different purposes, there is a growing need to address problems that prevent users from having a realistic virtual experience. One such problem is simulating virtual avatars for VR users in the digital world.

If analyzed carefully, it can be seen that the human anatomy is fairly complex. So, creating a digital humanoid model becomes complex as well, requiring the character riggers to carefully design the articulations around the joints of a digital character to make the 3D model functional in animated movies and games. For more realistic actions and movements of characters in the movies and games, real human actions are taken as input through motion capture systems. Various motion capture systems are used in this process, each varying in the fidelity and the level of realism offered.

While a virtual avatar would constructively affect the sense of presence experienced by a VR user, it’s not easy to digitally simulate human actions without tracking multiple parts of the body. For example, if we just take the human upper body, without tracking position and rotation of a user’s hip, neck and head, it’s difficult to represent actions like leaning forward in place, turning neck in place, turning whole body, physically moving, etc. A feasible, consumer-friendly approach to address this issue while simulating virtual avatars would be to incorporate technologies similar to Microsoft Kinect that do not need physical markers attached to the user’s joints. However, the tracking and resulting simulation using these technologies are not as realistic as the tracking with physical markers attached to a user's body.

Our project aims to simulate approximate poses for a user’s torso in virtual world during a VR experience, without adding new hardware requirements to existing VR systems and without adding additional markers or tracking equipment on a user’s body. Only the tracking data received from the headset and the hand controllers are used in our approach to estimate the pose of the virtual avatar’s torso. Currently, our goal is to simulate only few upper body actions like leaning forward, physical walking, looking up / down and turning around in place. Though the entire human body cannot be simulated digitally at the moment with this approach, we consider this to be a beginning step to come up with customized rigs and centralized functionalities to simulate believable humanoid avatars with minimal tracking.

So far, we have been successful in recognizing the leaning forward action (gesture) based on a base dataset that contains positional tracking data of the VR headset. The application we built was able to recognize and simulate this gesture successfully when the same user who provided the motion tracking input for base dataset tried using it. Our plan is to add a machine learning component to the application so that the input dataset and the conditions can be grown over time with different users, thereby reducing the gesture recognition error considerably. A video showing our work-in-progress can be seen above.