Motion tracking is nothing new for Microsoft, we have seen their expertise in commercial products like Microsoft Kinect. Microsoft Research’s computer vision team is now working on latest advances in detailed hand tracking. They have created a system that can track hands smoothly, quickly and accurately – in real time – but can run on a regular consumer gadget. This system can be used with Virtual Reality applications.
The system, still a research project for now, can track detailed hand motion with a virtual reality headset or without it, allowing the user to poke a soft, stuffed bunny, turn a knob or move a dial.
What’s more, the system lets you see what your hands are doing, fixing a common and befuddling disconnect that happens when people are interacting with virtual reality but can’t see their own hands.
Fully articulated hand tracking promises to enable fundamentally new interactions with virtual and augmented worlds, but the limited accuracy and ef?ciency of current systems has prevented widespread adoption. Today’s dominant paradigm uses machine learning for initialization and recovery followed by iterative model-?tting optimization to achieve a detailed pose ?t. We follow this paradigm, but make several changes to the model-?tting, namely using: (1) a more discriminative objective function; (2) a smooth-surface model that provides gradients for non-linear optimization; and (3) joint optimization over both the model pose and the correspondences between observed data points and the model surface. While each of these changes may actually increase the cost per ?tting iteration, we ?nd a compensating decrease in the number of iterations. Further, the wide basin of convergence means that fewer starting points are needed for successful model ?tting. Our system runs in real-time on CPU only, which frees up the commonly over-burdened GPU for experience designers. The hand tracker is ef?cient enough to run on low-power devices such as tablets. We can track up to several meters from the camera to provide a large working volume for interaction, even using the noisy data from current-generation depth cameras. Quantitative assessments on standard datasets show that the new approach exceeds the state of the art in accuracy. Qualitative results take the form of live recordings of a range of interactive experiences enabled by this new approach.
Read more about this project here.