Ergonomic Gestures Recognition

Denis Lalanne
Tom Forrer
Project status: 


In the last two decades, the mouse and the keyboard were the dominant input devices for computer interfaces. Nowadays, new forms of interfaces come up, such as touch interfaces and even touchless interfaces without any additional devices for the user. For simple or short tasks, the pointing nger has been widely accepted, but for prolonged sessions at an computer interface, users still prefer mouses and keyboards. This is due to the ergonomic features of these input devices, mainly the arm support and limited action space, but also comfort and precision. A gesture is, according to the denition of Kurtenbach and Hulteen, "a motion of the body containing information". There exist several classications of gestures, but gestural interfaces mainly focus on symbolic and deictic gestures according to the classication of Rime and Schiaratura. Early attempts of vertical gestural interfaces had a ergonomic problem, later known as the gorilla arm. A user is not at ease lifting the arms for a prolongued period, nor is he comfortable at performing specic gestures in a big action space. This project aims at providing a gesture recognition for hand, ngers and forearm, in a situation where the forearm is supported at a table or elbow rests in a chair. These conditions limit the action space and focus on more subtle gestures. Applications of such an ergonomic gesture recognition range from video collaboration to a completely new way of natural interaction, as seen in the recent Microsoft project Natal.


The goal of this project is recognize temporal aspects of conned and small-grained gestures. This is done by extracting features of the forearm, hand and ngers in a vision based recognition, and mapping these features on a physiological model of the hand. In order to focus on the temporal representation of a gesture, the feature extraction is assisted in a first phase by providing the vision system with visual markers on a glove. This feature extraction can then be replaced by a unassisted hand/nger feature detection. The gesture recognition system will be tested on several deictic and symbolic gestures such as point, zoom, rotate, swipe, confirmation and denial.


  1. Environment setup and gesture specification: Setup two cameras for stereoscopic vision, prepare gloves with visual markers, establish a list of targeted gestures
  2. Establish a ground truth: Record the video of several gestures, such as the pointing gesture following a predefined motion, and label them.
  3. Feature extraction: Extract position of visual marker of the glove
  4. Mapping features on the physiological hand model: Implement a model of the forearm and the hand and map the extracted features on this model
  5. Temporal representation of gestures: Store the temporal characteristics of a gesture
  6. Temporal analysis: analyse and match features on set of gestures
  7. Real-time gesture recognition: replace feature extraction of the visual markers on the glove by a vision based hand recognition
  8. Report