Computer Vision Techniques for Tracking a Moving, Deformable Object: Application for estimating the 3D location and pose of an animal’s head.
Neuroscience and Robotics Lab, Northwestern University
Project Advisors: Professor Hartmann and Jarvis Schultz
GitHub link. [Private Repo. Please ask permission to view.]
Overview
Tracking the location and shape of deformable objects is a challenging and demanding problem in computer vision with numerous useful applications. The present work applies image processing techniques to track the head of an animal as it explores its environment. As the animal moves the environment its mouth, ears, and nose can all deform. Here the focus was specifically on the rat head and whisker array because rats are a common model system used in neuroscience to study sensorimotor control.
The goal of the project was to design and implement software to automatically track the rat's head in three dimensions (3D) based on video recordings from two high-speed (300 fps) cameras. The two video cameras were temporally synchronized and placed approximately orthogonal to each other. The project leveraged a morphologically-accurate 3D model of the rat head taken from the DigitalRat software package that was developed by The Hartmann Group at the Northwestern University.
Tracking the rat’s head was achieved in three steps: 1) computer vision techniques were used to isolate the rat. 2) custom algorithms were developed to parse the rat’s head from its body. 3) a brute-force algorithm was used to find the best fit of the 3D morphological template, projected into the two 2D camera views.
This project was implemented in Python and leveraged open source libraries.
Results
To facilitate use of the morphological model, first a library was developed for manipulation of the 3D RatMap Head template. The library includes functions for extracting the template point cloud from a .mat file, as well as functions for rotation, translation, scaling, plotting, etc. (Video 2)
The tracking algorithm involves two major processing stages: rat head detection in each video frame and RatMap head template fitting via a brute force algorithm.
The first stage leverages OpenCV libraries and custom computer vision functions in order to identify the location of the rat’s head. The following computer vision techniques were used: Low Pass Filter Convolution, Thresholding, Gaussian Mixture-based Background Segmentation, Canny Edge detection for contour generation with specific area threshold, Dilation, and Contrast Limited Adaptive Histogram Equalization.
The second processing stage involves a brute force algorithm for estimating 3D location, orientation, and scaling of the rat head. All combinations of rotation, translation and scale shifts were tested to produce the best fit of the 3D template's 2D projection onto each camera view. Information from both views was used simultaneously, in order to produce more accurate results.
Once the tracked rat head locations and best orientation and translation estimates are determined for each video frame, bright green contours and 2D red template projections are superimposed on top of the video for visualization purposes. (Video 1)
Suggestions for algorithm improvements:
Estimate the direction of the rat's motion and incorporate it into the brute force algorithm, in order to minimize the error.
The software was tested with the assumption that the rat moves at a constant speed, which is not the case. Incorporating speed changes would increase tracking accuracy.
Feature detection and matching can also provide more information on the location and orientation of the rat. For instance, by using ORB (Oriented FAST and Rotated BRIEF) algorithm, extract key points and descriptors from each consecutive video frame, find matches between these descriptors, finally find how these descriptors translate between consecutive video frames (Video 3).