What Do I Need?
- Any Dedicated or Virtual Server
What is a Background Subtraction?
Background subtraction is a major preprocessing step in many vision-based applications. For example, consider cases like a visitor counter where a static camera takes the number of visitors entering or leaving the room, or a traffic camera extracting information about the vehicles, etc. In all these cases, first, you need to extract the person or vehicles alone. Technically, you need to extract the moving foreground from the static background.
If you have an image of the background alone, like an image of the room without visitors, an image of the road without vehicles etc, it’s an easy job. Just subtract the new image from the background. You get the foreground objects alone. But in most cases, you may not have such an image, so we need to extract the background from whatever images we have. It becomes more complicated when there’s a shadow on the vehicles. Since the shadow is also moving, simple subtraction will mark that also as foreground. It complicates things. However several awesome algorithms have been developed and introduced for this purpose. OpenCV has implemented a number of these algorithms which are super easy to use.
It’s a Gaussian Mixture-Based Background/Foreground Segmentation Algorithm. It was introduced in 2001 by P. KadewTrKuPong and R. Bowden, in a paper called ‘An improved adaptive background mixture model for real-time tracking with shadow detection’. It uses a method to model each background pixel by a mixture of K Gaussian distributions (K = 3 to 5). The weights of the mixture represent the time proportions that those colors stay in the scene. The probable background colors are the ones that stay longer and are more static.
While coding, you’ll need to create a background object using the function, cv2.bgsegm.createBackgroundSubtractorMOG(). It has some optional parameters, like history length, number of Gaussian mixtures, threshold, etc. It's also set to default values. Then inside the video loop, use the backgroundsubtractor.apply() method to get the foreground mask.
import numpy as np import cv2 cap = cv2.VideoCapture('hollywood-people-walking-around.mp4') fgbg = cv2.bgsegm.createBackgroundSubtractorMOG() while(1): ret, frame = cap.read() fgmask = fgbg.apply(frame) cv2.imshow('frame',fgmask) k = cv2.waitKey(30) & 0xff if k == 27: break cap.release() cv2.destroyAllWindows()
It’s also a Gaussian Mixture-based Background/Foreground Segmentation Algorithm. That’s based on two papers by Z.Zivkovic, ‘Improved adaptive Gausian mixture model for background subtraction’ in 2004 and ‘Efficient Adaptive Density Estimation per Image Pixel for the Task of Background Subtraction’ in 2006. One important feature of this algorithm is that it selects the appropriate number of Gaussian distributions for each pixel. Remember, in the last case, you took a K Gaussian distribution throughout the algorithm. It provides better adaptability to varying scenes due to illumination changes, etc.
As in the previous case, you have to create a background subtractor object. Here, you have an option of selecting whether shadow to be detected or not. If detectShadows = True, it detects and marks shadows, but decreases the speed. Shadows will be marked in gray color.
import numpy as np import cv2cap = cv2.VideoCapture('hollywood-people-walking-around.mp4')fgbg = cv2.createBackgroundSubtractorMOG2() while(1): ret, frame = cap.read() fgmask = fgbg.apply(frame) cv2.imshow('frame',fgmask) k = cv2.waitKey(30) & 0xff if k == 27: break cap.release() cv2.destroyAllWindows()
This algorithm is really awesome because it’s able to detect and differentiate shadows. It combines statistical background image estimation and per-pixel Bayesian segmentation. It was introduced Andrew B. Godbehere, Akihiro Matsukawa, Ken Goldberg in their paper ‘Visual Tracking of Human Visitors under Variable-Lighting Conditions for a Responsive Audio Art Installation’ in 2012. As per the paper, the system ran a successful interactive audio art installation called ‘Are We There Yet?’ from March 31 - July 31 2011 at the Contemporary Jewish Museum in San Francisco, California.
It uses the first 120 frames, by default, for background modeling. It employs a probabilistic foreground segmentation algorithm that identifies possible foreground objects using Bayesian inference. The estimates are adaptive, newer observations are more heavily weighted than old observations to accommodate variable illumination. Several morphological filtering operations like closing and opening are done to remove unwanted noise etc. You’ll get a black window during the first few frames. It’s preferable to apply morphological opening to the result in order to remove the noise.
import numpy as np import cv2cap = cv2.VideoCapture('hollywood-people-walking-around.mp4') kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3)) fgbg = cv2.bgsegm.createBackgroundSubtractorGMG() while(1): ret, frame = cap.read() fgmask = fgbg.apply(frame) fgmask = cv2.morphologyEx(fgmask, cv2.MORPH_OPEN, kernel) cv2.imshow('frame',fgmask) k = cv2.waitKey(30) & 0xff if k == 27: break cap.release() cv2.destroyAllWindows()
Using this technique it’s possible to start building applications that can successfully separate foreground and background elements. OpenCV has matured a great deal over the past couple of years and now includes over 6 different background segmentation algorithms as standard. Be sure to check out the Python scripts included with this how-to guide. If you want to try your own videos with the scripts just change cv2.VideoCapture('your-video-name.mp4') to match the name of the video you want processing.
python3 backgroundSubtractorMOG.py python3 backgroundSubtractorMOG2.py python3 backgroundSubtractorGMG.py