Computer vision cheat sheet
Computer vision is a broad field - there are many algorithms and it can be hard to tell what's the best way to attack a given problem. This aims to provide a guide on what to use.
What this isn't
- The bleeding edge. CV moves very quickly. I haven't always heard about the newest bestest way to analyze $YOUR_FAVORITE_THING. Email me and I'll take it under consideration though!
- The best way. "Best" often isn't even well-defined. Preprocessing time, CPU usage, memory usage, accuracy, and simplicity are all axes you might value differently for different applications. All else equal, I will try and show the simplest way to implement a solution.
- Comprehensive. There are many, many, areas in CV. Covering them all is beyond the scope of this guide.
What this is
- An illustration of what is possible. It's not always obvious a task can be done, let alone knowing what magic words to Google. I want to provide an overview of what is out there.
- Translation. Often you know what you want, but no way to know that it's called, say, histogram backprojection. This should help bridge that gap.
- A living document. I'll be adding entries, updating, and adding examples. If you have suggestions for more, let me know!
I want to... | You should use... |
Detect lines in an image | Hough Transform |
Detect circles in an image | Circular Hough Transform |
Find images similar to the one I have | phash, dhash |
Find images containing a particular object | Feature detector - orb, sift, surf, kaze/akaze |
Find images containing any particular object | Haar cascade, convolutional neural net |
Track an object in a video | Meanshift, camshift |
Detect faces in an image | Haar cascade |
Detect people in an image | Histogram of Oriented Gradients (HOG) people detector |
Remove defects (like passers-by) from an image | Median filter |
Detect motion in an image | Background subtraction + frame differences |
Read words or numbers in an image | Optical character recognition (OCR) |
Classify what an image contains (e.g. horse, Golden Gate Bridge, human, etc) | Deep learning, especially Convolutional Neural Nets |