Installing OpenCV in a virtualenv

There are a handful of answers strewn across the internet about installing OpenCV’s Python bindings, but none of them seem to apply to installing them in a virtualenv in Linux. In the interest of collecting all that information in one place, here’s what I did to get it running. I’m using:

Ubuntu 14.04 64 bit
Python 2.7.6
OpenCV 2.4.9.0

This will install OpenCV, the Python bindings, and Numpy system-wide, but afterward you will be able to use them inside a virtualenv. This assumes you have pip, virtualenv, and virtualenvwrapper installed and properly configured. If you aren’t familiar with these, Googling yields many resources, for example this tutorial.

First, install OpenCV’s dependencies, per the installation instructions. (Some of these were pre-installed on my system)

$ sudo apt-get install build-essential cmake libgtk2.0-dev pkg-config python-dev libavcodec-dev libavformat-dev libswscale-dev

The next part had me tripped up for a little bit. OpenCV doesn’t play particularly well with virtualenvs, so numpy needs to be installed on the system Python:

$ sudo pip install numpy

After that, continue to build OpenCV per the instructions. Download the source (I’m using version 2.4.9.0, from here) and unzip it in the directory of your choice.

$ unzip opencv-2.4.9.zip
$ cd opencv-2.4.9
$ mkdir build
$ cd build

Configure the make files using cmake. There is a flag required for the Python bindings that I couldn’t find in the official documentation, only in StackOverflow
questions: BUILD_NEW_PYTHON_SUPPORT. Also note the two trailing periods.

$ cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_NEW_PYTHON_SUPPORT=ON ..

This will output a lot of text, but if you scroll up you should find a section referring to Python. It will refer to the system Python binary. This is fine. We will set up our virtualenv later.

After the build is configured, time to make the project. This will take a few minutes to run.

$ make
$ sudo make install

After the build completes, you need to set up your virutalenv if you haven’t already. Numpy also needs to be installed in the virtualenv.

$ mkvirutalenv opencv
[...]
(opencv) $ pip install numpy

Now that our virtualenv is ready to go, we just need to copy the OpenCV binary into the virtualenv’s site-packages directory.

(opencv) $ cd lib
(opencv) $ cp cv2.so ~/.virtualenvs/opencv/lib/python2.7/site-packages

Everything should be good to go:

(opencv) $ python
>>> import cv2
>>>

If you get [/bash]ImportError: no module named cv2[/bash], double check you had the BUILD_NEW_PYTHON_SUPPORT flag set, and that numpy is installed for the system Python. If you get the error message

ImportError: numpy.core.multiarray failed to import

you need to install numpy in your virtualenv. From inside the virtualenv,

(opencv) $ pip install numpy

should fix that. I believe that should do the trick. Good luck with your computer vision!

New host

Fox row was down for a while due to some issues with the old web host. It’s on a new and improved hosting service now, so there shouldn’t be any more problems.

Sweeping the one-seeds

The Bracket is out, and once Virginia was announced as the final 1-seed, I realized Wisconsin may be in a unique position. The Badgers beat both Florida and Virginia during the regular season, and have the possibility to beat both Arizona and Wichita State in the tournament. If they do so, they will have beaten all four 1-seeds this season. I had to find out if anyone ever has. There have been 116 1-seeds since the field expanded to 64 in 1985. Naturally, all the 1-seeds are excluded by virtue of being unable to lose to themselves.

It turns out there have only been 2 teams that have 4 wins against 1-seeds in the same year, and they’re both Arizona squads. In 1997, the Wildcats beat North Carolina twice, Kansas, and Kentucky en route to the national championship. They never played Minnesota, the other 1-seed. In an even crazier 2001 season, they came within a game of doing it: they split 2 regular-season games against Illinois before beating them in the tournament, split regular season-games against Stanford, and beat Michigan State in the Final Four. The last 1-seed was Duke, who they lost to in the National Championship. They played 7 games against the 1-seeds that year!

There have been 7 teams with 3 wins against 1-seeds (what was it with the Wildcats those 4 years?):
1985 Illinois
1985 Georgetown
1986 Duke
1991 Duke
1992 Southern California
1992 Indiana
2000 Arizona

The distribution falls off rapidly after that: there have been 64 teams with 2 wins against 1-seeds and 352 with 1 win.

The all-time leaders contain no surprises. If we assume that most of the 1-seeds come from historically “power” conferences, being in the same conference provides more opportunity over the course of a season to play against them. Going deep in the tournament doesn’t hurt either.

Duke 26
Arizona 18
North Carolina 17
Indiana 16
Kansas 16
Maryland 16

On the flip side, Wake Forest has the dubious honor of most losses to 1-seeds in a year. In 2002 they also played 7 games against the eventual 1-seeds. They went 0-7, losing to Cincinnati and Kansas once each, Maryland twice, and Duke 3 times. There have been 16 5-loss teams and 73 4-loss teams. In terms of all-time futility, NC State takes the cake. Since 1985, they have a paltry .114 winning percentage against 1-seeds:

NC State 62
Virginia 56
Georgia Tech 54
Clemson 53
Michigan 52

So if the Badgers pull it off, they’ll be the first in the 64+ team era to do so.

Stats courtesy Sports-Reference.

Pypeline update

It’s been a while since I’ve posted any updates for pypeline. I’ve been recently getting familiar with OpenCV, which has a great feature recognition API. I’ve ditched PIL/pillow in favor of OpenCV, and it’s looking promising. Also I was getting sick of dealing with the .NEF files output by my camera, so I’ve switched over to using Dave Coffin’s dcraw. It’s great for converting just about any RAW filetype into tiff. The code has undergone substantial changes, it’s still available publicly at github. The only downside is OpenCV is currently only compatible with Python 2, so I guess py3k is out for now. Now I just need to get some decent photos so I can start working with them…

In a tweeting world

For the fun of it, I made a twitter bot. It searches for usages of the phrase “in a world where” and combines them. The results are occasionally comedic, poignant, and nonsensical.

I have found occasionally tweets get parsed incorrectly, and there are a few phrases I’ve seen repeated. Apparently the pool of “in a world” tweets isn’t all that large. It’s a little hit-or-miss, but occasionally turns up good ones. I built it with Python and tweepy. It wasn’t difficult with the tweepy API, and the learning experience was fun. You can follow it at @TweetInAWorld.

Pypeline update

I’ve pushed some changes to the pypeline repository, adding basic stacking functionality. There isn’t any registration, it only takes the median of each channel (R, G, B) for each pixel. It’s currently way slow, but I suspect there is substantial room for improvement there. Wrangling NEF files has proven more difficult than I anticipated, so currently the state of the art in pypeline is JPGs.

My camera is rated down to 32°F, and nightly lows have been around 0, so I’m scared to take it out into the elements. On the plus side, the stacking works with regular images too! Any particular pixel just needs to have the “right” value for at least half the shots.

Inputs:

DSC_0031 DSC_0030

DSC_0029 DSC_0028

DSC_0027

And the stacked result:
output

There is a little ghosting, but quite good considering, I’d say. I am not sure how to get rid of that totally. More pictures should quash the error, but at 5 I would have thought it would wipe out any traces of the marker. Also, a better algorithm should be able to push down the > 50% requirement to only a plurality. Maybe with some sort of clustering of values? I’m also taking the mean of each channel independently, maybe a better way would be to use luminance. In any case, baby steps!

Visual cryptography

Inspired by this post at DataGenetics, I implemented a quick-and-dirty script in python to test it out. The first takes an input image and iterates over it pixel by pixel, splitting it into two output images. Ideally, the outputs are randomly assigned, so it is impossible to recover the original without both outputs. The outputs can be combined to recover the original. Here’s an example of it in action:

Original image:

Intermediate images (hopefully look like static):

Final output:

Not perfect, but it is definitely recognizable. The idea can apparently be extended to 9×9 (and 16×16, and 25×25… I presume) images, for a wider-shared secret. In any case, this scheme should make it possible for any number of people to share a secret, but none of them individually can recover it. I uploaded the code here on github.