1. Software for computer vision

Pyramid Match

Feature Extraction for LIBPMK

November 2008

This package provides a framework for detecting interest points and extracting descriptors, mostly for images, but it is extendable to just about anything (such as videos or text). In the image domain, it provides further capabilities for automatically detecting image types and converting between formats so that particular detector/extractor implementations need not worry about low-level details. This library is written as an extension to LIBPMK and fully integrates with it.
» Project page and source code
Implicit Shape Model

Object Localization with an Implicit Shape Model

MIT Vision Interfaces group; Spring 2008

This is an implementation of the Leibe et al. implicit shape model, which is an algorithm for detecting and localizing instances of an object in a large image. We used this as a baseline in a recent NIPS submission (details coming shortly if it gets in) to detect instances of books in a cluttered environment. This library is written as an extension of LIBPMK.
» Project page and source code
Spatial PMK

Object Recognition with the Spatial PMK

MIT Vision Interfaces group; Fall 2007

This is an implementation of the Lazebnik et al. spatial pyramid match, which is just a uniform PMK on spatial features that quantized by appearance. For image databases where object classes have some spatial consistency or if the object is well-localized, this method was shown to perform well. This library is written as an extension of LIBPMK.
» Project page and source code
Adaptive Vocabulary Forests

Adaptive Vocabulary Forests for Recognition and Retrieval

MIT Vision Interfaces group; ICCV 2007

This is an implementation of adaptive vocabulary forests, which is some joint research I did and submitted to ICCV 2007. It is based on libpmk, but also implements the vocabulary trees on top of it, as well as some toy demos. It includes the source code we used for our ICCV demo, in which photos were taken from my camera phone and uploaded to a laptop, which had a tree server running in the background, so that any clients could upload photos and perform searches. The package contains over 15,000 lines of code!
» Project page, source code, and paper (PDF)
Pyramid Match

LIBPMK: A Pyramid Match Toolkit

Fall 2006-2007, MIT Vision Interface Lab

LIBPMK provides a fast C++ implementation of the Pyramid Match algorithm, as well as a flexible framework with which users can easily and quickly run experiments. The library includes a lot of built-in functionality made from scratch, like k-means and hierarchical clustering, dealing with data sets too large to fit in memory, creating multi-resolution histograms, and performing fast pyramid matches. The experimental framework wraps around LIBSVM to provide an easy way to train and test SVMs.
» Project page (includes documentation and C++ source code)
Optical flow of rotating sphere

Optical Flow: Motion Field and Focus of Expansion

Fall 2005, MIT 6.866 Machine Vision project

This project is an implementation of an iterative method for computing optical flow. Its input is a movie file in any format playable by mplayer (most things should work). The program will overlay the estimated motion field on a grayscale version of the original video. In the case of translational motion along the z-axis (the camera zooming in and out), you can also optionally have it estimate the focus of expansion and draw a dot there. One of my secondary goals was to make memory usage efficient so it scales well with the length of the input movie.
» source code

2. Other stuff

Bluetooth

A Remote Slide Advancer for your Nokia S60 Phone

Just for fun

Since both my mobile phone and my laptop have Bluetooth, I made a little clicker that lets me advance slides using the arrow keys on my cell phone. I've used this while giving talks rather than buying a separate slide advancer. From the back, my phone (Nokia N95) actually looks like a camera, so it occasionally raises eyebrows when people see me pressing buttons on it. You can also program any of the keypad buttons to execute any arbitrary command; I once used it to trigger some movies being played in the middle of a presentation.
» Source code
Pong

Real Life Pong

Spring 2007, MIT 6.883

This is an implementation of the game Pong where the paddles are controlled by moving in the physical world. You would run around with a GPS device and a cell phone. The game would display on the cell phone, and as you moved around, your paddle would move around. The GPS device and phone both communicate using Bluetooth, and you can communicate with a game server over Bluetooth or Wi-Fi. When indoors, the GPS device can be replaced with a cricket. The game can also be displayed on a projector screen mounted on the ceiling pointed at the floor, so you can actually move around in the game world.
» Source code
Two spinning tori

Motion Description Language Interpreter

Spring 2002, MCS6, Stuyvesant High School

This was my final project for MCS6, Computer Graphics. We defined a simple motion description language (MDL) and wrote a script parser which takes a bunch of commands and will generate pretty images and animations. It can generate a number of 3D shapes (box, sphere, torus) or arbitrary polygons. It implements the Gouraud shader for the lighting effects. A number of sample scripts are included in the tarball. Our group also made a web-based interface to the MDL renderer and a ray tracer.
» Source code and demos
Liar's Poker

Liar's Poker for the TI-89

Fall 2002, Stuyvesant High School

We liked playing Liar's Poker instead of paying attention to Dr. Majewski in Physics C.. so when he banned cards from the classroom, I made a TI-89 version and we would pass the calculator around, so it would look like we were doing work when we were instead playing 6-player games of Liar's Poker.
» Source code