My favorites | Sign in
Project Home Downloads Issues Source
Search
for
EdgeDetection  
EdgeDetection Demo
Updated Feb 4, 2012 by ri...@cs.utexas.edu

Introduction

The first step in many vision systems is to detect the edges in the image (see section 24.2.1 of AIMA). Given an image, an edge detection algorithm determines which pixels lay along the boundary of two different elements, such as an agent's head and a distant mountain. The edge maps and histograms can then be classified into different objects. In this demo you will drive a NERO agent around the environment and take snapshots of different objects in it. Each snapshot will then be processed through smoothing and edge detection; building a system that labels these objects is left for an exercise.

The video below shows how it works; you can also run OpenNERO yourself to test it interactively.

Running the Demo

To run the demo,

  1. Build OpenNERO using the cs343vision2 branch.
  2. Install the Python Imaging Library by downloading the latest version (1.1.7) for your OS (linux, windows, MacOSX).
  3. Start the NERO mod.
  4. Click the First Person Agent button.
  5. Position your agent with the W, A, S, and D keys.
  6. When you have an object in sight, press the Snapshot button.

Each snapshot will automatically be processed by OpenNERO and the results will be displayed in a four-panel window. Note that the default version included in the demo is a slower, more canonical implementation of the edge detection algorithm; it may take up to 2 minutes to process an image, depending on the speed of your computer. An optimized version is available in show_image_fast.py and should complete its analysis in about 10 seconds, but requires numpy and scipy to be installed.

Four-Panel Window Results

Each time you take a snapshot, it will be processed using a method similar to that in section 24.2.1 of the AIMA (third edition) textbook. That is, first the brightness is extracted from the original color image, resulting in a gray-scale image. Next, the image is convolved with a Gaussian filter, resulting in a smoothed image. The smoothing step is necessary to detected edges more reliably. The last step is to calculate the brightness gradients for each pixel in the smoothed image. When the gradient is high, there is an edge throught that pixel. The gradients are thresholded, resulting in a map of edges in the image. The results of these four processing steps are displayed together in a four-panel window that pops up. Below is a description of each panel.

Color

The upper left panel displays the raw, color snapshot taken.

Black and White

The upper right panel displays the color snapshot converted to black and white.

Smoothed

The lower left panel displays the black and white image after having applied a Gaussian filter to smooth out the edges and prepare it for edge detection. The code to perform this smoothing is found in show_image.py and excerpted below:

# Apply Gaussian filter
sigma = 1
bw_pix = bw.load()
convolution = [ [0 for col in range(bw.size[1])] for row in range(bw.size[0])]
# Loop over every pixel in the image and apply the Gaussian filter
for (x,y) in itertools.product(range(bw.size[0]), range(bw.size[1])):
    gauss_totals = 0
    # For each neighboring pixel
    for(u, v) in itertools.product(range(-3*sigma + x,3*sigma+1+x),range(-3*sigma + y,3*sigma+1+y)):
        if(u < 0 or v < 0 or u >= width or v >= height):
            continue
        gauss_totals += gaussian(x-u,y-v,sigma)
        convolution[x][y] += (bw_pix[u,v]) * gaussian(x-u,y-v,sigma)
    convolution[x][y] /= gauss_totals

smoothed = Image.new(bw.mode, bw.size)
smoothed_pix = smoothed.load()
for x, xval in enumerate(convolution):
    for y, yval in enumerate(convolution[x]):
        smoothed_pix[x,y] = int(convolution[x][y])

Edges

The lower right panel displays the edge-detected image. The code to perform this process is found in show_image.py and excerpted below:

neighbor_offsets = [(-1,-1,225), (-1,0,270), (-1,1,315), (0,1,0), (0,-1,180), (1,0,90), (1,1,45), (1,-1,125)]

gradient_threshold = 10

neighbor_offsets = [(-1,-1,225), (-1,0,270), (-1,1,315), (0,1,0), (0,-1,180), (1,0,90), (1,1,45), (1,-1,125)]

gradients = [ [(0,0) for col in range(bw.size[1])] for row in range(bw.size[0])]

# Loop over every pixel in the image and calculate the largest gradient
for (x,y) in itertools.product(range(bw.size[0]), range(bw.size[1])):
    pixel = convolution[x][y]
    # For each immediately neighboring pixel
    for (uidx, vidx, angle) in neighbor_offsets:
        u = x + uidx
        v = y + vidx
        if u < 0 or y < 0 or u >= width or v >= height:
            continue
        gradient = abs(convolution[u][v] - pixel)
        if gradient > gradients[x][y][0]:
            gradients[x][y] = (gradient, angle)

# Draw the edges
edges = bw.point(lambda i: 255)
edges_pix = edges.load()

results = [ [(False,0) for col in range(bw.size[1])] for row in range(bw.size[0])]

# Loop over every pixel in the image and determine if it's an edge pixel
for (x,y) in itertools.product(range(bw.size[0]), range(bw.size[1])):
    gradient = gradients[x][y]
    # the gradient must be greater than some threshold set by the user
    if gradient[0] < gradient_threshold:
        continue
    is_max = True
    # determine if the gradient is a local maximum
    for (uidx, vidx, angle) in neighbor_offsets:
        u = x + uidx
        v = y + vidx
        if u < 0 or y < 0 or u >= width or v >= height:
            continue
        if gradients[u][v][0] > gradient[0]:
            is_max = False
            break
    if is_max:
        edges_pix[x,y] = 0
        results[x][y] = (True,angle)

Next Steps

Now that you have seen how edge detection works, you can build an object recognition system on top of it. One very basic approach to object recognition is to train a machine learning classifier on a training set of edge-detected images. (Each snapshot that you take and its processed versions are saved into the snapshots folder with a timestamped filename, allowing you to easily collect such a training set.) To implement your classifier, see the Object Recognition Exercise page. After you have done this Exercise, the result of the classification is also displayed in the four-panel window: You can then walk around the environment, labeling objects that you see!


Sign in to add a comment
Powered by Google Project Hosting