|
EdgeDetection
EdgeDetection Demo
IntroductionThe first step in many vision systems is to detect the edges in the image (see section 24.2.1 of AIMA). Given an image, an edge detection algorithm determines which pixels lay along the boundary of two different elements, such as an agent's head and a distant mountain. The edge maps and histograms can then be classified into different objects. In this demo you will drive a NERO agent around the environment and take snapshots of different objects in it. Each snapshot will then be processed through smoothing and edge detection; building a system that labels these objects is left for an exercise. The video below shows how it works; you can also run OpenNERO yourself to test it interactively.
Running the DemoTo run the demo,
Each snapshot will automatically be processed by OpenNERO and the results will be displayed in a four-panel window. Note that the default version included in the demo is a slower, more canonical implementation of the edge detection algorithm; it may take up to 2 minutes to process an image, depending on the speed of your computer. An optimized version is available in show_image_fast.py and should complete its analysis in about 10 seconds, but requires numpy and scipy to be installed. Four-Panel Window ResultsEach time you take a snapshot, it will be processed using a method similar to that in section 24.2.1 of the AIMA (third edition) textbook. That is, first the brightness is extracted from the original color image, resulting in a gray-scale image. Next, the image is convolved with a Gaussian filter, resulting in a smoothed image. The smoothing step is necessary to detected edges more reliably. The last step is to calculate the brightness gradients for each pixel in the smoothed image. When the gradient is high, there is an edge throught that pixel. The gradients are thresholded, resulting in a map of edges in the image. The results of these four processing steps are displayed together in a four-panel window that pops up. Below is a description of each panel. ColorThe upper left panel displays the raw, color snapshot taken. Black and WhiteThe upper right panel displays the color snapshot converted to black and white. SmoothedThe lower left panel displays the black and white image after having applied a Gaussian filter to smooth out the edges and prepare it for edge detection. The code to perform this smoothing is found in show_image.py and excerpted below: # Apply Gaussian filter
sigma = 1
bw_pix = bw.load()
convolution = [ [0 for col in range(bw.size[1])] for row in range(bw.size[0])]
# Loop over every pixel in the image and apply the Gaussian filter
for (x,y) in itertools.product(range(bw.size[0]), range(bw.size[1])):
gauss_totals = 0
# For each neighboring pixel
for(u, v) in itertools.product(range(-3*sigma + x,3*sigma+1+x),range(-3*sigma + y,3*sigma+1+y)):
if(u < 0 or v < 0 or u >= width or v >= height):
continue
gauss_totals += gaussian(x-u,y-v,sigma)
convolution[x][y] += (bw_pix[u,v]) * gaussian(x-u,y-v,sigma)
convolution[x][y] /= gauss_totals
smoothed = Image.new(bw.mode, bw.size)
smoothed_pix = smoothed.load()
for x, xval in enumerate(convolution):
for y, yval in enumerate(convolution[x]):
smoothed_pix[x,y] = int(convolution[x][y])
EdgesThe lower right panel displays the edge-detected image. The code to perform this process is found in show_image.py and excerpted below: neighbor_offsets = [(-1,-1,225), (-1,0,270), (-1,1,315), (0,1,0), (0,-1,180), (1,0,90), (1,1,45), (1,-1,125)]
gradient_threshold = 10
neighbor_offsets = [(-1,-1,225), (-1,0,270), (-1,1,315), (0,1,0), (0,-1,180), (1,0,90), (1,1,45), (1,-1,125)]
gradients = [ [(0,0) for col in range(bw.size[1])] for row in range(bw.size[0])]
# Loop over every pixel in the image and calculate the largest gradient
for (x,y) in itertools.product(range(bw.size[0]), range(bw.size[1])):
pixel = convolution[x][y]
# For each immediately neighboring pixel
for (uidx, vidx, angle) in neighbor_offsets:
u = x + uidx
v = y + vidx
if u < 0 or y < 0 or u >= width or v >= height:
continue
gradient = abs(convolution[u][v] - pixel)
if gradient > gradients[x][y][0]:
gradients[x][y] = (gradient, angle)
# Draw the edges
edges = bw.point(lambda i: 255)
edges_pix = edges.load()
results = [ [(False,0) for col in range(bw.size[1])] for row in range(bw.size[0])]
# Loop over every pixel in the image and determine if it's an edge pixel
for (x,y) in itertools.product(range(bw.size[0]), range(bw.size[1])):
gradient = gradients[x][y]
# the gradient must be greater than some threshold set by the user
if gradient[0] < gradient_threshold:
continue
is_max = True
# determine if the gradient is a local maximum
for (uidx, vidx, angle) in neighbor_offsets:
u = x + uidx
v = y + vidx
if u < 0 or y < 0 or u >= width or v >= height:
continue
if gradients[u][v][0] > gradient[0]:
is_max = False
break
if is_max:
edges_pix[x,y] = 0
results[x][y] = (True,angle)Next StepsNow that you have seen how edge detection works, you can build an object recognition system on top of it. One very basic approach to object recognition is to train a machine learning classifier on a training set of edge-detected images. (Each snapshot that you take and its processed versions are saved into the snapshots folder with a timestamped filename, allowing you to easily collect such a training set.) To implement your classifier, see the Object Recognition Exercise page. After you have done this Exercise, the result of the classification is also displayed in the four-panel window: You can then walk around the environment, labeling objects that you see! | |