Converting Geotagged Photos to KML PhotoOverlays

Mano Marks, Google Geo APIs Team
January 2009

Objective

This tutorial teaches you how to use geotagged photos to create KML PhotoOverlays. While the sample code is written in Python, many similar libraries exist in other programming languages, so it shouldn't be a problem to translate this code into another language. The code in this article relies on an open-source Python library, EXIF.py.

Introduction

Digital cameras are pretty amazing things. Many users don't realize it, but they do more than just take pictures and video. They also tag those videos and photos with metadata about the camera and its settings. In the last few years, people have found ways to add geographic data to that information, either embedded by the camera manufacturers, such as some Ricoh and Nikon cameras, or through devices such as GPS loggers and the EyeFi Explore. Camera phones like the iPhone and phones using the Android operating system, like T-Mobile's G1, embed that data automatically. Some photo upload sites, such as Panoramio, Picasa Web Albums, and Flickr, will parse out GPS data automatically and use it to geotag a photo. You can then get that data back in feeds. But where's fun in that? This article explores how to get at that data yourself.

Exif headers

The most common way to embed data into an image file is using the Exchangeable Image File Format, or EXIF. The data is stored in binary form in the EXIF headers in a standard way. If you know the specification for EXIF headers, you can parse them out yourself. Fortunately, someone has already done the hard work and written a Python module for you. The EXIF.py open source library is a great tool for reading the headers of a JPEG files.

The Code

The sample code for this article is in this file: exif2kml.py. If you want to skip directly to using it, download that module, as well as EXIF.py, and place them in the same directory. Run python exif2kml.py foo.jpg replacing foo.jpg with the path to a geotagged photo. It will produce a file called test.kml.

Parsing the Exif headers

EXIF.py provides an easy interface for pulling out the Exif headers. Simply run the process_file() function and it will return the headers as a dict object.

def GetHeaders(the_file):
  """Handles getting the Exif headers and returns them as a dict.

  Args:
    the_file: A file object

  Returns:
    a dict mapping keys corresponding to the Exif headers of a file.
  """

  data = EXIF.process_file(the_file, 'UNDEF', False, False, False)
  return data

Once you have the Exif headers, you need to extract the GPS coordinates. EXIF.py treats these as Ratio objects, objects for storing the numerator and denominator of the values. This sets up a precise ratio rather than relying on a floating point number. But, KML expects decimal numbers, not ratios. So you extract each of the coordinates, and convert the numerator and denominator to a single floating-point number for decimal degrees:

def DmsToDecimal(degree_num, degree_den, minute_num, minute_den,
                 second_num, second_den):
  """Converts the Degree/Minute/Second formatted GPS data to decimal degrees.

  Args:
    degree_num: The numerator of the degree object.
    degree_den: The denominator of the degree object.
    minute_num: The numerator of the minute object.
    minute_den: The denominator of the minute object.
    second_num: The numerator of the second object.
    second_den: The denominator of the second object.

  Returns:
    A deciminal degree.
  """

  degree = float(degree_num)/float(degree_den)
  minute = float(minute_num)/float(minute_den)/60
  second = float(second_num)/float(second_den)/3600
  return degree + minute + second


def GetGps(data):
  """Parses out the GPS coordinates from the file.

  Args:
    data: A dict object representing the Exif headers of the photo.

  Returns:
    A tuple representing the latitude, longitude, and altitude of the photo.
  """

  lat_dms = data['GPS GPSLatitude'].values
  long_dms = data['GPS GPSLongitude'].values
  latitude = DmsToDecimal(lat_dms[0].num, lat_dms[0].den,
                          lat_dms[1].num, lat_dms[1].den,
                          lat_dms[2].num, lat_dms[2].den)
  longitude = DmsToDecimal(long_dms[0].num, long_dms[0].den,
                           long_dms[1].num, long_dms[1].den,
                           long_dms[2].num, long_dms[2].den)
  if data['GPS GPSLatitudeRef'].printable == 'S': latitude *= -1
  if data['GPS GPSLongitudeRef'].printable == 'W': longitude *= -1
  altitude = None

  try:
    alt = data['GPS GPSAltitude'].values[0]
    altitude = alt.num/alt.den
    if data['GPS GPSAltitudeRef'] == 1: altitude *= -1

  except KeyError:
    altitude = 0

  return latitude, longitude, altitude

Once you have the coordinates, it is easy to create a simple PhotoOverlay for each photo:

def CreatePhotoOverlay(kml_doc, file_name, the_file, file_iterator):
  """Creates a PhotoOverlay element in the kml_doc element.

  Args:
    kml_doc: An XML document object.
    file_name: The name of the file.
    the_file: The file object.
    file_iterator: The file iterator, used to create the id.

  Returns:
    An XML element representing the PhotoOverlay.
  """

  photo_id = 'photo%s' % file_iterator
  data = GetHeaders(the_file)
  coords = GetGps(data)

  po = kml_doc.createElement('PhotoOverlay')
  po.setAttribute('id', photo_id)
  name = kml_doc.createElement('name')
  name.appendChild(kml_doc.createTextNode(file_name))
  description = kml_doc.createElement('description')
  description.appendChild(kml_doc.createCDATASection('<a href="#%s">'
                                                     'Click here to fly into '
                                                     'photo</a>' % photo_id))
  po.appendChild(name)
  po.appendChild(description)

  icon = kml_doc.createElement('icon')
  href = kml_doc.createElement('href')
  href.appendChild(kml_doc.createTextNode(file_name))

  camera = kml_doc.createElement('Camera')
  longitude = kml_doc.createElement('longitude')
  latitude = kml_doc.createElement('latitude')
  altitude = kml_doc.createElement('altitude')
  tilt = kml_doc.createElement('tilt')

  # Determines the proportions of the image and uses them to set FOV.
  width = float(data['EXIF ExifImageWidth'].printable)
  length = float(data['EXIF ExifImageLength'].printable)
  lf = str(width/length * -20.0)
  rf = str(width/length * 20.0)

  longitude.appendChild(kml_doc.createTextNode(str(coords[1])))
  latitude.appendChild(kml_doc.createTextNode(str(coords[0])))
  altitude.appendChild(kml_doc.createTextNode('10'))
  tilt.appendChild(kml_doc.createTextNode('90'))
  camera.appendChild(longitude)
  camera.appendChild(latitude)
  camera.appendChild(altitude)
  camera.appendChild(tilt)

  icon.appendChild(href)

  viewvolume = kml_doc.createElement('ViewVolume')
  leftfov = kml_doc.createElement('leftFov')
  rightfov = kml_doc.createElement('rightFov')
  bottomfov = kml_doc.createElement('bottomFov')
  topfov = kml_doc.createElement('topFov')
  near = kml_doc.createElement('near')
  leftfov.appendChild(kml_doc.createTextNode(lf))
  rightfov.appendChild(kml_doc.createTextNode(rf))
  bottomfov.appendChild(kml_doc.createTextNode('-20'))
  topfov.appendChild(kml_doc.createTextNode('20'))
  near.appendChild(kml_doc.createTextNode('10'))
  viewvolume.appendChild(leftfov)
  viewvolume.appendChild(rightfov)
  viewvolume.appendChild(bottomfov)
  viewvolume.appendChild(topfov)
  viewvolume.appendChild(near)

  po.appendChild(camera)
  po.appendChild(icon)
  po.appendChild(viewvolume)
  point = kml_doc.createElement('point')
  coordinates = kml_doc.createElement('coordinates')
  coordinates.appendChild(kml_doc.createTextNode('%s,%s,%s' %(coords[1],
                                                              coords[0],
                                                              coords[2])))
  point.appendChild(coordinates)

  po.appendChild(point)

  document = kml_doc.getElementsByTagName('Document')[0]
  document.appendChild(po)

You can see we're just using standard W3C DOM methods, because those are what are available in most programming languages. To see how the whole thing fits together, download the code from here.

This sample doesn't take advantage of the full power of PhotoOverlays, which allow you to create deep explorations of high resolution photos. But, it does demonstrate how to hang a photo in a billboard style over Google Earth. Here's a sample of a KML file created using this code:

<?xml version="1.0" encoding="utf-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
  <Document>
    <PhotoOverlay id="photo0">
      <name>
        1228258523134.jpg
      </name>
      <description>
<![CDATA[<a href="#photo0">Click here to fly into photo</a>]]>      </description>
      <Camera>
      	<longitude>
          -122.3902159196034
        </longitude>
        <latitude>
           37.78961266330473
        </latitude>
        <altitude>
          10
        </altitude>
        <tilt>
          90
        </tilt>
      </Camera>
      <Icon>
        <href>
          1228258523134.jpg
        </href>
      </Icon>
      <ViewVolume>
        <leftFov>
          -26.6666666667
        </leftFov>
        <rightFov>
          26.6666666667
        </rightFov>
        <bottomFov>
          -20
        </bottomFov>
        <topFov>
          20
        </topFov>
        <near>
          10
        </near>
      </ViewVolume>
      <Point>
        <coordinates>
          -122.3902159196034,37.78961266330473,0
        </coordinates>
      </Point>
    </PhotoOverlay>
  </Document>
</kml>

And here's what it looks like in Google Earth:

Cautions

Geotagging photos is still in its infancy.

Here's a number of things to be aware of:

GPS devices aren't always 100% accurate, particularly those that come in cameras, so you should check the positions of your photos.
Many devices don't track altitude, setting it instead to 0. If altitude is important to you, you should find another way to capture that data.
The GPS position is the position of the camera, not of the subject of the photo. That is why this sample positions the Camera element in the GPS position, and the actual photo away from that position.
Exif doesn't capture information about the direction your camera is pointing, so you may need to adjust your PhotoOverlays because of that. The good news is that some devices, such as phones built on the Android operating system, do allow you to capture data such as compass direction and tilt directly, just not in the Exif headers.

All that being said, this is still a powerful way to visualize your photos. Hopefully, we'll be seeing more and more accurate geotagging of photos in the near future.

Where to Go from Here

Now that you've started using EXIF headers, you might explore the EXIF spec. There's plenty of other data that is stored there, and you might be interested in capturing it, putting it in a description balloon. You might also consider creating richer PhotoOverlays using ImagePyramids. The Developer's Guide article on PhotoOverlays has a good overview of using them.