The Joys of Geocoding

In my last post, I was asked what the accuracy of the locations in our generated Google Earth files are. Before I divulge that information, I’d like to explain some of the challenges of getting accurately geocoded data. (I’ll get on my soapbox and complain about the state of NWMLS data in my next post).

GPS Signal WiggleNow, in partial defense of realtors and the MLS, it is unrealistic to expect perfect data. For example, consumer-level GPS receivers aren’t always as accurate as one might think. This weekend I loaded up Microsoft Streets & Tips 2006 on my desktop computer, hooked up my GPS receiver, turned on GPS tracking , created a GPS trail, and walked away for an hour. An hour later, my map had a line drawing that resembled the type my 3 year old son likes to create. So even if a realtor was to use a GPS receiver, to get a latitude & longitude reading, it’s entirely possible that the measurement would be off by a house or two (or four).

Another problem, is that most digital maps are created with data sold by companies like TeleAtlas or NavTeq. The companies compile their data by driving around previously unknown streets & neighborhoods, with computers & GPS receivers (kinda like how that annoying guy in the Verizon ads, test their network). I should note that in-vehicle navigation systems are more accurate than GPS receivers alone, because the vehicle’s navigation system can also use the vehicle’s steeling wheel position and the speedometer to determine what your location is.

Unfortunately, by the time the Microsoft’s, Yahoo’s and Google’s of the world get their hands on the data, it is at least 3-6 months out of date (and probably closer to 12-18 months out of date by the time it gets on the web or published on a CD). This is a problem because about 25% of the properties in the NWMLS are new construction (where new construction is defined as a property that was built in 2005 or later). Since new construction is often located near new roads, the giants of digital mapping may be unable to help and are always in a position of playing catch up.

Then when the companies convert the raw data into digital maps, they end up using multiple sources of data, and interpolating it into one set of data they are going to use for a map. However, the data sources don’t always agree on where a point of interest is.

For example, Google Earth thinks the top of the Seattle Space Needle is at 47.620367° north latitude & 122.349005° west longitude. Meanwhile, Microsoft’s Virtual Earth, seems to think it’s located at 47.620336° north latitude & 122.348515° west longitude. Now, a few ten thousand-enths of a degree means the difference between the tip of the needle & one of the air conditioning units on the roof (a few yards). But if they can’t agree on where the top of the Space Needle is, it’s likely they aren’t going to agree on where 742 Evergreen Terrace is either. However, a few yards of error is better than a few miles of error (which is what can happen when I use raw NWMLS data)

Because of this, I have to geocode every single property in the database because I don’t trust the NWMLS data. So I to call Yahoo! Maps Web Services – Geocoding API to get a latitude & longitude for everything. Although Yahoo is far from perfect, at least it’s free and try’s harder than the MLS. So without further delay, here is the current geocoding precision of the points on our generated maps.

Geocoding Precision No. of properties Percentage
address 16341 80.20
street 1975 9.69
zip+4 43 .21
zip+2 343 1.68
zip 1644 8.07
city 25 .12
state 5 .02

In closing, I’d like to ask real estate professionals to be as complete and as accurate as possible when submitting listing data to their local MLS. I’d also like to state even if the MLS was accurate, it’s unrealistic to expect prefect geo-coding from imperfect data. If digital mapping companies and GPS technology can’t get it exactly right, a house or two off, is probably as accurate as you can realistically hope for given the current state of the art.

Robbie
Caffeinated Software

6 thoughts on “The Joys of Geocoding

  1. Robbie, as a former GIS / GPS guy, I can tell you that good, post-processed GPS data like those companies (should?) use is good to the meter or so. However, they don’t type in each address – they divvy the block up into even sections and “guess” the address from that. Part of what you’re seeing with Google earth is also geo-referencing error on aerial imagery it’s hard to get aerial photos of 3-d surfaces accurate to the meter; if you look at the space needle, it looks like it’s leaning to the side. This is because the airplane (yes, airplane!) was not directly over it when it took the picture.

    I’m guessing you’re well aware of this, but most folks aren’t.

    So, you’re right: no complaining people – those points are pretty close!

  2. That was a great insight into the world of Geocoding.

    I brought the point up because when searching GoogleEarth before and looking up houses of friends and family, I often noticed that some of the time it indentified one of the neighboring houses as the target property and thought it was a little odd.

    It’s looking good though. Keep up the great work!

  3. Pingback: Real Central VA - Tracking the Charlottesville and Central VA real estate market and more » Geocoding public notices

Leave a Reply