Realtor Association Cleans Up Their Housing Data

Today, and two days after the originally scheduled release date, the Illinois Association of Realtors announced the June home sales statistics for Chicago, the greater metro area, and all of Illinois. The reason for the delay was that the IAR was working on getting to the bottom of some pretty serious data errors that led them to overstate the trend in median home prices for Chicago.
As expected the June home sales numbers for Chicago came in 27.1% below the 2010 levels. I won’t bother to repeat here the discussion of these numbers that I posted a couple of weeks ago.
What was more interesting was the announcement by the IAR regarding what their investigation of the data issues uncovered. Apparently they were accessing fairly raw data with SQL and were somehow missing part of the data:

It has been determined, beginning with the November 2010 city of Chicago median price reports, that the program did not recognize some of the data fields thus eliminating, on average, 11 percent of the records per month, which resulted in a higher median price determination.
The city of Chicago sequel program has been corrected. All data from the inception of the city of Chicago reports (2008) has been verified and all reports since November 2010 have been adjusted to reflect the inclusion of this data.

Whereas realtors have access to a much easier query interface, these interfaces are only good for accessing data from a single MLS. The IAR has to aggregate data from multiple MLSs so they probably dump this data into a single database and then access it with SQL. Having used SQL myself, I can assure you that, with the convoluted design of some of these databases and the limits of SQL, it’s easy to make a mistake.

Leave a Reply