Great Temperatures for Data Harvesting

3

This chart shows the high temperatures seen at Hartsfield-Jackson Atlanta International Airport in 2011, and was created using several techniques used by data journalists. The source of the high temperatures is wunderground.com; on the pages of each location, a “history/almanac” section displays information of weather conditions seen at that location in the past. Using this part of the page, I modified code created by Nathan Yao, author of Visualize This, to “scrape” – go through the page of each day of Atlanta Airport’s weather history in 2011 on wunderground.com – the maximum temperatures. After doing that, I imported the information into Microsoft Excel, where I entered it into a spreadsheet, and created a line chart from it.

The process of scraping the data was the most difficult; though I did take and thoroughly enjoy an AP Computer Science class in Java in high school, it was quite different than anything I had done before. Yao had already written the program in his book, which did make things easier, but I did have to know which areas to modify for this exercise. Working with Excel was very easy; but I’ve worked with Excel a lot in the past, both to organize numbers and to create charts.

3 Comments

  1. Fact error: High temp. on June 11, 2011 was 89 °F.

    http://www.wunderground.com/history/airport/KATL/2011/6/11/DailyHistory.html?req_city=NA&req_state=NA&req_statename=NA

    High temp. on July 11, 2011 was 95 °F.

    http://www.wunderground.com/history/airport/KATL/2011/7/11/DailyHistory.html?req_city=NA&req_state=NA&req_statename=NA

    Note that your label says June 11, but the data point indicated is in July.

    Jan. 10 is correct. Nov. 29 is correct.