Thursday, September 12, 2013
Simple plots can reveal complex patterns
Visualization is a big topic on its own, which implies that you can get quite sophisticated in making plots. However, you can reveal complex information from simple plots.
I took a shot at visualizing power generation data from the Kaggle competition. My goal was just to make a "heat map" of the power generation data: for every <week, hour of week>, plot the power generated. Now, I had to rearrange the data a bit but the result was not only pretty, but more importantly, very revealing and efficient. The plot summarizes ~25k data points by revealing cycles over days and months over several years.
Enjoy. Courtesy of pandas for data munging and matplotlib for the plot.
Subscribe to:
Post Comments (Atom)
Dear Majid
ReplyDeleteThis sort of plot looks great! Recently I began working with Python (used Matlab before) and tried to create a plot similar to this one. But I ran into problems (array size) when I tried to group my data in days and hours of the day with the lambda x-function.
So my question is how you managed to arrange your data, especially which type of data did you use? I used a dateframe that I fetched from a MySQL-Server. Could you give me some advise or even share your code as a iPython Notebook?
Thanks in advance
JDB
I could post my code if you insist but I'm not sure it would be that useful to you (my input data was not arranged as nicely with a timestamp!) Your main difficulty is in arranging the data, and for that the solution is in pandas timeseries indexing http://pandas.pydata.org/pandas-docs/stable/timeseries.html
ReplyDeleteIn my case, once I had my hourly data, I was able to iterate by week.
As for the plot, I've used matplotlib to plot a million points without much trouble.