Thursday, March 28, 2013

A failed experiment - step by step

.

After I finished with the four weeks of climate data, I hoped to make some headway on removing the noise from the temperature data to make the trends easier to see.

This data is the average Winter temperature in Greenland from 1969 to 2010.


The dotted line shows the average of the averages.  The first half of the data would be 1969-1989, the second half 1990-2010.  It's easy to see the second half has more above average data than the first half.



A different way to measure things is the trendline, also known as the line of correlation or the best fitting line. There are two equations in the picture as well. Let me explain the constants.

0.0507 is the slope of the line. This means that if the rate continued into the future - and we can not assume it will - the temperature is increasing at about one degree every twenty years. That's pretty fast.

The -99.019 would be the temperature if the trendline line were taken back to 1 B.C. As I teach my statistics students, YOU ARE NOT SUPPOSED TO DO THAT!  The notice that the trendline is a line segment that doesn't continue either into the future or the past of the time period we decided to look at. Any methods for modeling future behavior have to be a lot more sophisticated than a trendline.

The 0.07203 is the correlation coefficient R². R² is always a number between 0 and 1, and the closer it is to 1, the more confident we are that the line tells us important information about the data. 0.07203 is not close to 1. Even with a data set with 42 points, the lowest R² that makes us 95% confident about what we can say is 0.0973.


Okay, the data isn't really close to linear. We can see that ourselves.  The reason I chose the years 1969 and 2010 is that those were peak years of the Atlantic Oscillation, the change in ocean temperatures that should have the most impact on Greenland temperature data.

As you can see, the Atlantic Oscillation is puny compared to the changes in average Winter temperatures in Greenland.  Be that as it may, let's factor them out to see what difference that makes.


The green line is the blue line - average - red line.  I subtracted the average just so the two lines wouldn't be right on top of each other. As you can see, factoring the oscillation out was not much change.


And now the trendline. Again, the constant -80.798 in the trendline equation is not important, but the slope of 0.0407 and the R² or 0.04628 are important. The slope says that upward movement is slower, only increasing a degree in 25 years instead of 20. (Note: Scientists agree than a degree in 50 years is fast, possibly catastrophic.  Either of these speeds is a cause for concern.)

The more important number is R² which shrunk from about 0.072 to 0.046.  If correlation had gone up enough to create 95% confidence, I would have considered that I was on the right track. But this experiment didn't work very well and it was not the only one.  Other seasons also showed more variation instead of less when the Oscillation was factored out Working with the stronger El Niño/La Niña oscillations on the data of a place like tropical India did not bring the correlation up.

I will continue to look at climate data and will occasionally present data on areas based on similarities of weather rather than just arbitrary longitude and latitude lines. I'm going to see if I can find a climate scientist who has time to talk about this stuff. But subtracting out the oscillations of ocean currents was a promising lead that turned out to be a dead end.

Stuff like that happens when you do research.


No comments:

Post a Comment