Climate and Carpentry

Crossposted from Scimatic

Jamie and I had the pleasure of going out for dim sum with Greg Wilson the other day. You might remember Greg from such classics as Science 2.0 and Stack Overflow Dev Days Toronto. He was talking to us about his Software Carpentry course that he’s been running out of the University of Toronto and University of Alberta.

Software Carpentry is a two week course for science graduate students who need to do software development, but may not have had CS training. It covers things like scripting, version control, and unit testing. And I know that based on my experience in graduate school, this kind of course would have saved me a lot of time and pain.

Back in the dark ages (1990s) when I was in grad school, we coded in Fortran, didn’t write unit tests, and our version control was something like "Get my latest source out of this directory: /home/jim/calibrations/beware_of_the_jaguar_new_calibration_1997_06_09/". Needless to say, I learned the lessons taught in Greg’s course the hard way.

The course is great and desperately needed; however there is push-back from some in the academic community as to its necessity. The argument is something along the lines of "grad students are cheap, why should we pay more to train them outside of their field?" and "they can learn all the coding they need from books and more senior students."

Well, there’s a great counter-example to that in the news recently: ClimateGate!

The Climate Research Unit at the University of East Anglia has one of the most-cited climate change databases in the world. This database has been the basis for some of the UN’s predictions that climate change is real, and is driven by mankind (Anthropogenic Global Warming). Recently, either an external hacker or an internal whistleblower released to the internet a series of emails and source code that implies some non-transparent dealings by the scientists with respect to their data. Whether it’s merely academic infighting or actual fraud is up for debate.

Regardless of your feelings about anthropogenic global warming and the politics of academia, one thing is clear from the released source code: everyone working on that code base needs to attend Greg’s Software Carpentry course. One has to feel bad for Ian "Harry" Harris, who’s HARRY_README.TXT shows that the CRU source code has just about every mistake possible. And because of those mistakes, we can’t answer some very real questions about the conclusions coming out of the CRU.

The point here is not that "ClimateGate" has killed AGW. My point is simply that bad software development practices have contributed not to settling the AGW question, but have further muddied the waters. If the folks at CRU had been open about their practices, we could have an honest debate about AGW and what do to about it. Because, in part, of their bad development practices, we can’t have that discussion, and the anti-AGW folks now say there’s no point to talking about it at all — global warming is dead.

I wouldn’t go that far, but once you lose Rex Murphy, you’re in trouble.

So, to all the academic supervisors and grant approvers out there: get grad students trained up on software development (and statistics, but that’s another post). Every major bit of physical science from now on will involve programming. Training students to do software development is the same as teaching them calculus. It’s not a luxury, it’s a necessity.

What? They don’t learn calculus anymore?! Kids!

Climate and Carpentry