Email storage and a lump of coal

TreeHugger is not the first to notice that computing technology can have environmental impact and different “systems” can be greener than others. In an invited talk at Microsoft Research in 2004, Andrew Shapiro from the Berkman Center and author of The Control Revolution raised the question of whether Linux could be deemed more environmentally friendly because it ran on lower-end hardware that would not meet the base requirements for modern Windows SKUs. (He was polite enough not to answer this question given the audience.) Similarly it is widely acknowledged that data centers today are gated by cooling and power consumption– air conditioning being one of the prime resource hogs– and availability of power generation is a significant factor in selecting “hot-spot” locations for building them.

TreeHugger post frets over the cost of email storage and wonders whether deleting email will curb carbon emissions. Good intentions for sure but the calculation may have been slightly off base for several reasons. First the bad news: storage in large-scale services like the one cites in the article are replicated. There can’t be just one copy of the message sitting around. Try explaining to a user that you lost all of their vacation pictures because drive #3385 failed– the so-called “we blame Seagate” approach.  That implies the figures are underestimating the true impact. That would be true only in a simplistic model where  power consumption scales with amount of data stored. Transaction capacity is often the determining factor for data center design. If one million people are checking email at the same time, enough servers have to be up and running to process those requests with tolerable latency. That’s true even if everyone keeps an empty inbox.

Similarly different storage architectures can lead to very different resource consumption patterns. If drives are directly attached to server, then more storage means more servers even if the servers sit idle CPU-wise. If the service uses a storage array network (SAN) then only drives are being powered and not all the extra baggage that would come with a full-fledged server. This is similar to the difference between using a networked drive at home verses another general purpose PC for handling backups. Finally there is the storage corollary to Moore’s law: disk sizes increase, price drops and so does power consumption per GB. (Unfortunately there is also a storage corollary to Peterson’s principle which states that data expands so as to fill the drive available.) It’s true that less storage will achieve some reduction but the Treehugger article probably overestimates this by several orders of magnitude. And if hosted cloud service were comapred to storing the same amount of data at home, there would be no contest: those massive data-centers achieve economies of scale and corresponding eco-efficiency not available to the average consumer not living off-the-grid with solar panels.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s