Data Storage Problems

This week the New York Times published a long article about the problem of data storage, and I would like to summarize some of their findings. The article is available here in Saturday’s technology section.

The article is an attack at what the author sees as wasteful use of resources in data storage centres. There are now hundreds of thousands of these huge centres spread throughout the world, and the problem is they use an incredible amount of electricity. The servers have to be kept cool and they have to have spare capacity so that we can download whatever we want whenever we want.

Inside a US data centre

Inside a US data centre

Worldwide these centres use about 30 billion watts of electricity, and that is about 30 nuclear power plants worth of power. A single data center uses about the same amount as a small town, and the main criticism is the nature of the usage.

In the US 2% of all electricity used goes to these data centers, but the vast majority of this resource is wasted. Typically many servers are left to run 24 a day but never or rarely used (more than half in this study), and the average machine in operation uses less than 10% of its capacity. Servers are left running obsolete programs or in ‘comatose’ because nobody wants to risk a mistake and turn them off.

All of this means that any data center might use 30 times as much electricity as is needed to carry out the functions it performs.

All of these centres also have to have a back up in case of power failure, and so are surrounded by diesel generators and stacks of batteries, and many have been found in breach of environmental regulations and fined. The article gives details but the companies are names that we all know and use.

If you read the more than 300 comments however you will discover that a lot of people do not agree with the findings as reported. Many technicians argue that the companies cited are investing huge amounts of money into making the storage of data more efficient, and are constructing wind farms and using solar power in an attempt to cut costs and emissions. The article has its agenda and exploits it fully, but the problem is real.

I personally believe that we are witnessing the results of a digital culture change. We no longer have to store data on our machines, we can store it in some mythical cloud out there in the cyber-universe. This makes us think that it somehow exists without the need for a hard drive, but this is not true. As a result we keep things that we do not need. I have 500 e mails in my inbox, with attachments, photos that I will never again look at and other useless things, and they are all in storage somewhere.

Technology advances, storage gets cheaper and uses less space, but the amount of data created is growing at an incredible rate. My question is, can we do anything about it? Are we not the ones who should take some responsibility and think about the consequences of our actions. We think about not using paper to print emails but we don’t think about not sending them!

Apple and some bad press

This week I would just like to do a short follow up on Christopher’s article entitled Why do we stick by Google and Apple but not Microsoft?

I will start with a little story about my 6 year old boy. He loves making things and last week he made a laptop computer from cardboard. It has keys with letters on it, a mouse and a black screen with icons. When it was finished he showed it to me, pointing out the detail, and said “I am not going to put the apple here on the top though, because they don’t all have apples”.

This is the power of branding. In a few years when he wants a laptop of his own what will he want? The one with the apple?

My kind of branding

My kind of branding

Recently though here in the US I have met a few people that do not use Apple products in principle. The reason I think is the bad press that their working practices have received in US newspapers.

I am not in any way endorsing these reports, but I feel that they are worthy of analyzing in a bit of detail, both for their content and their political or ethical standpoint.

The New York Times ran an entire series in which it looked at the human costs of the iPad and apple revolution, and you can read it here (not too long).

The opening lines speak of an explosion in which 2 people were killed as they polished iPad cases. This is not the only reported explosion either, and there are plenty of cases of people being burnt as they use chemicals without proper safety procedures, excessively long days spent entirely stood up and child labour.

It is not just Apple though that use these manufacturing plants however, and the scale of the operations is incredible. One of the names often cited for criticism is Foxconn, and they do a lot of Apple’s assembly. They have 1.2 million employees in China, some plants have more than 100 000 workers, they operate 24 hours a day, can call upon their work force at any time and start production within minutes of receiving orders. This is what the technology of today requires and produces.

Apple do have a code of conduct within its supply chain, drawn up and expanded upon since 2005. Audits are conducted and violations unearthed and they say that this is a sign of their commitments to improvement, but some say that the fact that they problem is continuous points to a toleration of non compliance.

Last year they found 4 deaths and 77 injuries within their production system, and several suicides. Now one death or injury is too many, but with a workforce of well over a million accidents will happen, and some might even see this as a good record. Apple state that they train their workforce and explain their rights to them.

One thing is for sure, the stakes are high and there is a lot of money to be made but Apple is a demanding company. And they are not the only ones with dodgy working practices, but seem to be singled out for criticism.

Why might that be I wonder? Maybe it is because as Christopher hinted they inspire such loyalty amongst their users, and some circles do not like that.

How to proceed in the age of big data?

A couple of weeks ago I read an article in the New York Times about the age of big data, and today at a science and technology conference I got into a conversation about the same thing with a US public health official.

Much has been written (and I am a guilty party) about Google’s quest for information, including allegations of infringements of privacy etc, but not all of this capability should be seen in a negative light. I would like to give you a few examples of why.

A wealth of data

Google collect all of the search terms used by every user and categorize them. Let’s take a hypothetical situation. You are director of a large hospital inManchester. What can Google tell you about your job? Well probably a lot, let’s say that this week there is an enormous peak in the search terms “Flu symptoms” used across the Greater Manchester area, or “rash on back and neck”. Indirectly the knowledge of these search trends tells you that you should prepare your hospital, because late next week you will have a massive influx of patients with the Flu or some other contagious disease as it takes hold of the population.

This information is potentially lifesaving, as one of the main problems with epidemics is they come out of nowhere and so health centres are not properly prepared.

Search terms can also give an indication of how the housing market will behave too, with a rise in searches for houses in a certain area being reflected 6 months later in new sales. The type of house searched could also improve planning, as developers would see what people were looking for and where.

Analysts and programmers are currently working on how to expand on the simple examples above using search terms as wider indicators, a system called ‘sentiment analysis’ looks particularly promising.

This form of analysis looks at terms used during on line communication and categorizes them in terms of their sentiments. The logic is that in an area that is prospering terms will be generally positive, but in an area that is threatened by demise, such as the closure of industry or other societal problems, the terms will differ. This is not dissimilar to the conversation analysis sociologists use to obtain a person’s own sentiments about their position in life, with their true feelings reflected in the terms they use without thought. The hope is that an accurate analysis of this type might signal unfolding problems before they become a reality so that action can be taken in specific areas to avoid social breakdown.

I have addressed these issues in more depth on the Bassetti Foundation website, but want to conclude by saying the following; in my posts I have often raised the issue of data collection as a problem, and collection of personal data for advertising or any other purpose for that matter does raise serious ethical issues, but here Google et al could be sitting on a mine of extremely useful and possibly globally important data if the technology and political will is developed to use it correctly.