EDITOR NOTE: This is Jonny’s 75th post on Technology Bloggers! Jonny was a complete newbie to blogging when he wrote his first post (about prosthetic limbs) but he is now somewhat of an expert – although he probably wouldn’t agree! – note by Christopher
Recently a couple of articles have appeared on large US websites about a type of search engine called Shodan. This search engine has been about for about 3 years, but it is different from Google and its cohorts in many ways. I looked at it and could not understand it at all, so what is it then and why is it causing such concern?
Expose online devices
I have seen Shodan described as “The scariest search engine on the Internet”. This CNN money article explains that Shodan navigates the Internet’s back channels. It’s a kind of “dark” Google, looking for the servers, webcams, printers, routers and all the other stuff that is connected to and makes up the Internet.
What interest could there be in such capability? Well a lot apparently. The system allows an individual to find security cameras, cooling systems and all types of home control systems that we have connected to the Internet. (See Christopher’s series about his British Gas system here).
One serious problem is that many of these systems have little or no security because they are not perceived as threatened. Shodan searchers have however found control systems for a water park, a gas station, a hotel wine cooler and a crematorium. Cybersecurity researchers have even located command and control systems for nuclear power plants and a particle-accelerating cyclotron by using Shodan.
Hacking apart it turns out that the world is full of systems that are attached via router to the office computer and web server, and on to the outside world. Access for anyone who can find them and might like to turn of the refrigeration at the local ice rink, shut down a city’s traffic lights or just turn off a hydroelectric plant.
The Shodan system was designed to help police forces and others who might have legitimate need for such a tool, but what when it gets into the wrong hands. Security is non existent, just get your free account and do a few searches and see what you find.
See this Tech News World article for a further look at the ethical and practical issues that such a freely available product might bring
Regular readers will be aware of my interest in these types of problems through my work at the Bassetti Foundation for Responsible Innovation. I am not sure how the development and marketing of such a tool could be seen as responsible behaviour, but as I have been told on many occasions during interviews there are plenty of other ways of finding out such things. These types of systems are gathering already available information to make it usable, nothing more, so not doing anything wrong.
The internet is big right? Okay it is massive. With that massiveness one naturally associates extreme diversity. Don’t get me wrong, across the entire internet, there is amazing variation, with billions of people adding their spin to the net.
What I am going to investigate in this post though is how diverse the ‘main’ internet is. What I mean by that is the internet that we use every day. How diverse is the most regularly used/visited content? Is there really as much choice as we think, or is the majority of the internet dominated by a few firms?
In order to go about this research I am going to use Alexa, who gather statistics on websites traffic. For most sites, the data isn’t that accurate, however for really busy sites, the numbers are so great, the reliability of the data is much higher, hence why I can use it.
According to Alexa, Google.com is the most visited site on the web. How could it not be? Alexa estimates that 50% of all internet users visited Google.com in the last three months. Second on the list for most visited sites is Facebook, which is trailing with just 45% of internet users visiting the site.
Remember however that is just Google.com, Google has a massive monopoly over the internet. In the 100 most visited sites on the web, 18 of the sites are owned by Google – 16 localised sites, Google.com and GoogleUserContent.com (the site you see when there is an error finding/displaying a page).
Google undoubtedly has reduced diversity on the internet, having such a monopoly on the sites we all visit. The thing is, it isn’t just 18 sites. Google also owns YouTube and (the third most visited site on the net) Blogspot which is ranked 10th, Blogger at 47 (Blogger and Blogspot are now one) and Blogspot.in (India) ranked 73. That means 21 of the most visited sites on the net belong to Google, meaning it owns more than one fifth of the ‘main’ internet.
Google’s dominance on the web suggests that a lot of us are Googlites!
Can you call the internet diverse, when in the top one hundred sites, one firm owns more than a fifth of all sites? Maybe, what does the rest of the field look like?
Unsurprisingly, the company that is arguable Google’s main rival is in second place. Yahoo and Microsoft are currently in a ‘Search Alliance’ therefore restricting competition, so I am going to count them in the list of sites that Microsoft owns/influences. Here is the list of sites that Microsoft owns/influences which are top 100 websites:
Yahoo.com – Ranked 3rd
Live.com – Ranked 7th
Yahoo.co.jp – Ranked 16th
MSN.com – Ranked 17th
Bing.com – Ranked 29th
Microsoft.com – Ranked 30th – ironic how it is lower many of the other sites it owns!
Flickr.com – Ranked 53rd and Yahoo owned
Therefore Microsoft own/influence 7 of the top 100 sites. Add that to Google’s 21, and 28 of the top sites on the net are owned by two firms. More than a quarter.
I am starting to think the ‘main’ internet is not as diverse as one may first assume.
Next on the list of internet giants comes Amazon. Amazon.com is ranked 10th, whilst Amazon Germany (Amazon.de) is ranked 91st and Amazon Japan (Amazon.co.jp) is 95th. Amazon also owns the Internet Movie Database (IMDB.com) which is the 50th most visited site. Amazon owns 4 of the top 100 sites.
32 sites gone.
The Alibaba Group is a privately owned Chinese business, which owns Alibaba.com, Tmall (tmall.com), Taobao (Taobao.com) and Sogou.com. The group therefore account for four of the sites that make up what I am calling the ‘main internet’.
36 sites taken by just 4 companies. How diverse is our internet?
Next we come to eBay.com which sits 23rd on the list of top 100 sites. eBay International AG (ebay.de) is in 80th place, followed by eBay UK (ebay.co.uk) in 86th. eBay also owns PayPal (paypal.com) which is ranked 46th.
eBay steals another 4 sites, leaving just 60 of our hundred left, and so far only 5 firms are involved.
CNN (cnn.com) AOL (aol.co.uk) and The Huffington Post (huffingtonpost.com) are all sites owned by Time Warner. Time Warner is the sixth business involved now, leaving just 57 sites.
The blogging platform WordPress (wordpress.com) is ranked 19th, and its brother, which allows users to host the content management system on their own site (wordpress.org) is ranked 83rd.
There goes another two sites, meaning just 55 left, and only seven players so far.
Ranked number 8 on the list is Twitter, however its URL shortener (t.co) is ranked 31st, meaning Twitter is also one of the big players in the top 100 sites, arguably with some form of domination over the internet.
47 sites of the top 100 accounted for and a mere eight organisations involved.
Of the final 53 sites, 5 are adult only sites leaving 48 sites – although many of these either are a part of, or are a much bigger group.
Some familiar faces appear in the other 48 sites, Facebook (2nd), Wikipedia (6th), LinkedIn (11th), Apple (34th), Tumblr (37th), Pinterest (47th), BBC Online (48th), Ask (54th), AVG (62nd), Adobe Systems Incorporated (67th), About.com (81st), ESPN (82nd), Go Daddy (85th), Netflix (89th), The Pirate Bay (92nd) and CNET (97th).
Remove these very well known, well established, and massive brands, and we are left with 32 sites – less than a third. Of the remaining sites, around half are Chinese, showing the growing influence and usage of the internet in China.
In this post I have established that of the sites we visit most regularly, 47 are owned by just eight organisations. Does that really represent the freedom that we all believe the internet offers?
I was surprised by the type of content, and the limited number of different sites that there are in the global top 100. It would seem that the most visited sites consist of search engines, social media sites and news websites. Interesting statistics.
So, what is your verdict on how diverse the internet we use everyday is? I personally am not quite as convinced as I was before writing this article that the internet is quite as free and diverse as we all believe.
Please note these rankings are changing all the time, and all content was correct according to Alexa.com at the time of writing – the 6th of July 2012.