Adobe’s Flex-ibility to the Deep web

We’re making our way to web 2.0 and everything is going smoothly. Well we at least think it is. None of you probably know that our migration to web 2.0 can become a major problem for the future web experience. Wow wait up, our web experience will become a major problem in the future? Wasn’t web 2.0 not all about the best web experience we could ever imagine?

This question has multiple answers:

  1. Web 2.0 sucks, we can’t index them properly (search engine builders)
  2. Web 2.0 is the best I’ve ever seen before (users)
  3. Web 2.0 takes web developing to the next level (web developers)
  4. Web 2.0 can be a pain in the ass, our servers crash as we speak (web hosting companies)

Most people will agree the number 2 and 3 are positive and that we all know why they are saying this. Users like it because of the nice interfaces and the smooth effect shown when the information refreshes. Web developers like it because they can do much every thing they could do on their desktop.

Number 4 is a little strange, Web 2.0 was also ‘designed’ to reduce the data transfer between server and client by not every time uploading the same page, but only the parts of the pages which change. But is seems to be different (in most cases), MySpace for example is still slow because of the data the servers have to process, it is even increased after transforming their website in Web 2.0 by making use of AJAX. But there are also a lot of website’s which reduced their server’s processed data by using web 2.0 technologies. It therefor seems to be a developers issue whether or not it will be effective. Proper use and implementation of web 2.0 technologies will reduce the amounts of data a server has to process.

Number 1 is the one we have to discuss, cause it can’t be solved that easy as number 4 above. This is all about search engines and web 2.0 technologies. As I have explained in Google + Ajax = Troubles. search engines don’t index (crawl) client-side generated content as it is in the case of the web 2.0 technology called AJAX, where JavaScript provides the connection between the server and client and requests the changed content of a page.

This information that only physic users (as in the way that the information will only be shown by a particular user/pass combination or as described above when information only pops up when some client-side script requests some) can access is stored in the so called Deep web (or Deepnet, invisible Web or hidden Web). In 2000, it was estimated that the deep Web contained approximately 7,500 terabytes of data and 550 billion individual documents. Estimates, based on extrapolations from the study entitled How much information is there?, from University of California, Berkeley, show that the deep Web consists of about 91,000 terabytes. By contrast, the surface Web, which is easily reached by search engines, is only about 167 terabytes.

Imagine the information you can access when search engines can access the deep net!

Not knowing this, almost every self respected web developer is trying to build his/her own website in Adobe’s Flex or what ever nice looking web 2.0 technology. Not knowing that he/she is building for the deep web and that this website will for the time now not be completely index by any search engine.

This problem is caught up by many search engines and they are busy trying to find a solution to index the deep web. But till this isn’t found, many web 2.0 websites will not be indexed by search engines and the deep will will grow much more fast as ever before.

Pakku

Leave a Reply