Will's LIS 2600 Blog: Week 12 Blog Post and Muddy Point

The surface web consists of sites that have been indexed by spider crawling automated programs that seek out sites that are static and linked to multiple other pages. This is the portion of the Internet that most of us are familiar with but there is another portion of the web called the 'Deep Web' that is difficult to measure because it can only be searched and retrieved by entering the precisely correct query into an engine. Much of the Internet falls into the category of "Deep Web". BrightPlanet is an engine developed to be capable of running multiple simultaneous searches in an attempt to make deep web content quantifiable and accessible. It found that the deep web is 2,000 times larger than the world wide web that most of us are familiar with. Deep web sites are more frequently used than surface sites but are less well known. I don't understand how that makes sense. Even Google is only searching through 1 in every 3,000 pages available. It's amazing to think how much more access and accurate retrieval of information we could have with even 50% efficiency. Considering how convenient and available information seems now, it's exciting to imagine a world in which information could be hundreds of times more retrievable. I have to admit that I never imagined that there was still that much untapped potential even with the current generation of technology. Given the speed that storage and transmission technologies advance, I wonder how likely it is that searching and retrieval will ever come close to bridging the gap of this vast unused potential.

Muddy Point: No Muddy Point from last week's stuff.

Will's LIS 2600 Blog

Monday, November 14, 2011

Week 12 Blog Post and Muddy Point

No comments:

Post a Comment