How Google Spider and PageRank Works

Post new topic   Reply to topic

View previous topic View next topic Go down

How Google Spider and PageRank Works

Post by funke on Fri Jan 11, 2008 9:22 am

It is amazing how much the Internet borrows for the living world. Take the web for example. When we talk about it, we are so used to the web of the Internet that our mind does not stop once to think about the spider web in the garden. And yet, it is from that garden spider web that it derived its name. And like the spider web in the garden, the cyberspace web also consists of interconnecting threads, or links, holding one web site to another. The only difference is size: the web of the Internet consists of billions and billions of links. And it's nowhere as neat as its cousin in the garden.

To make sense of this massive entangled heap of threads, we use search engines such as Google to find what we want. We go to the Google search engine, we type in what we want to find, and it serves us a list of results. Have you questioned how Google does it?

Google uses a program that browses through the World Wide Web to find, index and compile the content. Just as a spider crawls through the web in the garden, the program used to crawl through the World Wide Web is also called a "spider". A spider program, also called a web crawler and a web spider, is used by all the Search Engines to look for content and compile them.

If you want your website or webpage to come up when someone uses the Search Engine, you want to make sure the web spider has crawled your website. To achieve that, there has to be a link from some other website to your website. Websites that are standing alone never appears in the search rankings. That's not want you want. You want people to find your website, and to find it repeatedly.

On the Internet today, there are millions and millions of websites. Some are linked to another while others are alone. A link is achieved when you link to somebody, when somebody links to you, or when you link both ways. This linkage, called hyperlink, is a very important component of the Internet, for it allows the spider to reach your website or webpage, and ultimately provide a ranking to it. Websites that don't link to anybody, and nobody links to, are standing alone, and are virtually ignored by the spider. Quite simply, they are out of reach.

What is a web spider? It is a program or automated script that browses through the World Wide Web in a methodical, automated manner. The process of browsing through the pages is called web crawling or web spidering.

All the major search engines such as Google, Yahoo, MSN, etc. have their own web spiders. Google's spider is called Googlebot. There are two types of Googlebot, in fact, called deepbot and freshbot. The deepbot is a spider that tries to follow every link on your webpage. It brings the information back to the Google indexers to analyze and index. The freshbot is a spider that crawls through the web looking for new content, and may visit your website frequently.

In order to determine the importance of every website on the Internet, Google devised a ranking system, called PageRank. The name PageRank is a patented trademark of Google - the patent went to Stanford University, where the founders of Google, Larry Page and Sergey Brin, developed it. PageRank ensures that the most important websites are duly accorded its place on the Internet. Every page of every website on the web is assigned a PageRank from 0 to 10, with 0 being the least important, and 10 being the most. Every new website starts at 0, and tries to work its way up.
How do the spiders work? I am simplifying things, but basically, the spider starts from the websites with the highest ranking, say PageRank 10, and work its way down towards PageRank 0. Pages with PageRank 10 gets enormous attention from the spiders - the freshbot spider might visit it many, many times in an hour. On the other hand, PageRank 0 sites might not get any attention at all. What the spiders do is, they read through a page, starting from the top left and ending at the bottom right. If they encounter a hyperlink, they'll follow that link to the next page, and start reading there. What you want is for the spider to follow a hyperlink from someone's website to your website. It doesn't help you if you place a link from your website to someone else's website - the link must come from their website to yours.




When someone links to your website, his page carrying the link gives a "ballot" to your webpage. The more people link to your website, the more vote you collect, and the higher your PageRank. But in this balloting process, votes are not equal. A webpage that has a high PageRank throws out a higher vote than pages with low PageRank. If you can get important pages to link to you, you earn their strong votes, significantly elevating your webpage's PageRank position. At the same time, you do not generously link to any websites, because you bleed away your PageRank in doing so. In short, you want incoming links from high PageRank webpages, and you do not want to give outgoing links to just anybody.

It is enormously difficult to reach PageRank 10. In fact, the only website I can think of with a PageRank 10 is the Google homepage itself. Yahoo, MSN, Dell and Apple Computer homepages all carry PageRank 9. Even Wikipedia reaches only PageRank 8. These are already some of the biggest players on the Internet. Most of the important websites have PageRank between 4 and 6. It is your goal therefore to reach that level. And mind you, it gets harder and harder to go from one PageRank to another. PageRank uses a scale similar to the Richter Scale. Going from one level to the next may mean a thousand times more difficult.

A fast way to view the PageRank of any page on the web is to download the Google Toolbar, and the PageRank of the page you load will be displayed. Just use the Google search engine, and search for "Google Toolbar". Note however, though Google does not publicly discloses the precise method it uses for determining the Toolbar PageRank value, Google representative Matt Cutts announced that Google updates the Toolbar PageRank once every three months, meaning what we see is not real time.
Now that we have understood how Google rank webpages, the next big question is: how do we get important pages to link to us?

funke

Gender:Female
Posts : 15
Joined : 29 Dec 2007
Location : lagos

Back to top Go down

View previous topic View next topic Back to top


Permissions of this forum:

You cannot reply to topics in this forum