When you sit down at your computer and do a Google search, you're almost
instantly presented with a list of
results from all over the web. How does Google find web pages matching your
query, and determine the order of search results?
- Googlebot, a web crawler that finds and fetches web pages.
- The indexer that sorts every word on every page and stores the resulting index of words in a huge database.
- The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.
Let’s
take a closer look at each part.
1. Googlebot, Google’s Web Crawler:-
Googlebot
is Google’s web crawling robot, which finds and retrieves pages on the web and
hands them off to the Google indexer. We use a huge set of computers to fetch
or crawl billions of pages on the web. The program that does the fetching is
called Googlebot (also known as a robot, bot, or spider). Googlebot uses an
algorithmic process: computer programs determine which sites to crawl, how
often, and how many pages to fetch from each site.
Google's
crawl process begins with a list of web page URLs, generated from previous
crawl processes, and augmented with Sitemap data provided by webmasters. As
Googlebot visits each of these websites it detects links on each page and adds
them to its list of pages to crawl. New sites, changes to existing sites, and
dead links are noted and used to update the Google index.
2. Google’s Indexer:-
Googlebot gives the indexer the full text of the
pages it finds. These pages are stored in Google’s index database. This index
is sorted alphabetically by search term, with each index entry storing a list
of documents in which the term appears and the location within the text where
it occurs. This data structure allows rapid access to documents that contain
user query terms.
3. Google’s Query Processor:-
When
a user enters a query, machines search the index for matching pages and return
the results we believe are the most relevant to the user. Relevancy is
determined by over 200 factors, one of which is the Page Rank for a given page.
Page Rank is the measure of the importance of a page based on the incoming
links from other pages. In simple terms, each link to a page on site from
another site adds to your site's Page Rank. Not all links are
equal: Google works hard to improve the user experience by identifying spam
links and other practices that negatively impact search results. The best types
of links are those that are given based on the quality of your content.
In order for site to rank well in search results
pages, it's important to make sure that Google can crawl and index site
correctly. Webmaster Guidelines outline some best practices that can help to
avoid common pitfalls and improve site's ranking.
Google's “Did you mean” and Google “Auto complete”
features are designed to help users save time by displaying related terms,
common misspellings, and popular queries. They display these predictions only
when they think user might save the time. If a site ranks well for a keyword,
it's because they've algorithmically determined that its content is more
relevant to the user's query.

No comments:
Post a Comment