Showing posts with label Search Engine Optimization. Show all posts
Showing posts with label Search Engine Optimization. Show all posts

Friday 11 July 2014

How Google Searches for The Content You want...



 When you sit down at your computer and do a Google search, you're almost instantly   presented with a list of results from all over the web. How does Google find web pages matching your query, and determine the order of search results?



  •  Googlebot, a web crawler that finds and fetches web pages. 
  •  The indexer that sorts every word on every page and stores the resulting index of words in a huge database.  
  • The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.
       Let’s take a closer look at each part.
       

       1. Googlebot, Google’s Web Crawler:- 

                   Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. We use a huge set of computers to fetch or crawl billions of pages on the web. The program that does the fetching is called Googlebot (also known as a robot, bot, or spider). Googlebot uses an algorithmic process: computer programs determine which sites to crawl, how often, and how many pages to fetch from each site.
                                      
            Google's crawl process begins with a list of web page URLs, generated from previous crawl processes, and augmented with Sitemap data provided by webmasters. As Googlebot visits each of these websites it detects links on each page and adds them to its list of pages to crawl. New sites, changes to existing sites, and dead links are noted and used to update the Google index. 
         

       2. Google’s Indexer:- 

                    Googlebot gives the indexer the full text of the pages it finds. These pages are stored in Google’s index database. This index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs. This data structure allows rapid access to documents that contain user query terms. 
                            
                    To improve search performance, Google ignores (doesn’t index) common words called stop words (such as the, is, on, or, of, how, why, as well as certain single digits and single letters). Stop words are so common that they do little to narrow a search, and therefore they can safely be discarded. The indexer also ignores some punctuation and multiple spaces, as well as converting all letters to lowercase, to improve Google’s performance.
    

       3. Google’s Query Processor:-                            

                   When a user enters a query, machines search the index for matching pages and return the results we believe are the most relevant to the user. Relevancy is determined by over 200 factors, one of which is the Page Rank for a given page. Page Rank is the measure of the importance of a page based on the incoming links from other pages. In simple terms, each link to a page on site from another site adds to your site's Page Rank. Not all links are equal: Google works hard to improve the user experience by identifying spam links and other practices that negatively impact search results. The best types of links are those that are given based on the quality of your content.

                   In order for site to rank well in search results pages, it's important to make sure that Google can crawl and index site correctly. Webmaster Guidelines outline some best practices that can help to avoid common pitfalls and improve site's ranking.

                   Google's “Did you mean” and Google “Auto complete” features are designed to help users save time by displaying related terms, common misspellings, and popular queries. They display these predictions only when they think user might save the time. If a site ranks well for a keyword, it's because they've algorithmically determined that its content is more relevant to the user's query.