SEO Information |
|
Search Engine Robots - How They Work, What They Do (Part I)
Automated search engine robots, sometimes called "spiders" or "crawlers", are the seekers of web pages. How do they work? What is it they really do? Why are they important? You'd think with all the fuss about indexing web pages to add to search engine databases, that robots would be great and powerful beings. Wrong. Search engine robots have only basic functionality like that of early browsers in terms of what they can understand in a web page. Like early browsers, robots just can't do certain things. Robots don't understand frames, Flash movies, images or JavaScript. They can't enter password protected areas and they can't click all those buttons you have on your website. They can be stopped cold while indexing a dynamically generated URL and slowed to a stop with JavaScript navigation.How Do Search Engine Robots Work? Think of search engine robots as automated data retrieval programs, traveling the web to find information and links. When you submit a web page to a search engine at the "Submit a URL" page, the new URL is added to the robot's queue of websites to visit on its next foray out onto the web. Even if you don't directly submit a page, many robots will find your site because of links from other sites that point back to yours. This is one of the reasons why it is important to build your link popularity and to get links from other topical sites back to yours. When arriving at your website, the automated robots first check to see if you have a robots.txt file. This file is used to tell robots which areas of your site are off-limits to them. Typically these may be directories containing only binaries or other files the robot doesn't need to concern itself with. Robots collect links from each page they visit, and later follow those links through to other pages. In this way, they essentially follow the links from one page to another. The entire World Wide Web is made up of links, the original idea being that you could follow links from one place to another. This is how robots get around. The "smarts" about indexing pages online comes from the search engine engineers, who devise the methods used to evaluate the information the search engine robots retrieve. When introduced into the search engine database, the information is available for searchers querying the search engine. When a search engine user enters their query into the search engine, there are a number of quick calculations done to make sure that the search engine presents just the right set of results to give their visitor the most relevant response to their query. You can see which pages on your site the search engine robots have visited by looking at your server logs or the results from your log statistics program. Identifying the robots will show you when they visited your website, which pages they visited and how often they visit. Some robots are readily identifiable by their user agent names, like Google's "Googlebot"; others are bit more obscure, like Inktomi's "Slurp". Still other robots may be listed in your logs that you cannot readily identify; some of them may even appear to be human-powered browsers. Along with identifying individual robots and counting the number of their visits, the statistics can also show you aggressive bandwidth-grabbing robots or robots you may not want visiting your website. In the resources section of the end of this article, you will find sites that list names and IP addresses of search engine robots to help you identify them.How Do They Read The Pages On Your Website? When the search engine robot visits your page, it looks at the visible text on the page, the content of the various tags in your page's source code (title tag, meta tags, etc.), and the hyperlinks on your page. From the words and the links that the robot finds, the search engine decides what your page is about. There are many factors used to figure out what "matters" and each search engine has its own algorithm in order to evaluate and process the information. Depending on how the robot is set up through the search engine, the information is indexed and then delivered to the search engine's database. The information delivered to the databases then becomes part of the search engine and directory ranking process. When the search engine visitor submits their query, the search engine digs through its database to give the final listing that is displayed on the results page. The search engine databases update at varying times. Once you are in the search engine databases, the robots keep visiting you periodically, to pick up any changes to your pages, and to make sure they have the latest info. The number of times you are visited depends on how the search engine sets up its visits, which can vary per search engine. Sometimes visiting robots are unable to access the website they are visiting. If your site is down, or you are experiencing huge amounts of traffic, the robot may not be able to access your site. When this happens, the website may not be re-indexed, depending on the frequency of the robot visits to your website. In most cases, robots that cannot access your pages will try again later, hoping that your site will be accessible then. Resources *SpiderSpotting - Search Engine Watchhttp://searchenginewatch.com/webmasters/spiders.html *Robotstxt.orgList of robots and protocols for setting up a robots.txt file. http://www.robotstxt.org/ *Spider-FoodTutorials, forums and articles about Search Engine spiders and Search Engine Marketing. http://spider-food.net/ *Spiderhunter.comArticles and resources about tracking Search Engine spiders. http://www.spiderhunter.com/ *Sim Spider Search Engine Robot SimulatorSearch Engine World has a spider that simulates what the Search Engine robots read from your website. http://www.searchengineworld.com/cgi-bin/sim_spider.cgi Daria Goetsch is the founder and Search Engine Marketing Consultant for Search Innovation Marketing, a Search Engine Optimization company serving small businesses. She has specialized in Search Engine Promotion since 1998, including three years as the Search Engine Specialist for O'Reilly Media, Inc., a technical book publishing company. Copyright Š 2002-2005 Search Innovation Marketing. http://www.searchinnovation.com All Rights Reserved. Permission to reprint this article is granted if the article is reproduced in its entirety, without editing, including the bio information. Please include a hyperlink to http://www.searchinnovation.com when using this article in newsletters or online.
MORE RESOURCES: Exclusive: Forbes, CNN, and More Lose Millions as New Google Policy Tanks Affiliate Businesses Adweek SEO reality check: 13 hard-hitting truths you need to hear Search Engine Land Microsoftâs AI SEO Tips: New Guidance For AI Search Optimization Search Engine Journal Term Drift In SEO - Why It Matters Search Engine Roundtable How to do audience research for SEO Search Engine Land How managed WordPress hosting can level up your SEO Search Engine Land SEO Trends For 2025 Search Engine Journal [Losing Traffic?] 4 Easy Steps To See How Googleâs AIO Is Affecting Your SEO Search Engine Journal Canonicalization and SEO: A guide for 2025 Search Engine Land Sustainability Expert Danny Seo Talks Conscientious Living Philadelphia Style | Modern Luxury Honest Digital Wins âBest Large SEO Agencyâ at US Search AwardsâThe First in Automotive Newswire SEO Advice On Version History Pages From Google Search Engine Roundtable Locafy Ltd. Eyes Growth Amid SEO Innovations Yahoo Finance 15 Reasons Why Your Business Absolutely Needs SEO Search Engine Journal 10 Best SEO Services Of 2024 Forbes Server access logs and SEO: Everything you need to know in 2025 Search Engine Land Honest Digital Wins 'Best Large SEO Agency' at US Search Awards - The First in Automotive Leader-Telegram Trend Micro and Japanese Partners Reveal Hidden Connections Among SEO Malware Operations Trend Micro Latest Google AIO Updates May Impact SEO Search Engine Journal Googleâs Updated Machine Learning Courses Build SEO Understanding Search Engine Journal 198 Top SEO Experts You Should Be Following Search Engine Journal Google November 2024 core update rolling out now Search Engine Land The Trunk: Hereâs what to expect from Seo Hyun-jin and Gong Yoo's upcoming Netflix drama Sportskeeda Digital Marketers See Schema Structured Data Shifting Beyond SEO Search Engine Journal 5 SEO trends for 2025 Search Engine Land Bring More Shoppers to Your Site with These SEO Tips for the Holliday Season Miami's Community Newspapers The latest jobs in search marketing Search Engine Land Google: Some SEOs Over Focus On URL Structure Search Engine Roundtable Googleâs srsltid= parameter: What it means for SEO and attribution Search Engine Land Best SEO tool of 2024 TechRadar Google November Core Update: 6 Insights From Millions of Queries Search Engine Journal Phono Sounds uk â Front and Back end WordPress Developer and SEO (uk) Music Business Worldwide Google launches new core update two weeks before Black Friday: SEO uncertainty at a key time Marketing 4 eCommerce 12 SEO Best Practices For 2024 DesignRush ChatGPT Search makes Microsoft Bing an SEO priority Search Engine Land Writing and SEO Word Soup Marketoonist Seo Hyun Jin issues dramatic apology to BTS fans; here's why The Times of India Meet The 7 Most Popular Search Engines In The World Search Engine Journal The Infamous SEO Retainer Model: A Dubious Expense for Businesses and a Cash Cow for Agencies Tech Business News Google Rolls Out November 2024 Core Algorithm Update Search Engine Journal Best SEO companies for lawyers or law firms of 2025 Forbes India Google Warns Against Over-Reliance On SEO Tool Metrics Search Engine Journal Want to improve rankings and traffic? Stop blindly following SEO tool recommendations Search Engine Land The Best SEO Conferences For 2024-2025 (Virtual And In-Person) Search Engine Journal How to gain visibility in generative AI answers: GEO for Perplexity and ChatGPT - Search Engine Land Top 10 JavaScript SEO Tricks Every Developer Should Know The New Stack 10 Best AI SEO Tools (November 2024) Unite.AI There Is More To Seo In Gukâs Story Men's Folio 4 of the best technical SEO tools Search Engine Land 13 Essential On-Page SEO Factors You Need To Know Search Engine Journal Google November 2024 Core Update Rolling Out Search Engine Roundtable SEO Poisoning Threat Hits AustraliaâDonât Google This! The Cyber Express Celebrating faculty: Seo-Hyun Park Lafayette College - News Content length, depth and SEO: Everything you need to know in 2025 Search Engine Land About Kinsta Search Engine Land Woo Hyun, Kim Kang Hyun, Jeon Seo Jin, And More Bring Diverse Charms To Upcoming Drama "Marry YOU" soompi LinkedIn SEO: 7 Tips to Optimize Your Company Page Sprout Social |
RELATED ARTICLES
Anchor Text Optimization Anchor Text (also called phrase linking) can significantly improve your web pages relevance in the search engines. Optimized or keyword rich anchor text can help your web site gain positioning in the search engines as well as help drive better targeted search traffic. World of Website Promotion Website promotion is a big and ongoing process. Every person who has website should have little knowledge about various elements involved in website promotion even if he had hired a SEO. Search Engine Spam Running an online business relies to a greater or lesser extent on search engine traffic. Be this free search engine traffic or pay per click traffic your business still relies on the search engines to profit and survive. Search Bots, Crawlers, and Spiders If you are a webmaster and you review your logs, often you will see a bunch of really strange hits. They aren't humans, you can't tell their operating system or their browser! Who are these pesky little creatures who rummage around the internet all the time?Not quite sure what I am talking about? Here is a few examples of various bots searching my website:207. Google is Quickly Changing... Google is quickly changing.. 20 Ideas for Creating Traffic Rich, Search Engine Friendly Pages Sometimes questions will arise around the subject of gateway information pages or doorway pages. People have heard that "doorway pages" are BAD and some have stated that search engines "hate doorway pages". Anatomy Of A Top Ranking Web page Optimizing web pages for high rankings in the search engines involved two main processes. Firstly there is the on-page factors which include what keywords you place where on the page itself. Is Google Having a Tough Time with Their Website Limit? If you are one to pay attention to what happens within the Google realm, you might find yourself thrown for a loop these days. As Google updates their results, it seems like they are having some issues dealing with so many new websites popping up. 3.5 Tips To Help You Avoid Becoming The Next Search Engine Outlaw Want to avoid being blacklisted by the search engines and banished to the sin bin never to receive a single search engine visitor ever again? Thought so. Here's some tips to help you. Reciprocal Links to Boost Link Popularity ? Link popularity means the number of incoming links pointing to your website. This is one of the criterical factor that rank the search results. Ten Steps To A Well Optimized Website - Step 1: Keyword Selection This is part one of ten in this search engine positioning series. In part one we will outline how to choose the keyword phrases most likely to produce a high ROI for your search engine positioning efforts. Googlebot Wont Go Home I have 'Googlebot' crawl my site every day like a dispossessed spirit that can't leave.It was not always like this, I would go for a month or more before he came to my site and then would only crawl a few pages and leave again. Increase Your Link Popularity for Free Among the many things you need to worry about for high search engine rankings,link popularity is among the most important. Link popularity refers the how many other websites on the internet link back to your website. 10 Quick Ways To Kick-Start Your Profit Pulling Keywords First, you must realize that targeting the right keywords or phrases is the 'key' to making any kind of profit from your site. Choosing the 'right' keywords (the exact keyword or phrase surfers type into the search engines to find yoursite or product) can make or break your online venture. Make Quality Content Your #1 Priority It is by now a proven fact that content is the most important element for getting better pagerank and, consequently, more traffic.Furthermore, the best ranking websites have content that is better written than most other sites. Google Page Rank - Important Or Just Another Number? In my last newsletter I wrote about how your websites Alexa rating is not actually that important to the success of your online business. In this issue, I want to look at another popular statistic - Google Page Rank - and ask a similar question - is it that important?First a quick overview as to what the Google Page Rank actually is. All About Links -- Interview With Link Building Expert , Bob Gladstein Julia: Welcome Bob. Thank you for taking the time to answer my questions about link building. How Ive Maintained 7 Top Ten Google Rankings For Nine Months Back in November 2004 I discovered a way to get a top 10 ranking in Google. I tested the technique for 3 months before I shared my findings with the world. Search Engine Optimization :: The Basics and Why Websites Need It We all know that the most targeted traffic we can get for our websites is from search engines. If you have a little patience and time to set websites the right way you can have a great source of excellent traffic and, best of all, you get it for free. SEO Training - Avoid Making this Costly Mistake! SEO training can be overwhelming. There are literally hundreds of factors that go into search engine ranking on today's major search engines like Google, Yahoo and MSN. |
home | site map |
© 2006 |