Category: Blogs

Why & How to write Robots.txt file for website, blog

Robots.txt file is a great way to tell search engine’s about your site and manage crawler’s to index your site as per your preference. I think most of the newbie bloggers are not familiar with this file, most of them are in a bit fear of Robot.txt file as they think it is hard to edit Robots.txt file and any single mistake can harm their site.

This is same case with me when I started blogging 2 years back but now all of my blogs have Robots.TXT file installed on servers. It will be great if search engine boat crawl and index your website or blog frequently but wait!! But not when they index something which you really don’t thought off. These bots will index everything unless you guide them what to index and what not. Here I am going to explain everything about Robots.TXT file like: how to create Robots.txt file, how to edit Robot.txt and why you Robots.txt file.  

What is Robot.TXT file?

So, I think most of you guys have got the idea of this file but more reference here is detailed definition: Robots.txt is a txt file on your server which tells Search engine bots what to not index….this file also knows as Standards for Robot Exclusion. This is works as a protocol between search engines and websites to crawl and index website/blog content. When Google bot, Bing bot comes to your site for indexing they first need to crawl Robots.txt file and from here they got to know what to exclude while indexing the site. You can easily view this file by just typing http://www.domain_name.com/robots.txt and can see what restrictions you have placed for search engine bots till now. Also if some of you does not able see this file this means you don’t have robots.txt file available on server.

 

Importance of Robot.txt files for website or blog

This file can be very helpful if you have limited server resources and also can save a lot of loading time of your website by making some simple changes in Robot.txt file. Here I think there are mainly three main advantages when we have this file on our website.

Better resource management and improved load time

You have often seen many site with having some forms, search bars and many other scripts running when you open their website’s but still they will not take much time. The main reason behind this is Robot.txt file. Actually they have excluded these unusable scripts from crawl bots because these small scripts are made for users not for search engines. No one is going to search for these things on search engine. So, this can improve your site’s load time significantly.

Disallowing Search Robot from indexing your pages or content

Sometimes you don’t need to get index some of your pages or categories or tags. Then at that time Robot.txt is the only way to do it.

How to write Robots.TXT file

Creating a Robot.txt is file as simple as you create any notepad file in your computer but needs to be placed at server where your website resides. The folder where my site’s all data is stored. To view your server you can just simply use your hosting Cpanel or can use any ftp software like (filezilla). Here I am going to talk about few basic and advanced commands that you can use to write Robot.txt file for your website.

Simple commands to use in Robot.txt file

User-agent:*
      Disallow:

Here user-agent:* means that this section of commands will be applied to all Robots. Here * is a wildcards which is applicable for all. Using Disallow: without any directory in this segment means, crawler can access any directory without any restriction.

Trying some advanced commands to Robot.TXT now

User-agent:*
Disallow: /cgi-bin/

This command means any web spider or bots are not permitted to access cgi-bin folder or your site. Like if your cgi-bin/abc.html/ then it will not be indexed.

User-agent: Googlebot-Image
Disallow: /

Hi, as you can see in this command we have Googlebot-Image instead of *. So, here we have specified the Bot name while in above sections * is used for all robots.

This command means Googlebot-Image will not be able to access any file your website. You can also specify your image forlder to robot as well.

Eg. User-agent: Googlebot-Image
       Disallow: /images/

This command will tell Googlebot-Image that you should not index any file from images folder.

Similarly you can also use some other parameters in your Robot.txt file to exclude search engine bots from indexing your pages, post, images etc.

Eg.

User-agent: *
Disallow: /secrets/
Disallow: /techewire/mypost.html
Disallow: /wp-admin/
Disallow: /wp-content/
Disallow: /index.php/
Disallow: /comments/feed/

Etc.

Here above you have an example of how to write Robots.txt file for website and restrict search engine bots from indexing some important file which are not meant to be viewed in indexing.

Final words on “ Robot.txt file “

 

So, finally you have seen some good examples of Robots.TXT file and I think most of you are now able to write this file from your own which any problem. This is an important aspect of your website and you should think on it what content of your website is not required to get index. Hope you all now make some experiment with your robots.txt file of your website.

But still if you have problem in writing ROBOTS.txt file for website then feel free to make a comment below and also before leaving just make sure you like and share us on social media like Facebook and Google+.

 

Twitter Cheat Sheet

Copyright 2017 Dorlis. All Right Reserved.