Sunday 17 July 2022

What is Robots txt syntax


What is the Robots txt in Google Search Console?

First you have your own website “www.example.com”, get the domain from custom robots.txt blogger, word press, blue host, go daddy, cheap domain etc., which one, you can prefer it as per your requirement. If you have an interest to place ads in to your website of earn money through ‘Google AdSense ads’.

After created your own website, then you need to login in the “Google Search Console”, after login go to top left side and click add properties and give your website, one is for Domain and another one is for URL prefix.

googlebot, adsbot, google search console URL
Screenshot 1 | Add property type

Better suggestion is to take URL prefix and paste your website URL and press continue button, then go to settings and verify your properties ownership verification, you have to do some verification's like HTML file, HTML Tags, Google Analytics, Domain verification and Tag manager. It belongs to few HTML code have to copy and paste in to your website theme editor nearby the header line such as <head> paste your verification code </head> paste the HTML code or Meta tags and Google analytics code.

html code, theme editor html
Screenshot 2 | Theme editor HTML code

The Domain code paste in TXT in your website needs to add in Domain area. Then, after click verify button in your Google search console or Google webmaster the verification is completed successfully.

site ownership verify, HTML file, HTML tag
Screenshot 3 | Settings | Ownership verification

To update or crawl your website, you need to wait for two to four days and check the data status through inspection tool or performance or overview etc., Know start the robots.txt syntax part, check your website in address bar like ‘www.yourwebsite.com/robots.txt’ and press enter, see the below screenshot 4 for your reference.

robots txt, Allow, Disallow
Screenshot 4 | Robots.txt Tester

The robots.txt is give to access with syntax by “Allow” and “Disallow” and “Crawl-delay: 10”. Robots.txt disallow See below example as:

Useragent: *

Disallow: /search

Allow: /

Sitemap: yourwebsite url/sitemap.xml

If you want sitemap about your post or pages to crawl, when you update any new or latest post uploaded, then this sitemap will update into the index or crawling to users.

Just go to Google search console and select the index and will get drop down list and select the sitemap, robots.txt sitemap on the right side available to your website with forward slash / and type here as “Sitemap.xml” or “sitemap_index.xml”. It will show as success with green color. Other way you can make sitemap through Google search and type “sitemap generator”, just give your website url and generate it.

sitemap index, sitemap xml
Screenshot 5 | Sitemaps | Sitemap.xml

To give access to crawl from HTTP server or web server like Google Search Engine, Nginx and Apache these are most popular search engines for blogger is GSE, other domain host or Nginx or Apache search engines. Some have bot user-agent to crawl like spider web of your website to user by worldwide. In your robots.txt application have few user-agent like googlebot, googleimage, adsbot google, media-partner google, bingbot, MSNbot, applebot, etc. for crawl or index your website then use the command useragent* and give access Allow or Disallow command. See below example as:

Useragent: adsbot google

Disallow: /

Note: it means will not crawl or robots.txt not crawl or index.

Useragent: *

Disallow: /search

Disallow: /comment

Disallow: /ping?

Disallow: /label/

Allow: /

Note: If you will get any “blocked by robots.txt” in your ‘Google search console’ with label posts warnings, just give command as disallow label as above. After rectify your errors. The robots.txt will not index or crawl it. When you will added Label in your website few day after, you have removed the label, this errors as blocked by robots.txt arrives.


Computer stuff kit tricks of Topics 3.

Mobile Internet connection - Browsers like Google chrome, Mozilla Firefox any browsers.

GSM / GPRS Tracking - Mobiledata 2G, 3G and 4G LTE cellular network.

Embedded systems - Micro controller 8051, SRAM, FLASH and EEPROM memories.

Fix the website browsing error - Proxy settings, Internet Properties, Antivirus.

PC or Laptop - Network Diagnostics and network adapter.

Desktop or Laptop Troubleshoot - VGA, Monitor, Processor, RAM and ROM.