{"id":2104,"date":"2019-12-13T14:50:47","date_gmt":"2019-12-13T14:50:47","guid":{"rendered":"https:\/\/www.seotesteronline.com\/?p=2104"},"modified":"2021-05-19T13:33:53","modified_gmt":"2021-05-19T13:33:53","slug":"robots-txt","status":"publish","type":"post","link":"https:\/\/www.seotesteronline.com\/blog\/technical-seo\/robots-txt\/","title":{"rendered":"Guide to file Robots.txt: what it is and why it is so important"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">In this article, we will explore the role of robots.txt, a small file that can make the difference between scoring a high ranking and languishing on the lowest deeps of the SERP.<\/span><\/p>\n<h2>What is robots.txt<\/h2>\n<p><span style=\"font-weight: 400;\">Robots.txt&#8217;s role is to tell the crawler which pages it can require from your site. <\/span><b>Beware, the spider can still see them<\/b><span style=\"font-weight: 400;\">. It just does not scan them. If you want to hide a page, you should rely on noindex instructions, as specified by <\/span><a href=\"https:\/\/support.google.com\/webmasters\/answer\/6062608\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Google&#8217;s Search Console Guide<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><b>So, why do you need a robots.txt file?<\/b><span style=\"font-weight: 400;\"> Because <\/span><b>you can make crawling faster and smoother<\/b><span style=\"font-weight: 400;\">, saving your server from too many crawler requests.<\/span><b> You can exclude duplicate or not essential pages from the scanning that can hurt your ranking.<\/b><\/p>\n<h2>Where to put robots.txt<\/h2>\n<p><span style=\"font-weight: 400;\">You have to put the robots.txt file <\/span><b>inside your website&#8217;s main directory<\/b><span style=\"font-weight: 400;\"> so that its URL is <\/span><i><span style=\"font-weight: 400;\">http:\/\/www.mywebsite.com\/robots.txt.<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">Do not put it elsewhere, or it won&#8217;t work.<\/span><\/p>\n<h2>How to create robots.txt file<\/h2>\n<p><b>Create a .txt file in your website&#8217;s main directory and call it &#8220;robots&#8221;<\/b><span style=\"font-weight: 400;\">. Remember that you can have only one robots.txt file per site.<\/span><\/p>\n<p><b>Create a group<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Create your <\/span><b>first group<\/b><span style=\"font-weight: 400;\">. A robots.txt file can have one or more groups.\u00a0<\/span><\/p>\n<p><b>Each group has one or more instructions <\/b><span style=\"font-weight: 400;\">(also called rules). Remember to <\/span><b>use only one instruction for each line.<\/b><\/p>\n<p><b>Robots.txt istructions<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Instructions can be of three types:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><b>user-agent:<\/b><span style=\"font-weight: 400;\"> the crawler to which the rule applies.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><b>allow:<\/b><span style=\"font-weight: 400;\"> all <\/span><b>the files or directories<\/b><span style=\"font-weight: 400;\"> to which the user-agent <\/span><b>can access.<\/b><\/li>\n<li style=\"font-weight: 400;\"><b>disallow: <\/b><span style=\"font-weight: 400;\">all the files or directories to which the user-agent <\/span><b>cannot access.<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">A rule must include one or more (or all!) user agents, and at least an allow or disallow instruction (or both).<\/span><\/p>\n<h2>Robots.txt examples<\/h2>\n<p><span style=\"font-weight: 400;\">For example, to prevent Googlebot to scan your entire website, you must write in your robots.txt file something like:<\/span><\/p>\n<p><em><span style=\"font-weight: 400;\">#Prevent GoogleBot from scanning. (this is a comment. You can write what you want)<\/span><\/em><\/p>\n<p><em><span style=\"font-weight: 400;\">User-agent: googlebot<\/span> <\/em><br \/>\n<span style=\"font-weight: 400;\"><em>Disallow: \/<\/em><\/span><\/p>\n<p><span style=\"font-weight: 400;\">If you want instead to exclude more than a directory for all crawlers:<\/span><\/p>\n<p><em><span style=\"font-weight: 400;\">User-agent: *\u00a0<\/span> <\/em><br \/>\n<em> <span style=\"font-weight: 400;\">Disallow: \/directory 1<\/span> <\/em><br \/>\n<em> <span style=\"font-weight: 400;\">Disallow:\/ directory 2<\/span><\/em><\/p>\n<p><span style=\"font-weight: 400;\">(the asterisk means &#8220;all&#8221;)<\/span><br \/>\n<span style=\"font-weight: 400;\">Or maybe exclude all directories but one to a specific crawler:<\/span><\/p>\n<p><em><span style=\"font-weight: 400;\">User-agent: specific-crawler<\/span> <\/em><br \/>\n<em><span style=\"font-weight: 400;\">Allow: \/directory1<\/span> <span style=\"font-weight: 400;\">User-agent: *<\/span> <span style=\"font-weight: 400;\">Allow: \/<\/span><\/em><\/p>\n<p><span style=\"font-weight: 400;\">In this way, you&#8217;re stating that every other crawler can access to the entire website.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Finally, we can prevent the scanning of a specific file format, for example, jpg images.<\/span><\/p>\n<p><em><span style=\"font-weight: 400;\">User-agent: *<br \/>\n<\/span><\/em><em><span style=\"font-weight: 400;\">Disallow: \/*.jpg$<\/span><\/em><\/p>\n<p><span style=\"font-weight: 400;\">The $ character establish a rule valid for all strings that end with <\/span><i><span style=\"font-weight: 400;\">.jpg<\/span><\/i><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To see more examples, visit the <\/span><a href=\"https:\/\/support.google.com\/webmasters\/answer\/6062596?hl=it&amp;ref_topic=6061961\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Google\u2019s Search Console guide<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h2>Learn more about Technical SEO<\/h2>\n<p>Technical SEO is not easy. But it&#8217;s fundamental to make SEO the right way.<\/p>\n<p>Learn it reading our <a href=\"https:\/\/www.seotesteronline.com\/blog\/technical-seo\">Guide for Beginners to Technical SEO<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How do you create a Robots.txt file and where to insert it? Find out in this short article!<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[135],"tags":[],"acf":[],"lang":"en","translations":{"en":2104,"it":2102},"pll_sync_post":[],"_links":{"self":[{"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/posts\/2104"}],"collection":[{"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/comments?post=2104"}],"version-history":[{"count":9,"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/posts\/2104\/revisions"}],"predecessor-version":[{"id":4580,"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/posts\/2104\/revisions\/4580"}],"wp:attachment":[{"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/media?parent=2104"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/categories?post=2104"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.seotesteronline.com\/wp-json\/wp\/v2\/tags?post=2104"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}