SEO Expert? Here are 3 Robots.txt examples that you will interpret wrong, guaranteed.

Test your TechSEO skills.

We’ve asked SEO experts via many channels about these examples. The results were quite surprising. Not every SEO expert seems to be as proficient in interpreting robots.txt files as we thought.

What about yourself?

Give us your opinion in the form at the bottom of this page, and get a chance to win 1 Year of URLinspector.

Q1: Will Googlebot crawl the Secret Folder?

Given is this robots.txt for a restricted area:

User-agent: *
Disallow: /secret/*

User-agent: Googlebot
Allow: /coolgooglestuff/*

User-agent: Spacebot
Allow: /*

Q1 will Googlebot crawl the /secret/ folder YES or NO?

Bonus: Explain why.

Q2: Will Googlebot crawl the Userfiles Folder?

Given this robots.txt

User-agent: *
Disallow: /
Allow: /style/
Allow: /userfiles/

Is the Googlebot allowed to crawl the /userfiles/ folder? YES or NO?

Bonus: Explain why.

Q3: BLOCK THIS BOT

With the given robots.txt file

# block this bot
User-agent: Somebot
Disallow: /

# don't block this bot, but slow him down
User-agent: Googlebot
Crawl-Delay: 1800

# block this bot
User-agent: Someotherbot
Disallow: /

Q3: Is Googlebot…

  • A) Blocked?
  • B) Not blocked, but slowed down?
  • C) Not blocked

Bonus: Explain why.

Robots.txt is not trivial, and plenty of wrong code out there

Working with robots.txt seems trivial at first glance. But it’s not. There are many pitfalls and traps.

And even seasoned experts fail.

In URLinspector we use the original robots.txt library published by Google, the same code that Googlebot uses.

If you’re using software with some homemade robots.txt parser, you’re not doing yourself a favor.

Did you know? There are currently 142 robots.txt parsers in Github, and those are only the open source ones.

Guess how many more are hidden in private repos, from developers suffering from the “not invented here syndrome”?

See, even Moz have their own “modern robots.txt parser”, whatever that means. No thanks folks, we’d rather go with the original by Google.

Why? The original Googlebot robots library out there works differently to so many other robots.txt libraries out there, in some cases.

Also no JS, PHP or Node “interpretation” needed.

image-20221130214554343

Conclusion

Of course, just interpreting robots.txt by visual inspection is a problem and will get you wrong.

But also, using all sorts of software to “test robots.txt” can go wrong simply because there’s so much faulty code out there.

Don’t miss the chance to win an account for a full year of URL Inspector Bronze.

Let us know what you think the correct answers are in the form below.

How URLinspector may help

URLinspector uses the original robots.txt library published by Google.

That’s the same code run by Googlebot, to crawl your website.

One contributor there is Gary Illyes, whom you may know.

Why settle for less?

Do you want results that vary from what Google would do? We don’t think so.

Why not give URLinspector a try?

You can setup a free trial for 14 days with URLinspector, and see how it works.




Start with 14 days free trial; no credit card is required.

Resources

Recent Articles
The 7 Golden Rules of Website Indexing

Don’t let poor indexing hold your website back. Learn why looking after indexing is crucial for search engine visibility, credibility, and revenue.

SEO Expert? Here are 3 Robots.txt examples that you will interpret wrong, guaranteed.

Give us your opinion in the form at the bottom of this page, and get a chance to win 1 Year of URLinspector.

Subdomains or subfolders? Which one should you use for your website?

It’s not an easy decision between subdomains and subfolders, but we think subfolders are the best recommendation for most marketers.