Manage Crawler Robots with Bissetii in Hugo

Bissetii strives to be SEO compatible and friendly to crawler robots by default. This is where Bissetii shines from other Hugo themes where Bissetii supplies a full interface to handle crawlers easily.

Customizing robots.txt

Bissetii supplied a method to customize Hugo’s robots.txt file. Depending on the Bissetii version you use, the customization methods are different.

The output of the robots.txt is always at the root of the website. Example, for this site, it is located at: https://bissetii.zoralab.com/robots.txt

Version v1.13.0 and Above

To customize robots.txt, you can create your robot TOML data file and place it inside the following directory:

file path:             data/bissetii/robots/
repo-docs:       docs/.data/bissetii/robots/

The filename (e.g. google from google.toml) shall be used as User-Agent field. The only exception is all where it will be renamed as * resulting User-agent: *).

Data file Content

The robot TOML data file content is shown as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
Sitemap = "{{ .BaseURL }}/sitemap.xml"

Allow [
	"/",
	"/en/",
]

Disallow = [
	"/en/internal/",
	"/zh-hans/internal/",
]

Crawl-delay = 5

Say the above data file is named as GoogleBot.toml, it will be rendered as:

1
2
3
4
5
6
7
User-agent: GoogleBot
Allow: /
Allow: /en/
Crawl-delay: 5
Disallow: /en/internal/
Disallow: /zh-hans/internal/
Sitemap: https://www.example.com/en/sitemap.xml

Version v1.12.5 and Below

Bissetii does not have any processing solution due to Hugo Bug #5160. The only way is to supply your raw robots.txt via your static/ directory.

To override Bissetii’s default file, you can create the same robots.txt in the same path.

The enableRobotsTXT will be disabled due to Hugo’s multi-language bug in config/_default/config.toml. Hence, the guide for robots.txt if Hugo’s main documentations no applied into these Bissetii versions.

Page-Specific Crawler Instructions

Bissetii also supports page specific meta tag for ad-hoc robot management. To add a robot rules tag, you need to add each robot’s rule into the [robot] table inside the page’s Hugo front matter. Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
+++
...
[robots]
[robots.googleBot]
name = "googleBot"
content = "noindex, nofollow"

[robots.twitterBot]
name = "twitterBot"
content = "noindex"

...
+++

The above will be rendered as:

<meta name="googleBot" content="noindex,nofollow" />
<meta name="twitterBot" content="noindex" />

Wrapping Up

That is all for managing crawler robot with Bissetii in Hugo. If you have any question to ask us directly, please feel free to raise it at our GitLab Issues Section. We will be happy to assist you.