Manage Crawler Robots with Bissetii in Hugo
Bissetii strives to be SEO compatible and friendly to crawler robots by default. This is where Bissetii shines from other Hugo themes where Bissetii supplies a full interface to handle crawlers easily.
Customizing robots.txt
Bissetii supplied a method to customize Hugo’s robots.txt
file. Depending on
the Bissetii version you use, the customization methods are different.
The output of the robots.txt
is always at the root of the website. Example,
for this site, it is located at:
https://bissetii.zoralab.com/robots.txt
Version v1.13.0
and Above
To customize robots.txt
, you can create your robot TOML data file and place it
inside the following directory:
file path: data/bissetii/robots/
repo-docs: docs/.data/bissetii/robots/
The filename (e.g. google
from google.toml
) shall be used as User-Agent
field. The only exception is all
where it will be renamed as *
resulting
User-agent: *
).
Data file Content
The robot TOML data file content is shown as follows:
|
|
Sitemap
- COMPULSORY- The URL location of your root sitemap.
- The use of
{{ .BaseURL }}
is available for multi-facing website where Bissetii will replace it with your base URL.
Allow
- COMPULSORY- The allowed URL array list
Disallow
- OPTIONAL- The disallowed URL array list
- If the list empty, avoid declaring
Disallow
to keep the rendering simple.
Crawl-delay
- OPTIONAL- Specifies the delay timing for the specific crawler.
Say the above data file is named as GoogleBot.toml
, it will be rendered as:
|
|
Version v1.12.5
and Below
Bissetii does not have any processing solution due to
Hugo Bug #5160. The only way is
to supply your raw robots.txt
via your static/
directory.
To override Bissetii’s default file, you can create the same robots.txt
in the
same path.
The enableRobotsTXT
will be disabled due to Hugo’s multi-language bug in
config/_default/config.toml
. Hence, the guide for robots.txt
if Hugo’s main
documentations no applied into these Bissetii versions.
Page-Specific Crawler Instructions
Bissetii also supports page specific meta tag for ad-hoc robot management. To
add a robot rules tag, you need to add each robot’s rule into the [robot]
table inside the page’s Hugo front matter. Example:
|
|
[robots]
- COMPULSORY- Denotes the following fields belong to
robots
table.
- Denotes the following fields belong to
[robots.NAME]
- COMPULSORY- Denotes the following fields belong to
robots.NAME
table. - Provide a TOML compatible
NAME
. Otherwise, keep it to the robot name itself.
- Denotes the following fields belong to
name
- COMPULSORY- Name of the robot. Will be used as
name=
attribute inside the<meta>
tag.
- Name of the robot. Will be used as
content
- COMPULSORY- instructions for the robot. Will be used as
content=
attribute inside the<meta>
tag.
- instructions for the robot. Will be used as
The above will be rendered as:
<meta name="googleBot" content="noindex,nofollow" />
<meta name="twitterBot" content="noindex" />
Wrapping Up
That is all for managing crawler robot with Bissetii in Hugo. If you have any question to ask us directly, please feel free to raise it at our GitLab Issues Section. We will be happy to assist you.