How To Manage Crawler Robots As Hugo Theme
Like all SEO enabled sites, there should be a default robots management across
each page and via the singular "robots.txt" for crawlers to crawl in. Starting
from version v1.12.0
, Bissetii supports both page-level meta robot tag and
the single "robots.txt".
Customizing robots.txt
Depending on Bissetii version, Bissetii renders robots.txt
differently.
Version v1.13.0
and Above
Having Hugo fixed its robots.txt
placement in
https://github.com/gohugoio/hugo/issues/5160 (tested with Hugo version
v0.78.2
), Bissetii can now safely reverting back to using Hugo renderer to
create robots.txt
as documented in Hugo.
To customize robots.txt
, you can create your “User Agent” data file inside
your data/bissetii/robots
data directory. The pathing, depending on your
configurations, is as follow:
filepath pattern: data/bissetii/robots/<User-Agent>.toml
repo-docs: docs/.data/bissetii/robots/<User-Agent>.toml
The filename serves as the value for User-agent
field. The only exception is
all
where it will be renamed as *
(User-agent: *
).
You can create as many <User-Agent>.toml
as you want
(e.g. all.toml
, GoogleBot.toml
, etc). They will all be processed into a
single robots.txt
file.
By default, Bissetii supplies all.toml
to define sitemap location as
required by most search engines optimization.
Defining Rules for An User-Agent
You can define the rules for that particular crawler robot using TOML format.
For relative URL construction, Bissetii supplies {{ .BaseURL }}
as a
placeholder to be replaced with the actual .Site.BaseURL
.
|
|
As an example, say the filename is GoogleBot.toml
, the above will be converted
into:
|
|
Version v1.12.0
to v1.12.5
Bissetii facilitates robots.txt
in static/robots.txt
. The location is at
static root directory. By default, Bissetii has the following content:
User-agent: *
To override Bissetii’s default file, you can create the same robots.txt
in
the same path.
The enableRobotsTXT
will be disabled due to Hugo’s multi-language bug in
config/_default/config.toml
. Hence, the guide for robots.txt
if Hugo’s main
documentations no applied into these Bissetii versions.
Meta Robot Tag
Bissetii also supports page specific meta tag for robot management. To add a
robot rules tag, you need to add each robot’s rules into the [robots]
front-matter TOML map-table. Example:
|
|
This will render the HTML as:
<meta name="googleBot" content="noindex,nofollow" />
<meta name="twitterBot" content="noindex" />
You can create multiple robot tags for this specific page.
Epilogue
That’s all for how to manage robot crawlers for your website as Bissetii Hugo Theme user. If you have any questions, please feel free to place your query in our Issues Section.