prevent bots from harvesting irrelevant content
Bots cause a lot of load (compared to human visitors) on some instances of grouprise.
The following URLs seem to be problematic:
/stadt/abuse
/stadt/signup
/stadt/login
/stadt/tags
An example robots.txt
could look like this:
User-agent: *
Disallow: /stadt/abuse
Disallow: /stadt/login
Disallow: /stadt/signup
Disallow: /stadt/tags
The first three paths relate to content, which is not relevant to bots.
But the /stadt/tags
path contains content, which should be crawled. Here the calendar links are the source of the problem (e.g. /stadt/tags/frieda/?month=7&year=2021&page=15
). Maybe we can mark such links as irrelevant for bots.
Probably we should ship an robots.txt file with grouprise (or generate it dynamically).