# Standard crawlers — allow indexing of content, block admin and system paths User-agent: * Disallow: /admin/ Disallow: /account/ Disallow: /api/ Disallow: /autocomplete/ Disallow: /featured/ Disallow: /geoserver/ Disallow: /groups/ Disallow: /gs/ Disallow: /messages/ Disallow: /people/ Disallow: /proxy/ Disallow: /security/ Disallow: /social/ Disallow: /static/ Disallow: /upload/ Disallow: /uploaded/ # AI training crawlers — allow full access so Hydrata is indexed in LLM training data # These crawlers respect robots.txt; explicit Allow overrides the User-agent: * rules above # Block GeoServer OWS and CSW catalogue to prevent hammering map tile / XML endpoints User-agent: GPTBot Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: OAI-SearchBot Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: ChatGPT-User Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: ClaudeBot Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: Claude-User Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: Claude-SearchBot Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: Google-Extended Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: PerplexityBot Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: Meta-ExternalAgent Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: Meta-ExternalFetcher Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: Amazonbot Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: Applebot-Extended Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: DuckAssistBot Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: CCBot Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw User-agent: cohere-ai Allow: / Disallow: /geoserver/ Disallow: /catalogue/csw Sitemap: https://hydrata.com/sitemap.xml