TXT Files

The TXT Files section allows you to view text-based configuration files located in the root directory of the website.

These files typically include configurations such as:

Search engine crawling rules
Security policies
Bot access rules

WebPixie automatically detects these files and allows you to view their contents directly.

Robots.txt

The robots.txt file contains rules that determine how search engine bots should crawl your website.

With this file, site owners can control:

which pages can be crawled
which pages cannot be crawled

Common Directives in robots.txt

Disallow

Indicates which pages should not be accessed by search engines.

Example:

Disallow: /admin
Disallow: /login

These rules prevent the specified pages from being crawled.

You can use multiple Disallow directives to block different sections of your site from search engines.

WebPixie validates the directives in robots.txt line by line and shows errors if any are found. It also checks whether a Sitemap: directive is present — that check is presence-only and does not replace the full sitemap crawl performed by Sitemap Monitoring.

llms.txt

The llms.txt file is a proposed standard for giving AI/LLM crawlers a structured summary of a site's content, similar in spirit to robots.txt for search engines.

WebPixie detects an llms.txt file if present and validates it line by line, alongside robots.txt.

Using WebPixie with robots.txt

WebPixie automatically detects the robots.txt file and displays its contents in the analysis screen.

This allows users to:

quickly review crawling rules
spot misconfigurations
identify critical SEO issues

Reviewing your robots.txt regularly helps ensure your site is indexed correctly and avoids accidental blocking of important pages.

Robots.txt

Common Directives in robots.txt

User-agent

Disallow

llms.txt

Using WebPixie with robots.txt

On this page