Sitemap.xml checker

Check if your sitemap is accessible, contains valid URLs, and has no errors

Free

No sign-up

Instant results

Check results

This check only covers sitemap.xml. For a full picture of your page, run a page audit.

For issues across your whole site — duplicate titles, orphan pages, broken internal links — run a site audit.

Want us to fix what we found? Our team can help.

Page audit Site audit Fix errors

What is sitemap.xml and why it matters

A sitemap is a file that lists the URLs on your site so search engines can discover them without crawling the entire graph link-by-link. It's particularly important for large sites, new sites without many inbound links, sites with deep pages not linked from the homepage, and content that updates frequently. Google and Bing both use the sitemap to decide what to crawl and how often. The spec comes from sitemaps.org; supported formats are XML (<urlset> and <sitemapindex>), RSS 2.0, and Atom.

What this tool checks

Sitemap discovery — all Sitemap: directives in robots.txt, plus the default /sitemap.xml fallback
Multiple Sitemap directives — robots.txt can list several; all are parsed
Sitemap reference in robots.txt — missed opportunity if present at /sitemap.xml but not referenced
Format validation — recognized as urlset, sitemapindex, RSS, or Atom
Sitemap Index handling — nested sitemaps checked for accessibility (first 10)
URL count — sitemaps.org spec caps a single file at 50,000 URLs
lastmod presence — helps search engines prioritize recrawl
Future dates in lastmod — usually clock-skew or timezone bug
All-same-date lastmod — likely CMS stamping generation time rather than edit time
Foreign-domain URLs — Google rejects sitemaps referencing URLs outside the site
HTTP URLs on HTTPS site — mixed protocol causes duplicate-content issues
Current page membership — is the URL you're auditing included in the sitemap

Good vs bad examples

Good — minimal valid XML sitemap:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-04-10</lastmod>
  </url>
</urlset>

Good — Sitemap Index for large sites (split by content type, each file under the 50K URL cap):

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap><loc>https://example.com/posts-sitemap.xml</loc></sitemap>
  <sitemap><loc>https://example.com/products-sitemap.xml</loc></sitemap>
</sitemapindex>

Good — multiple Sitemap directives in robots.txt (supported by all major search engines):

User-agent: *
Disallow:
Sitemap: https://example.com/posts-sitemap.xml
Sitemap: https://example.com/products-sitemap.xml
Sitemap: https://example.com/categories-sitemap.xml

Bad — sitemap with URLs on a different domain (Google rejects):

<url><loc>https://other-domain.com/page</loc></url>

Bad — mixed HTTP and HTTPS URLs:

<url><loc>https://example.com/page1</loc></url>
<url><loc>http://example.com/page2</loc></url>  <!-- should be https -->

Common mistakes

No Sitemap directive in robots.txt — even if /sitemap.xml works, non-Google crawlers (Bing, Yandex, DuckDuckGo) rely on robots.txt for discovery
Cross-domain URLs in a single sitemap — Google rejects. For multi-domain content, use per-domain sitemaps
Mixing HTTP and HTTPS URLs — search engines treat them as different URLs, causing duplicate content
Stale or all-same lastmod values — if every URL has the same <lastmod> value (CMS stamps generation time), search engines downweight the signal
Missing lastmod entirely — crawlers can't tell which pages changed recently
Single sitemap over 50,000 URLs or over 50 MB uncompressed — split via Sitemap Index (or gzip compressed — also spec-supported, though our current checker fetches uncompressed)
Invalid XML — most often an un-escaped & in a URL. Use &
Including noindex or non-canonical URLs — sitemaps should list the canonical version you want indexed, not every variant

Frequently asked questions

Is sitemap.xml required?

Not formally — search engines can find pages through links. But for large sites and new pages, a sitemap significantly speeds up indexing. Google recommends having one.

What is a Sitemap Index?

A Sitemap Index is a map of maps. Instead of page URLs, it contains links to other sitemap files. Used on large sites where the URL count exceeds the single file limit (50,000 URLs).

How often should the sitemap be updated?

Whenever pages are added, removed, or significantly updated. Most CMS platforms (WordPress, Shopify, Wix, Ghost) regenerate the sitemap automatically on every content change. For static sites, most generators (Next.js, Hugo, Astro, Gatsby) create it at build time. If you maintain the file manually, update it with every structural change — stale sitemaps are worse than no sitemap because search engines cache them.

Can I have multiple Sitemap: directives in robots.txt?

Yes — the sitemaps.org protocol and all major search engines (Google, Bing, Yandex, DuckDuckGo) support multiple Sitemap: lines in robots.txt. Plugins like WordPress Yoast commonly generate several segmented sitemaps (post-sitemap.xml, page-sitemap.xml, category-sitemap.xml) and list each in robots.txt. All are fetched and processed.

Does the sitemap guarantee my pages will be indexed?

No. A sitemap tells search engines "here are URLs you might want to crawl." Whether they actually crawl and index each URL still depends on content quality, site authority, crawl budget, and whether the URL is allowed by robots.txt / meta-robots. The sitemap helps with discovery, not ranking. Pages that are noindex, blocked in robots.txt, or point to a different canonical won't get indexed regardless of sitemap listing.

Sitemap.xml checker

Check results

What is sitemap.xml and why it matters

What this tool checks

Good vs bad examples

Common mistakes

Frequently asked questions

Related checks

Meta robots

robots.txt

Canonical

HTTP → HTTPS

www redirect

Server 404