nextjs-seo-indexing
How to Install
Claude Code:
git clone https://github.com/self && cp skills/nextjs-seo-indexing ~/.claude/skills/Cursor:
Copy SKILL.md into your .cursorrules fileNext.js SEO Indexing & Crawl Budget Skill
Fix Google Search Console coverage issues, canonical problems, sitemap errors, and crawl budget waste in Next.js apps.
When to Use
- Use when a Next.js site has Google Search Console coverage issues such as duplicate canonicals, accidental noindex, crawl waste, or discovered-but-not-indexed URLs.
- Use when auditing sitemap, robots.txt, redirect, internal-linking, or static-rendering problems before an SEO release.
- Use when you need framework-specific examples for Next.js App Router metadata,
generateMetadata,robots.js, and sitemap routes.
Understanding Search Console Coverage States
| Status | Meaning | Fix |
|---|---|---|
| Crawled – not indexed | Google crawled but chose not to index | Improve content quality + canonical + internal links |
| Duplicate without canonical | Multiple URLs serve same content, no canonical | Add explicit canonical to the preferred URL |
| Excluded by noindex | noindex tag present |
Remove noindex if page should be indexed |
| Duplicate, Google chose different canonical | Google prefers a different URL than you specified | Align canonical with the URL Google naturally picks |
| Alternative page with proper canonical | Correct — non-preferred duplicate pointing to canonical | Expected behavior, not a problem |
| Not found 404 | Page deleted or URL changed | Add redirect or restore page |
| Discovered – not indexed | Google knows it exists but hasn't crawled it | Improve internal linking + crawl budget |
| Page with redirect | Redirect chain or redirect to wrong target | Shorten redirect chain, verify destination |
Step 1 — Canonical Audit
Next.js App Router (metadata export)
// app/blog/my-post/page.js
export const metadata = {
title: 'My Post Title',
alternates: {
canonical: 'https://www.yourdomain.com/blog/my-post',
},
};
Next.js App Router (generateMetadata)
export async function generateMetadata({ params }) {
return {
alternates: {
canonical: `https://www.yourdomain.com/blog/${params.slug}`,
},
};
}
Common canonical mistakes to fix:
// ❌ WRONG — relative URL
canonical: '/blog/my-post'
// ❌ WRONG — missing trailing slash inconsistency
// (pick one and stick with it sitewide)
// ✓ CORRECT — absolute URL, consistent scheme + subdomain
canonical: 'https://www.yourdomain.com/blog/my-post'
Step 2 — Noindex Audit
Find pages that are accidentally noindexed:
# Search for noindex in metadata
rg -n --glob '*.{js,ts,jsx,tsx}' 'noindex|robots.*noindex' app pages
# Check layout.js — a noindex here affects ALL pages
grep -n "robots" app/layout.js
In Next.js App Router, robots in the root layout applies globally. Only set it there if you want the whole site affected.
// app/layout.js — only set robots if you need sitewide control
export const metadata = {
// ✓ Allow indexing
robots: { index: true, follow: true },
// ❌ This would noindex the entire site:
// robots: { index: false }
};
Step 3 — Sitemap Health
Verify sitemap routes return 200 + valid XML
curl -sI https://www.yourdomain.com/sitemap.xml | grep -i "content-type\|status"
curl -s https://www.yourdomain.com/sitemap.xml | head -20
Next.js App Router sitemap (recommended pattern)
// app/sitemap.js
export default async function sitemap() {
const baseUrl = 'https://www.yourdomain.com';
// Static pages
const staticPages = [
{ url: baseUrl, lastModified: new Date(), changeFrequency: 'daily', priority: 1.0 },
{ url: `${baseUrl}/about`, lastModified: new Date(), changeFrequency: 'monthly', priority: 0.8 },
];
// Dynamic pages (fetch from DB or CMS)
const posts = await getPosts(); // your data fetch
const dynamicPages = posts.map(post => ({
url: `${baseUrl}/blog/${post.slug}`,
lastModified: new Date(post.updatedAt),
changeFrequency: 'weekly',
priority: 0.7,
}));
return [...staticPages, ...dynamicPages];
}
Multiple sitemaps (sitemap index)
// app/sitemap-tools/sitemap.js
// app/sitemap-blog/sitemap.js
// Each returns an array of URL entries
Step 4 — Static Rendering Verification
Pages must be statically generated (or SSR with metadata in HTML) for Google to see SEO tags.
# Check build output — pages should show ● (static) not λ (dynamic)
npm run build 2>&1 | grep -E "○|●|λ|/blog|/tools"
○ /about (static)
● /blog/[slug] (SSG) ← good
λ /api/data (serverless) ← expected for APIs
If important pages are λ (fully dynamic with no static generation), add:
// app/blog/[slug]/page.js
export async function generateStaticParams() {
const posts = await getPosts();
return posts.map(post => ({ slug: post.slug }));
}
Step 5 — Internal Linking Audit
Pages with zero internal links are rarely indexed. Every important page should be reachable from: 1. Homepage or navigation 2. A sitemap 3. At least one other content page
# Find pages that have no inbound links from other pages
# (manual check — grep for the slug across all files)
grep -r "/blog/my-orphan-post" --include="*.{js,ts,jsx,tsx,md}" . | grep -v "sitemap\|the-page-itself"
Step 6 — Redirect Audit
# Find all redirects in Next.js config
grep -A 3 "redirects" next.config.js
# Check for redirect chains (A → B → C — should be A → C)
# Test a suspected chain:
curl -sI https://www.yourdomain.com/old-url | grep -i location
// next.config.js — keep redirects flat (no chains)
async redirects() {
return [
{
source: '/old-url',
destination: '/new-url', // Must NOT itself redirect
permanent: true, // 308 for SEO
},
];
}
Step 7 — robots.txt Check
curl -s https://www.yourdomain.com/robots.txt
# ✓ Good
User-agent: *
Allow: /
Sitemap: https://www.yourdomain.com/sitemap.xml
# ❌ Bad — disallows crawling of important content
Disallow: /blog/
Disallow: /tools/
// app/robots.js (Next.js App Router)
export default function robots() {
return {
rules: { userAgent: '*', allow: '/' },
sitemap: 'https://www.yourdomain.com/sitemap.xml',
};
}
Indexing Checklist
- [ ] All important pages have absolute canonical URLs
- [ ] No important pages accidentally noindexed
- [ ] Sitemap routes return 200 with valid XML
- [ ] Sitemap submitted to Google Search Console
- [ ] Important pages statically generated (●) in build output
- [ ] No redirect chains (A→B→C should be A→C)
- [ ] robots.txt allows important content
- [ ] Every important page has ≥1 internal inbound link
- [ ]
generateStaticParamsadded for dynamic routes with known slugs
Limitations
- Does not guarantee Google will index a page; final indexing decisions remain with the search engine.
- Requires access to the codebase, deployed URLs, and ideally Google Search Console data for confident diagnosis.
- Treat recommendations that change URL structure, redirects, or canonical policy as production-impacting and review them before deployment.
Details
| Category | Content → SEO |
| Source | self |
| Stars | N/A |
| Risk Level | Safe |