name: seo-geo-audit description: "Get a prioritized SEO + GEO action plan for any website. Input: a URL (and optional competitors). Output: ranked fixes across technical health, content, keywords, e-commerce, local, backlinks, and AI visibility (ChatGPT, Perplexity, Google AI Overviews), or a plain-language summary. Use to audit or review a site, find why it does not rank or is not cited by AI, compare against competitors, or set a baseline before a bigger SEO project." license: MIT metadata: author: "Sorank (https://sorank.com)" version: "1.0.0"
Full SEO + GEO site audit
Audit a website the way a senior practitioner does: collect verifiable facts first, judge them against a field-tested checklist, benchmark against the competitors that actually rank, and deliver a short list of fixes ordered by impact. The methodology was distilled from 115+ real agency audit calls and updated with sourced 2025-2026 evidence on AI search.
Company knowledge first (Obsidian)
If the working environment contains an Obsidian vault or any local knowledge base (a folder of .md notes, often with a .obsidian directory), read the relevant notes before acting: brand and product facts, target keywords, competitors, and the SEO action log of what was already tried. Ground every recommendation in that context instead of asking the user for facts the vault already holds. At the end of the session, append the actions taken to the vault's SEO action log so the next session starts informed. Vault structure, read-first and write-back protocols: the obsidian-brain skill.
Why this audit is different
- Facts before judgment. Every claim in the final report must trace back to something measured (by the bundled script, a browser check, or data the owner provided). Never guess a title length or assume a sitemap exists.
- Competitors over abstract thresholds. "Your page has 400 words" means nothing alone. "The three sites outranking you average 1200 words on this query" is a finding. When competitors are known, benchmark page versus page.
- Two search worlds at once. Every category is checked twice: does it help Google rankings, and does it help the site get retrieved and cited by ChatGPT, Perplexity and Google AI Overviews. These overlap but do not coincide (AI Mode answers show only about 32 percent URL overlap with the top 10 organic results).
- The audit is a pricing tool for agencies. The counted error volume doubles as a quote: "500 errors on this site, that is roughly 3000 euros of work" scopes the project the prospect is buying. The score is a ratio of errors to pages, so one template-level error repeats across every page using that template and a single fix corrects it everywhere, which is why a low score is rarely a catastrophe. Frame both up front so the owner reads the report as a plan, not a verdict (field heuristic from 115+ agency audits).
When to use
- The user provides a URL and wants improvement points, an audit, a review, or a "check".
- A site does not rank, lost traffic, or is invisible in AI assistant answers.
- The user wants a comparison against one or more competitors.
- A baseline is needed before content production, a migration, or a redesign.
Inputs to gather
Ask only for what is missing; proceed with what exists:
| Input | Required | Note |
|---|---|---|
| Homepage URL | yes | The script also discovers robots.txt, sitemap, llms.txt |
| 1-2 key pages | recommended | Top service, product or article page |
| Competitor URLs | recommended | Enables page-versus-page benchmarking |
| Goals and market | helpful | Local vs national, language, business model |
| GSC / GA4 access | optional | Indexation ratio and traffic facts; otherwise ask the owner for screenshots |
Phase 1: collect the facts
Run the bundled collector on the homepage plus key pages, and on each competitor page cited later in the report:
python3 scripts/seo_audit.py https://example.com /services /blog/top-article
It returns, per page: HTTP status, HTTPS, platform fingerprint, title (with generic-title flag), meta description length, full heading hierarchy with level jumps, image alt coverage and weight sample, internal versus external link counts, Open Graph, canonical, JSON-LD types, visible word count, meta robots, and a likely_js_rendered flag. Site-wide: robots.txt rules for AI search bots and AI training bots (separately), sitemap declaration and URL count, llms.txt presence.
Known limits, and what to do about each:
| Limit | Fallback |
|---|---|
likely_js_rendered: true (client-side site) |
Open the page in a browser to read the real content, and report that AI crawlers cannot see it (they do not execute JavaScript, see seo-technical) |
| Images loaded as CSS backgrounds | Invisible to the parser (total may read 0); verify visually |
| Real-world speed | Ask the user to run https://pagespeed.web.dev (free, no key) on mobile and desktop and share scores |
| Backlink profile | Use GSC Links report or Bing Webmaster Tools if available; paid indexes (Ahrefs, Semrush, Moz) only as an option |
| GSC indexation ratio, Google Business Profile | Ask the owner; never invent these numbers |
Phase 2: analyze against the checklist
Read references/audit-checklist.md now. It contains the full 14-category checklist with thresholds, detection methods, and the reason behind each rule. Work through it in this order, skipping categories that do not apply to the site type:
| # | Category | Applies to |
|---|---|---|
| 1 | Method and framing | all |
| 2 | Tags (title, meta, headings, slugs) | all |
| 3 | Images | all |
| 4 | Performance and indexation | all |
| 5 | Architecture (one intent = one page) | all |
| 6 | Content volume | all |
| 7 | Keywords and intent | all |
| 8 | Blog and articles | sites with a blog (or that need one) |
| 9 | GEO (AI visibility) | all |
| 10 | Conversion signals | all |
| 11 | E-commerce (products, collections) | stores |
| 12 | Backlinks and mentions | all |
| 13 | Local (Google Business Profile) | local businesses |
| 14 | Migration risks | sites about to change domain, CMS or structure |
While analyzing:
- Classify each finding as OK, Fix, or Blocking. A Blocking finding (noindex on the whole site, client-rendered content, hacked pages) invalidates work on everything downstream; say so plainly.
- Count what can be counted ("23 of 31 images have no alt text"), because numbers make the report credible and actionable.
- Identify the 3 highest-impact fixes. Resist listing 40 equal-weight items: the owner will do nothing with that. The philosophy is that SEO compounds many small gains (roughly "1 percent per action", a field heuristic), but the report must still rank them.
Phase 3: the GEO layer
For the same pages, evaluate AI visibility specifically:
- Crawler access: from the script output, report blocked AI search bots (these remove the site from AI answers) separately from blocked training bots (a brand-visibility tradeoff, see the canonical table in seo-technical references/ai-crawlers.md).
- Rendering: if content only exists after JavaScript runs, ChatGPT, Claude and Perplexity cannot read it. This single finding outranks everything else in the GEO section.
- Citability: score the key pages against the 5-pillar rubric in geo-visibility (answer-first passages, stats and quotes with sources, tables and lists, question headings, E-E-A-T signals, structured data, descriptive slugs).
- Entity: search the brand name; check that name, one-line description and key facts match across the site, LinkedIn, review platforms and directories.
- Measurement baseline: note whether the site can even see its AI traffic today (GA4 channel setup), and point to geo-tracking.
Phase 4: deliver
Read references/output-templates.md and pick the format:
- Template A, full audit report: for practitioners; verdict line, what is good, top 3 priorities, findings by category with OK / Fix / Blocking status, ordered action plan.
- Template B, plain-language email: for non-technical owners; no acronyms, every term explained in everyday words, progress acknowledged, honest verdict.
Rules for both: write in the language of the site, open with what is genuinely good (credibility, and most sites do several things right), keep the verdict honest even when it is "rebuild before investing in content", and state explicitly what was not verified and why.
Reassure before you criticize: lead with honest numbered reference points ("this is already better than 90 percent of the sites I see", "I am very demanding on the score, above 80 percent is very good") so the owner trusts the findings instead of bracing for them. Explain the score as a ratio of errors to pages, where a template-level error repeats across all pages built on that template and one fix corrects it everywhere, so a low number does not warrant panic or a misread comparison against another tool (field heuristic from 115+ agency audits).
Handoffs after the audit
| Finding | Skill to apply the fix |
|---|---|
| Crawl, speed, indexation, JS rendering, AI bot access | seo-technical |
| Wrong or missing keywords, cannibalization | seo-keyword-research |
| Weak or thin articles | seo-content-blog |
| Product page issues | seo-content-product-page |
| Service pages missing or merged | seo-content-service-page |
| Category pages with no content | seo-content-collection-page |
| Orphan pages, weak internal links | seo-internal-linking |
| Missing or invalid structured data | seo-schema-markup |
| Weak link profile, no brand mentions | seo-backlinks |
| Local visibility, reviews, GBP | seo-local |
| Low AI citation rate | geo-visibility |
| No AI traffic measurement | geo-tracking |
Common mistakes
- Auditing only the homepage. Money pages and one article reveal more than the homepage alone.
- Reporting thresholds without competitors. The gap is the finding, not the absolute number.
- Treating GEO as a separate mystical discipline. Most GEO failures found in audits are SEO failures (not indexed, client-rendered, thin content) wearing a new name.
- Inventing data. If speed, backlinks or indexation were not measured, write "not verified" instead of guessing.
- Burying the verdict. The first line of the deliverable answers: is this site ready to grow, does it need a short fix list, or does it need a rebuild first.
Sources
- AI Mode versus top 10 organic overlap (32 percent URL overlap): https://www.semrush.com/blog/ai-mode-comparison-study/
- AI crawlers do not execute JavaScript (500M+ fetch analysis): https://vercel.com/blog/the-rise-of-the-ai-crawler
- Why ChatGPT cites pages (1.4M prompt study): https://ahrefs.com/blog/why-chatgpt-cites-pages/
- Generative Engine Optimization (controlled study, KDD 2024): https://arxiv.org/abs/2311.09735
- Google guidance on AI features and structured data: https://developers.google.com/search/docs/appearance/ai-features