How to Do Technical SEO With AI: My Step-by-Step Workflow (2026)
The exact way I run technical SEO with AI now — feeding crawl data to ChatGPT and Claude for prioritization, generating and validating schema, fixing robots.txt and redirects, and making a site AI-crawler ready. With copy-paste prompts and real screenshots.
Here's how I actually use AI for technical SEO in 2026 — not 'AI will replace SEO,' but a concrete, repeatable loop. (1) Crawl the site with a real crawler, then have AI summarize what's broken. (2) Feed the crawl export to ChatGPT or Claude and ask it to rank every issue by impact × effort so you fix the right things first. (3) Use AI to draft the tedious stuff — JSON-LD schema, robots.txt rules, redirect regex, hreflang — in seconds. (4) Validate every single output (Rich Results Test for schema, a redirect tester for rules); AI hallucinates, and early AI audit setups throw ~15% false positives, so nothing ships unchecked. (5) Make the site AI-crawler ready: almost no AI bot (GPTBot, PerplexityBot, ClaudeBot) renders JavaScript — Google's is the exception — so server-render anything you want cited, and allow the AI user-agents in robots.txt. (6) Re-crawl on a schedule so regressions surface fast. The post includes the exact copy-paste prompts I use at each step, an AI-vs-human responsibility split, an AI-crawler readiness table, and an impact × effort prioritization matrix. The headline: AI removes 80% of the manual grind in a technical audit, but you stay the editor — it reads, sorts, and drafts; you decide and verify.
This workflow leans on one thing: real crawl data for AI to reason over. That's the half we automated — CrawlRaven runs a 200-point technical crawl and hands you a scored, prioritized issue list, so you can spend your AI time on judgment instead of spreadsheet wrangling. Starts at $9/month. Try CrawlRaven free for 14 days →
A year ago, a technical SEO audit meant me, a giant Screaming Frog export, three coffees, and most of a day squinting at spreadsheets. Today I do the same work in about an hour — and it's more thorough. The difference isn't a magic tool; it's a workflow where AI does the reading, sorting, and drafting, and I do the deciding and verifying.
I want to be clear about what this post is and isn't. It is not “paste your URL into ChatGPT and let it run your SEO.” That's how you ship hallucinated schema and a robots.txt that blocks Googlebot. This is the actual step-by-step loop I run, with the exact prompts I use, where AI genuinely saves hours — and the guardrails that stop it from quietly breaking your site. Here's the whole thing at a glance:
My 6-step AI technical SEO loop
Pull a full technical crawl, then let AI summarize what's broken.
Feed the crawl export in and get an impact × effort ranking.
Schema, robots.txt, redirect regex, hreflang — drafted in seconds.
Never ship AI output unchecked. Test schema, diff redirects.
Allow AI bots, server-render content, confirm it's actually readable.
Re-crawl on a schedule so regressions surface the week they happen.
First, get the expectations right
Every disaster I've seen with “AI SEO” comes from the same mistake: treating the model as an oracle instead of an analyst. AI is phenomenal at the boring, high-volume parts of technical SEO and genuinely bad at the parts that decide whether your work matters. Before any prompts, internalize this split:
What AI does well vs. what still needs you
- ›Summarizing 10k-row crawl exports into plain English
- ›Drafting JSON-LD schema, robots.txt, and redirect regex
- ›Spotting patterns across huge log files
- ›Ranking issues by impact × effort in seconds
- ›Explaining technical findings to non-SEO stakeholders
- ›Deciding which fixes actually move the business
- ›Catching AI hallucinations before they ship
- ›Judging intent, relevance, and brand context
- ›Verifying schema, redirects, and crawl rules really work
- ›Owning the call when AI and the data disagree
Rule of thumb: let AI do the reading, sorting, and drafting — then verify every output yourself. Early AI audit setups throw roughly 15% false positives; treating AI as a tireless junior analyst, not an oracle, is what keeps that from biting you.
Modern AI-assisted audits catch a far higher share of issues than a manual sampling pass ever did — but the flip side is noise. Early setups routinely surface false positives, so the model's job is to hand you a shortlist, and your job is to confirm each item is real before anyone touches code. Keep that frame and everything below works. Lose it and AI will confidently waste your week.
The stack I actually use
You don't need a new platform for this. My entire AI technical SEO setup is four things:
- CrawlRaven's MCP, connected to your LLM — this is the engine. With the CrawlRaven MCP wired into your AI client, your model can run a full 200-point audit and read the structured results itself, right in the chat. (A standalone crawler like Screaming Frog works too — but then you're back to manual CSV exports.)
- A reasoning model — ChatGPT (GPT-5-class) or Claude — for analysis, prioritization, and drafting. I use both and cross-check on anything important.
- Google Search Console for real index and performance data the crawler can't see.
- Validators — Google's Rich Results Test, a redirect checker, a robots.txt tester — because nothing AI writes ships unverified.
The whole workflow below runs on one piece of setup: connect CrawlRaven's MCP server to your LLM — Claude, Cursor, ChatGPT, or any MCP-capable client. Once it's connected, you simply tell your AI “run a CrawlRaven audit on example.com” and the model pulls the real, structured crawl data straight into the chat — no exports, no copy-paste. That is exactly how I generated the signwith.co audit in Step 1: I asked, and the MCP delivered it. MCP access is included on CrawlRaven's Pro and Max plans — set up CrawlRaven, connect the MCP to your LLM, and you can replicate every step in this guide.
Step 1: Crawl first, then let AI read the crawl
AI can't audit what it can't see, and — this trips up a lot of people — chat models don't actually crawl your whole site when you paste a URL. So I start with real data: a full crawl that gives me status codes, titles, meta, canonicals, indexability, word counts, and response times for every URL. To make this concrete, I asked my LLM to run a CrawlRaven audit — through its MCP — on a live site, signwith.co. It pulled the data back into the chat in seconds, and that real audit is the ground truth the rest of this workflow runs on:
A score of 92/A with zero critical issues but seven warnings is a very typical “healthy but improvable” result — weak internal links, missing schema types, render-blocking resources, and soft E-E-A-T signals. Exactly the kind of nuanced, prioritized list that's perfect to hand to AI. Export it to CSV (or copy the issue table).
Then I hand the export to the model and ask for a plain-English read before I do anything clever with it. This first pass turns 8,000 rows into a paragraph I can actually reason about:
You are a senior technical SEO. I'm attaching a CSV export from a site crawl (columns: URL, Status Code, Indexability, Title, Meta Description, Canonical, Word Count, Response Time). Summarize the technical health of this site in plain English: 1. The 5 most serious issues, by how many URLs each affects. 2. Any patterns (e.g. a section returning 404s, canonical mismatches, thin pages). 3. Anything that looks like it could deindex pages or waste crawl budget. Be specific and cite example URLs. Don't suggest fixes yet — just diagnose.
Notice I tell it not to jump to fixes. I want a clean diagnosis first; prioritization and solutions come next, deliberately separated so the model doesn't bury a critical noindex under twelve cosmetic alt-text nits.
Step 2: Make AI prioritize by impact × effort
This is where AI earns its keep. The hardest part of any audit was never finding issues — it was deciding which 20 of 500 issues to fix this sprint. I make the model do that scoring, then sanity-check it. Same export, new prompt:
Using the same crawl data, build a prioritized action plan. For every distinct issue type, give me a table with: - Issue - # of URLs affected - Impact on rankings/indexing (1–5, with a one-line reason) - Implementation effort (1–5) - Quadrant: Quick Win / Major Project / Nice-to-have / Deprioritize Sort so the highest-impact, lowest-effort items are at the top. Flag anything that could remove pages from Google's index as CRITICAL, regardless of effort.

The output maps straight onto the matrix I work from. I start top-left and move right, and I refuse to let anyone touch the bottom-right box until everything else is done:
The impact × effort matrix AI fills in for you
- •Fix indexable noindex tags
- •Unblock AI bots in robots.txt
- •Add missing canonical tags
- •Server-render JS content
- •Rebuild internal link structure
- •Migrate to clean URL architecture
- •Tidy alt text
- •Compress a few images
- •Minor meta-description rewrites
- •Chasing a perfect Lighthouse 100
- •Reworking pages with no traffic
- •Cosmetic schema on dead pages
How I use it: I paste my crawl export and ask the model to score every issue on impact (1–5) and effort (1–5), then drop each into one of these four boxes. Start top-left and work right. The bottom-right box is where SEO time goes to die.
Step 3: Let AI draft the tedious fixes
Schema markup, robots.txt rules, redirect regex, hreflang clusters — this is finicky, error-prone, copy-paste work that AI is genuinely great at drafting. I'll spend 30 seconds on a prompt instead of 30 minutes hand-writing JSON-LD. Here's the schema one I reach for constantly:
Generate valid schema.org JSON-LD for this page. I'll paste the content below. Requirements: - Use the most appropriate type(s) (e.g. Article, Product, FAQPage, BreadcrumbList). - Only include properties you can fill from the content I give you — never invent ratings, prices, or dates. - Output a single <script type="application/ld+json"> block, ready to paste. - After the code, list any properties I should add manually and why. Page content: """ [paste the page's visible content, headings, author, publish date here] """

The same pattern handles the other grunt work. A couple I use weekly — note how specific the constraints are, because vague prompts are where AI invents things:
Two tasks, be precise: 1. Here's my current robots.txt: [paste]. I want to (a) keep Googlebot and Bingbot fully allowed, (b) allow the AI search bots OAI-SearchBot, PerplexityBot and Google-Extended, and (c) block /cart/, /checkout/, and any URL with a "?sort=" parameter. Rewrite it and explain each line. 2. Write an Nginx 301 redirect rule that maps every URL under /old-blog/<slug> to /blog/<slug>, preserving the slug. Then give me 3 example before/after URLs so I can test it.
Step 4: Validate everything (non-negotiable)
This is the step people skip, and it's the step that separates “AI saved me hours” from “AI deindexed my blog.” I treat every AI output as a draft from a fast, confident intern who occasionally makes things up. Concretely, before anything ships:
- Schema → paste into Google's Rich Results Test and the Schema validator. If it doesn't validate, it doesn't go live.
- Redirects → run the example URLs through a redirect checker and confirm single-hop 301s, no loops.
- robots.txt → test key URLs in a robots tester so you didn't just block a section you meant to keep.
- Any factual claim the model made about your site → confirm it against the crawl or GSC. AI will occasionally “remember” a problem that isn't there.

Step 5: Make the site AI-crawler ready
In 2026, technical SEO isn't just for Googlebot — it's for the AI engines increasingly sending (and answering) queries. And they behave differently from Google in one way that wrecks a lot of sites: almost none of them render JavaScript. If your content, links, or schema only appear after JS runs, most AI crawlers see a blank page.
Which AI bots render JavaScript?
| User-agent | Owner | What it does | Renders JS? | Obeys robots.txt? |
|---|---|---|---|---|
| GPTBot | OpenAI | Trains ChatGPT models | ✗ No | ✓ Yes |
| OAI-SearchBot | OpenAI | ChatGPT Search index | ✗ No | ✓ Yes |
| ChatGPT-User | OpenAI | Live fetch on user prompt | ✗ No | ✓ Yes |
| PerplexityBot | Perplexity | Perplexity answers index | ✗ No | ✓ Yes |
| ClaudeBot | Anthropic | Trains / fetches for Claude | ✗ No | ✓ Yes |
| Google-Extended | Gemini / AI Overviews | ✓ Yes | ✓ Yes |
The big takeaway: almost no AI crawler executes JavaScript — Google's is the notable exception, because it rides on Googlebot's rendering. If your content or schema only appears after JS runs, most AI engines simply never see it. Server-render anything you want cited.
So my AI-readiness pass is three checks: (1) confirm your important content is in the server-rendered HTML, not injected client-side — view source, don't just inspect; (2) make sure robots.txt actually allows the AI user-agents you want citing you; and (3) read your server logs to see which AI bots are really hitting you and what they get. That last one is a perfect AI task — logs are huge and pattern-dense:
I'm pasting a sample of my server access logs. Analyze AI/search crawler behavior: 1. List every bot user-agent you see (focus on GPTBot, OAI-SearchBot, ChatGPT-User, PerplexityBot, ClaudeBot, Google-Extended, Googlebot, Bingbot) and how many requests each made. 2. Which status codes are these bots receiving? Flag any non-200s they hit. 3. Are any of them being served redirects or errors on key pages? 4. Summarize: is anything blocking these bots from reaching my main content? Logs: """ [paste a few hundred log lines] """
Server logs are the only place you see the AI crawlers by name — Search Console doesn't break them out yet. What GSC does give you is a clean view of Googlebot's crawl activity, which is still worth watching (and is the closest most people get without log access):

Step 6: Put it on a loop
A one-time audit is a snapshot; sites rot continuously. The real unlock with AI is that re-running this loop is cheap, so I schedule it instead of waiting for a quarterly panic. A fresh crawl on a cadence, the same Step 1–2 prompts pointed at the new export, and a quick “what changed since last time” diff catches regressions the week they happen — a botched deploy that adds noindex, a new redirect chain, schema that broke in a template update.
This is honestly why we built CrawlRaven the way we did: continuous 200-point crawls with the prioritization already done, so the “crawl and rank the issues” half of this workflow runs on autopilot and I can spend my AI time on the judgment calls. However you do it, the principle holds — technical SEO is a loop now, not a project.

The guardrails that keep this safe
If you take one thing from this post, make it this list. These are the rules I never break, learned partly from watching them get broken:
- Real data in, or garbage out. Always feed AI an actual crawl, GSC export, or logs — never let it guess about your site from the URL.
- Diagnose, prioritize, and fix in separate prompts. Mixing them lets critical issues hide behind cosmetic ones.
- Validate every output. Schema, redirects, robots — test before deploy, every time.
- Constrain the prompt. “Never invent ratings or dates” and “only use data I gave you” prevent most hallucinations.
- You make the final call. AI ranks and drafts; you decide what matters to the business and own the result.
Key Takeaways
- →AI is the analyst, not the boss: It reads crawls, ranks issues, and drafts fixes brilliantly. It does not decide what matters or whether output is correct — you do.
- →Always start from a real crawl: Chat models don't crawl your whole site from a URL. Feed them an actual export, GSC data, or logs for ground truth.
- →Separate diagnose / prioritize / fix: Three distinct prompts. Mixing them lets a critical noindex hide behind cosmetic alt-text nits.
- →Validate everything before deploy: Rich Results Test for schema, a redirect checker for rules, a robots tester for blocks. AI hallucinates ~15% in early setups.
- →Most AI bots don't render JS: GPTBot, PerplexityBot, ClaudeBot read raw HTML. Server-render anything you want cited, and allow the AI user-agents in robots.txt.
- →Make it a loop: Re-running the workflow is cheap with AI. Schedule it so regressions surface the week they happen, not next quarter.
Related reading on CrawlRaven
Frequently asked questions
Can AI do a technical SEO audit on its own?
Not reliably on its own. Chat models don't fully crawl your site from a URL, and they hallucinate issues and fixes. The workflow that works is: run a real crawler (or an audit tool) to get ground-truth data, then use AI to summarize it, prioritize issues by impact and effort, and draft fixes like schema and redirects. You still validate every output and make the final calls. AI removes most of the manual grunt work; it does not replace the SEO.
What AI tools are best for technical SEO?
You need three categories, not one tool: a real crawler for data (Screaming Frog, or an audit platform like CrawlRaven for 200+ checks with prioritization built in), a reasoning model for analysis and drafting (ChatGPT or Claude — using both and cross-checking is ideal), and validators (Google's Rich Results Test, a redirect checker, a robots.txt tester). Google Search Console rounds it out with real index and performance data.
Is it safe to use AI-generated schema markup?
Yes, if you validate it. AI is excellent at drafting JSON-LD quickly, but it can invent properties like ratings, prices, or dates that aren't true — which can trigger structured-data penalties. Always constrain the prompt ('only use data I give you, never invent values'), then paste the output into Google's Rich Results Test and the Schema validator before it goes live. If it doesn't validate, it doesn't ship.
Do AI crawlers like GPTBot render JavaScript?
Mostly no. GPTBot, OAI-SearchBot, PerplexityBot, and ClaudeBot read the raw server-rendered HTML and generally do not execute JavaScript. Google's crawler is the main exception because it uses Googlebot's rendering service. The practical implication: if your content, internal links, or schema only appear after JavaScript runs, most AI engines never see them. Server-render anything you want AI search to read and cite.
What are the risks of using AI for technical SEO?
The main risks are hallucinated fixes (invented schema values, wrong redirect rules), false-positive issues (early AI audit setups flag a meaningful share of non-problems), and over-trust that leads to deploying changes that block crawlers or deindex pages. Every risk is mitigated by the same discipline: feed AI real data, keep diagnosis/prioritization/fixing in separate prompts, validate all output before deploy, and keep a human making the final decision.
How often should I run an AI technical SEO audit?
Treat it as a loop, not a one-off. Because re-running the workflow is cheap with AI, schedule a fresh crawl monthly (or after any significant deploy or migration) and re-run the summarize-and-prioritize prompts on the new export. A quick 'what changed since last time' diff catches regressions — accidental noindex tags, new redirect chains, broken schema — the week they happen instead of the next quarter.
Co-founder, CrawlRaven · 6+ years building SaaS content & SEO products
Ayush has 6+ years of experience building SaaS products and content strategies in the SEO space. As co-founder of CrawlRaven, he writes from hands-on experience building deep-crawl audit tools and solving the technical SEO problems agencies actually face.