technical seo22 min read

How to Perform a Technical SEO Audit in 2026 (Step-by-Step Guide)

The complete 8-step guide to performing a technical SEO audit. Covers crawlability, indexation, Core Web Vitals, structured data, security, and how to prioritize fixes by ranking impact.

Aditi ChaturvediJanuary 8, 2026Updated April 5, 2026

TL;DR

A technical SEO audit follows 8 steps: crawl your site, check crawlability and indexation, analyze on-page elements, assess Core Web Vitals, audit links, validate structured data, audit security and HTTPS, and prioritize fixes by impact. CrawlRaven automates all 200+ checks and prioritizes issues automatically.

This guide walks through every step of a technical SEO audit. If you want to automate the entire process, CrawlRaven runs all 200+ checks in minutes and prioritizes fixes by ranking impact. Try CrawlRaven free for 14 days →

Technical SEO Audit: 7-Step Flow

Follow sequentially for a comprehensive audit

1

Crawl Your Website

Discover every page, status code, and error with a full-site crawl.

2

Check Crawlability

Verify robots.txt, sitemaps, canonicals, and noindex directives.

3

Analyze On-Page Elements

Audit title tags, meta descriptions, H1s, and content quality.

4

Assess Core Web Vitals

Target LCP < 2.5s, CLS < 0.1, INP < 200ms for every page.

5

Audit Links

Find broken links, redirect chains, and orphan pages.

6

Validate Structured Data

Check JSON-LD schemas for rich result eligibility.

7

Prioritize & Act

Sort by impact and effort — fix critical issues first.

CrawlRaven automates steps 1–6 and prioritizes step 7 by estimated SEO impact.

I run technical SEO audits almost weekly — for client sites, for our own properties, and sometimes just to stress-test a new crawler. The process boils down to checking seven things: can search engines actually reach your pages, are they choosing to index them, do the on-page signals make sense, is performance solid, are links healthy, is structured data valid, and is HTTPS configured right. That's it. Seven areas, done properly, and you've covered what matters.

Manually, a full audit takes me somewhere between 2 and 8 hours depending on site size. With an automated crawler like CrawlRaven, I can get through the bulk of it in under 10 minutes — though I still spot-check the output by hand.

What follows is the exact process I use. Every step includes specific pass/fail thresholds, not vague advice. The same methodology works whether you're auditing a 50-page brochure site or a 100,000+ page e-commerce catalog.

Step 1: Crawl your website — discover every page and error

Everything starts with the crawl. You need a complete map of your site — every URL, every status code, every redirect and metadata tag — before you can diagnose anything. Think of it like an X-ray: you can't fix what you can't see.

Fire up CrawlRaven, Screaming Frog, or Sitebulb, point it at your homepage, and let it follow internal links. That's how Googlebot works too — Google's crawler documentation is pretty clear that link-following is the primary discovery mechanism. Your audit crawler should behave the same way.

Here's what I look for in the crawl output:

HTTP status codes — 200 (OK), 301/302 (redirects), 404 (not found), 500 (server error). Any non-200 page needs investigation.
Title tags and meta descriptions — check for missing, duplicate, or truncated values across every page.
H1 tags and header hierarchy — verify each page has exactly one H1 and uses H2–H4 in logical order.
Canonical URLs — ensure every page declares the correct canonical to avoid duplicate content signals.
Noindex/nofollow directives — identify pages accidentally blocked from indexation.
Internal link graph — map which pages link to which, and find orphan pages with zero inbound links.
Page load time and response size — flag pages with TTFB over 600ms or response bodies over 3MB.
Image data — file sizes, missing alt text, uncompressed formats (BMP, TIFF).

One thing that trips people up: if your site uses a JavaScript framework (React, Angular, Vue, Next.js), you absolutely need to enable JS rendering in the crawler. I've seen audits miss half a site's pages because the crawler only looked at raw HTML and never executed the client-side code. All that dynamically loaded navigation and content? Invisible without rendering.

Crawl configuration tips

Set a reasonable crawl rate — 3–5 URLs per second for most sites. Higher rates can trigger rate limiting or server strain.
Respect robots.txt during the crawl — this shows you what Googlebot actually sees, not what your site theoretically contains.
Include subdomains — if your site uses blog.example.com, app.example.com, or cdn.example.com, crawl them separately.
Set a crawl depth limit — for initial audits, 10–15 levels deep is usually sufficient. Pages deeper than 5 clicks from the homepage rarely rank well.

Step 2: Check crawlability and indexation — unblock your important pages

Two different questions here, and people conflate them all the time. Crawlability: can Google physically reach the page? Indexation: does Google choose to put it in search results? You can have a perfectly crawlable page that never gets indexed, and you can have pages Google wants to index but can't crawl. I've seen a single misplaced robots.txt rule quietly nuke an entire product catalog from the index — nobody noticed for three weeks.

Robots.txt audit

The robots.txt file is the first thing Googlebot reads before crawling anything on your domain — the Google robots.txt specification makes this explicit. So if there's something wrong with it, everything downstream is affected.

Verify the file exists at yourdomain.com/robots.txt and returns a 200 status code.
Check for overly broad Disallow rules — a common mistake is Disallow: / on staging environments that accidentally reaches production.
Ensure CSS and JavaScript files are not blocked — Google needs these to render your pages correctly.
Verify the Sitemap: directive points to your current XML sitemap URL.

XML sitemap audit

Your XML sitemap is basically a priority list you hand to search engines — "here are my important pages, and here's when I last touched them." The problem is most sitemaps are generated once and then forgotten.

Confirm the sitemap is submitted in Google Search Console and returns a 200 status code.
Check that every important page is included — compare sitemap URLs against your crawl data.
Remove non-indexable pages from the sitemap (redirects, 404s, noindex pages, canonicalized pages).
Verify lastmod dates are accurate — Google uses these to decide crawl priority.
For large sites, use a sitemap index file with sub-sitemaps of no more than 50,000 URLs each.

Canonical and noindex audit

Self-referencing canonicals: Every indexable page should have a canonical tag pointing to itself. Missing self-referencing canonicals leave Google guessing.
Cross-domain canonicals: If you syndicate content, verify cross-domain canonicals point to the original version.
Conflicting signals: A page with both a noindex tag and a canonical pointing to it sends mixed signals. Resolve the conflict — either index it or remove the canonical reference.
Pagination canonicals: Paginated pages (page 2, page 3, etc.) should canonical to themselves, not to page 1 — unless you're using a view-all page.

Index coverage check

This is where things get interesting. Pull up Google Search Console's Index Coverage report and compare it against your crawl data. The gaps between what you think is indexed and what Google actually indexed are often revealing:

Pages you want indexed but aren't: Check for noindex tags, canonical issues, or crawl blocks.
Pages indexed that shouldn't be: Filter pages, search result pages, and admin URLs that waste crawl budget.
Coverage errors: Server errors (5xx), redirect errors, and submitted-but-not-indexed pages.

Step 3: Analyze on-page technical elements — fix the content signals

Now we're into the HTML-level stuff — title tags, meta descriptions, heading structure. These are the signals that tell Google what each page is actually about, and I'm consistently surprised by how often even well-built sites get them wrong. Duplicate titles across 40 product pages, meta descriptions copy-pasted from a template, H1 tags used for styling instead of semantics. It all adds up to confused ranking signals.

Title tag audit

Pull your title tag data from the crawl and check it against Google's title link guidelines. Here's what I flag:

Unique per page: No two pages should share the same title tag. Duplicates signal thin or redundant content.
50–60 characters: Titles longer than 60 characters get truncated in search results. Keep the primary keyword within the first 50 characters.
Contains primary keyword: The target keyword should appear naturally — not stuffed or repeated.
Compelling copy: The title is your first impression in search results. Include a benefit or value proposition beyond just the keyword.

Meta description audit

Unique per page: Missing or duplicate meta descriptions mean Google generates its own — which may not highlight your key selling points.
120–160 characters: Longer descriptions get truncated. Include the primary keyword and a clear call-to-action.
Not used as a ranking factor: Meta descriptions don't directly affect rankings, but they affect click-through rate — which does.

Heading structure audit

One H1 per page: The H1 should match the page's primary topic and contain the target keyword.
Logical hierarchy: H2 → H3 → H4 in sequence. Skipping levels (H1 → H3) or using headings for styling breaks semantic structure.
No empty headings: Headings with no text content or only whitespace create accessibility and SEO issues.

Content quality checks

Thin content: Pages with fewer than 300 words of unique text rarely rank for competitive queries. Identify and either expand, consolidate, or noindex thin pages.
Duplicate content: Use your crawl data to find pages with 85%+ content similarity. Consolidate with canonicals or 301 redirects.
Keyword cannibalization: Multiple pages targeting the same keyword compete with each other. Map one primary keyword per page and differentiate intent.

Image optimization

Images are a sneaky performance killer. I've audited sites where uncompressed hero images added 4+ seconds to page load. Google has a solid Images best practices guide, but here's the practical version:

Alt text: Every informational image needs descriptive alt text. Decorative images should use alt="".
File size: Compress images to under 200KB where possible. Use WebP or AVIF formats for better compression.
Lazy loading: Images below the fold should use loading="lazy" — but never lazy-load the LCP (Largest Contentful Paint) image.
Responsive sizing: Use srcset and sizes attributes so browsers load appropriately sized images for each device.

Step 4: Assess Core Web Vitals — fix performance before Google does it for you

Core Web Vitals are Google's way of saying "we care about user experience, and we're going to measure it." In practice, sites that fail CWV thresholds lose ground to faster competitors — and the gap is most brutal on mobile, where slow sites get hammered.

I check CWV three ways: Google Search Console's Core Web Vitals report for field data, PageSpeed Insights for per-page diagnostics, and CrawlRaven's built-in performance analysis for site-wide patterns. Here are the thresholds you're targeting, straight from web.dev:

Metric	Good	Needs improvement	Poor
LCP (Largest Contentful Paint)	≤ 2.5s	2.5s – 4.0s	> 4.0s
CLS (Cumulative Layout Shift)	≤ 0.1	0.1 – 0.25	> 0.25
INP (Interaction to Next Paint)	≤ 200ms	200ms – 500ms	> 500ms

Common LCP fixes

Optimize the hero image: Preload the LCP image with <link rel="preload">, serve it in WebP/AVIF, and ensure it is not lazy-loaded.
Reduce server response time: Target TTFB under 200ms. Use a CDN, enable server-side caching, and optimize database queries.
Eliminate render-blocking resources: Defer non-critical CSS and JavaScript. Inline critical CSS for above-the-fold content.
Reduce DOM size: Pages with over 1,500 DOM nodes slow down rendering. Simplify complex layouts and remove unnecessary wrapper elements.

Common CLS fixes

Set explicit dimensions on images and videos: Always include width and height attributes so the browser reserves space before the asset loads.
Avoid injecting content above existing content: Banners, cookie notices, and late-loading ads that push content down are the primary CLS offenders.
Use CSS aspect-ratio for responsive media: This prevents layout shifts when images scale across breakpoints.

Common INP fixes

Break up long tasks: JavaScript tasks over 50ms block the main thread. Use requestIdleCallback or setTimeout to yield to the browser.
Reduce third-party script impact: Analytics, chat widgets, and ad scripts are the most common INP offenders. Load them asynchronously or defer them.
Optimize event handlers: Debounce scroll and resize handlers. Avoid expensive DOM operations in click handlers.

Step 5: Audit links — eliminate broken paths and orphan pages

Internal links are how search engines navigate your site — they're the hallways and staircases of your architecture. Google's link best practices are blunt about this: if a page isn't reachable through crawlable links, it basically doesn't exist. I always find broken links, redirect chains, and orphan pages in audits. Always. Even on sites that look well-maintained.

Internal link audit

Broken internal links (404s): Every link pointing to a non-existent page wastes crawl budget and creates a dead end. Fix by updating the link target or implementing a 301 redirect.
Redirect chains: A link that redirects to another redirect (A → B → C) adds latency and dilutes link equity. Update the original link to point directly to the final destination.
Redirect loops: Page A redirects to B, which redirects back to A. These are critical errors that prevent both users and crawlers from reaching the content.
Orphan pages: Pages with zero inbound internal links cannot be discovered through crawling. Either add internal links to them or consider whether they should exist at all.
Crawl depth: Important pages should be within 3 clicks of the homepage. Pages buried 5+ levels deep are deprioritized by search engines.

External link audit

Broken outbound links: Links to external pages that return 404 or 5xx errors hurt user experience and can signal neglect to search engines.
Links to redirected URLs: Update outbound links that redirect — link directly to the final URL.
Nofollow usage: Review which outbound links use rel="nofollow". Affiliate links and sponsored content should be nofollowed; editorial references generally should not.

Step 6: Validate structured data — claim your rich results

Schema markup is one of those things that feels optional until you see the difference it makes. Star ratings, FAQ dropdowns, how-to carousels, breadcrumb trails — these rich results consistently pull 20–30% higher click-through rates in my experience. That's free traffic you're leaving on the table if your structured data is missing or broken.

I run every page through both Google's Rich Results Test and Schema.org's validator. They catch different things, so you want both.

Schema types to implement by page type

All pages: BreadcrumbList for navigation breadcrumbs, Organization on the homepage.
Blog posts: Article or BlogPosting with author, date, and headline.
How-to guides: HowTo schema with step-by-step instructions.
FAQ pages: FAQPage with question-answer pairs.
Product pages: Product with price, availability, and reviews.
Service pages: Service or SoftwareApplication with features and pricing.
Local business pages: LocalBusiness with address, hours, and contact info.

Common schema mistakes

Invalid JSON-LD syntax: Missing commas, unclosed brackets, or invalid characters. Always validate after changes.
Mismatched data: Schema data that doesn't match visible page content violates Google's guidelines and can result in a manual action.
Missing required fields: Each schema type has required properties — omitting them means no rich result eligibility.
Duplicate schema: Multiple conflicting schema blocks on the same page confuse parsers. Use one JSON-LD block per schema type.

Step 7: Audit security and HTTPS — protect rankings and user trust

HTTPS has been a confirmed ranking signal for years now, but I still find sites with mixed content warnings, expired certificates, and missing redirects. The SEO impact is real, but honestly the bigger problem is what happens to user trust when Chrome slaps a "Not Secure" warning on your checkout page. Bounce rates go through the roof.

SSL certificate validity: Check that your certificate is valid, not expired, and covers all subdomains you use (including www and non-www).
Mixed content: HTTP resources (images, scripts, stylesheets) loaded on HTTPS pages trigger browser warnings. Find and update all HTTP references to HTTPS.
HTTP to HTTPS redirects: Every HTTP URL should 301 redirect to its HTTPS equivalent. Check both www and non-www variants.
Security headers: Implement Strict-Transport-Security (HSTS), X-Content-Type-Options, and Content-Security-Policy headers.
HTTPS-only cookies: Cookies should be set with the Secure flag to prevent transmission over unencrypted connections.

Step 8: Prioritize and create an action plan — fix high-impact issues first

Here's where most audits fall apart. You've got a spreadsheet with 200 issues, the dev team has limited bandwidth, and someone has to decide what gets fixed first. I've watched teams spend three sprints on alt text while a robots.txt rule was blocking their entire blog from indexation. Prioritization isn't optional — it's the difference between an audit that moves rankings and one that collects dust.

Priority framework

Priority	Issue type	Examples
Critical	Crawl blocks and indexation failures	robots.txt blocking key pages, noindex on revenue pages, server errors (5xx)
High	Broken user paths and redirect issues	Broken internal links, redirect chains/loops, orphan pages with traffic
High	Core Web Vitals failures	LCP > 4s, CLS > 0.25, INP > 500ms on key landing pages
Medium	On-page signal gaps	Missing/duplicate title tags, missing H1s, thin content pages
Medium	Structured data issues	Invalid schema, missing required fields, FAQ markup without content
Low	Quick wins and hygiene	Missing alt text, broken outbound links, suboptimal meta descriptions

If you don't want to build this matrix manually, CrawlRaven does it for you — it scores every issue by estimated SEO impact and spits out a prioritized fix list. Saves a lot of spreadsheet wrangling.

For the content side of things (keyword density, SERP-level content gaps, topical coverage), Surfer SEO is a solid complement to technical crawlers. I usually pair both — Surfer for content optimization, CrawlRaven for the infrastructure layer.

Creating your action plan

Export your audit findings to a spreadsheet or project management tool. Group issues by category (crawlability, performance, on-page, etc.).
Tag each issue with effort level — quick fix (under 1 hour), moderate (1–4 hours), or complex (requires developer sprint).
Start with critical + quick fix items. These deliver the highest ROI: high-impact issues that can be resolved immediately.
Schedule complex fixes into your development sprints. Include the expected SEO impact to help engineering teams prioritize.
Set up monitoring. Run automated crawls weekly or monthly to catch new issues before they affect rankings. CrawlRaven supports scheduled audits with alert notifications.

Want this as a reference doc you can hand to your team? Grab the full SEO audit checklist with 200+ checks.

Which audit tool should you use? Quick comparison

I get asked this constantly, and the honest answer is: it depends on what you're working with. Site size, budget, whether you need scheduled monitoring or just a one-off crawl — all of it matters. I've used all three of these extensively, so here's how they stack up on the things that actually affect your workflow:

Feature	CrawlRaven	Screaming Frog	Sitebulb
Deployment	Cloud-based	Desktop app	Desktop + Cloud
Max pages per crawl	100,000+	Unlimited (RAM-limited)	500,000
Prioritized fix list	✓ Auto-generated	— (raw data export)	✓ Hints system
Core Web Vitals	✓ Built-in per page	Via API integration	✓ Built-in
White-label reports	✓	—	✓ (Cloud only)
JavaScript rendering	✓	✓ (Chrome)	✓ (Chrome)
Scheduled crawls	✓ Daily/weekly/monthly	— (manual only)	✓ (Cloud only)
Accessibility auditing	—	—	✓ WCAG checks
Starting price	$9/mo	£199/yr	$13.50/mo (Desktop)

If you want to see the automated version of this entire process, CrawlRaven runs all 8 steps with 200+ checks and gives you the prioritized fix list out of the box. And when it's time to present findings to stakeholders who don't speak SEO, the SEO audit report template is a good starting point.

Frequently asked questions

What is a technical SEO audit?

A technical SEO audit is a systematic evaluation of every infrastructure factor that affects how search engines crawl, render, index, and rank your website. It covers crawlability, indexation, on-page elements, Core Web Vitals, link health, structured data, and security. A thorough audit examines 200+ individual checks across these categories.

How long does a technical SEO audit take?

A thorough manual technical SEO audit takes 2–8 hours depending on site size and complexity. Automated tools like CrawlRaven can complete the crawl and analysis in under 10 minutes for a 1,000-page site. However, reviewing findings and creating a prioritized action plan adds 1–3 hours regardless of the tool used.

What tools do I need for a technical SEO audit?

At minimum, you need a site crawler (CrawlRaven, Screaming Frog, or Sitebulb), Google Search Console for index coverage data, and PageSpeed Insights or Chrome DevTools for performance testing. CrawlRaven combines crawling, Core Web Vitals analysis, and structured data validation into one cloud-based platform.

What are the most critical issues to fix first?

Prioritize crawl-blocking issues first — broken robots.txt rules, accidental noindex tags on revenue pages, and server errors (5xx). Next, fix broken internal links and redirect chains that waste crawl budget. Then address Core Web Vitals failures on key landing pages, followed by missing structured data and on-page signal gaps.

How often should you run a technical SEO audit?

Run a full technical SEO audit quarterly and monitor key metrics monthly. After major site changes — redesigns, CMS migrations, URL restructures, or large content updates — run an immediate audit. Automated tools like CrawlRaven can run scheduled weekly or monthly crawls to catch new issues before they affect rankings.

What is the difference between a technical SEO audit and a full SEO audit?

A technical SEO audit focuses on infrastructure: crawlability, indexation, page speed, structured data, and security. A full SEO audit adds content quality analysis, keyword optimization, backlink profile review, and competitive benchmarking. Technical audits are the foundation — content and backlink audits build on top of solid technical health.

Can I do a technical SEO audit myself?

Yes. This guide covers every step of a DIY technical SEO audit. You need basic familiarity with HTML, HTTP status codes, and Google Search Console. For sites under 1,000 pages, a manual audit using free tools is practical. For larger sites, automated crawlers like CrawlRaven save significant time and reduce the risk of missing issues.

What is crawl budget and why does it matter?

Crawl budget is the number of pages Google will crawl on your site within a given timeframe. For small sites (under 10,000 pages), crawl budget is rarely a concern. For larger sites, wasting crawl budget on redirect chains, duplicate content, or low-value pages means Google may not crawl your important pages frequently enough to keep them fresh in search results.

How do I check if Google can crawl my site?

Use Google Search Console's URL Inspection tool to check individual pages, or review the Index Coverage report for site-wide data. You can also use Google's robots.txt Tester to verify your crawler directives. A site crawler like CrawlRaven or Screaming Frog shows you every page it can discover, mimicking how Googlebot navigates your site.

What are Core Web Vitals thresholds for 2026?

The current Core Web Vitals thresholds are: LCP (Largest Contentful Paint) under 2.5 seconds, CLS (Cumulative Layout Shift) under 0.1, and INP (Interaction to Next Paint) under 200 milliseconds. Pages meeting all three thresholds are classified as having a 'good' user experience and receive a ranking boost in Google's page experience signals.

About the Author

Aditi Chaturvedi

15+ years of growing SaaS websites through SEO | Author, 200-Point Audit Checklist

Aditi has spent 15+ years helping SaaS companies scale organic traffic through technical SEO and content strategy. She is the author of the CrawlRaven 200-Point Audit checklist used by agencies and in-house teams to systematically improve search performance.

how to perform a technical seo audittechnical seo audittechnical seo audit guidetechnical seo audit checklistsite audit tutorialseo audit steps