Shopify migration crawl and indexing QA

Commercial disclosure: this page may mention Shopify. Recommendations should be weighed against the stated testing status and native Shopify alternatives. See the affiliate disclosure.

Desk Researched. Last reviewed 2026-05-02.

A crawl is migration evidence, not a vanity audit

Crawl checks prove whether important old URLs, new Shopify destinations, redirects, canonicals, indexability and internal links behave the way the migration plan says they should.

Run separate crawls for old, staging and live

The old site crawl captures what must be protected. The staging crawl catches Shopify template and indexability mistakes. The live crawl confirms redirects, canonicals, robots and sitemap output after launch.

Crawl data needs commercial context

A crawler will not know which URLs earn revenue or links. Merge crawl data with Search Console, analytics and backlink evidence before deciding what matters.

A Shopify migration can look successful in the browser and still be unclear to search engines.

The old URLs may redirect, but to weak destinations. The new sitemap may exist, but not include the pages that matter. Important collections may be live, but canonical signals, internal links or noindex rules may stop them being treated as primary pages.

The critical checks sit between “the store launched” and “search engines can understand the new store”.

The work is not to crawl every possible URL. It is to prove that important old value has a crawlable, indexable and relevant new home.

Start with old-site evidence

Before judging the new Shopify store, collect evidence from the old store.

You need:

old crawl export
old sitemap URLs
Search Console landing page data
top organic pages
backlink target URLs
old canonical targets
old noindex rules
old redirect paths
old parameter/filter patterns

Without the old evidence, you cannot tell whether Shopify has simplified the site cleanly or accidentally removed important search paths.

Build a crawl sample set

Do not crawl only the new homepage and sitemap.

Build a sample set with:

top old organic landing pages
top old product URLs
top old category URLs
important blog/guide URLs
old filtered URLs
old tag/archive URLs
backlink targets
discontinued product URLs
newly created Shopify collections
newly created Shopify products
resource/download pages if relevant

The sample set should expose risk. If it only includes clean pages, the crawl will look better than the migration really is.

Check old URLs first

For each important old URL, confirm:

does it still resolve?
does it redirect?
is the redirect permanent?
does it redirect in one hop?
is the destination relevant?
does the destination return 200?
is the destination indexable?
does the destination canonicalise to itself or a sensible parent?

A redirect that lands on the homepage is usually not a successful SEO migration for an important category or product.

The useful question is not only “does it redirect?”.

Ask what the redirect proves:

Crawl result	What it usually means	First response
Old URL returns 404	The redirect is missing or was never mapped	Check the redirect sheet and old URL priority
Old URL redirects to homepage	The old intent may not be preserved	Find or build a closer destination
Old URL redirects in several hops	Old rules were probably carried over	Flatten the chain to the final Shopify URL
Destination is noindex	The redirect points into a page that cannot rank	Fix indexability or choose another destination
Destination canonicalises elsewhere	The signal may be diluted or confused	Check whether the canonical target is intentional

This is where crawl data becomes migration judgement.

Check Shopify sitemap coverage

Shopify normally generates sitemap files, but the presence of a sitemap is not the same as good coverage.

Check whether important Shopify pages are included:

collections
products
blogs/posts
pages
resources if they are public

Then compare the sitemap to your priority list.

If an important collection is live but absent from the sitemap or not internally linked, investigate before assuming it will be discovered quickly.

Check robots and noindex rules

Look for mistakes that block the wrong pages.

Check:

robots.txt output
noindex tags
x-robots-tag headers if used
password protection
app-injected directives
staging-domain remnants
template-level rules

Most crawl/indexing disasters are not subtle. They are usually simple blocking rules applied too broadly.

Check canonical signals

For priority pages, check:

canonical URL
status code of canonical target
whether canonical target is indexable
whether internal links point to the canonical version
whether old product-with-collection paths create confusion
whether filters or parameters canonicalise sensibly

Canonical tags are hints, but inconsistent hints create avoidable doubt.

Check collection indexing quality

Important collections should be easy to crawl, index and understand.

For each priority collection, check:

200 status
indexable directive
self-referencing canonical or deliberate canonical target
title and H1
product relevance
internal links from navigation/content/products
no accidental filter URL selected as the main page
crawl depth

If a collection is commercially important but buried, thin or internally unsupported, indexing may not be the only problem.

Check product indexing quality

For important products, check:

product URL status
canonical
indexability
image/media output
structured data
collection membership
internal links from collections
discontinued/out-of-stock handling
redirect from old product URL

Do not index every product blindly if the catalogue has duplicates, variants or discontinued items. Decide what should remain discoverable.

Check filtered and parameter URLs

Faceted navigation can create migration noise.

Look for URLs with:

filter parameters
sort parameters
tag paths
vendor/type patterns
search URLs
app-generated filter URLs

For each pattern, decide:

should this be crawlable?
should it be indexable?
should it canonicalise to the base collection?
should a high-demand filter become a dedicated collection instead?

Do not let filters become accidental landing pages because they happened to exist after launch.

Example:

An old WooCommerce URL for ?filter_size=wide may have earned search demand because shoppers wanted a specific product group. If Shopify turns that into a crawlable filter URL with no stable collection page, the migration may preserve access but weaken the landing page.

In that case, the better fix may be a proper collection, not just a canonical tag.

Use Search Console carefully after launch

Search Console will not update instantly, but it will show patterns.

Monitor:

indexing status for priority URLs
submitted vs indexed sitemap URLs
crawl errors
soft 404s
pages with redirects
excluded by noindex
duplicate/canonical reports
clicks and impressions by page type

Do not panic at every early warning. Look for repeated patterns across important URL groups.

First crawl after launch

Run the first post-launch crawl with:

sitemap crawl
internal crawl
redirect list crawl
priority URL list crawl
rendered HTML where useful

Compare outputs rather than looking at one report.

A sitemap crawl tells you what Shopify is submitting. An internal crawl tells you what the site actually links to. A redirect crawl tells you whether old value has a new home.

Common crawl/indexing mistakes

Watch for:

old URLs redirecting to irrelevant pages
important collections missing from navigation
accidental noindex on templates
staging URLs left in links
canonical targets pointing to the wrong version
filter URLs being crawled heavily
Shopify product URLs linked inconsistently
blog content not linked from commercial pages
sitemap submitted but not compared against priority URLs

These are fixable, but they are expensive to discover late.

Minimum crawl and indexing sheet

Use these columns:

URL
old URL
page type
priority
status code
indexability
canonical
sitemap presence
internal link count
redirect source
destination relevance
Search Console status
issue
severity
owner
action

This turns crawl data into decisions.

If the redirect review is the weak point, pair this sheet with the Migration Redirect Risk Review.

What to do next

If redirects are failing, use the Shopify redirect mapping guide.

If launch QA is still underway, use the Shopify migration QA checklist.

If traffic has already dropped, use the Shopify SEO traffic drop after migration runbook.

For broader live-store technical checks, use the Shopify technical SEO checklist.

Quick answer

Run crawl and indexing checks before and after a Shopify migration so the team can prove which old URLs existed, which new URLs replaced them, and which pages are crawlable, indexable and internally linked after launch.

What you will do

Save old-site crawl evidence before migration work changes the source site.
Catch staging noindex, canonical, robots, sitemap and template problems before launch.
Use live crawl evidence to fix redirect chains, 404s and indexation gaps quickly.

What to check first

Screaming Frog, Sitebulb or an equivalent crawler.
Google Search Console page, query and indexing exports.
GA4 or Shopify reports for landing-page value.
Backlink export for URLs that may not appear in the current crawl.
Shopify sitemap, robots.txt and URL redirect controls.

Work through it in this order

Crawl the old site and export indexable URLs, status codes, titles, meta descriptions, canonicals, H1s and inlinks.
Merge the crawl with Search Console, analytics and backlink exports so commercial URLs are not treated like low-value crawl noise.
Crawl the Shopify staging store and check product, collection, blog and page templates for indexability, schema, links and password/noindex leftovers.
Prepare an old-URL test list from the top organic, revenue and backlinked pages.
After launch, crawl the live domain, sitemap URLs and old high-priority URL list.
Fix one-hop redirect failures, unexpected 404s, noindex/canonical mistakes and sitemap-only orphan pages before lower-value warnings.
Keep the old, staging and live crawl exports in the migration evidence folder.

Real-world notes

Old category pages often vanish because the new Shopify collection structure was built from product imports rather than search demand.
Staging crawls regularly catch noindex tags, password remnants and app schema conflicts before anyone notices in Search Console.
A post-launch crawl can reveal that internal links still point through old redirected URLs even when the redirect map itself works.

Final checks

Old site crawl saved before migration changes.
Search Console and analytics data merged into URL list.
Staging crawl reviewed for noindex, canonical, robots and schema issues.
Top old URLs tested after launch.
Redirect chains and loops reviewed.
Live sitemap URLs crawled.
404s prioritised by traffic, links and revenue.
Crawl exports stored with launch date.

Watch-outs

If stock sync unpublishes products after launch, crawl data may show sudden 404s that are actually inventory process problems.
If an app creates filter or search-result pages, the crawl may expose index bloat that the original migration plan never considered.
If old redirects existed in WordPress, Shopify imports can create chains unless the final destination is mapped directly.

Next action

Run this crawl pass before final redirect QA, then use the traffic-drop guide if Search Console movement looks abnormal.

Field questions

Should I crawl the old WooCommerce site before moving to Shopify?

Yes. Crawl the old site before URLs, navigation, content or plugins change. Keep the export because it becomes the source list for redirects, metadata checks and post-launch 404 monitoring.

What should I crawl after Shopify launch?

Crawl the final domain, high-priority old URLs, sitemap URLs, top organic landing pages and a sample of product, collection, blog and page templates.

Can Search Console replace a migration crawl?

No. Search Console is essential, but it lags and does not replace a controlled crawl of old URLs, staging URLs and live redirects.

Commercial disclosure

Partner links mentioned on this page

Some links may earn a commission, but recommendations still start with the store problem, the evidence, and the simplest workable next step.

Shopify

Affiliate link active

Platform decisions, Shopify comparisons and migration pages.

See when Shopify fits Visit Shopify

Shopify migration crawl and indexing checks