

Compare site structure, detect changes in key elements and metrics and use URL mapping to compare staging against production sites. Crawl Comparison – Compare crawl data to see changes in issues and opportunities to track technical SEO progress.Spelling & Grammar – Spell & grammar check your website in over 25 different languages.Structured Data & Validation – Extract & validate structured data against specifications and Google search features.Visualisations – Analyse the internal linking and URL structure of the website, using the crawl and directory tree force-directed diagrams and tree graphs.XML Sitemap Analysis – Crawl an XML Sitemap independently or part of a crawl, to find missing, non-indexable and orphan pages.AMP Crawling & Validation – Crawl AMP URLs and validate them, using the official integrated AMP Validator.Store & View HTML & Rendered HTML – Essential for analysing the DOM.Rendered Screen Shots – Fetch, view and analyse the rendered pages crawled.Custom robots.txt – Download, edit and test a site’s robots.txt using the new custom robots.txt.XML Sitemap Generation – Create an XML sitemap and an image sitemap using the SEO spider.External Link Metrics – Pull external link metrics from Majestic, Ahrefs and Moz APIs into a crawl to perform content audits or profile links.PageSpeed Insights Integration – Connect to the PSI API for Lighthouse metrics, speed opportunities, diagnostics and Chrome User Experience Report (CrUX) data at scale.Google Search Console Integration – Connect to the Google Search Analytics and URL Inspection APIs and collect performance and index status data in bulk.Google Analytics Integration – Connect to the Google Analytics API and pull in user and conversion data directly during a crawl.Custom Extraction – Scrape any data from the HTML of a URL using XPath, CSS Path selectors or regex.

#Screaming frog seo spider linux code

H2 – Missing, duplicate, long, short or multiple headings.H1 – Missing, duplicate, long, short or multiple headings.Word Count – Analyse the number of words on every page.Crawl Depth – View how deep a URL is within a website’s architecture.Last-Modified Header – View the last modified date in the HTTP header.Response Time – View how long pages take to respond to requests.

Meta Keywords – Mainly for reference or regional search engines, as they are not used by Google, Bing or Yahoo.Meta Description – Missing, duplicate, long, short or multiple descriptions.Page Titles – Missing, duplicate, long, short or multiple title elements.Duplicate Pages – Discover exact and near duplicate pages using advanced algorithmic checks.URL Issues – Non ASCII characters, underscores, uppercase characters, parameters, or long URLs.Security – Discover insecure pages, mixed content, insecure forms, missing security headers and more.External Links – View all external links, their status codes and source pages.Blocked Resources – View & audit blocked resources in rendering mode.Blocked URLs – View & audit URLs disallowed by the robots.txt protocol.Redirects – Permanent, temporary, JavaScript redirects & meta refreshes.Errors – Client errors such as broken links & server errors (No responses, 4XX client & 5XX server errors).
