gasilwb.blogg.se - Screaming frog seo spider linux

#Screaming frog seo spider linux code

Compare site structure, detect changes in key elements and metrics and use URL mapping to compare staging against production sites. Crawl Comparison – Compare crawl data to see changes in issues and opportunities to track technical SEO progress.Spelling & Grammar – Spell & grammar check your website in over 25 different languages.Structured Data & Validation – Extract & validate structured data against specifications and Google search features.Visualisations – Analyse the internal linking and URL structure of the website, using the crawl and directory tree force-directed diagrams and tree graphs.XML Sitemap Analysis – Crawl an XML Sitemap independently or part of a crawl, to find missing, non-indexable and orphan pages.AMP Crawling & Validation – Crawl AMP URLs and validate them, using the official integrated AMP Validator.Store & View HTML & Rendered HTML – Essential for analysing the DOM.Rendered Screen Shots – Fetch, view and analyse the rendered pages crawled.Custom robots.txt – Download, edit and test a site’s robots.txt using the new custom robots.txt.XML Sitemap Generation – Create an XML sitemap and an image sitemap using the SEO spider.External Link Metrics – Pull external link metrics from Majestic, Ahrefs and Moz APIs into a crawl to perform content audits or profile links.PageSpeed Insights Integration – Connect to the PSI API for Lighthouse metrics, speed opportunities, diagnostics and Chrome User Experience Report (CrUX) data at scale.Google Search Console Integration – Connect to the Google Search Analytics and URL Inspection APIs and collect performance and index status data in bulk.Google Analytics Integration – Connect to the Google Analytics API and pull in user and conversion data directly during a crawl.Custom Extraction – Scrape any data from the HTML of a URL using XPath, CSS Path selectors or regex.

#Screaming frog seo spider linux code

Custom Source Code Search – Find anything you want in the source code of a website! Whether that’s Google Analytics code, specific text, or code etc.

Custom HTTP Headers – Supply any header value in a request, from Accept-Language to cookie.

User-Agent Switcher – Crawl as Googlebot, Bingbot, Yahoo! Slurp, mobile user-agents or your own custom UA.

Images over 100kb, missing alt text, alt text over 100 characters.

Images – All URLs with the image link & all images from a given page.

AJAX – Select to obey Google’s now deprecated AJAX Crawling Scheme.

Rendering – Crawl JavaScript frameworks like AngularJS and React, by crawling the rendered HTML after JavaScript has executed.

Outlinks – View all pages a URL links out to, as well as resources.

Inlinks – View all pages linking to a URL, the anchor text and whether the link is follow or nofollow.

hreflang Attributes – Audit missing confirmation links, inconsistent & incorrect languages codes, non canonical hreflang and more.

Redirect Chains – Discover redirect chains and loops.

Follow & Nofollow – View meta nofollow, and nofollow link attributes.

Pagination – View rel=“next” and rel=“prev” attributes.

X-Robots-Tag – See directives issued via the HTTP Headder.

Canonicals – Link elements & canonical HTTP headers.

Meta Refresh – Including target page and time delay.

Meta Robots – Index, noindex, follow, nofollow, noarchive, nosnippet etc.

H2 – Missing, duplicate, long, short or multiple headings.H1 – Missing, duplicate, long, short or multiple headings.Word Count – Analyse the number of words on every page.Crawl Depth – View how deep a URL is within a website’s architecture.Last-Modified Header – View the last modified date in the HTTP header.Response Time – View how long pages take to respond to requests.

Meta Keywords – Mainly for reference or regional search engines, as they are not used by Google, Bing or Yahoo.Meta Description – Missing, duplicate, long, short or multiple descriptions.Page Titles – Missing, duplicate, long, short or multiple title elements.Duplicate Pages – Discover exact and near duplicate pages using advanced algorithmic checks.URL Issues – Non ASCII characters, underscores, uppercase characters, parameters, or long URLs.Security – Discover insecure pages, mixed content, insecure forms, missing security headers and more.External Links – View all external links, their status codes and source pages.Blocked Resources – View & audit blocked resources in rendering mode.Blocked URLs – View & audit URLs disallowed by the robots.txt protocol.Redirects – Permanent, temporary, JavaScript redirects & meta refreshes.Errors – Client errors such as broken links & server errors (No responses, 4XX client & 5XX server errors).