Documentation

How SiteInspect works, what we check, and how to interpret your results.

Getting started

Enter any page URL on the homepage. We fetch the page and run checks for SEO, accessibility, compliance, security, and performance. No account or install is required for a free single-URL scan.

Use a full URL (e.g. https://example.com/page).
Free scan: we analyze that one URL and return a readiness score plus the top 3–5 critical findings.
For multi-page scans and the full report (prioritized list, screenshots, PDF), use a paid plan via the dashboard.

What we scan

Findings are grouped into categories. Depending on scan mode (fetch-only vs browser), you may see different checks.

SEO

Title and length, meta description and length, canonical URL, Open Graph (title, image, description), Twitter card tags, html lang, meta robots noindex, multiple H1s, structured data (JSON-LD), favicon, robots.txt, sitemap, and AI crawler hints in robots.txt.

Accessibility (WCAG, ADA, Section 508, EAA)

Findings align with WCAG 2.1, ADA, Section 508, and EAA expectations. Main landmark presence, images missing alt text, inputs missing labels, missing or skipped headings, focus visibility, skip link, and contrast hints. Form validation and required fields: we check that required form controls have accessible names, and (in browser scans) that invalid controls expose aria-invalid and link error messages via aria-describedby—without submitting forms. Reports map accessibility findings to WCAG success criteria with “Learn more” links. Browser scans add LCP/CLS, tap targets, and the validation checks above.

Compliance

Privacy policy link, terms of service link, and cookie-consent presence. Copy in findings may reference GDPR/CCPA expectations; we do not provide legal advice.

Security

HSTS, X-Content-Type-Options, CSP, HTTPS redirect, mixed content, X-Frame-Options, Referrer-Policy, Permissions-Policy, staging/localhost exposure, possible key/secret exposure, stack/version disclosure, and SSL certificate validity. We detect what is serving the site (backend) from Server and X-Powered-By headers and from page content (e.g. .php URLs, Django CSRF tokens). When PHP, Python/Django, or Go (or gunicorn, nginx, etc.) are detected with a version, we look up known CVEs via OSV and report them so you can patch or upgrade.

Reliability & performance

HTTP reachability (2xx/3xx), 4xx/5xx handling, broken links, viewport meta tag, preconnect hints. Browser scans may add LCP, CLS, render-blocking resources, and third-party impact.

PWA & UX / Mobile

Web app manifest (name, start URL, icons), tap targets, cookie banner presence and accept/reject options. Platform-specific checks (e.g. Shopify, WordPress) when detected.

Scrapeability

How easy the site is to scrape with simple HTTP or browser automation. We check robots.txt (restrictive vs permissive for User-agent: *), rate limits (e.g. 429), and challenge/block pages (e.g. Cloudflare “Just a moment”, CAPTCHA, “enable JavaScript”) via both fetch and headless browser. Each scan gets a scrapeability score (0–100; higher = easier to scrape) shown in the report and on the scans list. Run once per origin so multi-page scans don’t repeat the same checks.

Website auditing

Paid scans include same-site broken link checks. We follow same-origin links from scanned pages and report any that return 4xx or 5xx. The full report has a “Website auditing — broken links” section listing each broken URL so you can fix or remove them and improve trust and crawlability.

For multi-page scans, we optionally discover URLs from your sitemap (when listed in robots.txt) and merge them with homepage links, so you get broader coverage without manually listing every path.

Authenticated scans (pages behind login)

You can scan URLs that require a logged-in session by providing session cookies when you start a scan. In the dashboard, open New scan, expand “Use authenticated session (optional)”, and paste a JSON array of cookies (name, value, domain, and optionally path). Export them from your browser (e.g. DevTools → Application → Cookies) after logging in to your site.

We never store cookies permanently. They are held only for the duration of that scan and are deleted as soon as the scan completes or fails. They are not saved to our database, logs, or any persistent store.

Do not use an account with elevated or admin privileges. Use a limited or test account when exporting cookies. If the session were ever exposed, a low-privilege account limits the impact.

Free vs paid

Feature	Free	Paid
URLs per scan	Single URL	Multiple pages (same domain)
Output	Readiness score + top 3–5 critical findings	Full prioritized report + readiness & scrapeability scores
Screenshots / PDF	—	Screenshots per page (and PDF export)
Rate limit	One scan per 12 hours	Subscription or scan credits
Account	Not required	Required (dashboard + organization)

Billing & credits

Paid scans use either subscription allowance (monthly/yearly plan with a scan limit per period) or scan credits (one-time purchases). One scan deducts one allowance or one credit when the scan starts, not when it completes.

Refund on failure: If a paid scan fails (error) or times out, we automatically refund the scan so your credit or subscription slot is restored.
Timeouts: Free scans time out after 15 minutes; paid scans after 45 minutes. After timeout the scan is marked failed and (for paid) refunded.
Dashboard, CI, and MCP all use the same scan pool per organization.

Domain verification

Domain verification is optional. You can run paid scans on any URL without verifying a domain. If you verify a domain in Settings → Domains, only your organization can scan that domain; other organizations cannot. Unverified domains can be scanned by any paid customer.

Understanding the report

While a scan is running, the dashboard shows real-time progress: how many pages have been scanned (e.g. 5 / 12), a progress bar, and a list of completed pages (URL and device: desktop/mobile). The page auto-updates every few seconds; when the scan finishes, you’re redirected to the full report.

The full report (paid) groups findings by category and severity. Each finding includes:

Category — e.g. SEO, Accessibility, Compliance, Security, Scrapeability.
Code — unique identifier (e.g. MISSING_META_DESCRIPTION, A11Y_IMAGES_MISSING_ALT).
Title & summary — what’s wrong and why it matters.
Fix guidance — recommended action (e.g. add a meta description, add alt text).
Pages — which URLs the issue was found on; long lists can be expanded via “and X more” so you can see every page.

The report includes an Executive summary with two scores: Readiness score (0–100, from findings across pages) and Scrapeability score (0–100; how easy the site is to scrape—robots.txt, challenge/block detection, rate limits). You also get top issues (deduplicated by type, with page counts), a dedicated Accessibility & compliance (WCAG, ADA, Section 508, EAA) section with WCAG success criterion IDs and “Learn more” links, a Scrapeability section, a Website auditing — broken links table, and AI crawler & AEO (allowed/blocked/unspecified). Paid reports include screenshots per page (when the browser scan succeeds or recovers a partial screenshot). Use the report to prioritize fixes; for formal compliance claims, consider a full audit and legal advice.

CI / Pipeline

You can run SiteInspect in GitHub Actions or GitLab CI: trigger a scan against a preview or staging URL, then use the findings to fail the build, comment on the PR with the report link, or apply safe auto-fixes and open a new PR.

API for CI

POST /v1/scans — body {"url": "https://..."}; returns id, status.
GET /v1/scans/:id — poll until status === "done".
GET /v1/scans/:id/findings — returns all findings with page_url, code, title, fix_json. No auth for public scans.

CLI and workflows

The siteinspect-ci package provides a small CLI: run a scan, poll until complete, and write findings to a JSON file. Optionally run a fix step that applies safe edits (e.g. viewport meta, html lang) to source files using a urlToSource config, then commit and open a PR from your pipeline. See the siteinspect-ci README in the repo for GitHub Action and GitLab CI examples.

Cursor (MCP)

The siteinspect-mcp package is an MCP server for Cursor. Add it in Cursor Settings → MCP (stdio), then you can ask the AI to run a SiteInspect scan on a URL and apply fixes to your current codebase. See the siteinspect-mcp README for setup.

Alerts on high-severity findings

When a scan completes and has at least one finding with severity 1 (critical) or 2 (high), the backend can send alerts to:

PagerDuty — trigger an incident (Events API v2; use routing key from a service integration).
Slack — post a message to a channel via an Incoming Webhook URL.
JIRA — create an issue in a project (base URL, email, API token, project key).
ClickUp — create a task in a list (API token, list ID).

Each destination is optional. Configure via environment variables (e.g. PAGERDUTY_ROUTING_KEY, SLACK_ALERT_WEBHOOK_URL, JIRA_*, CLICKUP_*). See docs/ALERTING.md in the repo for the full list and security notes.

Pentesting

SiteInspect readiness scans are automated: they check security headers, exposure risks, stack disclosure, and common misconfigurations as part of every scan. They are not a substitute for a full security assessment.

We offer a narrowly scoped automated pentest (specific checks for most sites/APIs: headers, TLS, CORS, misconfigurations) and hands-on penetration testing: our team runs exploit testing (e.g. IDOR, auth bypass, session issues, business logic), tests authentication and authorization, and produces a detailed report. Three hands-on tiers — Basic, Professional, Enterprise — plus the automated option; see Pricing. Use readiness scans for ongoing checks; consider a pentest for pre-launch, compliance, or periodic deep dives.

Load testing

We offer two ways to test how your app or API performs under load: automated (one run per credit; you create a load test in the dashboard with a target URL and we run it and deliver a report) and hands-on (our team runs the tests and delivers a detailed report with throughput, latency, and recommendations).

Automated load test

Purchase load test credits in Settings → Billing (per organization). In the dashboard, open Load testing, choose your organization, enter the target URL, and click Create load test. One credit is used when the test is created. When the test completes, the report appears in the same page; you can view or download it.

Hands-on load testing

Three tiers: Basic (1–2 days, single app or API), Standard (3–5 days, multiple scenarios and a performance baseline), and Enterprise (full assessment, SLA, ongoing support). Purchase from Settings → Billing; after payment we create an engagement and our team will coordinate with you to run the tests and deliver the report. Reports are available in the Load testing section when the engagement is complete.

See Pricing for current prices.

Glossary

Readiness score: A single number (0–100) reflecting how well the scanned pages meet our checks; higher is better. Derived from severity and count of findings, averaged across pages.
Scrapeability score: A 0–100 score for how easy the site is to scrape (HTTP or browser automation). Based on robots.txt restrictiveness, challenge/block page detection, and rate limits. Shown in the report and on the scans list for paid scans.
WCAG / ADA / Section 508 / EAA: Web Content Accessibility Guidelines, Americans with Disabilities Act, Section 508 (US federal), and European Accessibility Act. Our accessibility findings map to WCAG 2.1 success criteria; we do not certify compliance.
Canonical URL: The preferred URL for a page, set via rel="canonical" to avoid duplicate content in search.
Landmark: ARIA or HTML5 regions (e.g. main, nav) that help assistive technologies navigate the page.
Fetch vs browser scan: Fetch-only scans use the HTML and headers we get from a request. Browser scans run in a headless browser and can detect runtime issues (e.g. LCP, focus, contrast).
Scan timeout: Free scans are marked failed after 15 minutes; paid after 45 minutes. Paid scans that fail or time out are automatically refunded (credit or subscription slot).

Back to home