Extract every JSON-LD block from your HTML and fail CI when a Product, Article, or Organization is missing the fields Google requires for rich results.
One missing field in your structured data — a Product with no offers, an Article with no datePublished — quietly removes a page from rich-result eligibility. On a site with thousands of pages, nobody notices until the traffic graph dips weeks later. The existing checkers are web apps you paste into by hand or paid monitors that ping you after the fact. schema-guard runs at build time, reads your actual output HTML, and exits non-zero before the broken markup ships.
- Zero runtime dependencies. It reads your
package-lock.jsonand leaves it alone. The JSON-LD extraction is hand-rolled. - Per-type rules, not a generic JSON schema. It knows that a Product needs price information and an Article needs a publish date, because that is what Google's rich-result docs require.
- Baseline mode. Adopt it on a codebase that already has issues, record them once, and fail CI only on errors you introduce after that.
- Errors vs. warnings. A missing required field is an error and fails the build. A missing recommended field is a warning and does not.
Run it directly with npx:
npx schema-guard "dist/**/*.html"Or install it as a dev dependency:
npm install --save-dev schema-guardRequires Node 18 or newer. No other dependencies are installed.
Point it at built HTML files (globs work) or a live URL:
schema-guard "dist/**/*.html" # validate your build output
schema-guard index.html about.html # specific files
schema-guard --url https://your.site/ # fetch and validate a live page
schema-guard "dist/**/*.html" --json # machine-readable outputThis repo ships two example pages that each contain one deliberate mistake, so you can see what a real catch looks like. Run the tool against them:
schema-guard examples/product-page.html examples/article-page.htmlOutput:
examples/product-page.html
ERROR L15 block#0 <Product> [offers] Product has no "offers", "review", or "aggregateRating". At least one is required for a product rich result. Add an Offer with price and priceCurrency.
WARN L35 block#1 <Organization> [contactPoint] Organization has no "contactPoint". Recommended for customer service / contact knowledge-panel features.
examples/article-page.html
ERROR L16 block#0 <BlogPosting> [datePublished] Article is missing "datePublished". Required for freshness and Top Stories eligibility; use ISO 8601, e.g. "2026-06-01T08:00:00+00:00".
WARN L16 block#0 <BlogPosting> [author] Article has no "author". Recommended; use a Person or Organization with a "name".
schema-guard: 2 error(s), 2 warning(s).
The Product example has a name, an image, and a brand, but no offers — so it would not show a product rich result, and schema-guard catches it. The article has a headline and image but no publish date. The process exits with code 1 because there are errors, which is what stops a CI job.
The complete Organization block in the same product page is not flagged for anything required — it only gets a recommended-field warning. That is the point: it tells you what is actually broken, not everything that is theoretically missing.
If you adopt schema-guard on a site that already has structured-data errors, you do not want CI red on day one. Record the current errors as a baseline, then gate only on new ones.
# 1. Snapshot today's errors (run once, commit nothing — the file is yours)
schema-guard "dist/**/*.html" --update-baseline .schema-guard-baseline.json
# 2. In CI, fail only when a NEW error appears
schema-guard "dist/**/*.html" --baseline .schema-guard-baseline.jsonWith a baseline in place, pre-existing errors are marked (known) and the run passes. The moment a pull request adds a new one, it is marked (NEW) and the run fails:
examples/product-page.html
ERROR L15 block#0 <Product> [offers] ... (known)
ERROR L35 block#1 <Organization> [url] Organization is missing "url". Required so Search can associate the markup with your site. (NEW)
schema-guard: 3 error(s), 2 warning(s). 1 NEW error(s) not in baseline.
Exit code is 1 because of the one new error, even though the two known errors are tolerated. Fix the new one, the build goes green, and the old debt stays visible until you get to it.
| Code | Meaning |
|---|---|
0 |
No failing findings. |
1 |
Failing findings present (errors, or new errors in baseline mode). |
2 |
Usage or runtime error (bad arguments, file not found, fetch failed). |
| Option | Effect |
|---|---|
--url <u> |
Fetch and validate a live URL instead of files. |
--json |
Output machine-readable JSON. |
--baseline <file> |
Fail only on errors not present in the baseline file. |
--update-baseline <file> |
Write current errors to the baseline and exit 0. |
--strict |
Treat warnings as failures too. |
--no-warn |
Hide warnings from the report (still counted in JSON). |
-h, --help |
Show help. |
schema-guard validates the types that drive the most common rich results:
| Type | Required (error if absent) | Recommended (warning) |
|---|---|---|
| Product (+ subtypes) | name; an offers / review / aggregateRating; a price needs priceCurrency |
image, description, offers.availability |
Article / NewsArticle / BlogPosting |
headline, image, datePublished |
author, dateModified, publisher |
Organization (+ LocalBusiness, Corporation) |
name, url, logo |
sameAs, contactPoint |
Every node is also checked for valid JSON, a present @context, and a present @type. A type schema-guard does not cover is reported as a warning and skipped — it does not pretend to validate something it has no rules for.
The requirement tables live in plain data files under src/rules/. To extend coverage, add a src/rules/<type>.js that exports a rule set, register it in src/rules/index.js, and the validator picks it up — no logic to change. The field rules trace to Google Search Central's structured-data documentation; treat the severities as a starting point and adjust them in those files for your own use case.
The validator is exported, so you can run it inside your own scripts:
import { validateHtml, summarize } from 'schema-guard';
const findings = validateHtml(htmlString, 'page.html');
const { errors, warnings } = summarize(findings);
if (errors > 0) process.exit(1);GitHub Actions, against your build output:
- run: npm run build
- run: npx schema-guard "dist/**/*.html" --baseline .schema-guard-baseline.jsonnpm test # node --test, no test runner to installThe tests cover extraction (multiple blocks, escaped </script>, @graph, invalid JSON) and validation (every required/recommended field per type, baseline diffing, and the planted errors in the example files).
MIT. See LICENSE.
Built by Velkina — https://velkina.com