diff --git a/.editorconfig b/.editorconfig new file mode 100644 index 0000000..e1db26e --- /dev/null +++ b/.editorconfig @@ -0,0 +1,14 @@ +# EditorConfig — consistent whitespace across editors. https://editorconfig.org +root = true + +[*] +indent_style = space +indent_size = 2 +end_of_line = lf +charset = utf-8 +trim_trailing_whitespace = true +insert_final_newline = true + +# Two trailing spaces are a hard line break in Markdown — don't strip them. +[*.md] +trim_trailing_whitespace = false diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..022ef39 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,23 @@ +# Normalize line endings on commit. Force LF on everything that runs on Linux CI +# runners or in a shell, so a checkout on another OS can never introduce CRLF +# breakage in the scripts or the test corpus. +* text=auto + +*.sh text eol=lf +*.mjs text eol=lf +*.js text eol=lf +*.ts text eol=lf +*.json text eol=lf +*.yml text eol=lf +*.yaml text eol=lf +*.md text eol=lf +*.txt text eol=lf +*.cff text eol=lf + +# Binary assets — never normalize or diff as text. +*.png binary +*.jpg binary +*.jpeg binary +*.gif binary +*.webp binary +*.pdf binary diff --git a/.github/ISSUE_TEMPLATE/boundary_or_gold.yml b/.github/ISSUE_TEMPLATE/boundary_or_gold.yml new file mode 100644 index 0000000..27cb50a --- /dev/null +++ b/.github/ISSUE_TEMPLATE/boundary_or_gold.yml @@ -0,0 +1,52 @@ +name: Boundary failure / gold case +description: The engine answered when it should have declined, refused when it had grounded evidence, or claimed "supported" while citing only hints. +labels: ["gold-case"] +body: + - type: markdown + attributes: + value: | + This is the most valuable thing you can file. The rule that comes with + it (the one the repo holds itself to): the fix goes into the corpus, the + scoring, or the prompt — **never** a special case for this one question. + See [CONTRIBUTING](https://github.com/lukefwalton/answer-engine/blob/main/CONTRIBUTING.md) + and [eval/README.md](https://github.com/lukefwalton/answer-engine/blob/main/eval/README.md). + + ⚠️ **A private-note leak is not a gold case — it's a security issue.** If + private text actually reached an answer, that's the no-leak boundary + failing: report it privately via + [SECURITY.md](https://github.com/lukefwalton/answer-engine/blob/main/SECURITY.md), + not here. This form is for **non-sensitive** grounding/decline + regressions — please describe your corpus minimally and don't paste + private content. + - type: dropdown + id: failure + attributes: + label: What broke? + options: + - Answered when it should have declined + - Claimed "supported" while citing only hints (ungrounded) + - Refused when it had grounded evidence to answer + - Something else (non-sensitive — leaks go to SECURITY.md) + validations: + required: true + - type: textarea + id: query + attributes: + label: The question + description: The exact query — but only if it's non-sensitive. If it involves private text, paraphrase or redact it and say that you did (issues are public). + validations: + required: true + - type: textarea + id: corpus + attributes: + label: Corpus shape + description: What the engine was pointed at (your own corpus or the bundled example). Describe the shape — don't paste private content. If private text leaked into an answer, stop and use SECURITY.md instead of this form. + validations: + required: true + - type: textarea + id: expected + attributes: + label: What should have happened? + description: The behavior it should have had — the gold answer (cite / decline / route). + validations: + required: true diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml new file mode 100644 index 0000000..1d26ce4 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -0,0 +1,55 @@ +name: Bug report +description: Something in the engine or CLI is broken. +labels: ["bug"] +body: + - type: markdown + attributes: + value: | + Thanks for reporting a bug. For behavior where the engine answered when + it should have declined, refused with grounded evidence, or cited + ungrounded, use the **Boundary failure / gold case** template instead — + that's the highest-signal report here. + + Found a **security issue** — a boundary bypass, a **leaked private + note**, or a leaked key? Don't file it here — see + [SECURITY.md](https://github.com/lukefwalton/answer-engine/blob/main/SECURITY.md). + A private-note leak is the no-leak invariant failing, so it's a security + disclosure, not a public report. + - type: textarea + id: what-happened + attributes: + label: What happened? + description: What went wrong, and what did you expect instead? + validations: + required: true + - type: textarea + id: repro + attributes: + label: Steps to reproduce + description: If a repro query is sensitive, paraphrase or redact it — issues are public. + placeholder: | + 1. npm run index … + 2. npm run ask -- "…" + 3. See … + validations: + required: true + - type: dropdown + id: area + attributes: + label: Where does it happen? + options: + - Indexing (npm run index) + - Asking (npm run ask) + - Eval (npm run eval) + - Demo + - Typecheck / build / tests + - Somewhere else / not sure + validations: + required: true + - type: input + id: environment + attributes: + label: Node version & OS + placeholder: "e.g. Node 24, macOS 15" + validations: + required: false diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 0000000..df925c4 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,8 @@ +blank_issues_enabled: false +contact_links: + - name: The contribution bar (read first) + url: https://github.com/lukefwalton/answer-engine/blob/main/.github/STANDARDS.md + about: Most additions are correctly out of scope. This is the rubric every PR is read against. + - name: Report a security issue + url: https://github.com/lukefwalton/answer-engine/blob/main/SECURITY.md + about: Boundary bypasses and key leaks — report privately, not as a public issue. diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md new file mode 100644 index 0000000..170da90 --- /dev/null +++ b/.github/pull_request_template.md @@ -0,0 +1,22 @@ +## What & why + + + +## Checklist + +- [ ] `npm test` passes **without an API key** (the offline CI gate) — no hidden + dependency on live calls +- [ ] `npm run typecheck` passes +- [ ] No change that makes the eval pass by special-casing a question (fixes go + into the corpus, scoring, or prompt) +- [ ] A boundary stays structural (a type, not a checker someone must remember), + if this touches the no-leak path +- [ ] Touched the prompt / retrieval / validation / repair? Ran the relevant + `npm run eval` subset +- [ ] Read against [`.github/STANDARDS.md`](.github/STANDARDS.md) — the rubric a PR is judged by diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 0000000..239e21a --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,132 @@ +# Contributor Covenant Code of Conduct + +## Our Pledge + +We as members, contributors, and leaders pledge to make participation in our +community a harassment-free experience for everyone, regardless of age, body +size, visible or invisible disability, ethnicity, sex characteristics, gender +identity and expression, level of experience, education, socio-economic status, +nationality, personal appearance, race, caste, color, religion, or sexual +identity and orientation. + +We pledge to act and interact in ways that contribute to an open, welcoming, +diverse, inclusive, and healthy community. + +## Our Standards + +Examples of behavior that contributes to a positive environment for our +community include: + +- Demonstrating empathy and kindness toward other people +- Being respectful of differing opinions, viewpoints, and experiences +- Giving and gracefully accepting constructive feedback +- Accepting responsibility and apologizing to those affected by our mistakes, + and learning from the experience +- Focusing on what is best not just for us as individuals, but for the overall + community + +Examples of unacceptable behavior include: + +- The use of sexualized language or imagery, and sexual attention or advances of + any kind +- Trolling, insulting or derogatory comments, and personal or political attacks +- Public or private harassment +- Publishing others' private information, such as a physical or email address, + without their explicit permission +- Other conduct which could reasonably be considered inappropriate in a + professional setting + +## Enforcement Responsibilities + +Community leaders are responsible for clarifying and enforcing our standards of +acceptable behavior and will take appropriate and fair corrective action in +response to any behavior that they deem inappropriate, threatening, offensive, +or harmful. + +Community leaders have the right and responsibility to remove, edit, or reject +comments, commits, code, wiki edits, issues, and other contributions that are +not aligned to this Code of Conduct, and will communicate reasons for moderation +decisions when appropriate. + +## Scope + +This Code of Conduct applies within all community spaces, and also applies when +an individual is officially representing the community in public spaces. +Examples of representing our community include using an official email address, +posting via an official social media account, or acting as an appointed +representative at an online or offline event. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported to the community leaders responsible for enforcement at +[luke@lukefwalton.com](mailto:luke@lukefwalton.com). All complaints will be +reviewed and investigated promptly and fairly. + +All community leaders are obligated to respect the privacy and security of the +reporter of any incident. + +## Enforcement Guidelines + +Community leaders will follow these Community Impact Guidelines in determining +the consequences for any action they deem in violation of this Code of Conduct: + +### 1. Correction + +**Community Impact**: Use of inappropriate language or other behavior deemed +unprofessional or unwelcome in the community. + +**Consequence**: A private, written warning from community leaders, providing +clarity around the nature of the violation and an explanation of why the +behavior was inappropriate. A public apology may be requested. + +### 2. Warning + +**Community Impact**: A violation through a single incident or series of +actions. + +**Consequence**: A warning with consequences for continued behavior. No +interaction with the people involved, including unsolicited interaction with +those enforcing the Code of Conduct, for a specified period of time. This +includes avoiding interactions in community spaces as well as external channels +like social media. Violating these terms may lead to a temporary or permanent +ban. + +### 3. Temporary Ban + +**Community Impact**: A serious violation of community standards, including +sustained inappropriate behavior. + +**Consequence**: A temporary ban from any sort of interaction or public +communication with the community for a specified period of time. No public or +private interaction with the people involved, including unsolicited interaction +with those enforcing the Code of Conduct, is allowed during this period. +Violating these terms may lead to a permanent ban. + +### 4. Permanent Ban + +**Community Impact**: Demonstrating a pattern of violation of community +standards, including sustained inappropriate behavior, harassment of an +individual, or aggression toward or disparagement of classes of individuals. + +**Consequence**: A permanent ban from any sort of public interaction within the +community. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], +version 2.1, available at +[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1]. + +Community Impact Guidelines were inspired by +[Mozilla's code of conduct enforcement ladder][mozilla]. + +For answers to common questions about this code of conduct, see the FAQ at +[https://www.contributor-covenant.org/faq][faq]. Translations are available at +[https://www.contributor-covenant.org/translations][translations]. + +[homepage]: https://www.contributor-covenant.org +[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html +[mozilla]: https://github.com/mozilla/diversity +[faq]: https://www.contributor-covenant.org/faq +[translations]: https://www.contributor-covenant.org/translations diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 026bd51..ddfa5d9 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -96,3 +96,7 @@ while iterating, never the whole gold set on every pass (see - **License.** Contributions are accepted under the repo's [Apache-2.0](./LICENSE) license; by opening a PR you agree your contribution is licensed under it. +- **Conduct.** Issues and PRs are governed by the + [Code of Conduct](./CODE_OF_CONDUCT.md). +- **Security.** Found a way past the no-leak boundary, or a leaked key? Report it + privately — see [SECURITY.md](./SECURITY.md), not a public issue. diff --git a/README.md b/README.md index 2fdc489..ad5d337 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,6 @@ # Answer Engine: An AI that Says "I Don't Know" +[![Tests](https://github.com/lukefwalton/answer-engine/actions/workflows/test.yml/badge.svg)](https://github.com/lukefwalton/answer-engine/actions/workflows/test.yml) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20676773.svg)](https://doi.org/10.5281/zenodo.20676773) [![License](https://img.shields.io/github/license/lukefwalton/answer-engine)](LICENSE) [![Release](https://img.shields.io/github/v/release/lukefwalton/answer-engine)](https://github.com/lukefwalton/answer-engine/releases) diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 0000000..cb564b1 --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,47 @@ +# Security Policy + +Answer Engine is a teaching-sized, locally-run example: you clone it, point it at +a corpus, and invoke CLI scripts (`npm run index`, `npm run ask`, `npm run eval`). +There is no hosted endpoint and no server to attack. But the whole point of the +repo is a **security-shaped invariant** — the no-leak boundary — so vulnerability +reports against that boundary are exactly what's most valuable here. + +The boundary: private/unauthored text **cannot reach the model's prompt** along +the typed path (`src/no-leak.ts` makes the prohibited move structurally +inexpressible — a type with no field for private prose), and every answer either +cites retrieved evidence or refuses. A "vulnerability," for this repo, is a way +to break that. + +## Reporting a vulnerability + +Please **do not open a public issue** for a security problem. Instead: + +1. Email **[luke@lukefwalton.com](mailto:luke@lukefwalton.com)** with a + description of the issue. +2. Include the corpus shape, the query, and the boundary that broke. +3. You'll get an acknowledgement within a few days. Please allow a reasonable + window to ship a fix before disclosing publicly. + +## In scope + +- **Boundary bypass:** any path that gets private prose or other unauthored text + into the prompt despite `src/no-leak.ts`. +- **Fabricated grounding:** any path that makes the engine claim an answer is + `supported` while citing only hints, or that leaks a private note's contents + rather than routing to it. +- **Prompt injection** through corpus documents or the query that subverts the + answer contract (refuse-or-cite). +- **Secret handling:** leaking the LLM API key read from `.env`, or any script + that writes it somewhere it shouldn't. + +## Not a security issue + +- An answer you think is wrong but that *is* grounded in a citation, or a refusal + you disagree with. That's eval/quality — the most useful response is a failing + **gold case** (see [`CONTRIBUTING.md`](CONTRIBUTING.md) and + [`eval/README.md`](eval/README.md)), not a security report. + +## Supported versions + +Fixes target the `main` branch (and the latest tagged release / archived +artifact). This is a reference implementation, not a deployed service.