7 Commits

Author SHA1 Message Date
Alexandre Teixeira b27b6fc0b6 test: report cwd in order-sensitivity runner 2026-06-12 10:22:33 +02:00
Alexandre Teixeira 3372539e74 test: add report-only order-sensitivity runner 2026-06-12 10:22:33 +02:00
catalini82 9d7a3d66c0 test(memory): cover owner isolation for memory search
Co-authored-by: Cata <cata@bigjohn.local>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 22:21:30 +01:00
Rolly Calma 20cf94f53d fix(platform): read proc version with utf-8
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:58:22 +01:00
muhamed hamed 3b3c0d6254 fix: detect HuggingFace token when downloading cookbook models (#3459)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:53:16 +01:00
Mazen Tamer Salah f5c1eb4b9d fix(settings): degrade load_features to defaults on PermissionError
load_settings() already catches PermissionError, but load_features() caught only
FileNotFoundError/JSONDecodeError/ValueError. An existing-but-unreadable
data/features.json (e.g. root-owned after a deploy) therefore raised instead of
falling back to DEFAULT_FEATURES, taking down GET /api/auth/features and anything
that reads feature flags. Add PermissionError to the except tuple to match
load_settings().

Adds tests/test_load_features_permission_error.py.

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:20:10 +01:00
nopoz 93825a505c ci: security scanning suite and governance (consolidates #305-310) (#1314)
* ci: add security scanning suite and governance

Consolidates the security CI work into one reviewable change. Adds, as
separate workflow files under .github/workflows/:

- secret-scan.yml      gitleaks (pinned + checksum-verified), full history
- workflow-security.yml actionlint + zizmor, audits the workflows themselves
- dependency-review.yml PR dependency gate + advisory pip-audit
- container-scan.yml    hadolint (blocking) + Trivy image scan (advisory)
- codeql.yml            CodeQL for Python and JS, main + weekly

Plus .github/dependabot.yml (pip/npm/actions/docker), .github/CODEOWNERS,
and docs/security-ci.md explaining each check and the one-time settings.

All additive: no existing files are modified. Actions are pinned to commit
SHAs, tokens default-deny (permissions: {}), advisory scans never block,
and SARIF upload is gated to push so fork PRs do not fail on a read-only
token. Composes with the correctness CI in #1015.

* ci(security): isolate Trivy from the Dockerfile lint gate

Address review on #1314 (points 2 and 3).

container-scan.yml now runs only hadolint (the blocking Dockerfile lint)
and keeps the broad pull_request + push:[main] trigger so the required
check always reports and never hangs a PR.

The advisory image scan moves to container-trivy.yml, split by event:
  - pull_request / workflow_dispatch: build and scan under contents:read
    only, no SARIF upload. The image build runs PR-supplied Dockerfile
    instructions, so this path holds no write scope.
  - push to main: build, scan, and upload SARIF with security-events:write.
    Only this trusted path is granted write.
This stops PR jobs from requesting security-events:write they never use,
and a paths-ignore (matching docker-publish.yml) skips the image rebuild
on docs-only changes.

docs/security-ci.md: correct the trigger description to "every pull
request and every push to main", matching the workflows and the existing
ci.yml convention.

Verified locally: zizmor --offline --min-severity=low and actionlint are
clean on the changed and new workflow files.

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 20:51:11 +01:00
22 changed files with 1187 additions and 17 deletions
+8
View File
@@ -0,0 +1,8 @@
# Code owners.
#
# Every file is owned by the maintainer, so that when branch protection has
# "Require review from Code Owners" turned on, no pull request can be merged
# without the maintainer's review. This is the human gate that backs up the
# automated security checks. See docs/security-ci.md for how to turn it on.
* @pewdiepie-archdaemon
+48
View File
@@ -0,0 +1,48 @@
# Dependabot keeps dependencies and pinned action versions current.
#
# Why this matters for security: every workflow in this repo pins its GitHub
# Actions to an exact commit (a SHA), which is safe but freezes them in time.
# Dependabot opens a small, reviewable pull request whenever a newer version
# exists -- for Python packages, npm packages, the Docker base image, and the
# pinned Actions themselves -- so staying patched does not require manual work.
# Updates are grouped so a week's bumps arrive as one PR per ecosystem, not a
# flood of separate ones.
version: 2
updates:
# Python dependencies (requirements.txt + requirements-optional.txt).
- package-ecosystem: pip
directory: "/"
schedule:
interval: weekly
open-pull-requests-limit: 5
groups:
python:
patterns: ["*"]
# Frontend / tooling npm packages (package.json).
- package-ecosystem: npm
directory: "/"
schedule:
interval: weekly
open-pull-requests-limit: 5
groups:
npm:
patterns: ["*"]
# The pinned action SHAs used across .github/workflows.
- package-ecosystem: github-actions
directory: "/"
schedule:
interval: weekly
open-pull-requests-limit: 5
groups:
actions:
patterns: ["*"]
# The Docker base image in the Dockerfile.
- package-ecosystem: docker
directory: "/"
schedule:
interval: weekly
open-pull-requests-limit: 5
+61
View File
@@ -0,0 +1,61 @@
# CodeQL code scanning
#
# Purpose: GitHub's own static analysis engine reads the application source
# (Python backend + the JavaScript frontend) and looks for real
# vulnerabilities -- SQL/command injection, path traversal, auth mistakes,
# unsafe deserialization. Findings appear in the repo's Security tab. This is
# the deepest check in the suite and the most valuable for a high-profile
# target.
#
# It runs on every push to main and on a weekly schedule (to catch newly
# disclosed query patterns against unchanged code). It deliberately does NOT
# run on pull requests: most PRs here come from forks, whose read-only token
# cannot publish results, which would produce confusing failures. To scan pull
# requests too, a maintainer can instead enable CodeQL "default setup" in
# Settings -> Security -> Code scanning (one toggle, no file needed) -- see
# docs/security-ci.md.
name: CodeQL
on:
push:
branches: [main]
schedule:
# Weekly, Monday 06:00 UTC.
- cron: '0 6 * * 1'
workflow_dispatch:
permissions: {}
concurrency:
group: codeql-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
analyze:
name: Analyze (${{ matrix.language }})
runs-on: ubuntu-latest
permissions:
contents: read
security-events: write # publish results to the Security tab
strategy:
fail-fast: false
matrix:
# Both are interpreted, so CodeQL needs no build step (build-mode none).
language: [python, javascript-typescript]
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Initialize CodeQL
uses: github/codeql-action/init@03e4368ac7daa2bd82b3e85262f3bf87ee112f57 # v3.36.0
with:
languages: ${{ matrix.language }}
build-mode: none
- name: Perform CodeQL analysis
uses: github/codeql-action/analyze@03e4368ac7daa2bd82b3e85262f3bf87ee112f57 # v3.36.0
with:
category: "/language:${{ matrix.language }}"
+52
View File
@@ -0,0 +1,52 @@
# Container security: Dockerfile lint
#
# Purpose: the Docker image is how most people run Odysseus, so it is part of
# the attack surface. hadolint lints the Dockerfile for mistakes and insecure
# patterns (running as root longer than needed, unpinned base image, bad apt
# usage). Blocking.
#
# The image vulnerability scan (Trivy, advisory) lives in its own file,
# container-trivy.yml. Keeping it separate lets that advisory scan be
# path-filtered and held to a read-only token on pull requests without
# weakening this blocking gate, which must always report so a required check
# never hangs.
#
# Note: a separate open PR (#120) proposes a local `scripts/scan_image.py`.
# This job is complementary -- it is a CI gate, not a script a contributor has
# to remember to run.
name: Container scan
on:
pull_request:
push:
branches: [main]
workflow_dispatch:
permissions: {}
concurrency:
group: container-scan-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
hadolint:
name: hadolint (Dockerfile lint)
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Lint Dockerfile
uses: hadolint/hadolint-action@2332a7b74a6de0dda2e2221d575162eba76ba5e5 # v3.3.0
with:
dockerfile: Dockerfile
# DL3008: pinning apt package versions is impractical on a -slim base
# image. Debian purges old package versions from its repos, so a
# pinned version breaks future rebuilds. The base image itself is
# what should be pinned (tracked by Dependabot's docker ecosystem).
ignore: DL3008
+125
View File
@@ -0,0 +1,125 @@
# Container image vulnerability scan (advisory)
#
# Trivy builds the application image and scans it for known-vulnerable OS and
# Python packages. Advisory only -- it reports findings to the repo's Security
# tab without blocking a merge, because the image inevitably contains
# already-known CVEs in upstream packages that are not this project's bug.
#
# Split from the Dockerfile lint (container-scan.yml) for two reasons:
#
# - Least privilege. The image build runs Dockerfile instructions, which on a
# pull request are attacker-influenceable. That path (the `scan` job) is
# held to a read-only token and never publishes results. Only `publish`,
# which runs on push to main (curated, fast-forwarded from reviewed dev),
# gets security-events:write to upload SARIF.
# - Cost. Docs-only changes do not rebuild the image (paths-ignore below),
# matching docker-publish.yml. hadolint stays on the broad trigger in
# container-scan.yml so the blocking gate always reports.
name: Container scan (Trivy)
on:
pull_request:
paths-ignore:
- '**.md'
- 'docs/**'
- '.github/ISSUE_TEMPLATE/**'
push:
branches: [main]
paths-ignore:
- '**.md'
- 'docs/**'
- '.github/ISSUE_TEMPLATE/**'
workflow_dispatch:
permissions: {}
concurrency:
group: container-trivy-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
# Pull requests and manual runs: build and scan under a read-only token.
# The build executes PR-supplied Dockerfile instructions, so this job must
# not hold any write scope, and it does not upload to the Security tab.
scan:
name: Trivy (image scan, advisory)
if: github.event_name != 'push'
runs-on: ubuntu-latest
# Advisory: a CVE in an upstream package must not block a PR.
continue-on-error: true
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Set up Buildx
uses: docker/setup-buildx-action@d7f5e7f509e45cec5c76c4d5afdd7de93d0b3df5 # v4.1.0
# Build without pushing so a broken Dockerfile is caught here, and the
# exact image we ship is what gets scanned.
- name: Build image
uses: docker/build-push-action@f9f3042f7e2789586610d6e8b85c8f03e5195baf # v7.2.0
with:
context: .
push: false
load: true
tags: odysseus:ci
- name: Scan image with Trivy
uses: aquasecurity/trivy-action@ed142fd0673e97e23eac54620cfb913e5ce36c25 # v0.36.0
with:
image-ref: odysseus:ci
format: table
ignore-unfixed: true
env:
# Pin the vuln DB source to GHCR to avoid rate-limited Docker Hub
# mirrors that flake on shared runners.
TRIVY_DB_REPOSITORY: ghcr.io/aquasecurity/trivy-db:2
# Push to main only: build, scan, and publish SARIF to the Security tab.
# This is the only path that runs trusted code, so it is the only one granted
# security-events:write.
publish:
name: Trivy (image scan + SARIF upload)
if: github.event_name == 'push'
runs-on: ubuntu-latest
continue-on-error: true
permissions:
contents: read
security-events: write # upload SARIF to the Security tab
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Set up Buildx
uses: docker/setup-buildx-action@d7f5e7f509e45cec5c76c4d5afdd7de93d0b3df5 # v4.1.0
- name: Build image
uses: docker/build-push-action@f9f3042f7e2789586610d6e8b85c8f03e5195baf # v7.2.0
with:
context: .
push: false
load: true
tags: odysseus:ci
- name: Scan image with Trivy
uses: aquasecurity/trivy-action@ed142fd0673e97e23eac54620cfb913e5ce36c25 # v0.36.0
with:
image-ref: odysseus:ci
format: sarif
output: trivy-results.sarif
ignore-unfixed: true
env:
TRIVY_DB_REPOSITORY: ghcr.io/aquasecurity/trivy-db:2
- name: Upload Trivy results
uses: github/codeql-action/upload-sarif@03e4368ac7daa2bd82b3e85262f3bf87ee112f57 # v3.36.0
with:
sarif_file: trivy-results.sarif
category: trivy-image
+71
View File
@@ -0,0 +1,71 @@
# Supply-chain review
#
# Purpose: defend against "side-chain" / supply-chain attacks -- a pull request
# that adds (or bumps) a dependency to a version with a known vulnerability or a
# disallowed license. Two layers:
#
# - dependency-review: runs ONLY on pull requests. It compares the
# dependencies before and after the PR and blocks the merge if the change
# pulls in a package with a known security advisory. This is the gate.
# - pip-audit: scans the project's current Python requirements against the
# advisory database. Advisory only (it never blocks a merge), because it can
# flag a pre-existing issue in an already-shipped dependency.
name: Dependency review
on:
pull_request:
push:
branches: [main]
workflow_dispatch:
# Default-deny token; jobs grant only read access.
permissions: {}
concurrency:
group: dependency-review-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
dependency-review:
name: dependency-review (PR gate)
# Only meaningful on a pull request -- it needs a base..head diff to review.
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Review dependency changes
uses: actions/dependency-review-action@a1d282b36b6f3519aa1f3fc636f609c47dddb294 # v5.0.0
with:
# Fail the PR on any newly introduced moderate-or-worse advisory.
fail-on-severity: moderate
pip-audit:
name: pip-audit (advisory)
runs-on: ubuntu-latest
# Advisory: report known-vulnerable Python deps without blocking the merge.
continue-on-error: true
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.12'
- name: Run pip-audit on requirements
run: |
set -euo pipefail
pip install pip-audit==2.10.0
pip-audit -r requirements.txt -r requirements-optional.txt --strict
+60
View File
@@ -0,0 +1,60 @@
# Secret scanning
#
# Purpose: stop credentials (API keys, tokens, passwords, private keys) from
# ever living in the Git history. Odysseus deliberately keeps real secrets in
# files that are gitignored (.env, data/), but a slip in a future commit -- or a
# malicious pull request that sneaks one in -- would otherwise go unnoticed.
# This job reads the repository and the full commit history and fails if it
# finds anything that looks like a secret.
#
# It runs the official gitleaks BINARY directly (pinned to an exact version and
# verified against the project's published SHA-256 checksum) rather than the
# gitleaks GitHub Action, because the Action asks for a paid license on
# organization-owned repos. The binary is free and behaves identically.
name: Secret scan
on:
pull_request:
push:
branches: [main]
workflow_dispatch:
# Start with zero permissions; the single job opts back in to read-only.
permissions: {}
concurrency:
group: secret-scan-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
gitleaks:
name: gitleaks
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
# Full history so a secret committed in an earlier commit (and later
# deleted) is still caught -- deletion does not remove it from Git.
fetch-depth: 0
persist-credentials: false
# Pinned version + checksum so a tampered release binary cannot run here.
# Bump VERSION/SHA256 together; the checksum comes from the matching
# gitleaks_<version>_checksums.txt on the GitHub release.
- name: Run gitleaks (pinned, checksum-verified)
env:
GITLEAKS_VERSION: 8.30.1
GITLEAKS_SHA256: 551f6fc83ea457d62a0d98237cbad105af8d557003051f41f3e7ca7b3f2470eb
run: |
set -euo pipefail
TARBALL="gitleaks_${GITLEAKS_VERSION}_linux_x64.tar.gz"
curl -fsSL -o "${TARBALL}" \
"https://github.com/gitleaks/gitleaks/releases/download/v${GITLEAKS_VERSION}/${TARBALL}"
echo "${GITLEAKS_SHA256} ${TARBALL}" | sha256sum -c -
tar -xzf "${TARBALL}" gitleaks
# Scan the whole history. Findings print to the log and fail the job.
./gitleaks git --no-banner --redact --verbose .
+80
View File
@@ -0,0 +1,80 @@
# Workflow security (CI that audits the CI)
#
# Purpose: the GitHub Actions workflows themselves are an attack surface. A
# poorly written workflow can leak the repository token, run attacker-supplied
# code from a pull request, or pull in a tampered third-party action. These two
# tools check every workflow file in this repo for those mistakes:
#
# - actionlint: catches workflow syntax errors and shell-script bugs inside
# `run:` steps before they reach main.
# - zizmor: a security linter for Actions. Flags template-injection holes,
# unpinned actions, credential persistence, and over-broad token
# permissions -- exactly the patterns the rest of this CI is built to avoid.
#
# Add this early: it then audits every workflow added after it.
name: Workflow security
on:
pull_request:
push:
branches: [main]
workflow_dispatch:
# Default-deny token; each job grants only read access to the code.
permissions: {}
concurrency:
group: workflow-security-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
actionlint:
name: actionlint
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
# Pinned version + checksum so a tampered binary cannot run here.
- name: Run actionlint (pinned, checksum-verified)
env:
ACTIONLINT_VERSION: 1.7.12
ACTIONLINT_SHA256: 8aca8db96f1b94770f1b0d72b6dddcb1ebb8123cb3712530b08cc387b349a3d8
run: |
set -euo pipefail
TARBALL="actionlint_${ACTIONLINT_VERSION}_linux_amd64.tar.gz"
curl -fsSL -o "${TARBALL}" \
"https://github.com/rhysd/actionlint/releases/download/v${ACTIONLINT_VERSION}/${TARBALL}"
echo "${ACTIONLINT_SHA256} ${TARBALL}" | sha256sum -c -
tar -xzf "${TARBALL}" actionlint
./actionlint -color
zizmor:
name: zizmor (Actions SAST)
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.12'
# Pinned zizmor release. --offline keeps the audit hermetic (no network
# calls about the actions it inspects); --min-severity=low surfaces
# everything so nothing slips through under the gate.
- name: Run zizmor
run: |
set -euo pipefail
pip install zizmor==1.25.2
zizmor --offline --min-severity=low .github/workflows/
+1 -1
View File
@@ -300,7 +300,7 @@ def is_wsl() -> bool:
import sys import sys
if sys.platform.startswith("linux") or os.name == "posix": if sys.platform.startswith("linux") or os.name == "posix":
try: try:
with open("/proc/version", "r") as f: with open("/proc/version", "r", encoding="utf-8", errors="ignore") as f:
if "microsoft" in f.read().lower(): if "microsoft" in f.read().lower():
return True return True
except Exception: except Exception:
+102
View File
@@ -0,0 +1,102 @@
# Security CI guide
This project runs a set of automated security checks on every pull request and
on every push to `main`. This page explains what each one does, whether it can
block a merge, and the few one-time settings you should turn on to get the full
benefit.
## What runs, and why
Each check lives in its own file under `.github/workflows/`. They run
automatically; you do not start them.
| Check | What it protects against | Blocks a merge? |
|---|---|---|
| **Secret scan** (gitleaks) | An API key, token, or password being committed by mistake or on purpose | Yes |
| **Workflow security** (actionlint + zizmor) | A broken or insecure automation file that could leak the repo's access token | Yes |
| **Dependency review** | A pull request that adds a software library with a known security hole | Yes |
| **pip-audit** | Known security holes in the Python libraries already used | No (advisory) |
| **Container scan: hadolint** | Mistakes and insecure patterns in the `Dockerfile` | Yes |
| **Container scan: Trivy** | Known security holes in the Docker image | No (advisory) |
| **CodeQL** | Real bugs in the app's own code: injection, auth mistakes, path traversal | No (advisory) |
"Blocks a merge" means a red X appears on the pull request and, once you enable
the setting below, the **Merge** button is disabled until it is fixed.
"Advisory" means it reports problems into the repository's **Security** tab so
you can review them on your own schedule, but it never stops a merge. These are
advisory on purpose: they often flag long-standing issues in other people's
libraries, not something a given pull request introduced.
## Where results appear
- **Checks tab of a pull request**: the pass/fail of each check. A green tick is
good; a red X needs attention.
- **Security tab of the repository**: detailed findings from the advisory
scanners (Trivy and CodeQL). This is your dashboard.
## If a check fails
- **Secret scan failed**: a real credential may have been committed. Treat it as
leaked: rotate (regenerate) that key or token immediately, then remove it from
the file. Do not just delete the commit; assume it was seen.
- **Dependency review failed**: the pull request adds a library with a known
vulnerability. Ask the contributor to use a patched version, or decline the
change.
- **hadolint / workflow security failed**: the contributor changed the
`Dockerfile` or an automation file in a way the linter rejects. Ask them to
address the message shown in the failed check.
## One-time settings to turn on
These two settings unlock the full value. You only do them once.
### 1. Require the blocking checks before merging
This makes the **Merge** button refuse to work until the gating checks pass.
1. Go to the repository on GitHub.
2. Click **Settings** (top right of the repo).
3. In the left sidebar, click **Branches**.
4. Under **Branch protection rules**, click **Add branch ruleset** (or **Add
rule**), and set the branch name pattern to `dev` (this is the branch all
pull requests target; `main` is fast-forwarded at releases).
5. Enable **Require status checks to pass before merging**.
6. In the search box that appears, add these checks by name:
- `Python syntax (compileall)`
- `JS syntax (node --check)`
- `gitleaks`
- `actionlint`
- `zizmor (Actions SAST)`
- `hadolint (Dockerfile lint)`
- `dependency-review (PR gate)`
The first two come from the correctness CI (`ci.yml`); the rest are this
security suite. Leave pytest, pip-audit, Trivy, and CodeQL unchecked so they
stay advisory.
7. Also enable **Require a pull request before merging** and **Require review
from Code Owners** (this uses the `.github/CODEOWNERS` file so every change
needs your sign-off).
8. Click **Create** / **Save changes**.
Note: a check name only appears in the list after it has run at least once, so
let the workflows run on one pull request first, then add them here.
### 2. Turn on the Security tab features
1. **Settings -> Code security** (or **Code security and analysis**).
2. Turn on **Dependency graph** (usually on by default for public repos) -- this
powers Dependency review and Dependabot.
3. Turn on **Dependabot alerts** and **Dependabot security updates**.
4. Under **Code scanning**, you have two ways to scan the app code with CodeQL:
- The included `codeql.yml` workflow already scans `main` and runs weekly.
- To also scan **pull requests** (recommended, since most contributions come
from forks), click **Set up -> Default** under Code scanning. GitHub then
runs CodeQL on pull requests for you, with no token limitations.
## Keeping it current
`.github/dependabot.yml` opens small weekly pull requests to update Python and
npm packages, the Docker base image, and the pinned automation actions
themselves. Review and merge those like any other pull request; they keep the
project patched without manual tracking.
+20
View File
@@ -1,12 +1,14 @@
"""cookbook_helpers.py — validators + small helpers shared by the cookbook routes. """cookbook_helpers.py — validators + small helpers shared by the cookbook routes.
Extracted from cookbook_routes.py; the routes module imports the symbols it needs.""" Extracted from cookbook_routes.py; the routes module imports the symbols it needs."""
import json
import logging import logging
import ntpath import ntpath
import os import os
import posixpath import posixpath
import re import re
import shlex import shlex
from pathlib import Path
from fastapi import HTTPException from fastapi import HTTPException
from pydantic import BaseModel from pydantic import BaseModel
@@ -90,6 +92,24 @@ def _validate_token(v: str | None) -> str | None:
return v return v
def load_stored_hf_token(*, state_path: Path | str | None = None) -> str:
"""Return the decrypted HF token from cookbook_state.json, else env fallback."""
path = Path(state_path) if state_path else Path(os.environ.get("DATA_DIR", "data")) / "cookbook_state.json"
token = ""
if path.exists():
try:
state = json.loads(path.read_text(encoding="utf-8"))
env = state.get("env") if isinstance(state, dict) else {}
if isinstance(env, dict) and env.get("hfToken"):
from src.secret_storage import decrypt
token = decrypt(env.get("hfToken") or "")
except Exception:
token = ""
if not token:
token = (os.environ.get("HF_TOKEN") or os.environ.get("HUGGING_FACE_HUB_TOKEN") or "").strip()
return token
def _validate_local_dir(v: str | None) -> str | None: def _validate_local_dir(v: str | None) -> str | None:
if v is None or v == "": if v is None or v == "":
return None return None
+5 -8
View File
@@ -40,6 +40,10 @@ from routes.cookbook_helpers import (
_ps_squote, _bash_squote, _validate_serve_cmd, _parse_serve_phase, _ps_squote, _bash_squote, _validate_serve_cmd, _parse_serve_phase,
_safe_env_prefix, _local_tooling_path_export, _append_serve_preflight_exit_lines, _safe_env_prefix, _local_tooling_path_export, _append_serve_preflight_exit_lines,
_append_serve_exit_code_lines, _append_llama_cpp_linux_accel_build_lines, _cached_model_scan_script, _append_serve_exit_code_lines, _append_llama_cpp_linux_accel_build_lines, _cached_model_scan_script,
load_stored_hf_token,
_append_vllm_linux_preflight_lines, _ollama_bind_from_cmd, _pip_install_fallback_chain,
_pip_install_no_cache, _user_shell_path_bootstrap, _venv_safe_local_pip_install_cmd,
_diagnose_serve_output, run_ssh_command_async,
_ollama_bind_from_cmd, _pip_install_fallback_chain, _pip_install_no_cache, _ollama_bind_from_cmd, _pip_install_fallback_chain, _pip_install_no_cache,
_user_shell_path_bootstrap, _venv_safe_local_pip_install_cmd, _user_shell_path_bootstrap, _venv_safe_local_pip_install_cmd,
ModelDownloadRequest, ServeRequest, ModelDownloadRequest, ServeRequest,
@@ -234,14 +238,7 @@ def setup_cookbook_routes() -> APIRouter:
return state return state
def _load_stored_hf_token() -> str: def _load_stored_hf_token() -> str:
if not _cookbook_state_path.exists(): return load_stored_hf_token(state_path=_cookbook_state_path)
return ""
try:
state = json.loads(_cookbook_state_path.read_text(encoding="utf-8"))
env = state.get("env") if isinstance(state, dict) else {}
return _decrypt_secret(env.get("hfToken") if isinstance(env, dict) else "")
except Exception:
return ""
def _cookbook_ssh_dir() -> Path: def _cookbook_ssh_dir() -> Path:
# The Docker image keeps cookbook keys under /app/.ssh; that path only # The Docker image keeps cookbook keys under /app/.ssh; that path only
+1 -1
View File
@@ -283,7 +283,7 @@ def load_features() -> dict:
if not isinstance(saved, dict): if not isinstance(saved, dict):
raise ValueError("features must be an object") raise ValueError("features must be an object")
merged = {**DEFAULT_FEATURES, **saved} merged = {**DEFAULT_FEATURES, **saved}
except (FileNotFoundError, json.JSONDecodeError, ValueError): except (FileNotFoundError, PermissionError, json.JSONDecodeError, ValueError):
merged = dict(DEFAULT_FEATURES) merged = dict(DEFAULT_FEATURES)
_features_cache = (now, merged) _features_cache = (now, merged)
return merged return merged
+2 -1
View File
@@ -2054,13 +2054,14 @@ async def _cookbook_env_for_host(host: str) -> Dict[str, Any]:
else: else:
env_prefix = f'eval "$(conda shell.bash hook)" && conda activate {env_path}' env_prefix = f'eval "$(conda shell.bash hook)" && conda activate {env_path}'
from routes.cookbook_helpers import load_stored_hf_token
return { return {
"env_prefix": env_prefix, "env_prefix": env_prefix,
"env_type": env_kind, "env_type": env_kind,
"env_path": env_path, "env_path": env_path,
"gpus": env_root.get("gpus") or "", "gpus": env_root.get("gpus") or "",
"platform": platform, "platform": platform,
"hf_token": env_root.get("hfToken") or "", "hf_token": load_stored_hf_token(),
"ssh_port": ssh_port, "ssh_port": ssh_port,
} }
+4 -6
View File
@@ -1506,12 +1506,10 @@ export function _hwfitInit() {
clearTimeout(_hwfitDebounce); clearTimeout(_hwfitDebounce);
_hwfitDebounce = setTimeout(() => _hwfitFetch(), 400); _hwfitDebounce = setTimeout(() => _hwfitFetch(), 400);
}); });
// HF Token // HF token save is owned by cookbook.js (_wireTabEvents) — do not wire a
const hfToken = document.getElementById('hwfit-hftoken'); // second change/input handler here. The old duplicate ran after cookbook.js
if (hfToken) { // cleared the input on save and overwrote _envState.hfToken with "", so the
hfToken.addEventListener('change', () => { _envState.hfToken = hfToken.value.trim(); _persistEnvState(); }); // debounced state sync never persisted the token to cookbook_state.json.
hfToken.addEventListener('input', () => { _envState.hfToken = hfToken.value.trim(); });
}
// Rebuild all server select dropdowns with current servers // Rebuild all server select dropdowns with current servers
function _rebuildServerSelect() { function _rebuildServerSelect() {
+54
View File
@@ -83,6 +83,60 @@ python3 -m pytest tests/test_auth_config_lock_concurrency.py
python3 -m pytest -m slow python3 -m pytest -m slow
``` ```
## Order-sensitivity reporting (report-only)
`tests/run_order_report.py` runs pytest with the collected test items shuffled
by a seeded RNG, to surface order-sensitive tests (hidden coupling through
shared import state, module caches, databases, etc.). It is report-only: it is
not wired into CI, adds no gate, and changes no normal pytest collection or
ordering - the shuffle exists only inside this runner. The seed is always
printed, and pytest targets/options go after a literal `--`:
```bash
python3 tests/run_order_report.py --seed 123 -- tests/cli/ -q
python3 tests/run_order_report.py -- tests/cli/ -q # generates and prints a seed
```
The same seed reproduces the same order when the reported working directory,
pytest target arguments, and test environment are also the same. The runner
prints all command arguments with shell-safe POSIX quoting and uses the
invoking Python interpreter.
A generated-seed run starts with output like:
```text
[order-report] working directory: /path/to/odysseus
[order-report] shuffling test order with seed 284734921
[order-report] reproduce from this working directory with the same test environment:
[order-report] reproduce with: /path/to/odysseus/.venv/bin/python /path/to/odysseus/tests/run_order_report.py --seed 284734921 -- tests/cli/ -q
```
Run the printed command from the reported working directory to reproduce the
same fixed-seed order:
```text
[order-report] working directory: /path/to/odysseus
[order-report] shuffling test order with seed 284734921
[order-report] reproduce from this working directory with the same test environment:
[order-report] reproduce with: /path/to/odysseus/.venv/bin/python /path/to/odysseus/tests/run_order_report.py --seed 284734921 -- tests/cli/ -q
```
Pytest output remains visible between the report header and footer. A failing
run ends with pytest's normal failure report followed by:
```text
FAILED tests/example_test.py::test_example - AssertionError
[order-report] seed 284734921: pytest exit code 1 (report-only; fix order-sensitive failures in separate scoped PRs)
```
Failures discovered this way are real isolation bugs: fix them in separate
scoped PRs - do not silence them with `skip`/`xfail`, and do not "fix" them by
depending on a particular order.
The runner propagates pytest's exit code, so it composes with normal local
workflows; "report-only" means it is not a CI gate, not that failures are
swallowed.
## Core principles ## Core principles
- Keep PRs small and homogeneous: one kind of change per PR. - Keep PRs small and homogeneous: one kind of change per PR.
+156
View File
@@ -0,0 +1,156 @@
#!/usr/bin/env python3
"""Report-only randomized test-order runner (issue #3973).
Runs pytest with the collected test items shuffled by a seeded RNG so
order-sensitive tests (hidden coupling through shared import state, module
caches, databases, etc.) surface locally. The seed is always printed, so any
failing order is reproducible with ``--seed``.
This runner is report-only: it is not wired into CI, adds no gate, and does
not change normal pytest collection or ordering. Failures it discovers should
be fixed in separate scoped PRs, not silenced here.
Examples:
python3 tests/run_order_report.py --seed 123 -- tests/cli/ -q
python3 tests/run_order_report.py -- tests/cli/ -q # generates and prints a seed
The shuffle is applied through a local ``pytest_collection_modifyitems`` hook
passed to ``pytest.main`` as an in-process plugin; no conftest or global
plugin is involved. Reproduction requires the reported working directory,
seed, pytest arguments, and test environment. The exit code is pytest's own.
"""
from __future__ import annotations
import argparse
import random
import shlex
import sys
from collections.abc import Callable, Sequence
from pathlib import Path
# Seeds are kept in the non-negative 32-bit range so they stay short enough to
# copy from a report line into a reproduction command.
SEED_MAX = 2**32 - 1
def shuffle_items(items: list, seed: int) -> None:
"""Deterministically shuffle ``items`` in place using ``seed``."""
random.Random(seed).shuffle(items)
class OrderShuffle:
"""Local pytest plugin that shuffles collected items with a fixed seed."""
def __init__(self, seed: int):
self.seed = seed
def pytest_collection_modifyitems(self, items: list) -> None:
shuffle_items(items, self.seed)
def generate_seed() -> int:
"""Generate a fresh seed for a run that did not pass ``--seed``."""
return random.SystemRandom().randint(0, SEED_MAX)
def seed_type(value: str) -> int:
"""argparse type: a seed in ``[0, SEED_MAX]``."""
number = int(value)
if not 0 <= number <= SEED_MAX:
raise argparse.ArgumentTypeError(
f"seed must be between 0 and {SEED_MAX}, got {value!r}"
)
return number
def build_parser() -> argparse.ArgumentParser:
"""Build the argument parser for the order-sensitivity runner."""
parser = argparse.ArgumentParser(
prog="run_order_report.py",
description=(
"Run pytest with randomized test order to surface order-sensitive "
"tests. Report-only: prints the seed used and propagates pytest's "
"exit code; it changes no normal pytest behavior."
),
epilog=(
"Pass pytest targets and options after a literal -- separator, "
"e.g.: run_order_report.py --seed 123 -- tests/cli/ -q"
),
)
parser.add_argument(
"--seed",
type=seed_type,
help="shuffle seed; omitted: a seed is generated and printed",
)
parser.add_argument(
"pytest_args",
nargs="*",
metavar="-- PYTEST_ARGS",
help="pytest targets/options forwarded after a literal --",
)
return parser
def runner_path() -> str:
"""Return an absolute path for copy-pasteable reproduction commands."""
return str(Path(__file__).resolve())
def print_report_header(seed: int, pytest_args: Sequence[str]) -> None:
"""Print the seed and an exact reproduction command before running."""
repro = [
sys.executable,
runner_path(),
"--seed",
str(seed),
"--",
*pytest_args,
]
print(f"[order-report] working directory: {Path.cwd()}")
print(f"[order-report] shuffling test order with seed {seed}")
print(
"[order-report] reproduce from this working directory with the same "
"test environment:"
)
print(f"[order-report] reproduce with: {shlex.join(repro)}")
def print_report_footer(seed: int, exit_code: int) -> None:
"""Print the outcome with the seed again, after possibly long pytest output."""
outcome = "no failures" if exit_code == 0 else f"pytest exit code {exit_code}"
print(
f"[order-report] seed {seed}: {outcome} "
"(report-only; fix order-sensitive failures in separate scoped PRs)"
)
def run(
argv: Sequence[str] | None = None,
pytest_main: Callable[..., int] | None = None,
) -> int:
"""Parse ``argv``, run pytest with shuffled item order, and report the seed.
``pytest_main`` is injected so tests can assert on the forwarded arguments
and plugin without running a nested pytest. It must match ``pytest.main``:
accept ``(args, plugins=...)`` and return an exit code.
"""
namespace = build_parser().parse_args(argv)
seed = namespace.seed if namespace.seed is not None else generate_seed()
pytest_args = list(namespace.pytest_args)
print_report_header(seed, pytest_args)
if pytest_main is None:
import pytest
pytest_main = pytest.main
exit_code = int(pytest_main(pytest_args, plugins=[OrderShuffle(seed)]))
print_report_footer(seed, exit_code)
return exit_code
def main() -> int:
"""Console entry point."""
return run(sys.argv[1:])
if __name__ == "__main__":
raise SystemExit(main())
+37
View File
@@ -0,0 +1,37 @@
"""Cookbook HF token persistence and lookup."""
import json
import os
import pytest
from routes.cookbook_helpers import load_stored_hf_token
from src.secret_storage import encrypt
def test_load_stored_hf_token_reads_encrypted_state(tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state_path = tmp_path / "cookbook_state.json"
state_path.write_text(
json.dumps({"env": {"hfToken": encrypt("hf_test_token_12345")}}),
encoding="utf-8",
)
assert load_stored_hf_token() == "hf_test_token_12345"
assert load_stored_hf_token(state_path=state_path) == "hf_test_token_12345"
def test_load_stored_hf_token_falls_back_to_env_when_state_missing(tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("HF_TOKEN", "hf_from_env")
assert load_stored_hf_token() == "hf_from_env"
def test_load_stored_hf_token_prefers_state_over_env(tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("HF_TOKEN", "hf_from_env")
state_path = tmp_path / "cookbook_state.json"
state_path.write_text(
json.dumps({"env": {"hfToken": encrypt("hf_from_state")}}),
encoding="utf-8",
)
assert load_stored_hf_token() == "hf_from_state"
@@ -0,0 +1,26 @@
"""load_features() must degrade to defaults if features.json is unreadable.
load_settings() already catches PermissionError, but load_features() did not, so
an unreadable data/features.json (e.g. root-owned after a deploy) raised instead
of falling back to DEFAULT_FEATURES, taking down GET /api/auth/features.
"""
import builtins
import src.settings as settings
def test_load_features_degrades_on_permission_error(monkeypatch):
# Ensure the cache does not short-circuit the read.
monkeypatch.setattr(settings, "_features_cache", None, raising=False)
real_open = builtins.open
def deny(path, *args, **kwargs):
if str(path) == str(settings.FEATURES_FILE):
raise PermissionError("denied")
return real_open(path, *args, **kwargs)
monkeypatch.setattr(builtins, "open", deny)
result = settings.load_features()
assert result == dict(settings.DEFAULT_FEATURES)
+28
View File
@@ -0,0 +1,28 @@
from unittest.mock import MagicMock
import routes.memory_routes as memory_routes
from src.memory import MemoryManager
def test_memory_search_returns_only_callers_memories(monkeypatch, tmp_path):
manager = MemoryManager(str(tmp_path))
alice_memory = manager.add_entry("Project codename is Odyssey", owner="alice")
bob_memory = manager.add_entry("Project codename is Odyssey", owner="bob")
manager.save([alice_memory, bob_memory])
monkeypatch.setattr(memory_routes, "get_current_user", lambda request: "bob")
router = memory_routes.setup_memory_routes(manager, MagicMock())
search = next(
route.endpoint
for route in router.routes
if route.path == "/api/memory/search" and "POST" in route.methods
)
result = search(
request=None,
query="Project codename is Odyssey",
session_id=None,
category=None,
)
assert [memory["id"] for memory in result["memories"]] == [bob_memory["id"]]
+1
View File
@@ -83,6 +83,7 @@ def test_is_wsl_true_when_proc_version_mentions_microsoft(monkeypatch):
def fake_open(path, mode="r", *args, **kwargs): def fake_open(path, mode="r", *args, **kwargs):
assert path == "/proc/version" assert path == "/proc/version"
assert mode == "r" assert mode == "r"
assert kwargs == {"encoding": "utf-8", "errors": "ignore"}
return io.StringIO("Linux version 6.6.0 microsoft standard") return io.StringIO("Linux version 6.6.0 microsoft standard")
monkeypatch.setattr("builtins.open", fake_open) monkeypatch.setattr("builtins.open", fake_open)
+245
View File
@@ -0,0 +1,245 @@
"""Direct tests for the order-sensitivity report runner (tests/run_order_report.py).
The shuffle and argument plumbing are tested without spawning pytest: the
shuffle helpers are asserted directly and ``run`` is exercised with an
injected fake ``pytest.main``. A small subprocess test then proves the seed is
applied end to end (reproducible, seed visible) against a throwaway test file,
never the real suite.
"""
from __future__ import annotations
import shlex
import subprocess
import sys
from pathlib import Path
import pytest
from tests.run_order_report import (
SEED_MAX,
OrderShuffle,
generate_seed,
run,
shuffle_items,
)
REPO_ROOT = Path(__file__).resolve().parents[1]
RUNNER = REPO_ROOT / "tests" / "run_order_report.py"
class _FakePytestMain:
"""Records forwarded args and plugins and returns a fixed exit code."""
def __init__(self, returncode: int = 0):
self.returncode = returncode
self.calls: list[tuple[list[str], list]] = []
def __call__(self, args: list[str], plugins: list) -> int:
self.calls.append((list(args), list(plugins)))
return self.returncode
# --- shuffle determinism -----------------------------------------------------
def test_same_seed_shuffles_identically():
first = list(range(20))
second = list(range(20))
shuffle_items(first, seed=123)
shuffle_items(second, seed=123)
assert first == second
def test_different_seeds_shuffle_differently():
first = list(range(20))
second = list(range(20))
shuffle_items(first, seed=123)
shuffle_items(second, seed=321)
assert first != second
def test_shuffle_preserves_items():
items = list(range(20))
shuffle_items(items, seed=123)
assert sorted(items) == list(range(20))
def test_plugin_hook_matches_shuffle_items():
hooked = list(range(20))
expected = list(range(20))
OrderShuffle(seed=7).pytest_collection_modifyitems(hooked)
shuffle_items(expected, seed=7)
assert hooked == expected
# --- argument parsing and pytest invocation ----------------------------------
def test_pytest_args_after_separator_are_forwarded():
fake = _FakePytestMain()
run(["--seed", "123", "--", "tests/cli/", "-q"], pytest_main=fake)
(args, plugins), = fake.calls
assert args == ["tests/cli/", "-q"]
assert [type(p) for p in plugins] == [OrderShuffle]
def test_explicit_seed_reaches_plugin():
fake = _FakePytestMain()
run(["--seed", "123", "--", "-q"], pytest_main=fake)
(_, plugins), = fake.calls
assert plugins[0].seed == 123
def test_pytest_exit_code_is_propagated():
fake = _FakePytestMain(returncode=3)
assert run(["--seed", "123", "--", "-q"], pytest_main=fake) == 3
@pytest.mark.parametrize("value", ["abc", "-1", str(SEED_MAX + 1)])
def test_invalid_seed_is_rejected_before_pytest(value):
fake = _FakePytestMain()
with pytest.raises(SystemExit) as excinfo:
run(["--seed", value, "--", "-q"], pytest_main=fake)
assert excinfo.value.code == 2
assert fake.calls == []
# --- seed reporting -----------------------------------------------------------
def test_explicit_seed_is_printed_with_repro_command(capsys):
run(["--seed", "123", "--", "tests/cli/", "-q"], pytest_main=_FakePytestMain())
out = capsys.readouterr().out
assert "[order-report] shuffling test order with seed 123" in out
repro = shlex.join(
[
sys.executable,
str(RUNNER),
"--seed",
"123",
"--",
"tests/cli/",
"-q",
]
)
assert f"reproduce with: {repro}" in out
def test_working_directory_is_reported(capsys, monkeypatch, tmp_path):
monkeypatch.chdir(tmp_path)
run(["--seed", "123", "--", "-q"], pytest_main=_FakePytestMain())
out = capsys.readouterr().out
assert f"[order-report] working directory: {tmp_path}" in out
def test_footer_repeats_seed_and_outcome(capsys):
run(["--seed", "123", "--", "-q"], pytest_main=_FakePytestMain(returncode=1))
out = capsys.readouterr().out
assert "[order-report] seed 123: pytest exit code 1" in out
def test_generated_seed_is_printed_and_used(capsys):
fake = _FakePytestMain()
run(["--", "-q"], pytest_main=fake)
out = capsys.readouterr().out
seed_line = next(line for line in out.splitlines() if "with seed" in line)
seed = int(seed_line.rsplit("seed ", 1)[1])
assert 0 <= seed <= SEED_MAX
(_, plugins), = fake.calls
assert plugins[0].seed == seed
def test_generate_seed_is_within_range():
assert all(0 <= generate_seed() <= SEED_MAX for _ in range(5))
# --- end-to-end: the seed really drives collection order (real subprocess) ---
_SAMPLE_TESTS = "".join(
f"def test_{name}():\n pass\n\n"
for name in ("alpha", "bravo", "charlie", "delta", "echo", "foxtrot", "golf", "hotel")
)
@pytest.fixture(scope="module")
def sample_suite(tmp_path_factory) -> Path:
"""A throwaway directory with eight trivial tests, outside the repo rootdir."""
suite = tmp_path_factory.mktemp("order_report_suite")
(suite / "test_sample.py").write_text(_SAMPLE_TESTS, encoding="utf-8")
return suite
def _collect_order(sample_suite: Path, seed: int) -> tuple[list[str], str]:
"""Run the runner with ``--collect-only`` and return (test ids, stdout)."""
result = subprocess.run(
[
sys.executable,
str(RUNNER),
"--seed",
str(seed),
"--",
"--collect-only",
"-q",
"-p",
"no:cacheprovider",
"test_sample.py",
],
cwd=sample_suite,
capture_output=True,
text=True,
)
assert result.returncode == 0, result.stderr or result.stdout
ids = [line for line in result.stdout.splitlines() if "::" in line]
assert len(ids) == 8, result.stdout
return ids, result.stdout
def test_subprocess_same_seed_is_reproducible(sample_suite):
first, out = _collect_order(sample_suite, seed=123)
second, _ = _collect_order(sample_suite, seed=123)
assert first == second
assert "[order-report] shuffling test order with seed 123" in out
def test_subprocess_different_seeds_change_order(sample_suite):
first, _ = _collect_order(sample_suite, seed=123)
second, _ = _collect_order(sample_suite, seed=321)
assert first != second
def test_subprocess_failure_exit_code_and_footer(tmp_path):
"""A real failing pytest run keeps pytest's exit code and reports the seed."""
(tmp_path / "test_failure.py").write_text(
"def test_failure():\n assert False\n",
encoding="utf-8",
)
result = subprocess.run(
[
sys.executable,
str(RUNNER),
"--seed",
"123",
"--",
"test_failure.py",
"-q",
],
cwd=tmp_path,
capture_output=True,
text=True,
)
assert result.returncode == 1
repro = shlex.join(
[
sys.executable,
str(RUNNER),
"--seed",
"123",
"--",
"test_failure.py",
"-q",
]
)
assert f"reproduce with: {repro}" in result.stdout
assert "[order-report] seed 123: pytest exit code 1" in result.stdout