mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-17 10:15:27 -04:00
docs: add backup/restore guide for odysseus-backup (#2587)
The scripts/odysseus-backup snapshot/restore CLI was undocumented in README.md and docs/. Add docs/backup-restore.md covering the snapshot, list, verify, and restore subcommands, default include/skip behavior (deep_research and mail-attachments skipped unless flagged), the destructive-restore warning and its data.before-restore-* stash, a cron example, and Docker-vs-native data/ paths (including the ChromaDB named volume caveat). Link it from the README Data section. Addresses the "Backup/restore guide and helper flow for data/" item in ROADMAP.md. Docs only; no change to the tool. Fixes #2583 Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -467,6 +467,9 @@ docs/ landing page (index.html) + preview clips
|
|||||||
All user data lives in `data/` (gitignored): `app.db` (sessions, messages, documents),
|
All user data lives in `data/` (gitignored): `app.db` (sessions, messages, documents),
|
||||||
`memory.json`, `presets.json`, `uploads/`, `personal_docs/`, `chroma/`, `settings.json`.
|
`memory.json`, `presets.json`, `uploads/`, `personal_docs/`, `chroma/`, `settings.json`.
|
||||||
|
|
||||||
|
To back up or restore everything in `data/`, see the
|
||||||
|
[Backup & Restore guide](docs/backup-restore.md).
|
||||||
|
|
||||||
## Star History
|
## Star History
|
||||||
|
|
||||||
<a href="https://www.star-history.com/?repos=pewdiepie-archdaemon%2Fodysseus&type=date&legend=top-left">
|
<a href="https://www.star-history.com/?repos=pewdiepie-archdaemon%2Fodysseus&type=date&legend=top-left">
|
||||||
|
|||||||
@@ -0,0 +1,129 @@
|
|||||||
|
# Backup & Restore
|
||||||
|
|
||||||
|
Odysseus keeps all of your state in the `data/` directory — the SQLite database
|
||||||
|
(`app.db`), the Fernet encryption key (`data/.app_key`), the vault, memory, RAG
|
||||||
|
indexes, personal documents, and uploads. The `scripts/odysseus-backup` tool
|
||||||
|
snapshots that directory into a single gzip tarball and restores it later.
|
||||||
|
|
||||||
|
Snapshots are safe to take while the app is running: SQLite databases are copied
|
||||||
|
through SQLite's own `.backup` API rather than a raw file copy, so an in-flight
|
||||||
|
write can't corrupt the snapshot.
|
||||||
|
|
||||||
|
> **A snapshot contains your secrets.** The tarball includes the Fernet
|
||||||
|
> encryption key (`data/.app_key`), the vault, sessions, and any stored
|
||||||
|
> provider/API tokens — so treat it like a password. Store backups somewhere
|
||||||
|
> private, never commit them to Git, and prefer an encrypted destination when
|
||||||
|
> copying them offsite.
|
||||||
|
|
||||||
|
## Quick start
|
||||||
|
|
||||||
|
Run the tool from the repository root:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create a snapshot → backups/odysseus-backup-<YYYYMMDD-HHMMSS>.tar.gz
|
||||||
|
./scripts/odysseus-backup snapshot
|
||||||
|
|
||||||
|
# List existing snapshots (most recent first)
|
||||||
|
./scripts/odysseus-backup list
|
||||||
|
|
||||||
|
# Check a tarball's integrity without extracting it
|
||||||
|
./scripts/odysseus-backup verify backups/odysseus-backup-20260101-120000.tar.gz
|
||||||
|
|
||||||
|
# Restore (destructive — see the warning below)
|
||||||
|
./scripts/odysseus-backup restore backups/odysseus-backup-20260101-120000.tar.gz --yes
|
||||||
|
```
|
||||||
|
|
||||||
|
The script depends only on the Python standard library, so any `python3` on your
|
||||||
|
`PATH` will run it — you don't need the app's virtualenv active.
|
||||||
|
|
||||||
|
Every command prints a JSON result. Add `--pretty` for indented output.
|
||||||
|
|
||||||
|
## Commands
|
||||||
|
|
||||||
|
### `snapshot`
|
||||||
|
|
||||||
|
Writes a `tar.gz` of `data/` to `backups/<timestamp>.tar.gz`.
|
||||||
|
|
||||||
|
| Flag | Effect |
|
||||||
|
| --- | --- |
|
||||||
|
| `--out PATH` | Write to a specific path instead of the default `backups/` location. Must be **outside** `data/`. |
|
||||||
|
| `--include-research` | Include `data/deep_research/` (skipped by default — research runs are large). |
|
||||||
|
| `--include-attachments` | Include `data/mail-attachments/` (skipped by default — cached IMAP extractions, re-derivable). |
|
||||||
|
|
||||||
|
By default the snapshot includes everything under `data/` **except**
|
||||||
|
`deep_research/` and `mail-attachments/`. Personal uploads and documents are
|
||||||
|
included.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Snapshot straight to a mounted NAS path
|
||||||
|
./scripts/odysseus-backup snapshot --out /mnt/nas/odysseus-$(date +%F).tar.gz
|
||||||
|
|
||||||
|
# Full snapshot including research runs and mail attachments
|
||||||
|
./scripts/odysseus-backup snapshot --include-research --include-attachments
|
||||||
|
```
|
||||||
|
|
||||||
|
### `list`
|
||||||
|
|
||||||
|
Lists the tarballs in `backups/`, most recent first, with size and modification
|
||||||
|
time.
|
||||||
|
|
||||||
|
### `verify PATH`
|
||||||
|
|
||||||
|
Opens the tarball read-only and walks every member to confirm it is intact and
|
||||||
|
safe to restore. Nothing is extracted. Use this before relying on an old backup
|
||||||
|
or after copying one across machines.
|
||||||
|
|
||||||
|
### `restore PATH --yes`
|
||||||
|
|
||||||
|
Overwrites `data/` from a tarball.
|
||||||
|
|
||||||
|
> **Restore is destructive.** It replaces the current `data/` directory. `--yes`
|
||||||
|
> is required so a mistyped command can't wipe your live state.
|
||||||
|
|
||||||
|
Restore is not a blind delete: before extracting, the tool **renames your current
|
||||||
|
`data/` to `data.before-restore-<timestamp>`** in the repository root. If a
|
||||||
|
restore turns out to be wrong, your previous state is still there — delete the
|
||||||
|
restored `data/` and rename the stashed directory back. The restore path is also
|
||||||
|
validated entry-by-entry: archives containing absolute paths, `..` segments,
|
||||||
|
symlinks, or anything outside `data/` are rejected.
|
||||||
|
|
||||||
|
## Scheduling offsite backups
|
||||||
|
|
||||||
|
The tarball output composes cleanly with cron and any copy tool. For example, a
|
||||||
|
nightly snapshot copied offsite:
|
||||||
|
|
||||||
|
```cron
|
||||||
|
0 3 * * * cd /path/to/odysseus && ./scripts/odysseus-backup snapshot --out "/mnt/nas/odysseus-$(date +\%F).tar.gz"
|
||||||
|
```
|
||||||
|
|
||||||
|
Swap the `--out` target for `scp`, `rclone`, `s3cmd`, or similar to push the
|
||||||
|
snapshot to remote storage.
|
||||||
|
|
||||||
|
## Docker vs native installs
|
||||||
|
|
||||||
|
The tool reads `data/` and writes `backups/` relative to the repository root, so
|
||||||
|
where you run it matters:
|
||||||
|
|
||||||
|
- **Native installs** — run it from the repo root as shown above. `data/` and
|
||||||
|
`backups/` are both in the repo directory.
|
||||||
|
- **Docker** — `docker-compose.yml` bind-mounts the host's `./data` to
|
||||||
|
`/app/data`, so the live data is also present on the host. **Run the tool on
|
||||||
|
the host** from the repo root; the snapshot reads the bind-mounted `./data` and
|
||||||
|
writes to `./backups` on the host. Running it *inside* the container is not
|
||||||
|
recommended, because `backups/` is not a mounted volume and the tarball would
|
||||||
|
be lost when the container is recreated.
|
||||||
|
|
||||||
|
> **ChromaDB caveat (Docker only).** In the Docker setup, ChromaDB stores its
|
||||||
|
> vectors in a separate Compose-managed volume (declared as `chromadb-data`),
|
||||||
|
> **not** under `./data`. `odysseus-backup` therefore does not capture the Docker
|
||||||
|
> ChromaDB store. Back it up separately if you need it. Compose prefixes the
|
||||||
|
> volume with the project name, so find the real name first
|
||||||
|
> (`docker volume ls | grep chromadb`), then archive it — for example:
|
||||||
|
>
|
||||||
|
> ```bash
|
||||||
|
> docker run --rm -v <project>_chromadb-data:/data -v "$PWD":/backup \
|
||||||
|
> alpine tar czf /backup/chromadb.tar.gz -C /data .
|
||||||
|
> ```
|
||||||
|
>
|
||||||
|
> On native installs ChromaDB lives at `data/chroma/` and is included in the
|
||||||
|
> snapshot normally.
|
||||||
Reference in New Issue
Block a user