Metadata-Version: 2.4
Name: locram
Version: 1.3.11
Summary: Local Zettelkasten for LLM agents — SQLite + Obsidian vault + MCP
Project-URL: Homepage, https://github.com/nocrun/locram
Project-URL: Repository, https://github.com/nocrun/locram
Project-URL: Bug Tracker, https://github.com/nocrun/locram/issues
Author: nocrun
License-Expression: MIT
Keywords: knowledge-base,llm,mcp,obsidian,zettelkasten
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Text Processing :: General
Requires-Python: >=3.11
Requires-Dist: click>=8.1.0
Requires-Dist: mcp[cli]>=1.3.0
Requires-Dist: python-frontmatter>=1.1.0
Requires-Dist: python-ulid>=3.0.0
Provides-Extra: embeddings
Requires-Dist: sqlite-vec>=0.1.6; extra == 'embeddings'
Description-Content-Type: text/markdown

# locram

Local Zettelkasten knowledge base for LLM agents and humans.

SQLite is the source of truth. An Obsidian-compatible Markdown vault is the human-facing mirror. MCP and CLI give agents and scripts a precise API over the same corpus.

Everything lives locally under `~/.locram/` by default.

---

## What locram is

`locram` is a local knowledge base with two deliberate constraints:

1. SQLite is authoritative.
   Notes, links, metadata, review state, embeddings, and graph structure live in one DB.

2. The vault is a managed mirror.
   Every managed note also exists as Markdown in the vault so humans can read and edit it in Obsidian.

This gives you:

- fast local search via FTS5
- typed graph relationships
- MCP tools for agents
- Markdown notes for humans
- no cloud service and no hosted backend

---

## Current surface

- 30 MCP tools
- CLI for setup, sync, CRUD-adjacent operations, embeddings, diagnostics, and maintenance
- bidirectional DB <-> vault workflow
- Obsidian auto-configuration
- optional semantic and hybrid retrieval via `sqlite-vec` + Ollama
- soft delete with trash, explicit restore, and permanent purge
- integrity diagnostics via `locram check`
- semantic invariant diagnostics via MCP `validate_topology()`

---

## Data layout

Default root: `~/.locram/`

```text
~/.locram/
├── locram.db
├── trash/
│   └── <page_id>.md
├── embed.pending
├── embed.lock
├── embed-runner.log
├── embed-runner.err
├── watcher.log
├── watcher.err
└── vault/
    ├── .obsidian/
    ├── attachments/
    ├── templates/
    └── *.md
```

Override with:

```bash
export LOCRAM_HOME=/path/to/your/locram-home
```

---

## Install

### Base setup

Use this if you want:

- MCP tools
- vault sync
- typed links and graph tools
- FTS5 search
- Obsidian integration

No Ollama or `sqlite-vec` is required.

```bash
uv tool install locram
# or: pipx install locram
locram init
```

### Embedding-enabled setup

Use this if you also want:

- `semantic_search`
- `hybrid_search`
- embedding storage
- `locram embed`
- optional auto-embed on macOS

Global install with `uv tool`:

```bash
uv tool install --force --refresh --with sqlite-vec locram
```

Regular Python environment:

```bash
pip install "locram[embeddings]"
```

Then configure:

```bash
export LOCRAM_EMBED_PROVIDER=ollama
export LOCRAM_EMBED_MODEL=bge-m3
# optional if non-default:
# export LOCRAM_EMBED_URL=http://localhost:11434
```

And fetch the model:

```bash
ollama pull bge-m3
locram embed
```

---

## Quick start

```bash
locram init
locram create --title "Hello locram" --content "First note"
locram search "Hello"
locram check
```

macOS watcher:

```bash
locram watch-install
```

Optional embeddings:

```bash
locram embed
locram embed-auto-install
```

---

## Obsidian workflow

Open `~/.locram/vault` as an Obsidian vault.

`locram init` configures:

- `.obsidian/app.json`
- `.obsidian/core-plugins.json`
- `.obsidian/graph.json`
- `.obsidian/templates.json`
- `.obsidian/types.json`
- `vault/templates/locram note.md`
- `vault/attachments/`

The default note template includes an empty `id:` marker:

```yaml
---
type: fleeting
subject: []
tags: []
status: active
created:
updated:
review in: 7
parent:
id:
---
```

That matters because vault ownership is explicit.

### Important: managed vs unmanaged Markdown

`locram` does not import arbitrary Markdown.

Vault sync behavior:

- no `id` key in frontmatter: file is ignored
- empty `id:`: explicit claim marker
  `locram sync` imports it only after the note has an H1 and non-empty body
- valid `id` with matching DB row: reconcile vault-writable fields
- valid `id` with no DB row: treated as orphan file, not imported
- invalid `id`: reported as an error

Practical implication:

- if you create notes in Obsidian and want `locram` to manage them, create them from the locram template or manually add empty `id:`
- plain `Cmd+N` in Obsidian creates unmanaged Markdown unless you insert the template

---

## MCP configuration

Base setup:

```json
{
  "mcpServers": {
    "locram": {
      "command": "locram",
      "args": ["serve"]
    }
  }
}
```

Embedding-enabled setup:

```json
{
  "mcpServers": {
    "locram": {
      "command": "locram",
      "args": ["serve"],
      "env": {
        "LOCRAM_EMBED_PROVIDER": "ollama",
        "LOCRAM_EMBED_MODEL": "bge-m3",
        "LOCRAM_EMBED_URL": "http://localhost:11434"
      }
    }
  }
}
```

Custom data root:

```json
{
  "mcpServers": {
    "locram": {
      "command": "locram",
      "args": ["serve"],
      "env": {
        "LOCRAM_HOME": "/Users/you/my-locram",
        "LOCRAM_EMBED_PROVIDER": "ollama",
        "LOCRAM_EMBED_MODEL": "bge-m3",
        "LOCRAM_EMBED_URL": "http://localhost:11434"
      }
    }
  }
}
```

Notes:

- GUI apps usually do not inherit your shell env; set `LOCRAM_HOME` and `LOCRAM_EMBED_*` in the MCP config if needed.
- In Claude Desktop or similar apps, using the full binary path from `which locram` is often safer.
- From source, a good dev config is:

```json
{
  "mcpServers": {
    "locram": {
      "command": "uv",
      "args": ["--directory", "/absolute/path/to/locram", "run", "locram", "serve"]
    }
  }
}
```

---

## MCP tools

Current MCP surface: 30 tools.

### Notes

| Tool | Description |
|------|-------------|
| `create_page` | Create a note |
| `get_page` | Fetch by page id or exact title |
| `update_page` | Partial update |
| `delete_page` | Soft delete by default; `hard=true` removes immediately |
| `restore_page` | Restore a `to_delete` page |
| `purge_page` | Permanently remove a `to_delete` page |
| `list_pages` | List with filters |
| `search` | FTS5 BM25 over title and content |
| `replace_in_page` | Exact find-and-replace |
| `update_section` | Replace a markdown section |
| `promote_page` | Maturity step: `fleeting -> note-taking -> permanent` |
| `mark_reviewed` | Mark reviewed without changing content |

### Links and hierarchy

| Tool | Description |
|------|-------------|
| `link_pages` | Create typed link |
| `unlink_pages` | Remove typed links |
| `set_parent` | Set or clear `parent_id` |
| `batch_link` | Create many links at once |

### Graph and diagnostics

| Tool | Description |
|------|-------------|
| `find_orphaned_notes` | Notes with no links and no parent |
| `find_central_notes` | High-degree notes |
| `find_due_for_review` | Review queue |
| `find_similar_notes` | Semantic nearest neighbors when possible, lexical fallback otherwise |
| `get_graph_summary` | Graph stats and topology summary |
| `validate_topology` | Semantic invariant report for agents |

### Attachments and Mermaid

| Tool | Description |
|------|-------------|
| `get_attachment` | Read from `vault/attachments/` as base64 |
| `save_attachment` | Save attachment into `vault/attachments/` |
| `render_mermaid` | Render Mermaid to `svg` or `png` |

### Embeddings and retrieval

| Tool | Description |
|------|-------------|
| `store_embedding` | Store vector for a page |
| `find_unembedded` | Queue of pages missing embeddings |
| `find_stale_embeddings` | Queue of stale embeddings for a model |
| `semantic_search` | Vector-only retrieval |
| `hybrid_search` | FTS5 + vector retrieval |

---

## CLI

```bash
locram init
locram serve
locram sync [--file PATH]
locram vault-rebuild
locram watch-install
locram watch-uninstall
locram embed-auto-install
locram embed-auto-uninstall

locram create --title "..." [--content "..."] [--type ...] [--subject ...] [--tag ...]
locram get <page_id>
locram search <query> [--limit N]
locram list [--type ...] [--status ...] [--limit N]
locram restore <page_id>
locram purge <page_id>
locram check
locram link <source_id> <target_id> [--type ...]

locram embed [--limit N] [--force]
locram embed-auto-run

locram reset
```

Data-returning commands emit JSON:

- `create`
- `get`
- `search`
- `list`
- `restore`
- `purge`
- `check`
- `embed`
- `embed-auto-run`

Setup commands print human-readable status.

---

## Note model

### Note types

| Type | Meaning |
|------|---------|
| `fleeting` | Quick raw capture |
| `note-taking` | Structured note under refinement |
| `permanent` | Finished atomic idea |
| `structure` | Map-of-content style note |
| `hub` | Domain entry point |

Maturity pipeline:

```text
fleeting -> note-taking -> permanent
```

`structure` and `hub` are classification types, not maturity steps.

Important invariant:

- hubs must be root-level; `parent_id` is rejected for `type=hub`

### Link types

| Forward | Inverse |
|---------|---------|
| `related` | `related` |
| `extends` | `extended_by` |
| `supports` | `supported_by` |
| `contradicts` | `contradicted_by` |
| `refines` | `refined_by` |
| `questions` | `questioned_by` |
| `reference` | none |

Three relationship channels coexist:

- vertical: `parent_id`
- horizontal: typed links
- inline: `[[wikilinks]]` inside content, resolved at read time only

---

## Sync model

### DB -> vault

Every successful MCP or CLI mutation writes SQLite first, then rewrites the canonical vault file.

### Vault -> DB

Use:

- `locram sync` for full-vault reconcile
- `locram sync --file PATH` for one-file sync without deletion detection
- `locram watch-install` on macOS for launchd-based automatic full sync

### How title and file path behave

Current shipped behavior:

- note content is normalized to start with `# {title}`
- managed notes have a canonical `vault_relpath` stored in DB
- renaming `title` through API updates the H1, but does not rename the file
- during explicit claim of a brand-new unmanaged file, `sync` may rename an `Untitled.md`-style filename to match the first H1

### What sync reads from the vault

Vault-writable fields:

- `content`
- `type`
- `status`
- `subject`
- `tags`
- `review in`
- `parent`

DB-owned fields:

- `id`
- `created`
- `updated`
- `reviewed`
- typed links
- embeddings

### Full-vault sync invariants

- unmanaged Markdown without `id` key is ignored
- empty `id:` claims a note into DB only after H1 + body exist
- `to_delete` rows are not reactivated by vault files
- stale managed files with valid `id` but no DB row are reported as orphan files
- duplicate ids across live vault files are reported
- `path_mismatch` and `id_mismatch` are reported instead of silently reassigning ownership

---

## Deletion lifecycle

`locram` has three distinct delete states:

### Soft delete

Triggered by:

- `delete_page(page_id)` via MCP
- deleting a managed vault file and then running full sync

The CLI has no `delete` command; its lifecycle commands start at `restore` and `purge`.

Effects:

- DB row stays
- `status` becomes `to_delete`
- live vault file is moved to `~/.locram/trash/<page_id>.md`
- canonical `vault_relpath` is preserved

### Restore

`restore_page` / `locram restore <page_id>`:

- sets status back to `active`
- moves the file back from trash to its canonical `vault_relpath`
- if the trash file is missing, rewrites the vault file from DB content

### Purge

`purge_page` / `locram purge <page_id>`:

- requires current status `to_delete`
- deletes the DB row
- removes live and trash artifacts
- links and embeddings are deleted by FK cascade
- child notes keep their rows but lose `parent_id`

Inspect soft-deleted notes with:

```bash
locram list --status to_delete
```

---

## Diagnostics

### `locram check`

User-facing filesystem integrity command.

Checks for:

- `missing_file`
- `orphan_file`
- `path_mismatch`
- `id_mismatch`
- `duplicate_id`

Behavior:

- exits `0` when clean
- exits `1` when violations exist
- prints structured JSON

### `validate_topology()`

Agent-facing MCP diagnostic for semantic and graph invariants.

Checks for:

- `hub_with_parent`
- `missing_h1`
- `h1_mismatch`
- `to_delete_live_file`
- `orphan_file`
- `duplicate_id`

Returns a structured report:

```json
{
  "ok": false,
  "violation_count": 2,
  "counts": {
    "missing_h1": 1,
    "hub_with_parent": 1
  },
  "violations": [
    {
      "type": "missing_h1",
      "page_id": "..."
    }
  ]
}
```

---

## Frontmatter reference

Managed notes are written like this:

```yaml
---
type: permanent
subject: [ai]
tags: [research]
status: active
created: 2026-03-21T10:30:00Z
updated: 2026-03-21T14:15:00Z
review in: 7
reviewed:
parent: "[[Parent Title]]"
id: 01KMXXXXXXXXXXXXXXXXXXXXXX
---
```

Field semantics:

| Field | Meaning |
|-------|---------|
| `type`, `subject`, `tags`, `status`, `review in`, `parent` | Vault-writable and syncable |
| `created` | DB-owned |
| `updated` | DB-owned; advanced on real DB mutations and vault->DB sync updates |
| `reviewed` | DB-owned; use `mark_reviewed` |
| `id` | Stable identity; never edit manually |

---

## Embeddings and semantic retrieval

This section applies only when embeddings are enabled.

Environment variables:

| Variable | Default | Meaning |
|----------|---------|---------|
| `LOCRAM_HOME` | `~/.locram` | Root data directory |
| `LOCRAM_EMBED_PROVIDER` | `""` | Set to `ollama` to enable embeddings |
| `LOCRAM_EMBED_MODEL` | `""` | Model name, for example `bge-m3` |
| `LOCRAM_EMBED_URL` | `http://localhost:11434` | Ollama base URL |
| `LOCRAM_EMBED_API_KEY` | `""` | Reserved for future providers |
| `LOCRAM_EMBED_AUTO` | `0` | Enable auto-embed runner on macOS |
| `LOCRAM_EMBED_QUIET_SECONDS` | `60` | Quiet period after sync |
| `LOCRAM_EMBED_BATCH_SIZE` | `50` | Batch size per auto-embed pass |

Commands:

```bash
locram embed
locram embed --limit 100
locram embed --force
locram embed-auto-run
```

`semantic_search` and `hybrid_search` return coverage metadata:

```json
{
  "coverage": {
    "searched": 150,
    "total_active": 500
  }
}
```

If the DB contains embeddings from a different model, retrieval may include a `warning`; use `locram embed --force`.

### Auto-embed on macOS

`locram` can run a periodic launchd embed worker.

Typical setup:

```bash
export LOCRAM_EMBED_AUTO=1
export LOCRAM_EMBED_QUIET_SECONDS=60
export LOCRAM_EMBED_BATCH_SIZE=50

locram watch-install
locram embed-auto-install
```

Behavior:

1. sync writes `embed.pending` when notes were inserted or updated
2. `embed-auto-run` executes every 30 seconds
3. it waits until the quiet period is satisfied
4. it embeds stale notes in batches
5. it clears `embed.pending` when the queue is drained

---

## Requirements

- Python 3.11+
- macOS for launchd watcher and auto-embed runner
- Linux works for manual sync and the rest of the local workflow

---

## License

MIT
