Metadata-Version: 2.4
Name: locram
Version: 1.3.8
Summary: Local Zettelkasten for LLM agents — SQLite + Obsidian vault + MCP
Project-URL: Homepage, https://github.com/nocrun/locram
Project-URL: Repository, https://github.com/nocrun/locram
Project-URL: Bug Tracker, https://github.com/nocrun/locram/issues
Author: nocrun
License-Expression: MIT
Keywords: knowledge-base,llm,mcp,obsidian,zettelkasten
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Text Processing :: General
Requires-Python: >=3.11
Requires-Dist: click>=8.1.0
Requires-Dist: mcp[cli]>=1.3.0
Requires-Dist: python-frontmatter>=1.1.0
Requires-Dist: python-ulid>=3.0.0
Provides-Extra: embeddings
Requires-Dist: sqlite-vec>=0.1.6; extra == 'embeddings'
Description-Content-Type: text/markdown

# locram

**Local Zettelkasten knowledge base for LLM agents and humans.**

SQLite as the source of truth. Obsidian-compatible Markdown vault as the human-readable mirror. MCP interface for any AI agent.

Write notes from Cursor, Claude Desktop, OpenAI Codex, or your own agent. Read them back in Obsidian. No server, no cloud, no account. Everything lives in `~/.locram/`.

---

## What is this?

locram is a personal knowledge base built on two principles:

1. **SQLite is the source of truth.** All notes, links, and metadata live in a single local database. Fast FTS5 full-text search, graph analytics, typed relationships — all queryable in microseconds.

2. **Vault is the human interface.** Every note is simultaneously a Markdown file in `~/.locram/vault/`. Edit in Obsidian. Changes sync back to the DB automatically via a macOS launchd watcher or `locram sync`.

The result: AI agents get a precise, fast, queryable API. Humans get a beautiful Obsidian graph. Both work on the same data.

---

## Key Features

- **MCP server** — 27 tools for LLM agents: create, read, update, delete, link, search, semantic search, Mermaid rendering
- **Typed links** — 12 semantic link types with auto-inverse (e.g. `extends` ↔ `extended_by`)
- **Bidirectional sync** — DB → vault on every write; vault → DB via `locram sync`
- **Obsidian-native** — auto-configures `.obsidian/` with graph settings, property types, templates
- **FTS5 full-text search** — BM25 ranking over title + content
- **Semantic & hybrid search** — vector search via `sqlite-vec` + FTS5+vector RRF merge (`semantic_search`, `hybrid_search`)
- **Graph analytics** — orphaned notes, central hubs, review queue, semantic-aware note neighbors
- **Embedding storage** — store vectors from local Ollama (BGE-M3); manual refresh via `locram embed`
- **Optional auto-embed** — periodic launchd runner can refresh stale embeddings after a quiet period
- **CLI** — every operation available from terminal, JSON output
- **Zero cloud dependencies** — base setup runs fully local; embedding mode uses local Ollama

---

## Data Directory

```
~/.locram/
├── locram.db
└── vault/
    ├── .obsidian/     # created/updated by locram init
    ├── templates/
    ├── attachments/
    └── *.md
```

Override: `export LOCRAM_HOME=/path`

---

## Install

Choose one of these two setups.

### Option A — Base setup

Use this if you want:

1. MCP tools
2. vault sync
3. FTS5 search
4. graph tools
5. Obsidian integration

No Ollama, `sqlite-vec`, or `LOCRAM_EMBED_*` variables are required.

```bash
uv tool install locram
# or: pipx install locram
locram init
```

### Option B — Embedding-enabled setup

Use this if you also want:

1. `semantic_search`
2. `hybrid_search`
3. `locram embed`
4. optional auto-embed after quiet period

This setup requires:

1. `sqlite-vec`
2. a running [Ollama](https://ollama.com) instance
3. an embedding model such as `bge-m3`
4. `LOCRAM_EMBED_*` environment variables

If you install locram globally with `uv tool`, install `sqlite-vec` into the same tool environment:

```bash
uv tool install --force --refresh --with sqlite-vec locram
```

Use the same command later to refresh or repair the global tool environment with embeddings enabled.

If you install locram into a normal Python environment, use:

```bash
pip install "locram[embeddings]"
```

Then add to your shell profile (`~/.zshrc` or equivalent):

```bash
export LOCRAM_EMBED_PROVIDER=ollama
export LOCRAM_EMBED_MODEL=bge-m3
# LOCRAM_EMBED_URL defaults to http://localhost:11434 — only set if Ollama is on a different port
```

Then:

```bash
ollama pull bge-m3
locram embed
```

macOS system Python may block global `pip install` (PEP 668); prefer `uv tool` / `pipx`.

---

## Quick start

Base setup:

```bash
locram init
locram watch-install    # macOS: auto sync after vault edits
locram create --title "Hello" --type fleeting
locram search "Hello"
```

Embedding-enabled setup:

```bash
locram embed
locram embed-auto-install   # macOS: optional quiet-period auto-embed runner
```

### Obsidian

1. Install [Obsidian](https://obsidian.md/).
2. **Open folder as vault** and choose the locram vault directory:
   - default: **`~/.locram/vault`**
   - or: **`$LOCRAM_HOME/vault`** if you set `LOCRAM_HOME`.
3. Notes and the **graph** use that folder; `.obsidian/` is already configured by `locram init`.

Opening the folder in Finder (`open ~/.locram/vault`) does not open Obsidian—you pick the same path inside Obsidian via **Open folder as vault**.

---

## MCP config

Use the same MCP shape for Cursor, Claude Desktop, Codex, and other MCP-compatible clients.

Base setup:

```json
{
  "mcpServers": {
    "locram": {
      "command": "locram",
      "args": ["serve"]
    }
  }
}
```

Embedding-enabled setup:

```json
{
  "mcpServers": {
    "locram": {
      "command": "locram",
      "args": ["serve"],
      "env": {
        "LOCRAM_EMBED_PROVIDER": "ollama",
        "LOCRAM_EMBED_MODEL": "bge-m3",
        "LOCRAM_EMBED_URL": "http://localhost:11434"
      }
    }
  }
}
```

Custom data directory:

```json
{
  "mcpServers": {
    "locram": {
      "command": "locram",
      "args": ["serve"],
      "env": {
        "LOCRAM_HOME": "/Users/you/notes",
        "LOCRAM_EMBED_PROVIDER": "ollama",
        "LOCRAM_EMBED_MODEL": "bge-m3",
        "LOCRAM_EMBED_URL": "http://localhost:11434"
      }
    }
  }
}
```

Notes:

1. Only include `LOCRAM_EMBED_*` if you are using the embedding-enabled setup.
2. GUI apps do not reliably inherit `~/.zshrc`, so MCP clients should set embedding env vars in the config itself.
3. For Claude Desktop, use the full binary path from `which locram` or the full path to `uv`.
4. From source, use:
   - `"command": "uv", "args": ["--directory", "/absolute/path/to/locram", "run", "locram", "serve"]`

> If Cursor already has locram connected as an MCP server, all 27 tools are immediately available in that workspace.

---

## MCP tools

### CRUD

| Tool | Description |
|------|-------------|
| `create_page` | `title`, `content`, `type`, `subject[]`, `tags[]`, `parent_id`, `review_interval_days` |
| `get_page` | `identifier` = page id or **exact** title; ambiguous title → error, use id |
| `update_page` | Partial fields only |
| `delete_page` | Soft default; `hard=true` removes row |
| `list_pages` | `status`, `type`, `subject`, `tag`, `since_days`, `limit`, `offset` |
| `search` | FTS5 BM25 on title + content |

### Partial edit

| Tool | Description |
|------|-------------|
| `replace_in_page` | `page_id`, `old_text`, `new_text` |
| `update_section` | `page_id`, `heading`, `new_content` |

### Links

| Tool | Description |
|------|-------------|
| `link_pages` | Typed link; inverses created except `reference` |
| `unlink_pages` | `link_type=null` removes all links between the pair |
| `set_parent` | `child_id`, `parent_id` / null |
| `batch_link` | `[{source_id, target_id, link_type}]` → `{created, skipped, errors}` |

### Graph

| Tool | Description |
|------|-------------|
| `find_orphaned_notes` | No links and no parent |
| `find_central_notes` | By link degree |
| `find_similar_notes` | Semantic nearest neighbors when the note has a current-model embedding; otherwise lexical fallback using the note’s title |
| `find_due_for_review` | Past review interval |
| `get_graph_summary` | Stats + topology; `limit` default **500** (pages/links capped) |

### Lifecycle

| Tool | Description |
|------|-------------|
| `promote_page` | **MCP only** (not a CLI command). Moves **maturity** one step: `fleeting` → `note-taking` → `permanent`. Call when a note is ready for the next stage. Cannot demote; cannot “promote” to `structure` / `hub` — use `update_page(type=…)` for those **classification** types. |
| `mark_reviewed` | **MCP only.** Sets `reviewed_at` without changing body text |

### Attachments

| Tool | Description |
|------|-------------|
| `get_attachment` | `filename` under `vault/attachments/` → base64 |
| `save_attachment` | `filename`, `data_b64` |
| `render_mermaid` | `name` (stem, no extension), `diagram` (Mermaid text, no fences), `output_format` (`svg` \| `png`, default `svg`). Renders via `mmdc` → `vault/attachments/<name>.<format>`. Returns `{rendered, path, filename, embed}`. Requires `mmdc` on PATH: `npm install -g @mermaid-js/mermaid-cli` |

### Semantic search

| Tool | Description |
|------|-------------|
| `semantic_search` | `query`, `limit=20`. Vector-only cosine retrieval. Returns `{results, coverage, warning?, error?}`. Requires `LOCRAM_EMBED_PROVIDER=ollama` + `LOCRAM_EMBED_MODEL` (and `sqlite-vec` + running Ollama). |
| `hybrid_search` | `query`, `limit=20`. FTS5 + vector merge via RRF. Returns `{results[sources], coverage, warning?, error?}`. Falls back to lexical-only if semantic not configured or on embed/store errors. |

`find_similar_notes(page_id)` is now also embedding-aware: if the note has a `content` embedding for the current active model, the tool returns semantic nearest neighbors; otherwise it falls back to lexical title-based FTS similarity. Returned hits are annotated with `sources=["semantic"]` or `sources=["fts"]`.

### Embeddings

| Tool | Description |
|------|-------------|
| `store_embedding` | `page_id`, `embedding: list[float]`, `model`, `field='content'`. Little-endian float32; UPSERT |
| `find_unembedded` | `limit`, optional `model`. Without `model`: ids with no `content` embedding at all. With `model` (or unset + `LOCRAM_EMBED_MODEL` set): ids missing or stale for that model (same queue as `locram embed`) |
| `find_stale_embeddings` | `model`, `limit`. Explicit stale/missing queue for a given embedding model |

---

## CLI

```bash
locram init                              # Create DB, vault, Obsidian config
locram serve                             # Start MCP stdio server
locram sync [--file PATH]                # Sync vault → DB (full or single file)
locram vault-rebuild                     # Rebuild all .md files from DB
locram watch-install                     # Install macOS launchd watcher
locram watch-uninstall                   # Remove watcher
locram embed-auto-install                # Install periodic macOS auto-embed runner
locram embed-auto-uninstall              # Remove periodic auto-embed runner

locram create --title "..." [options]    # Create note (JSON output)
locram get <id>                          # Get note (JSON output)
locram search <query> [--limit N]        # FTS5 search (JSON output)
locram list [--type ...] [--status ...]  # List notes (JSON output)
locram link <src_id> <dst_id> [--type]  # Create typed link

locram embed [--limit N] [--force]       # Generate / refresh embeddings via Ollama
locram embed-auto-run                    # One quiet-period-gated auto-embed pass

locram reset                             # DELETE ALL DATA, recreate from scratch
```

All commands print **JSON**.

---

## Semantic search setup

This section applies only to the **embedding-enabled setup**.

Configuration rules:

1. For the CLI: set runtime variables in your shell profile (`~/.zshrc` or equivalent).
2. For MCP clients: put `LOCRAM_EMBED_*` in the MCP config `env` block. Do not rely on GUI apps inheriting `~/.zshrc`.
3. For `watch-install` / `embed-auto-install`: the current `LOCRAM_HOME` and `LOCRAM_EMBED_*` values are written into the launchd plist `EnvironmentVariables`. After changing any of those values, rerun the install commands.

### Environment variables

These are the actual environment variables read by `config.py` at startup:

| Variable | Default | Description |
|----------|---------|-------------|
| `LOCRAM_HOME` | `~/.locram` | Root data directory. Contains DB, vault, logs, lock/pending files. |
| `LOCRAM_EMBED_PROVIDER` | `""` | Set to `ollama` to enable. Empty = semantic disabled. |
| `LOCRAM_EMBED_MODEL` | `""` | Ollama model name, e.g. `bge-m3`. Empty = semantic disabled. |
| `LOCRAM_EMBED_URL` | `http://localhost:11434` | Ollama base URL. Only change if non-default port. |
| `LOCRAM_EMBED_API_KEY` | `""` | Reserved for future providers. Not used with Ollama. |
| `LOCRAM_EMBED_AUTO` | `0` | Set to `1` to enable auto-embed after sync (opt-in). |
| `LOCRAM_EMBED_QUIET_SECONDS` | `60` | Quiet period before auto-embed runs after sync. |
| `LOCRAM_EMBED_BATCH_SIZE` | `50` | Max notes per auto-embed pass. |

### Embedding commands

Assumes you already completed the embedding-enabled setup above.

```bash
locram embed              # embed new and stale notes
locram embed --limit 100  # limit batch size (default 200)
locram embed --force      # re-embed all active notes (up to 100k) — use after model switch
```

`locram embed` output:

```json
{ "embedded": 42, "skipped": 0, "failed": 0, "model": "bge-m3" }
```

`semantic_search` and `hybrid_search` always include `coverage` so you know what fraction of the active corpus was actually searched:

```json
{ "coverage": { "searched": 150, "total_active": 500 } }
```

A `warning` field appears when the DB contains embeddings from a different model — run `locram embed --force` to resolve.

### Optional auto-embed after vault edits (macOS)

Auto-embed is **opt-in** and macOS-only (uses launchd). Default: disabled.

Set the variables below in your shell, then install or reinstall the watcher/runner so the values
are written into the launchd plist:

```bash
# ~/.zshrc
export LOCRAM_EMBED_AUTO=1
export LOCRAM_EMBED_QUIET_SECONDS=60   # wait N seconds after sync before embedding
export LOCRAM_EMBED_BATCH_SIZE=50      # notes per pass

locram watch-install        # refreshes sync watcher env snapshot
locram embed-auto-install   # registers a launchd agent that runs every 30s
```

How it works:

1. vault edits trigger `locram sync` (via the existing `watch-install` watcher)
2. if sync inserted or updated any notes, a `~/.locram/embed.pending` marker is written
3. the embed runner (`locram embed-auto-run`) fires every 30 s via launchd
4. it skips if: auto disabled / provider not configured / no pending marker / quiet period not elapsed / another run is active (lock file)
5. once the quiet period passes, it embeds stale notes in batches and clears the pending marker when the queue is drained

This is **not** a true per-save event — it is a quiet-period approximation over the vault-synced corpus.

Runtime files (all under `~/.locram/`):

| File | Purpose |
|------|---------|
| `embed.pending` | Set by sync when notes changed; cleared by runner when queue drained |
| `embed.lock` | Prevents concurrent runs; auto-cleared if stale (>10 min) |
| `embed-runner.log` | stdout of every `embed-auto-run` invocation |
| `embed-runner.err` | stderr of every `embed-auto-run` invocation |

Run one pass manually at any time:

```bash
locram embed-auto-run
```

---

## Note types

| Type | Handle | Description |
|------|--------|-------------|
| Fleeting | `fleeting` | Quick raw capture — default type |
| Note-taking | `note-taking` | Structured note being refined |
| Permanent | `permanent` | Finished atomic idea — one idea per note |
| Structure | `structure` | Map of Content — organizes a topic via links |
| Hub | `hub` | Domain entry point — links to structure notes |

**Maturity** (pipeline): `fleeting` → `note-taking` → `permanent` — advance with the MCP tool **`promote_page`** when the note earns the next stage. **Classification** types `structure` and `hub` are not maturity steps; set them with **`update_page(type=…)`** (or `create_page`), not `promote_page`.

---

## Link types

12 directional types; `related` is symmetric; `reference` has no inverse. Pair examples: `extends` / `extended_by`, `supports` / `supported_by`, …

| Forward | Inverse | Relationship |
|---------|---------|--------------|
| `related` | `related` | General connection (symmetric) |
| `extends` | `extended_by` | A builds upon B |
| `supports` | `supported_by` | A provides evidence for B |
| `contradicts` | `contradicted_by` | A challenges B |
| `refines` | `refined_by` | A is a more precise version of B |
| `questions` | `questioned_by` | A raises a question about B |
| `reference` | *(none)* | A cites B as a source (unidirectional) |

- **Vertical:** `parent_id` / `set_parent` (sub-items).
- **Horizontal:** typed links (`link_pages`).
- **Inline:** `[[wikilinks]]` in body — resolved in `get_page`, not stored as link rows.

---

## Sync

**DB → vault:** every MCP/CLI mutation updates SQLite, then overwrites the `.md` file.

**Vault → DB:** `locram sync` (full vault) or `locram sync --file PATH` (one file, no delete detection). Watcher (macOS) runs debounced full sync. Logs: `~/.locram/watcher.log`.

**Optional semantic maintenance:** if auto-embed is enabled, the periodic embed runner checks for pending embedding work after a quiet period and runs batched `locram embed` logic in the background. This never blocks sync itself.

**Compared for “did the file change?”:** body hash, title (from first `#` H1, else filename stem), `type`, `status`, `subject`, `tags`, `review in`, `parent`.

**From vault into DB:** `title`, `content`, `type`, `status`, `subject`, `tags`, `review_interval_days`, and `parent` → `parent_id` — when they differ from the DB, the row is updated and **`updated_at` in the DB is set to now**. So editing a note in Obsidian and syncing **advances `updated`**; you do not set `updated` by hand in YAML.

**Not read from vault YAML into DB:** `id`, `created_at`, **`updated`**, `reviewed` / `reviewed_at`, persisted typed links, embeddings. Those stay DB-driven; the file is rewritten from DB after a successful sync or MCP write.

**Hierarchy note:** `parent` is vault-writable. Use a blank value to clear it, or `[[Exact Parent Title]]` to set it. If the parent reference is unknown, ambiguous, or self-referential, sync keeps the DB value and rewrites the file back to the canonical parent.

**Inline wikilinks:** `[[Title]]` references inside note content are preserved in `content` and resolved at read time as `inline_mentions`; they are **not** persisted into the typed `links` graph automatically.

**Content change from vault:** if the synced update includes new **`content`**, **`reviewed_at` is cleared** in the DB (then reflected in YAML on the next write).

---

## Frontmatter reference

| Field | Role |
|-------|------|
| `type`, `subject`, `tags`, `status`, `review in`, `parent` | Editable in Obsidian; **vault → DB** when sync detects changes |
| `created` | **DB only** — set at creation; YAML is a copy; manual edits are not applied from vault |
| `updated` | **DB only** — mirrors `updated_at`; advances on MCP/CLI updates and on vault→DB sync when something **actually changed**. **Not** bumped by **`mark_reviewed`** alone. **Do not hand-edit YAML** |
| `reviewed` | **DB only** — use MCP **`mark_reviewed`**; vault→DB sync **does not** read `reviewed` from YAML. **Clearing:** updating **content** via MCP or sync clears `reviewed_at` |
| `id` | ULID; **never change** |

Example block (shape only):

```yaml
---
type: permanent
subject: [ai]
tags: [research]
status: active
created: 2026-03-21T10:30
updated: 2026-03-21T14:15
review in: 7
reviewed: 2026-03-21T18:00:00Z
parent: "[[Parent Title]]"
id: 01KMXXXXXXXXXXXXXXXXXXXXXX
---
```

---

## Requirements

Python 3.11+. Watcher is macOS-only; sync works on Linux too.

---

## License

MIT
