Metadata-Version: 2.4
Name: chatmem
Version: 0.1.9
Summary: Drop-in context memory layer for OpenAI-compatible LLM sessions
License: MIT
Project-URL: Homepage, https://github.com/yourname/chatmem
Project-URL: Repository, https://github.com/yourname/chatmem
Keywords: llm,memory,openai,context,chatbot
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: openai>=1.0.0
Requires-Dist: tiktoken>=0.7.0
Provides-Extra: mcp
Requires-Dist: mcp>=1.0.0; extra == "mcp"
Provides-Extra: serve
Requires-Dist: fastapi>=0.115.0; extra == "serve"
Requires-Dist: uvicorn>=0.29.0; extra == "serve"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"

# chatmem

A drop-in context memory layer for long-running LLM sessions.

Keeps conversations alive across process restarts, compresses old turns automatically, and builds persistent user memory — without changing how you call the API.

Works with any OpenAI-compatible provider (OpenAI, DeepSeek, Moonshot, local models via Ollama, etc.).

---

## How it works

```
┌─────────────────────────────────────────────────────────┐
│  Every chat() call assembles context from 3 layers:     │
│                                                         │
│  [system + core memory]  ← persistent user traits      │
│  [recent sessions]       ← summaries of past sessions  │
│  [current history]       ← verbatim last 20 turns       │
│         ↓ old turns auto-compressed when > 50% full     │
└─────────────────────────────────────────────────────────┘
```

- **Session persistence** — history is saved to SQLite after every turn; process restarts pick up where they left off
- **Lazy compression** — old turns are replaced with structured summaries when context hits 50% of the model's window
- **Core memory** — a background judge extracts long-term facts (preferences, goals, identity) into a persistent key-value store with decay
- **Recent memory** — session summaries are carried into future sessions so the model remembers past conversations

---

## Install

```bash
pip install chatmem
```

---

## Quick start

```python
from chatmem import ContextManager, ContextConfig, LLMConfig

config = ContextConfig(llm=LLMConfig(
    model="gpt-4o-mini",
    api_key_env="OPENAI_API_KEY",   # reads from environment variable
    context_window=128_000,
))

cm = ContextManager.create(config=config)

# Drop-in replacement for client.chat.completions.create
response = cm.chat([
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user",   "content": "Hello!"},
])
print(response.choices[0].message.content)
```

---

## Configuration

### Generate a config file

```bash
chatmem init            # creates ./chatmem.json  (project-level)
chatmem init --global   # creates ~/.chatmem/config.json  (shared across projects)
```

Edit the generated file, then verify:

```bash
chatmem config          # shows which config file is active and its contents
```

### Config file format

```json
{
  "llm": {
    "model": "gpt-4o-mini",
    "base_url": null,
    "context_window": 128000,
    "compress_model": null,
    "extra_body": null,
    "api_key": null,
    "api_key_env": "OPENAI_API_KEY"
  },
  "dimensions": {
    "identity":      { "label": "Who I am",          "stability": 50.0 },
    "values":        { "label": "What I believe",    "stability": 40.0 },
    "goals":         { "label": "What I want",       "stability": 30.0 },
    "preferences":   { "label": "Likes / dislikes",  "stability": 25.0 },
    "capabilities":  { "label": "What I can do",     "stability": 30.0 },
    "emotional":     { "label": "How I react",       "stability": 20.0 },
    "autobiography": { "label": "Key experiences",   "stability": 35.0 }
  }
}
```

### Loading config in code

`from_json()` with no arguments auto-discovers the config file:

```python
# Auto-discover: checks $CHATMEM_CONFIG → ./chatmem.json → ~/.chatmem/config.json
config = ContextConfig.from_json()
cm = ContextManager.create(config=config)

# Or pass an explicit path
config = ContextConfig.from_json("/path/to/myconfig.json")

# Or set an environment variable (useful for containers / CI)
# CHATMEM_CONFIG=/etc/myapp/chatmem.json python app.py
```

### API key resolution order

1. `api_key=` passed directly to `ContextManager.create()`
2. `api_key` field in config
3. Environment variable named by `api_key_env` in config ← recommended for production

### Provider examples

```json
// OpenAI (default)
{ "model": "gpt-4o-mini", "base_url": null }

// DeepSeek
{ "model": "deepseek-chat", "base_url": "https://api.deepseek.com/v1" }

// Moonshot / Kimi
{ "model": "kimi-k2.5", "base_url": "https://api.moonshot.cn/v1",
  "extra_body": { "thinking": { "type": "disabled" } } }

// Local (Ollama)
{ "model": "llama3", "base_url": "http://localhost:11434/v1" }
```

---

## Core memory dimensions

Dimensions define what the model remembers long-term about a user. Customize them to match your domain:

```python
from chatmem import ContextConfig, LLMConfig, DimensionConfig

config = ContextConfig(
    llm=LLMConfig(model="gpt-4o-mini", api_key_env="OPENAI_API_KEY"),
    dimensions={
        "domain":      DimensionConfig(label="Expertise area",  stability=40.0),
        "preferences": DimensionConfig(label="Preferences",     stability=25.0),
        "style":       DimensionConfig(label="Communication style", stability=30.0),
    },
)
```

`stability` controls how slowly an entry decays — higher values persist longer.

---

## Session lifecycle

```python
# Non-streaming (default)
response = cm.chat(messages)
print(response.choices[0].message.content)

# Streaming — pass any kwarg supported by the OpenAI SDK
for chunk in cm.chat(messages, stream=True):
    print(chunk.choices[0].delta.content or "", end="", flush=True)

# Async
response = await cm.achat(messages)

# Explicitly end a session (saves summary to recent memory)
cm.end_session()

# Manually store something in core memory
cm.remember("preferred_language", "Python", dimension="preferences")
```

All extra kwargs (`temperature`, `max_tokens`, `stream`, etc.) are forwarded directly to the underlying API — chatmem does not need to declare them.

---

## MCP integration (OpenClaw, etc.)

Install with MCP support:

```bash
pip install "chatmem[mcp]"
```

Start the MCP server:

```bash
chatmem-mcp                        # auto-discovers chatmem.json / ~/.chatmem/config.json
chatmem-mcp --config path/to.json  # explicit config
chatmem-mcp --db /data/chatmem.db  # explicit DB path
```

Add to your agent's MCP config:

```json
{
  "mcpServers": {
    "chatmem": {
      "command": "chatmem-mcp",
      "args": []
    }
  }
}
```

### Tools exposed

| Tool | Args | Description |
|------|------|-------------|
| `chat` | `message`, `system?` | Send user message, get reply with persistent memory |
| `remember` | `key`, `value`, `dimension?` | Manually write a fact to core memory |
| `end_session` | — | Save session summary and reset history |

### Design: only user-facing conversation goes through chatmem

The agent's internal reasoning and tool calls are unaffected.
Only the user ↔ assistant exchange is routed through chatmem:

```
Agent internal reasoning → agent's own LLM  (unchanged)
Agent tool calls         → agent's tools    (unchanged)
User-facing reply        → chatmem.chat()   (memory managed here)
```

## Proxy server (recommended for agent integration)

The proxy is an OpenAI-compatible HTTP server. Any agent that supports a custom `base_url` works with zero code changes.

Install:

```bash
pip install "chatmem[serve]"
```

Start:

```bash
chatmem serve                        # default: 127.0.0.1:12434
chatmem serve --port 8080
chatmem serve --host 0.0.0.0         # expose to network
chatmem serve --config path/to.json --db /data/chatmem.db
```

Configure your agent:

```python
# Any OpenAI-compatible client
client = OpenAI(base_url="http://127.0.0.1:12434/v1", api_key="any")
```

```bash
# Environment variable
OPENAI_BASE_URL=http://127.0.0.1:12434/v1
```

All LLM calls are automatically enriched with memory — no agent code changes required.

---

## Integration via code

`ContextManager.chat()` is a drop-in replacement for `client.chat.completions.create()`:

```python
# Before
response = client.chat.completions.create(model=..., messages=messages)

# After
response = cm.chat(messages)
```

---

## Requirements

- Python 3.11+
- `openai >= 1.0.0`
- `tiktoken >= 0.7.0`
