Metadata-Version: 2.4
Name: hermes-px
Version: 0.0.4
Summary: Secure AI inference proxy library with Tor routing — Drop-in OpenAI SDK replacement
Author: EGen Labs
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: requests<3.0.0,>=2.31.0
Requires-Dist: requests[socks]<3.0.0,>=2.31.0
Requires-Dist: python-dotenv<2.0.0,>=1.0.0
Provides-Extra: rag
Requires-Dist: chromadb<1.0.0,>=0.4.0; extra == "rag"
Requires-Dist: watchdog<5.0.0,>=3.0.0; extra == "rag"
Requires-Dist: tiktoken<1.0.0,>=0.5.0; extra == "rag"
Requires-Dist: sentence-transformers<4.0.0,>=2.2.0; extra == "rag"
Requires-Dist: PyMuPDF<2.0.0,>=1.23.0; extra == "rag"
Requires-Dist: python-docx<2.0.0,>=1.0.0; extra == "rag"
Requires-Dist: Pillow<12.0.0,>=10.0.0; extra == "rag"
Requires-Dist: pytesseract<1.0.0,>=0.3.10; extra == "rag"
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# 🏛️ Hermes — Secure AI Inference Proxy

[![PyPI version](https://badge.fury.io/py/hermes-px.svg)](https://badge.fury.io/py/hermes-px)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

**Hermes** is a production-grade Python SDK that provides a **1:1 drop-in replacement** for the OpenAI Python library. 

Instead of calling centralized AI servers directly, Hermes routes all inference traffic through configurable **5-layer Tor proxy chains**, ensuring complete anonymity while maintaining full API compatibility. Built for security researchers, privacy advocates, and developers who need robust, anonymized AI inference.

> *Named after the Greek god of messengers, travelers, and thieves — Hermes carries your payloads through hidden paths between realms.*

---

## 📑 Table of Contents
1. [System Requirements](#-system-requirements)
2. [Installation](#-installation)
3. [Quick Start: The 2-Minute Guide](#-quick-start-the-2-minute-guide)
4. [Migrating from OpenAI SDK](#-migrating-from-openai-sdk)
5. [Tutorial: Core Features](#-tutorial-core-features)
   - [Basic Inference](#1-basic-inference)
   - [Multi-Turn Conversations](#2-multi-turn-conversations)
   - [Understanding the Response](#3-understanding-the-response)
6. [Advanced Feature: Local RAG Pipeline](#-advanced-feature-local-rag-pipeline)
7. [Error Handling & Stability](#-error-handling--stability)
8. [Configuration & Environment](#-configuration--environment)
9. [Interactive Learning CLI](#-interactive-learning-cli)

---

## 💻 System Requirements

Hermes requires an active Tor daemon to route traffic. **It will fail to connect if Tor is not running.**

By default, Hermes looks for Tor on `127.0.0.1:9050`.

**Ubuntu/Debian:**
```bash
sudo apt update
sudo apt install tor
sudo systemctl start tor
```

**macOS:**
```bash
brew install tor
brew services start tor
```

**Windows:**
```powershell
# Using Chocolatey (Run as Administrator)
choco install -y tor
net start tor
```

---

## 📦 Installation

Hermes is published on PyPI as `hermes-px` (to avoid naming conflicts). 

However, **in your code, you will always `import hermes`**.

### Standard Installation (Core API Routing)
```bash
pip install hermes-px
```

### Installation with RAG Support
If you want to use the completely offline, local Retrieval-Augmented Generation (RAG) capabilities:
```bash
pip install "hermes-px[rag]"
```
*(Note: This installs heavy dependencies like ChromaDB, PyMuPDF, and sentence-transformers).*

---

## ⚡ Quick Start: The 2-Minute Guide

The recommended way to use Hermes is via a context manager. This ensures the HTTP session and proxy connections are safely closed after use.

```python
from hermes import Hermes

# 1. Initialize the client (automatically uses local Tor proxy)
with Hermes() as client:
    
    # 2. Check if Tor is reachable
    if not client.ping():
        print("Tor is not running!")
        exit(1)
        
    # 3. Create a completion exactly like the OpenAI SDK
    response = client.chat.completions.create(
        model="OLYMPUS-1", # Hermes default proxy identifier
        messages=[
            {"role": "system", "content": "You are a concise engineering assistant."},
            {"role": "user", "content": "Explain binary search in one sentence."}
        ],
        timeout=30 # Tor circuits can be slow, generous timeouts are recommended
    )
    
    # 4. Access the content identically to OpenAI
    print(response.choices[0].message.content)
```

---

## 🔄 Migrating from OpenAI SDK

If you have an existing codebase using `openai`, migrating to `hermes` is a 2-line code change. **You do not need to rewrite your request logic or response parsing.**

**Before (OpenAI):**
```python
from openai import OpenAI
client = OpenAI(api_key="sk-YOUR-API-KEY")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)
```

**After (Hermes):**
```python
from hermes import Hermes
client = Hermes() # No API keys required!

response = client.chat.completions.create(
    model="OLYMPUS-1", # Use Hermes model IDs
    messages=[{"role": "user", "content": "Hello!"}]
)
```

---

## 📘 Tutorial: Core Features

### 1. Basic Inference
Hermes handles the construction of the request, system schema injection, and network tuning under the hood. Only the non-streamed `content` objects survive the HTTP boundaries.

```python
from hermes import Hermes

with Hermes() as client:
    # Any parameters like 'temperature' or 'top_p' are gracefully absorbed 
    # to maintain structural compatibility with the OpenAI SDK
    res = client.chat.completions.create(
        model="OLYMPUS-1",
        messages=[{"role": "user", "content": "Write a python script to reverse a string."}],
        temperature=0.7 
    )
    print(res.choices[0].message.content)
```

### 2. Multi-Turn Conversations
Hermes is stateless, just like the underlying LLM APIs. To have a conversation, you append the assistant's previous responses to the `messages` array.

```python
from hermes import Hermes

messages = [
    {"role": "system", "content": "You are a helpful math tutor."},
    {"role": "user", "content": "My favorite number is 42."}
]

with Hermes() as client:
    # Turn 1
    r1 = client.chat.completions.create(model="OLYMPUS-1", messages=messages)
    reply = r1.choices[0].message.content
    print("Hermes:", reply)
    
    # Append the assistant's reply to memory
    messages.append({"role": "assistant", "content": reply})
    
    # Turn 2
    messages.append({"role": "user", "content": "What number did I say earlier?"})
    r2 = client.chat.completions.create(model="OLYMPUS-1", messages=messages)
    print("Hermes:", r2.choices[0].message.content)
```

### 3. Understanding the Response
Hermes returns highly structured, immutable (frozen) dataclasses entirely matching the OpenAI spec.

```python
response = client.chat.completions.create(
    model="OLYMPUS-1", messages=[{"role": "user", "content": "Hi"}]
)

print(response.id)                          # e.g. "chatcmpl-qwerty123456"
print(response.model)                       # e.g. "OLYMPUS-1"
print(response.choices[0].finish_reason)    # e.g. "stop"
print(response.usage.total_tokens)          # Evaluated token usage

# Easily serialize the entire response for logging or APIs
json_data = response.to_dict()
```

---

## 📚 Advanced Feature: Local RAG Pipeline

Hermes includes a built-in Retrieval-Augmented Generation (RAG) pipeline for querying your local documents.

**Security First:** The RAG pipeline generates vector embeddings **100% locally** using `sentence-transformers`. Your document text is *never* sent over the network during ingestion. 

### Supported Formats
PDF, DOCX, TXT, MD, JSON, JSONL, CSV, and Images (PNG/JPG/GIF via OCR).

### How to use RAG

1. Create a `Data` directory in your project root and drop some documents in it.
2. Ensure you installed the RAG dependencies (`pip install hermes-px[rag]`).

```python
import time
from hermes.rag import pipeline

# 1. Start the background watchdog. It will monitor ./Data 
# and automatically chunk, embed, and store documents in ChromaDB
pipeline.start()

# Give it a moment to ingest your files
time.sleep(5) 

# 2. Query your documents
print(f"Total chunks stored: {pipeline.document_count}")

results = pipeline.query(
    query_texts=["How do I configure the server?"], 
    n_results=3
)

# 3. View the retrieved context
for idx, doc in enumerate(results["documents"]):
    source = results["metadatas"][idx]["source_id"]
    print(f"\n--- From: {source} ---")
    print(doc)

# 4. Stop the background watcher when done
pipeline.stop()
```

---

## 🛡️ Error Handling & Stability

Because Tor routes traffic through random global relays, network latency and dropouts are a reality. Hermes provides a granular exception hierarchy so you can build resilient retry logic.

```python
from hermes import (
    Hermes, 
    HermesConnectionError, 
    HermesTimeoutError, 
    HermesResponseError
)

with Hermes() as client:
    try:
        res = client.chat.completions.create(
            model="OLYMPUS-1",
            messages=[{"role": "user", "content": "Hi"}],
            timeout=15 # Aggressive timeout
        )
    except HermesConnectionError:
        print("CRITICAL: Tor proxy is down or local network offline.")
        # Action: Alert admin, restart Tor
    except HermesTimeoutError:
        print("WARNING: Tor circuit was too slow.")
        # Action: Retry the request (a new Tor circuit will likely be used)
    except HermesResponseError as e:
        print(f"WARNING: Upstream returned malformed data: {e}")
```

---

## ⚙️ Configuration & Environment

### Client Initialization
You can override the default Tor proxy location if your daemon runs elsewhere.
```python
client = Hermes(tor_proxy="socks5h://10.0.0.1:9150")
```

### Environment Variables
Hermes reads sensible defaults from your environment or `.env` file.

| Variable | Default | Description |
|----------|---------|-------------|
| `HERMES_TELEMETRY` | `1` | Set to `0` to completely disable fire-and-forget Supabase tracking. |
| `RAG_DATA_DIR` | `./Data` | Directory watched for RAG documents. Evaluated securely against path traversals. |
| `RAG_DB_DIR` | `./chroma_db` | Persistent vector store location. Locked to `0o700` permissions. |
| `HERMES_LOG_CONSOLE` | `0` | Set to `1` to pipe detailed debug logs to standard output. |

---

## 🎓 Interactive Learning CLI

To quickly get a feel for the API, Hermes ships with a beautiful interactive CLI tutorial.

After installing Hermes, run the learning CLI directly from the terminal. 

```bash
# If you cloned the repository:
python demo/hermes_learn.py

# To run it directly if installed globally via pip:
python -c "import urllib.request; exec(urllib.request.urlopen('https://raw.githubusercontent.com/EGenLabs/hermes/main/demo/hermes_learn.py').read())"
```

The CLI features:
- Interactive guided lessons on API components.
- A **Free Chat Sandbox** to test routing and response times live.
- Real-time proxy status checking.

---

### *Disclaimer & License*
*Hermes is released under the MIT License. EGen Labs assumes no liability for the content transmitted through the proxy network. You are responsible for ensuring that your usage complies with local laws and regulations.*
