Metadata-Version: 2.4
Name: tinillm
Version: 1.7.0
Summary: Hardware LLM capability scanner — know what runs on your machine
Requires-Python: >=3.11
Requires-Dist: click>=8.1
Requires-Dist: ddgs>=1.0
Requires-Dist: ollama>=0.4
Requires-Dist: psutil>=6.0
Requires-Dist: rich>=13.0
Provides-Extra: dev
Requires-Dist: pytest>=9.0; extra == 'dev'
Provides-Extra: rag
Requires-Dist: beautifulsoup4>=4.12; extra == 'rag'
Requires-Dist: bm25s>=0.1.10; extra == 'rag'
Requires-Dist: flagembedding>=1.2; extra == 'rag'
Requires-Dist: httpx>=0.27; extra == 'rag'
Requires-Dist: lxml>=5.2; extra == 'rag'
Requires-Dist: marker-pdf>=0.3; extra == 'rag'
Requires-Dist: openpyxl>=3.1; extra == 'rag'
Requires-Dist: pymupdf4llm>=0.0.17; extra == 'rag'
Requires-Dist: pymupdf>=1.24; extra == 'rag'
Requires-Dist: pytesseract>=0.3.10; extra == 'rag'
Requires-Dist: python-docx>=1.1; extra == 'rag'
Requires-Dist: python-pptx>=0.6.23; extra == 'rag'
Requires-Dist: qdrant-client>=1.9; extra == 'rag'
Requires-Dist: rank-bm25>=0.2.2; extra == 'rag'
Requires-Dist: tree-sitter-c>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-cpp>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-javascript>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-python>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-rust>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-typescript>=0.23; extra == 'rag'
Requires-Dist: tree-sitter>=0.23; extra == 'rag'
Description-Content-Type: text/markdown

# tinillm

Know what LLMs your hardware can run — locally, instantly.

```
pip install tinillm
tinillm scan
```

---

## What it does

`tinillm scan` inspects your CPU, RAM, and GPU then tells you exactly which
LLM model sizes can run on your machine, at what quality level, and how fast.

```
╭──────────────────────────────────────────────────────╮
│   tinillm scan — Hardware Report                     │
├──────────────────────────────────────────────────────┤
│  CPU    Intel Core i9-13900K   24c / 32t   5.8 GHz   │
│  RAM    32.0 GB total  ·  24.2 GB free               │
│  GPU    NVIDIA GeForce RTX 4090   24.0 GB   CUDA     │
│  OS     Linux                                        │
╰──────────────────────────────────────────────────────╯

  LLM Capability Matrix

  Model    Fit        Best Quant   Mem Needed   Tokens/sec
  ~1B      Perfect    Q8_0          1.9 GB        580 t/s
  ~3B      Perfect    Q8_0          3.8 GB        195 t/s
  ~7B      Perfect    Q6_K          6.2 GB         88 t/s
  ~13B     Perfect    Q5_K_M       10.1 GB         47 t/s
  ~34B     Good       Q4_K_M       21.8 GB         18 t/s

  1 model size(s) too large — use --verbose to show
  Perfect  ·  Good  ·  Marginal  ·  TooTight
```

**Works on Linux, macOS, and Windows.** No GPU required — CPU-only machines
are supported too.

---

## Install

```bash
pip install tinillm
```

Requires Python 3.11+. No other tools needed.

---

## Usage

```bash
tinillm scan                # hardware + LLM capability report (default)
tinillm scan --verbose      # show all model sizes including ones that don't fit
tinillm scan --json         # machine-readable JSON (for scripts / CI)
tinillm scan --no-color     # plain text (safe to pipe to grep / awk / log files)
```

### Scripting example

```bash
# Find all models that run perfectly on this machine
tinillm scan --json | python3 -c "
import json, sys
data = json.load(sys.stdin)
for fit in data['fits']:
    if fit['fit_level'] == 'Perfect':
        print(fit['model'], fit['best_quant'], fit['tokens_per_sec'], 't/s')
"
```

---

## GPU support

| Vendor | Detection method |
|--------|-----------------|
| NVIDIA | `nvidia-smi` → sysfs fallback |
| AMD | `rocm-smi` → sysfs fallback |
| Apple Silicon | `system_profiler` (unified memory) |
| Intel Arc | sysfs + `lspci` |
| Windows (all) | PowerShell WMI |
| Any | `vulkaninfo` last-resort fallback |

---

## Fit levels explained

| Level | Meaning |
|-------|---------|
| **Perfect** | Fits comfortably at Q4_K_M or better with ≥20% VRAM headroom |
| **Good** | Fits but tightly |
| **Marginal** | Fits only at heavy compression / reduced context, or CPU-only |
| **TooTight** | Won't fit under any quantisation |

---

## Versioning

`tinillm` grows one feature at a time:

| Version | Feature |
|---------|---------|
| **1.1** | `scan` — hardware LLM capability scanner ← *current* |
| 1.2 | *(next feature)* |

---

## Part of the tini* family

| Tool | What it does |
|------|-------------|
| [tiniRAG](https://github.com/TiniLLM/tiniRAG) | Privacy-first RAG CLI |
| **tinillm** | Hardware LLM capability scanner |
