Metadata-Version: 2.4
Name: tinillm
Version: 1.8.0
Summary: Hardware LLM capability scanner — know what runs on your machine
Requires-Python: >=3.11
Requires-Dist: click>=8.1
Requires-Dist: ddgs>=1.0
Requires-Dist: ollama>=0.4
Requires-Dist: prompt-toolkit>=3.0
Requires-Dist: psutil>=6.0
Requires-Dist: rich>=13.0
Provides-Extra: dev
Requires-Dist: pytest>=9.0; extra == 'dev'
Provides-Extra: rag
Requires-Dist: beautifulsoup4>=4.12; extra == 'rag'
Requires-Dist: bm25s>=0.1.10; extra == 'rag'
Requires-Dist: flagembedding>=1.2; extra == 'rag'
Requires-Dist: httpx>=0.27; extra == 'rag'
Requires-Dist: lxml>=5.2; extra == 'rag'
Requires-Dist: marker-pdf>=0.3; extra == 'rag'
Requires-Dist: openpyxl>=3.1; extra == 'rag'
Requires-Dist: pymupdf4llm>=0.0.17; extra == 'rag'
Requires-Dist: pymupdf>=1.24; extra == 'rag'
Requires-Dist: pytesseract>=0.3.10; extra == 'rag'
Requires-Dist: python-docx>=1.1; extra == 'rag'
Requires-Dist: python-pptx>=0.6.23; extra == 'rag'
Requires-Dist: qdrant-client>=1.9; extra == 'rag'
Requires-Dist: rank-bm25>=0.2.2; extra == 'rag'
Requires-Dist: tree-sitter-c>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-cpp>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-javascript>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-python>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-rust>=0.23; extra == 'rag'
Requires-Dist: tree-sitter-typescript>=0.23; extra == 'rag'
Requires-Dist: tree-sitter>=0.23; extra == 'rag'
Description-Content-Type: text/markdown

# tinillm

Know what LLMs your hardware can run — locally, instantly.

```
pipx install tinillm
tinillm
```

---

## What it does

`tinillm` is an interactive tool. Launch it by typing `tinillm` in your
terminal, and you'll land in a welcome screen. From there, every feature is
a slash command: `/scan` inspects your hardware, `/models` browses real LLMs,
`/run` launches one in Ollama, and so on.

```
╭─── tinillm v1.8.0 ───────────────────────────────────────────────────╮
│  ████████╗                                                           │
│     ██╔══╝    Welcome back, Harish!      Tips for getting started    │
│     ██║                                  ────────────────────────    │
│     ██║       v1.8.0 · tinillm             Run /scan    to detect    │
│     ╚═╝       ~/tinillm_CLI                Run /models  to browse    │
│               ollama ● running             Run /run     to launch    │
│                                            Run /doctor  health chk   │
│                                            Run /help    list all     │
│                                                                      │
│                                          Recent activity             │
│                                          ────────────────────────    │
│                                            /scan                     │
│                                            /run llama3.2:3b          │
╰──────────────────────────────────────────────────────────────────────╯
  Type /help for commands · /exit or Ctrl+D to quit

tinillm> /scan

  LLM Capability Matrix

  Model    Fit        Best Quant   Mem Needed   Tokens/sec
  ~1B      Perfect    Q8_0          1.9 GB        580 t/s
  ~3B      Perfect    Q8_0          3.8 GB        195 t/s
  ~7B      Perfect    Q6_K          6.2 GB         88 t/s
  ~13B     Perfect    Q5_K_M       10.1 GB         47 t/s
  ~34B     Good       Q4_K_M       21.8 GB         18 t/s

tinillm>
```

**Works on Linux, macOS, and Windows.** No GPU required — CPU-only machines
are supported too.

---

## Install

```bash
pipx install tinillm     # recommended: isolated per-tool environment
# or
pip install tinillm
```

Requires Python 3.11+. No other tools needed.

---

## Usage

Launch the tool with a single command:

```bash
tinillm
```

Inside the REPL, every feature is a slash command:

| Command | What it does |
|---------|-------------|
| `/scan`    | Scan hardware and show which LLM sizes fit |
| `/scan --verbose` | Include model sizes that don't fit |
| `/scan --json` | Machine-readable JSON output |
| `/models`  | Browse real LLM models and see which fit |
| `/models --fits-only` | Hide models that don't fit |
| `/models --ollama` | Show which models are installed in local Ollama |
| `/run`     | Pick a compatible model interactively and run it |
| `/run llama3.2:3b` | Launch a specific model directly |
| `/suggest --use-case coding` | Personalised model recommendation |
| `/index ./docs` | Index local documents for RAG |
| `/ask "..."` | Ask a question against your indexed corpus |
| `/doctor`  | System health check (hardware + Ollama + RAG) |
| `/rag info` | Show RAG index statistics |
| `/help`    | List every command |
| `/clear`   | Clear the terminal |
| `/exit`    | Quit (or Ctrl+D) |

Tab-completion works on slash commands, subcommands, and flags.

---

## First launch

The first time you run `tinillm`, it automatically runs `/scan` for you so
you see your hardware capabilities immediately. On subsequent launches, just
the welcome panel appears.

---

## GPU support

| Vendor | Detection method |
|--------|-----------------|
| NVIDIA | `nvidia-smi` → sysfs fallback |
| AMD | `rocm-smi` → sysfs fallback |
| Apple Silicon | `system_profiler` (unified memory) |
| Intel Arc | sysfs + `lspci` |
| Windows (all) | PowerShell WMI |
| Any | `vulkaninfo` last-resort fallback |

---

## Fit levels explained

| Level | Meaning |
|-------|---------|
| **Perfect** | Fits comfortably at Q4_K_M or better with ≥20% VRAM headroom |
| **Good** | Fits but tightly |
| **Marginal** | Fits only at heavy compression / reduced context, or CPU-only |
| **TooTight** | Won't fit under any quantisation |

---

## Versioning

| Version | Feature |
|---------|---------|
| **1.8** | Interactive REPL (single entry-point, slash commands) ← *current* |
| 1.7 | Added RAG (`/index`, `/ask`, `/rag`) |
| 1.1 | First feature — hardware scanner |

---

## Part of the tini* family

| Tool | What it does |
|------|-------------|
| [tiniRAG](https://github.com/TiniLLM/tiniRAG) | Privacy-first RAG CLI |
| **tinillm** | Interactive LLM + hardware tool |
