Metadata-Version: 2.1
Name: quarterbit
Version: 20.5.2
Summary: Memory-efficient LLM training. AXIOM enables 70B on single H100, 7B on consumer GPUs.
Home-page: https://quarterbit.dev
Author: Clouthier Simulation Labs
Author-email: Clouthier Simulation Labs <info@quarterbit.dev>
Project-URL: Homepage, https://quarterbit.dev
Project-URL: Documentation, https://quarterbit.dev/docs
Keywords: optimizer,adam,deep-learning,pytorch,gpu,memory-efficient,compression,axiom
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown

# QuarterBit AXIOM

**Train large models on single GPUs. 15-17x memory compression.**

AXIOM enables memory-efficient training for any PyTorch model. Train 70B on a single H100, or 7B on your RTX 4070.

## Installation

```bash
pip install quarterbit
```

**Requirements:** Python 3.11+, PyTorch 2.0+, NVIDIA GPU

## Quick Start

```python
from quarterbit import load_model, TrainingStats

# Load any model with memory compression
model = load_model("meta-llama/Llama-2-7b-hf")  # HuggingFace
# OR
model = load_model(my_pytorch_model)             # Custom PyTorch

# Training - no optimizer needed!
stats = TrainingStats(log_interval=100)
for step, batch in enumerate(train_loader):
    loss = model(**batch).loss
    loss.backward()  # Weights update automatically
    stats.log(step, loss.item())

# Save with standard PyTorch
torch.save(model.state_dict(), "checkpoint.pt")
```

Output:
```
AXIOM STREAMING LOADER
  Mode: Sub-byte compression

VLA LOADING COMPLETE
  GPU memory: 5.2 GB
  Trainable: 100%

Step   100 | Loss: 3.24 | PPL: 25.67 | Peak: 5.2GB
Step   200 | Loss: 2.89 | PPL: 18.05 | Peak: 5.2GB
```

## Memory Comparison

| Model | Standard (FP16+AdamW) | AXIOM | Savings |
|-------|----------------------|-------|---------|
| 7B | 42 GB | **5 GB** | 8.4x |
| 13B | 78 GB | **9 GB** | 8.7x |
| 70B | 420 GB | **53 GB** | 7.9x |

## How It Works

```
load_model("llama-7b")
    │
    ▼
┌─────────────────────────────────┐
│ 1. Streaming Load               │  Never exceeds final size
│ 2. Proprietary Compression      │  < 1 byte/param
│ 3. CUDA Kernels                 │  10-50x faster
│ 4. Auto-update in backward()   │  No optimizer needed
└─────────────────────────────────┘
    │
    ▼
loss.backward()  →  Weights update automatically
```

## Supported Models

Works with **any PyTorch model**:
- LLaMA, Mistral, Mixtral, Qwen, Yi, Phi, Gemma
- ViT, CLIP, Whisper, LLaVA
- Custom PyTorch models

## CLI

```bash
quarterbit login      # Login via browser
quarterbit status     # Show license info
```

## License

**Free account required** - [quarterbit.dev](https://quarterbit.dev)

| Tier | Price | GPU Hours |
|------|-------|-----------|
| Free | $0 | 5/month |
| Pro | $49/mo | Unlimited |

---

**Clouthier Simulation Labs** | 2026
