Metadata-Version: 2.1
Name: quarterbit
Version: 20.3.0
Summary: Memory-efficient LLM training. AXIOM enables 70B on single H100, 7B on consumer GPUs.
Home-page: https://quarterbit.dev
Author: Clouthier Simulation Labs
Author-email: Clouthier Simulation Labs <info@quarterbit.dev>
Project-URL: Homepage, https://quarterbit.dev
Project-URL: Documentation, https://quarterbit.dev/docs
Keywords: optimizer,adam,deep-learning,pytorch,gpu,memory-efficient,compression,axiom
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown

# QuarterBit AXIOM

**Train large models on single GPUs. 15-17x memory compression.**

AXIOM enables memory-efficient training for any PyTorch model. Train 70B on a single H100, or 13B on a free Kaggle T4.

## Installation

```bash
pip install quarterbit
```

**Requirements:** Python 3.11+, PyTorch 2.0+, NVIDIA GPU

## Quick Start

```python
from quarterbit import axiom, TrainingStats
from transformers import AutoModelForCausalLM
import torch

# Load any model
model = AutoModelForCausalLM.from_pretrained("model-name", torch_dtype=torch.float16)

# Enable memory-efficient training
model = axiom(model)
model = model.cuda()

# Training loop with stats
stats = TrainingStats(log_interval=100)
for step, batch in enumerate(train_loader):
    loss = model(**batch).loss
    loss.backward()
    stats.log(step, loss.item())

    # Validation (optional)
    if step % 500 == 0 and step > 0:
        val_loss = evaluate(model, val_loader)
        stats.log_val(step, val_loss)

stats.summary()

# Save/resume with standard PyTorch
torch.save(model.state_dict(), "checkpoint.pt")
```

Output:
```
Step   100 | Loss: 3.2451 | PPL: 25.67 | 1250 tok/s | Peak: 5.2GB
Step   200 | Loss: 2.8934 | PPL: 18.05 | 1312 tok/s | Peak: 5.2GB
...
Step   500 | Loss: 2.5123 | PPL: 12.33 | 1285 tok/s | Peak: 5.2GB
         >>> Val Loss: 2.6841 | Val PPL: 14.64
...
==================================================
Training Complete
  Train Loss: 3.8521 -> 2.1234
  Best Loss:  2.0891
  Val PPL:    18.92 -> 12.45 (+34.2%)
  Best Val:   12.31
  Steps:      2000
  Time:       45.2 min
  Throughput: 1285 tok/s avg
==================================================
```

## Memory Savings

| Model | Standard (FP16+AdamW) | AXIOM | Compression |
|-------|----------------------|-------|-------------|
| 7B | 84 GB | 5.5 GB | 15x |
| 13B | 156 GB | 9 GB | 17x |
| 70B | 840 GB | 53 GB | 16x |

## Supported Models

Works with **any PyTorch model**:

- All HuggingFace Transformers (NLP, Vision, Audio, Multimodal)
- LLaMA, Mistral, Mixtral, Qwen, Yi, Phi, Gemma
- ViT, CLIP, Whisper, LLaVA
- Custom PyTorch models

## Extensions

```python
from quarterbit import AXIOM_CHECKPOINT, AXIOM_DDP

# Activation checkpointing (additional memory savings)
actcp = AXIOM_CHECKPOINT(max_slots=32, max_n=1024*1024)

# Distributed training (128x bandwidth reduction)
compressor = AXIOM_DDP(n=total_params, top_k_percent=6.25)
```

## CLI

```bash
quarterbit login      # Login via browser (recommended)
quarterbit status     # Show license and usage
quarterbit activate   # Activate with license key
```

## License

**Free account required** - Sign up at [quarterbit.dev](https://quarterbit.dev)

| Tier | Price | GPU Hours |
|------|-------|-----------|
| Free | $0 | 5/month |
| Academic | $0 | 10/month |
| Pro | $49/mo | Unlimited |
| Team | $299/mo | Unlimited |

## Links

- **Website**: [quarterbit.dev](https://quarterbit.dev)
- **Documentation**: [quarterbit.dev/docs](https://quarterbit.dev/docs)

---

**Clouthier Simulation Labs** | Copyright 2026
