Metadata-Version: 2.1
Name: quarterbit
Version: 20.4.0
Summary: Memory-efficient LLM training. AXIOM enables 70B on single H100, 7B on consumer GPUs.
Home-page: https://quarterbit.dev
Author: Clouthier Simulation Labs
Author-email: Clouthier Simulation Labs <info@quarterbit.dev>
Project-URL: Homepage, https://quarterbit.dev
Project-URL: Documentation, https://quarterbit.dev/docs
Keywords: optimizer,adam,deep-learning,pytorch,gpu,memory-efficient,compression,axiom
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown

# QuarterBit AXIOM

**Train large models on single GPUs. 15-17x memory compression.**

AXIOM enables memory-efficient training for any PyTorch model. Train 70B on a single H100, or 13B on a free Kaggle T4.

## Installation

```bash
pip install quarterbit
```

**Requirements:** Python 3.11+, PyTorch 2.0+, NVIDIA GPU

## Quick Start

```python
from quarterbit import axiom, TrainingStats
from transformers import AutoModelForCausalLM
import torch

# Load any model
model = AutoModelForCausalLM.from_pretrained("model-name", torch_dtype=torch.float16)

# Enable memory-efficient training
model = axiom(model)
model = model.cuda()

# Standard optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=5e-5)

# Training loop with stats
stats = TrainingStats(log_interval=100)
for step, batch in enumerate(train_loader):
    loss = model(**batch).loss

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    stats.log(step, loss.item())

stats.summary()

# Save/resume with standard PyTorch
torch.save(model.state_dict(), "checkpoint.pt")
```

Output:
```
VLA TRAINABLE: 99 layers converted
  Compression: 6.4x vs FP32

Step   100 | Loss: 3.2451 | PPL: 25.67 | Peak: 5.2GB
Step   200 | Loss: 2.8934 | PPL: 18.05 | Peak: 5.2GB
...

==================================================
Training Complete
  Train Loss: 3.85 -> 2.12
  Steps:      1000
==================================================
```

## Memory Savings

| Model | Standard (FP16+AdamW) | AXIOM | Compression |
|-------|----------------------|-------|-------------|
| 7B | 84 GB | 5.5 GB | 15x |
| 13B | 156 GB | 9 GB | 17x |
| 70B | 840 GB | 53 GB | 16x |

## Supported Models

Works with **any PyTorch model**:

- All HuggingFace Transformers (NLP, Vision, Audio, Multimodal)
- LLaMA, Mistral, Mixtral, Qwen, Yi, Phi, Gemma
- ViT, CLIP, Whisper, LLaVA
- Custom PyTorch models

## CLI

```bash
quarterbit login      # Login via browser (recommended)
quarterbit status     # Show license and usage
quarterbit activate   # Activate with license key
```

## License

**Free account required** - Sign up at [quarterbit.dev](https://quarterbit.dev)

| Tier | Price | GPU Hours |
|------|-------|-----------|
| Free | $0 | 5/month |
| Academic | $0 | 10/month |
| Pro | $49/mo | Unlimited |
| Team | $299/mo | Unlimited |

## Links

- **Website**: [quarterbit.dev](https://quarterbit.dev)
- **Documentation**: [quarterbit.dev/docs](https://quarterbit.dev/docs)

---

**Clouthier Simulation Labs** | Copyright 2026
