Metadata-Version: 2.1
Name: quarterbit
Version: 30.1.9
Summary: Memory-efficient optimizer for full-parameter LLM training.
Home-page: https://quarterbit.dev
Author: Clouthier Simulation Labs
Author-email: Clouthier Simulation Labs <info@quarterbit.dev>
License: Free during beta. See quarterbit.dev for terms.
Project-URL: Homepage, https://quarterbit.dev
Project-URL: Documentation, https://quarterbit.dev/docs
Keywords: optimizer,adam,deep-learning,pytorch,gpu,memory-efficient,compression,axiom
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown

# QuarterBit AXIOM

**Memory-efficient optimizer for full-parameter LLM training.**

[![PyPI](https://img.shields.io/pypi/v/quarterbit)](https://pypi.org/project/quarterbit/)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![Windows](https://img.shields.io/badge/Windows-0078D6?logo=windows&logoColor=white)](https://pypi.org/project/quarterbit/)
[![Linux](https://img.shields.io/badge/Linux-FCC624?logo=linux&logoColor=black)](https://pypi.org/project/quarterbit/)

## What is AXIOM?

AXIOM is a drop-in optimizer that reduces training memory. Unlike LoRA or adapters, it trains all parameters.

**Memory comparison:**

| Component | AdamW | AXIOM | Savings |
|-----------|-------|-------|---------|
| Gradients | 4 bytes/param | 0.004 bytes/param | 960x |
| Optimizer State | 8 bytes/param | 0.08 bytes/param | 100x |
| **Total** | **12 bytes/param** | **<0.1 bytes/param** | **140x** |

**Comparison with other methods:**

| Method | Memory | Full Training | Convergence |
|--------|--------|---------------|-------------|
| AdamW | 12 bytes/param | Yes | Baseline |
| 8-bit Adam | 6 bytes/param | Yes | Matches AdamW |
| GaLore | 4 bytes/param | Yes | Slightly slower |
| LoRA/QLoRA | N/A | No (subset) | Varies |
| **AXIOM** | **<0.1 bytes/param** | **Yes** | **Matches AdamW** |

## Installation

```bash
pip install quarterbit
```

Requires: Python 3.11+, PyTorch 2.0+, CUDA GPU (Windows or Linux)

## Quick Start

```python
from quarterbit import AXIOM_Trainer
from transformers import AutoModelForCausalLM
import torch

# Load model
model = AutoModelForCausalLM.from_pretrained("gpt2", torch_dtype=torch.float16)

# Train with AXIOM
trainer = AXIOM_Trainer(model, train_loader, val_loader)
results = trainer.fit(steps=1000)

# Results include: initial_loss, final_loss, improvement_pct, peak_vram_gb
print(results)
```

## Try It

Run on free Kaggle T4:

[![Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://www.kaggle.com/code/kyleclouthier/axiom-train-gpt-j-6b-on-free-kaggle-t4)

Full-parameter training of GPT-J 6B on a 16GB GPU.

## How It Works

AXIOM compresses optimizer state and gradients during training:

- **Full training** - all parameters updated (not LoRA/adapters)
- **Drop-in replacement** - works with existing training code
- **Lossless** - no accuracy degradation vs FP16+AdamW

## Examples

See [examples/](examples/):
- `train_gpt2.py` - GPT-2 on WikiText
- `train_qwen.py` - Qwen2.5 models
- `kaggle_gptj_6b.ipynb` - GPT-J 6B on Kaggle T4

## Support

Free during beta. If useful, consider supporting development:

[![Ko-fi](https://img.shields.io/badge/Ko--fi-Support-ff5e5b?logo=ko-fi)](https://ko-fi.com/kyleclouthier)

## Links

- [quarterbit.dev](https://quarterbit.dev)
- [PyPI](https://pypi.org/project/quarterbit/)
- [Kaggle Demo](https://www.kaggle.com/code/kyleclouthier/axiom-train-gpt-j-6b-on-free-kaggle-t4)

## License

Free during beta. See [quarterbit.dev](https://quarterbit.dev) for terms.

---

Built by Clouthier Simulation Labs, Ontario, Canada.
