Metadata-Version: 2.4
Name: Synth-CLI
Version: 1.0.0
Summary: AI Content Authenticator & OCR Engine — detect AI-generated text and AI-generated images with deep learning.
Author-email: Khushaal <khushaal@example.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/khushaal/synth-cli
Project-URL: Repository, https://github.com/khushaal/synth-cli
Project-URL: Issues, https://github.com/khushaal/synth-cli/issues
Project-URL: Changelog, https://github.com/khushaal/synth-cli/blob/main/CHANGELOG.md
Keywords: ai,authenticator,ocr,cli,nlp,content-detection,deepfake,image-forensics,midjourney,stable-diffusion,vision-transformer
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: typer<1,>=0.12
Requires-Dist: rich<16,>=13
Requires-Dist: easyocr<2,>=1.7
Requires-Dist: opencv-python-headless<5,>=4.9
Requires-Dist: pypdfium2<5,>=4.25
Requires-Dist: torch<3,>=2.2
Requires-Dist: transformers<6,>=4.40
Requires-Dist: Pillow>=10.0
Requires-Dist: python-dotenv<2,>=1.0
Requires-Dist: httpx<1,>=0.27
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-cov>=5.0; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Requires-Dist: mypy>=1.10; extra == "dev"
Requires-Dist: pre-commit>=3.7; extra == "dev"
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=6.0; extra == "dev"
Dynamic: license-file

<p align="center">
  <pre>
  ███████╗██╗   ██╗███╗   ██╗████████╗██╗  ██╗
  ██╔════╝╚██╗ ██╔╝████╗  ██║╚══██╔══╝██║  ██║
  ███████╗ ╚████╔╝ ██╔██╗ ██║   ██║   ███████║
  ╚════██║  ╚██╔╝  ██║╚██╗██║   ██║   ██╔══██║
  ███████║   ██║   ██║ ╚████║   ██║   ██║  ██║
  ╚══════╝   ╚═╝   ╚═╝  ╚═══╝   ╚═╝   ╚═╝  ╚═╝
  </pre>
</p>

<p align="center">
  <a href="https://pypi.org/project/Synth-CLI/"><img src="https://img.shields.io/pypi/v/Synth-CLI.svg" alt="PyPI version"></a>
  <a href="https://pypi.org/project/Synth-CLI/"><img src="https://img.shields.io/pypi/pyversions/Synth-CLI.svg" alt="Python versions"></a>
  <a href="https://github.com/khushalv21/SYNTH/blob/main/LICENSE"><img src="https://img.shields.io/github/license/khushalv21/SYNTH.svg" alt="License"></a>
</p>

<p align="center">
  <a href="#installation">Installation</a> •
  <a href="#quick-start">Quick Start</a> •

<p align="center">
  <strong>AI Content Authenticator & OCR Engine</strong><br>
  <em>Detect AI-generated text and AI-generated images with a single command.</em>
</p>

<p align="center">
  <a href="#installation">Installation</a> •
  <a href="#quick-start">Quick Start</a> •
  <a href="#cli-reference">CLI Reference</a> •
  <a href="#configuration">Configuration</a> •
  <a href="docs/ARCHITECTURE.md">Architecture</a> •
  <a href="docs/UNIVERSAL_API_GUIDE.md">API Guide</a>
</p>

---

## What is Synth?

**Synth** is an enterprise-grade CLI tool that authenticates content origin — whether it's text extracted from images/PDFs or the images themselves. It detects AI-generated text (GPT, Claude, etc.) and AI-generated images (Midjourney, Stable Diffusion, DALL·E). Everything runs in your terminal — no web server, no browser, no GUI.

### Features

- 🔍 **OCR Pipeline** — OpenCV preprocessing (grayscale → adaptive threshold → denoise) + EasyOCR extraction
- 🤖 **AI Text Detection** — Local HuggingFace models or any remote API (OpenAI, Anthropic, Ollama, custom)
- 🖼️ **AI Image Forensics** — Vision Transformer (ViT) model detects Midjourney, Stable Diffusion, and DALL·E output (auto-detected)
- 📄 **PDF Ingestion** — Feed multi-page PDFs directly into the pipeline — zero-friction, no poppler or system dependencies required
- 🔌 **Strategy Pattern** — Swap detection backends without changing a line of code
- 🍎 **Hardware Agnostic** — Auto-detects CUDA (NVIDIA), MPS (Apple Silicon), or CPU
- 📁 **Batch Processing** — Scan entire directories with progress bars
- 🎨 **Rich TUI** — Colour-coded verdicts, AI-pattern highlighting, styled tables
- 📦 **One-Line Install** — `pip install Synth-CLI` and you're ready to go

---

## Installation

```bash
pip install Synth-CLI
```

That's it. All dependencies (OCR, ML models, PDF support) are handled automatically.

### Verify Installation

```bash
synth --version
# synth v0.1.0

synth
# Shows system info, compute device, available strategies
```

> **Note**: On first run, HuggingFace models are downloaded and cached automatically (~500 MB for text detection, ~350 MB for image forensics).

### Development Install

For contributors who want to work on the source:

```bash
git clone https://github.com/khushaal/synth-cli.git
cd synth-cli
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
```

---

## Quick Start

### Analyse Any File (Auto-Detection)

Synth automatically detects whether to run text analysis or image forensics — no flags needed.

```bash
synth photo.png
```

```bash
synth report.pdf
```

Multi-page PDFs are fully supported — each page is extracted, OCR'd, and analysed individually.

### Batch Scan a Directory

```bash
synth ./documents/
```

Synth auto-classifies each file and routes it to the correct pipeline.

### Use a Remote API

```bash
# Set up your .env file first (see Configuration)
synth document.jpg --engine api --agent gpt-4o
```

### Specify OCR Languages

```bash
synth scan.jpg --lang en,fr,de
```

### Skip Text Preview

```bash
synth photo.png --no-text
```

### System Dashboard

```bash
synth
```

### Command Reference

```bash
synth help
```

---

## CLI Reference

### Commands

| Command | Description |
|---|---|
| `synth` | System dashboard — version, device, strategies |
| `synth <file>` | Auto-detect and analyse a file |
| `synth <folder>/` | Batch-analyse a directory |
| `synth help` | Show the command reference menu |
| `synth -V` | Print version and exit |

### Options

| Option | Default | Description |
|---|---|---|
| `--engine`, `-e` | `local` | Detection strategy: `local` (HuggingFace) or `api` (remote HTTP) |
| `--agent`, `-a` | *(auto)* | Model name (local) or model ID (API) |
| `--show-text` / `--no-text` | `--show-text` | Toggle extracted text panel with AI-pattern highlighting |
| `--lang`, `-l` | `en` | Comma-separated OCR language codes |
| `--verbose` | `false` | Enable debug logging |

**Exit codes:**
- `0` — All files verified as human-created
- `1` — AI-generated content detected (useful in CI/CD pipelines)

**Supported formats:**
- **Images:** `.png`, `.jpg`, `.jpeg`, `.tiff`, `.tif`, `.bmp`, `.webp`
- **Documents:** `.pdf` (multi-page)

---

## Configuration

### Local Engine (Default)

No configuration required. Synth downloads and caches the HuggingFace model on first use.

```bash
synth image.png --engine local
```

To use a different HuggingFace model:

```bash
synth image.png --engine local --agent "Hello-SimpleAI/chatgpt-detector-roberta"
```

### API Engine

Create a `.env` file in your project root (see `.env.example`):

```env
SYNTH_API_BASE_URL=https://api.openai.com/v1/chat/completions
SYNTH_API_KEY=sk-your-key-here
SYNTH_API_MODEL=gpt-4o
```

Then run:

```bash
synth image.png --engine api
```

For detailed API configuration (Ollama, Anthropic, custom endpoints), see the **[Universal API Guide](docs/UNIVERSAL_API_GUIDE.md)**.

### Custom Payload Mappings

For APIs with non-standard request/response formats, point to a JSON config:

```env
SYNTH_PAYLOAD_MAP=./config/payload_openai.json
```

Sample configs are provided in the `config/` directory:
- `config/payload_openai.json` — OpenAI chat completions
- `config/payload_anthropic.json` — Anthropic Messages API

---

## Project Structure

```
synth-cli/
├── pyproject.toml              # Package metadata & dependencies
├── .env.example                # API config template
├── config/
│   ├── payload_openai.json     # OpenAI payload mapping
│   └── payload_anthropic.json  # Anthropic payload mapping
├── docs/
│   ├── ARCHITECTURE.md         # Technical deep-dive
│   └── UNIVERSAL_API_GUIDE.md  # API configuration guide
├── src/synth/
│   ├── __init__.py             # Version string
│   ├── cli/
│   │   ├── main.py             # Typer commands & entry point
│   │   └── display.py          # Rich TUI components
│   └── core/
│       ├── auth.py             # Text & Vision authenticators
│       ├── device.py           # Hardware auto-detection
│       ├── exceptions.py       # Custom exceptions
│       └── ocr.py              # OpenCV + EasyOCR + PDF pipeline
└── tests/                      # Pytest suite
```

---

## Development

```bash
# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Lint
ruff check src/ tests/

# Type check
mypy src/
```

---

## License

MIT — see [LICENSE](LICENSE) for details.
