Metadata-Version: 2.3
Name: streval
Version: 0.1.0
Summary: Tool for evaluating structured outputs
Author: ruankie
Requires-Dist: pydantic>=2.12.5
Requires-Python: >=3.12
Description-Content-Type: text/markdown

# Streval

**Streval** is a simple Python library for evaluating structured extraction results, such as data pulled from documents or generated by LLMs.  

It allows you to **compare predicted structured objects** (Python dicts or Pydantic models) to a **single ground truth object** and get a detailed **accuracy summary and breakdown**.

## Key Features

* Strict matching only (binary comparison):  
    * Cheap, fast, deterministic, and simple  
    * Currently **no fuzzy matching**; a field is either correct or incorrect  
* Provides **field-level accuracy**: average correctness across all fields and a per-field accuracy breakdown  
* Provides **object-level accuracy**: shows how many predictions are exact matches to the ground truth  
* Penalizes **missing fields** in the prediction  
* Penalizes **extra fields** not in the ground truth  
* Handles nested dictionaries and lists recursively  
* Accepts **Pydantic models or plain dicts** for both predictions and ground truth  

---

## How it Works

1. You provide a **ground truth object** and a **list of predictions**.  
2. Streval recursively compares each field between the predictions and the ground truth.  
3. You get a structured **accuracy summary** including:  
    * Field-level accuracy  
    * Object-level accuracy  
    * Per-field breakdown  

This makes it easy to see **exactly where your extraction method succeeds or fails**.

## Tests

Run tests with coverage:

```bash
uv run pytest --cov=src
```

## Build Docs

Documentation is built using MkDocs.

To build the static site:

```bash
uv run mkdocs build
```

To serve the documentation locally:

```bash
uv run mkdocs serve
```
