Metadata-Version: 2.4
Name: sampeuler
Version: 0.1.1
Summary: SampEuler: Euler Characteristic Transform and related topological data analysis tools
Author-email: Haoche Yang <benjamin.yang111@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/reddevil0623/SampEuler
Project-URL: Documentation, https://github.com/reddevil0623/SampEuler#readme
Project-URL: Repository, https://github.com/reddevil0623/SampEuler
Project-URL: Issues, https://github.com/reddevil0623/SampEuler/issues
Keywords: topology,euler characteristic,ECT,SECT,topological data analysis,TDA,simplicial complex
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.20.0
Requires-Dist: numba>=0.50.0
Requires-Dist: scipy>=1.6.0
Requires-Dist: scikit-learn>=0.24.0
Requires-Dist: matplotlib>=3.3.0
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Dynamic: license-file

# SampEuler

A Python package for computing Euler Characteristic Transforms (ECT), Smooth Euler Characteristic Transforms (SECT), SampEuler and related topological data analysis tools for geometric simplicial complexes.

## Installation

```bash
pip install sampeuler
```

## Features

- **ECT (Euler Characteristic Transform)**: Compute ECT using evenly spaced or custom directions
- **SECT (Smooth Euler Characteristic Transform)**: Cumulative/integrated version of ECT
- **SampEuler**: ECT with randomly sampled directions as empirical measures of ECT pushforward measures
- **SampEuler Vectorization**: Convert SampEuler output to 2D histogram images for vectorization and interpretation
- **Distance Metrics**: ECT metric and Wasserstein distance for comparing shapes
- **Parallelized**: All main functions use Numba JIT compilation with parallel execution

## Quick Start

```python
import numpy as np
import sampeuler as se

# Define a simplicial complex (list of simplices, each simplex is a list of vertex indices)
simp_comp = [
    [0], [1], [2],           # 0-simplices (vertices)
    [0, 1], [1, 2], [0, 2],  # 1-simplices (edges)
    [0, 1, 2]                # 2-simplex (triangle)
]

# Define vertex coordinates
data = np.array([
    [0.0, 0.0],
    [1.0, 0.0],
    [0.5, 1.0]
])

# Compute ECT with 100 evenly spaced directions
ect_result = se.ect_2d(simp_comp, data, k=100, interval=(-2., 2.), points=500)

# Compute SECT (mean across directions)
sect_result = se.sect_2d(simp_comp, data, k=100, interval=(-2., 2.), points=500, mode='mean')

# Compute SampEuler with 1000 random directions
sampeuler_result = se.SampEuler_2d(simp_comp, data, k=1000, interval=(-2., 2.), points=500)
```

## Examples

For more detailed examples including visualizations, see the [examples/toy_example.ipynb](https://github.com/reddevil0623/SampEuler/blob/main/examples/toy_example.ipynb) notebook.

## API Reference

### ECT Functions

#### `ect_2d(simp_comp, data, k=20, interval=(-1., 1.), points=100, factor=3)`
Compute ECT in 2D using k evenly spaced directions and evaluate at points places. The `simp_comp` variable is a list of lists representing the abstract simplicial complex information, and the `data` variable stores the list of coordinates for vertices as the embedding of the simplicial complex. The `interval` is the filtration interval the algorithm will be evenly sampling points to evaluate the Euler characteristic curves at. This `interval` should be chosen large enough so that all vertices have height within the interval along all directions. The algorithm uses `factor` to more finely sample filtration values to allow more accurate results.
#### `ect(simp_comp, data, thetas, interval=(-1., 1.), points=100, factor=3)`
Compute ECT using provided direction vectors `thetas` for geometric simplicial complex of any dimension. The remaining setup is the same to `ect_2d`.

### SECT Functions

#### `sect_2d(simp_comp, data, k=20, interval=(-1., 1.), points=100, factor=3, mode='full')`
Compute SECT (Smooth Euler Characteristic Transform) in 2D using `k` evenly spaced directions. The SECT is the cumulative integral of the Euler curve, providing a smoother representation that captures the same topological information. Parameters are the same as `ect_2d`, with an additional `mode` parameter:
- `mode='full'`: Returns the full (k, points) array with one SECT curve per direction
- `mode='mean'`: Returns the mean SECT curve across all directions as a 1D array of shape (points,)

#### `sect(simp_comp, data, directions, interval=(-1., 1.), points=100, factor=3, mode='full')`
Compute SECT using provided unit direction vectors for geometric simplicial complexes of any dimension. The `directions` parameter is an array of shape (k, dim) containing unit vectors. Other parameters are the same as `sect_2d`.

### SampEuler Functions

#### `SampEuler_2d(simp_comp, data, k=20, interval=(-1., 1.), points=100, factor=3)`
Compute SampEuler with k randomly sampled directions in 2D as an empirical measure drawn from the ECT pushforward measure. The variables are the same to `ect_2d`

#### `SampEuler(simp_comp, data, dim, k, interval=(-1., 1.), points=100, factor=3)`
Compute SampEuler with `k` randomly sampled directions in arbitrary dimension. The `dim` parameter specifies the dimension of the ambient space the simplicial complex is embedded in, which determines how random directions are sampled (uniformly on the unit sphere in R^dim). Other parameters are the same as `ect_2d`.

### Vectorization

#### `SampEulerVectorization(simp_comp, data, k=20, xinterval=(-1., 1.), xpoints=100, yinterval=(-1., 1.), ypoints=100, resolution=1, factor=3, precomputed=None)`
Convert SampEuler output to a 2D histogram image for vectorization and visualization. The x-axis represents filtration values (`xinterval` divided into `xpoints` bins), and the y-axis represents Euler characteristic values (`yinterval` divided into `ypoints` bins). Each cell counts the proportion of directions whose Euler curves fall within that bin. The `resolution` parameter controls internal SampEuler resolution; when `resolution > 1`, a direction is counted in a bin only if ALL its values within the corresponding x-interval fall within the y-bin. Alternatively, pass `precomputed` SampEuler/ECT data directly to skip recomputation.

```python
# From simplicial complex
vec = se.SampEulerVectorization(simp_comp, data, k=1000, xpoints=100, ypoints=50)
image = vec.image  # Access the 2D histogram (shape: ypoints x xpoints)
vec.plot()         # Visualize the result

# From precomputed SampEuler data
precomputed_se = se.SampEuler_2d(simp_comp, data, k=1000, interval=(-2., 2.), points=100)
vec = se.SampEulerVectorization(precomputed=precomputed_se, xinterval=(-2., 2.), xpoints=100, yinterval=(-1., 3.), ypoints=4)
```

### Distance Metrics

#### `ect_metric(simp_comp1, data1, simp_comp2, data2, k=20, interval=(-1., 1.), points=100)`
Compute the ECT metric between two geometric simplicial complexes. The metric is defined as the supremum over `k` evenly spaced directions of the integrated L1 difference between the two Euler curves. This provides a rotation-invariant distance measure between shapes based on their topological signatures.

#### `sampeuler_wasserstein_distance(empirical1, empirical2, p=2, delta_x=1.0)`
Compute the p-Wasserstein distance between two SampEuler empirical measures using optimal transport. Each SampEuler output is treated as an empirical measure over the space of Euler curves. The function computes the optimal matching between curves using the Hungarian algorithm, with the cost between curves defined as the Lp distance scaled by `delta_x` (the spacing between sample points, typically `(interval[1] - interval[0]) / (points - 1)`).

```python
# Compare two shapes using Wasserstein distance
se1 = se.SampEuler_2d(simp_comp1, data1, k=1000, interval=(-5., 1.), points=3000)
se2 = se.SampEuler_2d(simp_comp2, data2, k=1000, interval=(-5., 1.), points=3000)

delta_x = 6.0 / (3000 - 1)  # spacing between sample points
distance = se.sampeuler_wasserstein_distance(se1, se2, p=2, delta_x=delta_x)
```

## Requirements

- Python >= 3.8
- NumPy >= 1.20.0
- Numba >= 0.50.0
- SciPy >= 1.6.0
- scikit-learn >= 0.24.0
- Matplotlib >= 3.3.0

## License

MIT License

## Citation

If you use this package in your research, please cite:

```bibtex
@software{sampeuler,
  title = {SampEuler: Euler Characteristic Transform Tools},
  author = {Haoche Yang},
  year = {2025},
  url = {https://github.com/reddevil0623/SampEuler}
}
```
