Metadata-Version: 2.4
Name: synfintabgen
Version: 1.0.0.post2
Summary: A package for generating synthetic financial tables.
Author: Ethan Bradley
Author-email: ebradley24@qub.ac.uk
License: MIT
Project-URL: Homepage, https://ethanbradley.co.uk/research/synfintabs
Project-URL: Issues, https://github.com/ethanbradley/synfintabgen/issues
Project-URL: Source, https://github.com/ethanbradley/synfintabgen
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: easyocr>=1.7.1
Requires-Dist: htmltree>=0.7.6
Requires-Dist: nanoid>=2.0.0
Requires-Dist: nltk>=3.9.1
Requires-Dist: selenium>=4.19.0
Requires-Dist: tqdm>=4.66.2
Dynamic: license-file

# SynFinTabGen: Synthetic Financial Table Generator

A package for generating synthetic financial tables.

## Quick Start

To generate a dataset of synthetic financial tables, create a generator and pass how many tables you would like.

```python3
from synfintabgen import DatasetGenerator

generator = DatasetGenerator()

generator(10)
```

The output directory defaults to `dataset` in the current working directory.

## Configuration

You can configure the generator using the `DatasetGeneratorConfig` class.

```python3
from synfintabgen import DatasetGeneratorConfig

config = DatasetGeneratorConfig(
    dataset_path="my-datasets-dir",
    dataset_name="my-dataset-name",
    document_width=745,
    document_height=1503
)

generator = DatasetGenerator(config)
```

## Citation

If you use this software, please cite both the article using the citation below and the software itself.

```bib
@misc{bradley2024synfintabs,
      title         = {Syn{F}in{T}abs: A Dataset of Synthetic Financial Tables for Information and Table Extraction},
      author        = {Bradley, Ethan and Roman, Muhammad and Rafferty, Karen and Devereux, Barry},
      year          = {2024},
      eprint        = {2412.04262},
      archivePrefix = {arXiv},
      primaryClass  = {cs.LG},
      url           = {https://arxiv.org/abs/2412.04262}
}
```
