Metadata-Version: 2.4
Name: brpipes
Version: 0.1.0
Summary: Brazilian NLP pipeline components for spaCy
Project-URL: Homepage, https://github.com/wilyJ80/brpipes
Project-URL: Bug Tracker, https://github.com/wilyJ80/brpipes/issues
Author-email: wilyJ80 <abjurandam@gmail.com>
License-File: LICENSE
Keywords: brazilian,ner,nlp,portuguese,spacy
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing :: Linguistic
Requires-Python: >=3.12
Requires-Dist: spacy>=3.0.0
Description-Content-Type: text/markdown

# Development prerequisites

- `uv` installed: [https://docs.astral.sh/uv/#installation](https://docs.astral.sh/uv/#installation)

- `uv sync`

- `uv pip install -e .`

- `uv run spacy download pt_core_news_lg`

# Linting and testing

- `uv run ty check`

- `uv run pytest --cov --durations=0`

# Example

```py
import spacy
import pymupdf
from spacy.language import Language
from pymupdf import Document
from pipe import brpipes_names

def main():
    # Load file
    pdf: Document = pymupdf.open('/home/user/document.pdf')
    full_content = "".join([page.get_text() for page in pdf])

    nlp: Language = spacy.load('pt_core_news_lg')
    nlp.add_pipe('brpipes_names')

    doc = nlp(full_content)
    print([(ent.text, ent.label_) for ent in doc.ents])

if __name__ == "__main__":
    main()

```
