Metadata-Version: 2.1
Name: fastdup
Version: 2.28
Summary: Fast tool for gaining insights from large image repositories.
Home-page: https://github.com/visualdatabase/fastdup
Author: Dr. Danny Bickson & Dr. Amir Alush
Author-email: info@visual-layer.com
License: Non commercial
Description-Content-Type: text/markdown
Requires-Dist: pyOpenSSL (>=24.0.0)
Requires-Dist: aiofiles (~=24.1.0)
Requires-Dist: cryptography (==44.0.1)
Requires-Dist: fastapi (==0.115.8)
Requires-Dist: google-auth (==2.29.0)
Requires-Dist: httpx (==0.26.0)
Requires-Dist: joblib (~=1.2.0)
Requires-Dist: jsonschema (==4.22.0)
Requires-Dist: numpy (<2.0,>1.23.0)
Requires-Dist: pandas (>=2.0.3)
Requires-Dist: Pillow (==10.3.0)
Requires-Dist: polars (==0.20.0)
Requires-Dist: sqlalchemy[asyncio] (~=2.0.29)
Requires-Dist: duckdb (==1.2.0)
Requires-Dist: duckdb-engine (~=0.13.0)
Requires-Dist: pyarrow (==14.0.1)
Requires-Dist: pyjwt (==2.8.0)
Requires-Dist: python-multipart (>0.0.18)
Requires-Dist: PyYAML (~=6.0)
Requires-Dist: requests (>=2.31.0)
Requires-Dist: scikit-learn (==1.5.0)
Requires-Dist: sentry-sdk (==2.27.0)
Requires-Dist: setproctitle (==1.3.3)
Requires-Dist: setuptools (==78.1.1)
Requires-Dist: starlette (==0.40.0)
Requires-Dist: starlette-prometheus (==0.9.0)
Requires-Dist: tqdm (==4.66.3)
Requires-Dist: uvicorn (==0.29.0)
Requires-Dist: nest-asyncio
Requires-Dist: psutil
Requires-Dist: pydantic
Requires-Dist: jinja2 (==3.1.6)
Requires-Dist: pillow-heif
Requires-Dist: packaging
Requires-Dist: opencv-python-headless (>=4.8.0.0)
Requires-Dist: pillow
Requires-Dist: certifi

# Fastdup Tool
Copyright (C) 2024 by Dr. Amir Alush and Dr. Danny Bickson.

fastdup is a tool for gaining insights from a large image/video collection. It can find anomalies, duplicate and near duplicate images/videos, clusters of similarity, learn the normal behavior and temporal interactions between images/videos. It can be used for smart subsampling of a higher quality dataset, outlier removal, novelty detection of new information to be sent for tagging.

fastdup is:

* Unsupervised: fits any dataset
* Scalable : handles 400M images on a single machine
* Efficient: works on CPU only
* Low Cost: can process 12M images on a $1 cloud machine budget

[Non Commercial License](https://github.com/visual-layer/fastdup/blob/main/LICENSE)

[Github Project Page](https://github.com/visualdatabase/fastdup)

