Metadata-Version: 2.4
Name: airpot
Version: 0.8.2
Summary: AI rescoring for extreme-scale proteomics
Home-page: https://github.com/seerbio/airpot
Author: Seth Just
Author-email: sjust@seer.bio
Project-URL: Bug Tracker, https://github.com/seerbio/airpot/issues
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE.md
Requires-Dist: wheely-mammoth<2.0.0,>=0.12.0
Requires-Dist: cortado-ms<2.0.0,>=0.5.0
Requires-Dist: mokapot<0.10.0,<1.0.0,>=0.9.1
Requires-Dist: numba>=0.54.0
Provides-Extra: docs
Requires-Dist: numpydoc>=1.0.0; extra == "docs"
Requires-Dist: sphinx-argparse>=0.2.5; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=0.5.0; extra == "docs"
Requires-Dist: nbsphinx>=0.7.1; extra == "docs"
Requires-Dist: ipykernel>=5.3.0; extra == "docs"
Requires-Dist: recommonmark>=0.5.0; extra == "docs"
Provides-Extra: dev
Requires-Dist: pre-commit>=2.7.1; extra == "dev"
Requires-Dist: black>=20.8b1; extra == "dev"
Dynamic: license-file

<img alt="airpot logo" src="./docs/_static/airpot-logo.png" height="128" align="left" style="margin: 8px">

**airpot** is a library for AI rescoring of PSMs in extreme-scale proteomics experiments.

## Installation  

This library requires Python 3.8+ and can be installed with pip:  

```shell
pip install airpot
```

## Basic Usage  

Using `airpot` requires that you collect a dataset of PSMs, which can easily be
accomplished with the
[`wheely-mammoth` library](https://github.com/seerbio/wheely-mammoth), which is
installed when you install `airpot`:

```pycon
>>> from wheely.mammoth.parsers import read_encyclopedia_features
>>> ds = read_encyclopedia_features("data/*.features.txt")
```

You can then train a model using the `brew` function:

```pycon
>>> from airpot import brew
>>> res = brew(ds)
>>> res
<airpot.backends.mokapot.MokapotResult object at 0x7f2f5e9b7850>
>>> res.psms.data.limit(5).toPandas()[[res.psms.peptide_column, *res.psms.score_columns, res.psms.target_column, res.psms.qvalue_column]]
                   sequence  mokapot score  target  mokapot q-value
0    -.LSLEGDHSTPPSAYGSVK.-       2.851714    True         0.002123
1      -.IMDPNIVGSEHYDVAR.-       2.554996    True         0.002123
2     -.VAQPTITDNKDGTVTVR.-       2.232809    True         0.002123
3  -.TNVNGGAIALGHPLGGSGSR.-       2.189039    True         0.002123
4           -.ASIHEAWTDGK.-       2.145052    True         0.002123
>>> for m in res.models:
...   print(m)
... 
A trained mokapot.model.Model object:
	estimator: LinearSVC(class_weight={0: 10, 1: 1}, dual=False, random_state=7)
	scaler: StandardScaler()
	features: ['primary', 'xCorrLib', 'xCorrModel', 'LogDotProduct', 'logWeightedDotProduct', 'sumOfSquaredErrors', 'weightedSumOfSquaredErrors', 'numberOfMatchingPeaks', 'numberOfMatchingPeaksAboveThreshold', 'averageAbsFragmentDeltaMass', 'averageFragmentDeltaMasses', 'isotopeDotProduct', 'averageAbsParentDeltaMass', 'averageParentDeltaMass', 'eValue', 'deltaRT', 'numMissedCleavage', 'pepLength']
A trained mokapot.model.Model object:
	estimator: LinearSVC(class_weight={0: 10, 1: 10}, dual=False, random_state=7)
	scaler: StandardScaler()
	features: ['primary', 'xCorrLib', 'xCorrModel', 'LogDotProduct', 'logWeightedDotProduct', 'sumOfSquaredErrors', 'weightedSumOfSquaredErrors', 'numberOfMatchingPeaks', 'numberOfMatchingPeaksAboveThreshold', 'averageAbsFragmentDeltaMass', 'averageFragmentDeltaMasses', 'isotopeDotProduct', 'averageAbsParentDeltaMass', 'averageParentDeltaMass', 'eValue', 'deltaRT', 'numMissedCleavage', 'pepLength']
A trained mokapot.model.Model object:
	estimator: LinearSVC(class_weight={0: 0.1, 1: 0.1}, dual=False, random_state=7)
	scaler: StandardScaler()
	features: ['primary', 'xCorrLib', 'xCorrModel', 'LogDotProduct', 'logWeightedDotProduct', 'sumOfSquaredErrors', 'weightedSumOfSquaredErrors', 'numberOfMatchingPeaks', 'numberOfMatchingPeaksAboveThreshold', 'averageAbsFragmentDeltaMass', 'averageFragmentDeltaMasses', 'isotopeDotProduct', 'averageAbsParentDeltaMass', 'averageParentDeltaMass', 'eValue', 'deltaRT', 'numMissedCleavage', 'pepLength']
```
