Abstract
PyHMMER provides Python integration of the popular profile Hidden Markov
Model software HMMER via Cython
bindings. This allows the annotation of protein sequences with profile HMMs
and building new ones directly with Python. PyHMMER increases flexibility of
use, allowing creating queries directly from Python code, launching searches,
and obtaining results without I/O, or accessing previously unavailable
statistics like uncorrected P-values. A new parallelization model greatly
improves performance when running multithreaded searches, while producing the
exact same results as HMMER.
PyHMMER has been used in the following publications:
- Accurate de novo identification of biosynthetic gene clusters with GECCO (preprint).
- Identification of microbial metabolic functional guilds from large genomic datasets.
- Automated model building and protein identification in cryo-EM maps (preprint).
- Homologous Pairs of Low and High Temperature Originating Proteins Spanning the Known Prokaryotic Universe.