phyphy: Python package for facilitating the execution and parsing of HyPhy standard analyses

Abstract

HyPhy (Kosakovsky Pond, Frost, and Muse 2005) is a powerful bioinformatics software platform for evolutionary comparative sequence analysis and testing hypotheses of natural selection from sequence data. Combined with its accompanying webserver Datamonkey (Weaver et al. 2018), HyPhy has garnered over 3500 citations since its introduction in 2005 and has greatly accelerated the pace of biomedical and epidemiological research. I in- troduce phyphy (Python HyPhy), a Python package aimed to i) faciliate the execution of standard HyPhy analyses, and ii) extract analysis information from the JSON-formatted HyPhy output into user-friendly formats. phyphy will greatly improve the batch-users’ experience by allowing users to bypass the interactive HyPhy command-line prompt and execute hundreds or thousands of analyses directly from a Python script. In addition, phyphy makes it simple to obtain key information from an executed HyPhy analysis, in- cluding fitted model parameters, annotated phylogenies with analysis output formatted for downstream visualization, and CSV files containing the most relevant output for a given method. phyphy is compatible with Hyphy version >=2.3.7 and is freely available under a BSD 3-clause license from https://github.com/sjspielman/phyphy.

Publication
The Journal of Open Source Software 10.21105/joss.00514