Welcome to Genomeplot!¶
genomeplot
is a simple convenience wrapper to some bokeh functions, streamlining the creation of interactive plots across the genome.
It is simple to create a plot for any genome, all you need is a genome reference file (FASTA format).
For inspiration, check out the Gallery.
This package contains a few “prerolled” genome plots, that create the base figure using the genomeplot.GenomePlot
class.
Using a prerolled genomeplot¶
Instances of GenomePlot
representing the genome of three organisms are currently available: Anopheles gambiae, Plasmodium falciparum, and Homo Sapiens.
If you would like to contribute a prerolled GenomePlot
, please see Contributing.
The genomeplot.GenomePlot
instance is created using the .load()
method. Parameters may still be edited by directly setting the values in the class.
A convenience dummy function genomeplot.prerolled.util.noiseplot()
is available to demonstrate GenomePlot
instances.
import genomeplot
from bokeh.plotting import output_file
output_file("example_ag.html")
# First create a GenomePlot instance
agam = genomeplot.anophelesgambiae.load()
# Pass in a function to the apply method to make a plot
agam.apply(genomeplot.util.noiseplot)
import genomeplot
from bokeh.plotting import output_file
output_file("example_hs.html")
# First create a GenomePlot instance
pfal = genomeplot.plasmodiumfalciparum.load()
# Pass in a function to the apply method to make a plot
pfal.apply(genomeplot.util.noiseplot, winsize=10000)
In this Homo Sapiens example, we use another dummy plot function, genomeplot.prerolled.util.sineplot()
, it simply plots a sine curve over each contig.
import genomeplot
from bokeh.plotting import output_file
output_file("example_hs.html")
# First create a GenomePlot instance
hsap = genomeplot.homosapiens.load()
# Pass in a function to the apply method to make a plot
hsap.apply(genomeplot.util.sineplot, mb_per_wave=30)
Creating a custom genomeplot¶
First, create a new instance of the genomeplot.GenomePlot
class. The only required argument is a filepath describing the contigs present in the reference genome.
This is typically a FASTA, or alternatively a csv
file, meeting the requirements described here.
Below is a simple example for a generic GenomePlot
:
import genomeplot
fasta_path = "/path/to/fastafile.fa.gz"
contigs = ["1", "2", "3", "4", "5", "6", "7", "8", "X"]
gf = GenomePlot(fasta_path, contigs=contigs, layout="oo|ooo|oooo")
Once instantiated, to make a plot, a plot function is passed to the .apply()
method of genomeplot.GenomePlot
.
This function is then applied in turn to each contig, each making a plot which is placed in the grid.:
gf.apply(genomeplot.util.noiseplot)
This function simply places a circle with a random y
value at intervals over the genome.
Initially it is recommended that noiseplot is used as a template.
More detailed instructions to creating plotting functions suitable for use with GenomePlot.apply()
will be available soon.
More complex examples of use are available in the Gallery.
Source¶
Source code is hosted on GitHub at https://github.com/hardingnj/genomeplot.
Installation¶
This package isn’t yet stable/mature enough to be put in conda
/pypi
, so for the moment please install by either cloning the repo, or downloading the .tar.gz
from the master branch and extracting.
Then run:
python setup.py install
Requirements¶
bokeh >= 0.12.14
python >= 3.5
Contributing¶
Pull requests for additional organisms and new features are very welcome. See Contributing for more details.
License and warning¶
This is academic software, has not been extensively tested, and may contain bugs and/or omissions. If you find errors or have problems, please raise an issue at https://github.com/hardingnj/genomeplot/issues. This software (including documentation) is licensed under the GNU GENERAL PUBLIC LICENSE.
Acknowledgements¶
Much of this code is heavily based on matplotlib code written by Alistair Miles