Visualize and share IGV genome tracks with quilt3 and JSON (and zero backend code)

Published in

Quilt

3 min readOct 5, 2022

Next-generation sequencing data are among the most voluminous and
critical in biopharma. Easy visualization of genome tracks is
pivotal to understanding potential drugs and therapies. The tricky part in visualizing genome tracks is not the visualization but the sharing, which typically requires you to stand up and maintain your own visualization service so that scientists can consume the visualizations without coding. In fact this is one of the most challenging tasks in data science: having to leave your native tools like Jupyter and pandas, just to share your results with others in dead-end file formats like PowerPoints, or complex web services written in flask. This article demonstrates how you can generate an IGV options file in JSON and quilt3 push the result to share genome tracks with your colleagues.

Visualizing genome tracks

IGV visualizations are driven by JSON files known as IGV browser configurations that specify the reference genome, tracks, and view locus, as well as the user interface.

IGV browser configurations

Below are two simple IGV browser configuration files, igv-options-interact.json, and igv-options.json. Copy both files to a clean directory on your machine. Note that each options file references publicly available genome sequences data from the Broad Institute, hosted in Amazon S3:

Three track interactive genome sequences in BEDPE and bigWig formats.

Single track interactive genome sequence annotation.

Note that Quilt supports the following three conventions to reference URLs in an IGV options file:

HTTPS links (https://)
S3 links (s3://)
Package-relative links (./your/path/to/data/)

quilt_summarize.json

Now for the major time-saving step. Create a third file on your machine, quilt_summarize.json, with the following contents:

quilt_summarize.json for IGV genome tracks

quilt_summarize.json specifies how the contents of your package display in the Quilt web catalog. Notice the "types: ["igv"] line that tells the Quilt catalog to render the corresponding JSON files as genome tracks.

We now have the three files above in a directory on our local machine:

You can now quilt3 pushyour working directory to S3 as follows:

Single quilt3 push command to upload a new IGV data package. New lines (\) added for readability.

If your company runs its own Quilt web catalog, you will see IGV visualizations that resemble this example of the above package with genome tracks on open.quiltdata.com. (If you’re not yet running the Quilt web catalog, we plan to support IGV in future versions of the open source quilt3 catalog that you can use to browse and preview data in S3.)

You can explore genome tracks and the JSON files as shown below. The first screencast demonstrates three sequence tracks (variant, alignment, and annotation):

Interactive exploration of a genome sequence

You can use the “View as” menu, upper right, to view the data as JSON or an IGV visualization.

Screencast of an interactive exploration of a BAM file (.bam)

The final screencast demonstrates multiple loci displayed side-by-side in split panes within the IGV visualization plugin:

Screencast of an interactive exploration of a multi-locus IGV visualization

Learn more

Post questions in the comments or visit https://quiltdata.com/ to learn more about integrating wet and dry science datasets with Quilt data packages.

Appendix: Visualization in the web catalog

The Quilt Data web catalog facilitates dynamic reporting, currently with native support for the following libraries for visualization, exploration, and presentation:

Vega
Vega-lite
ECharts
Voila (Developer preview)
Perspective

Previewing genome sequence files

The Quilt web catalog currently supports visual preview of large FASTQ, SAM, and VCF files.