Genomic data in VDJbase

Genomic data is organised into ‘genomic sets’: typically a genomic set will contain data relating to a single locus of a species. At the moment, VDJbase holds genomic data for Rhesus Macaque but not for Human: when entering the Genomic pages, be sure to select Rhesus Macaque as the species at the top left hand corner of the page, otherwise the listing will be blank.

The Records page lists all assemblies and contigs for which VDJbase has annotations. Click on a record name for details. Records are currently annotated by an in-house tool called Digger: to see the features annotated in a particular record, click on the folder in the Reports column, and then click on Features.

The Genes page lists genes discovered in the genomic set. The number of assemblies or contigs in which the gene is found is shown in the Appearances column. To see those records, click on the number: this will take you to a filtered view of the Records page.

Macaque sequence naming follows Bernat et al., 2021. This is essentially the naming scheme of Cirelli et al., 2019 , but displayed in an IMGT-like format. Current IMGT names for sequences listed at IMGT are also shown. Macaque AIRR-seq data on VDJbase uses the same naming, hence genomic and AIRR-seq samples can be directly compared.

As with AIRR-Seq data, you can use the filters at the top of the Records and Genes pages to refine your selection. When a selection is made in the Genes page, the Records page will display a box ‘Only records with selected genes’. Clicking this box will restrict the view to show only records containing the genes. The same process can be used in reverse to show only genes listed in selected records. Click ‘clear filters’ to clear selections.

The gene browser provides a view of a single assembly or contig, selected using the boxes at the top of the screen, using the Broad Institute’s IGV Web App. You can drill in to the view by double-clicking on an annotated element (shown in blue), or move to a detailed view of a particular gene by typing its name, or a part of its name, into the search box to the left of the magnifying glass, and then clicking the glass or pressing Enter.

When an annotated gene is shown in sufficient detail, the Samples track shows an alignment of the sequence determined by the annotation tool from the assembly. Usually this will be in grey, indicating that the annotated sequence agrees with the assembly. the Refs track shows all alleles of the gene found in the reference assembly for this species.

Leave a comment