Creating a select set

A select set is a set of reads taken from a single repertoire that directly support a specific inference.

In outline, the process to create the set is as follows:

  • For a paired-end-read dataset, merge all paired-end reads
  • Align reads to the inferred allele reference
  • Filter the output to an identity of at least 96%
  • Filter reads to match the novel allele’s SNPs
  • Create a select set matching the filtered reads (for a paired-end-read dataset, this should consist of the original unpaired reads)

A script and associated docker image have been created to perform these steps. You can use the script directly, derive from it, or follow your own procedure if you prefer.