The JSON format in which germline sets are distributed has been updated (germline sets are also available from OGRDB in FASTA format, but the JSON format provides much richer information). The revised format is compliant with the latest development version of the AIRR schema, which is expected to be released as an update in the next few weeks.
The key changes are as follows:
– In AlleleDescription,
coding_sequence is no longer IMGT-gapped. This is intended to make AlleleDescription neutral to specific delineations.
– OGRDB will always provide a
SequenceDelineationV for the IMGT delineation.
SequenceDelineationV now includes both an
unaligned_sequence and an
aligned_sequence. The IMGT gapped sequence can therefore be found in
AlleleDescription.SequenceDelineationV.aligned_sequence now refer to
- The co-ordinates in SequenceDelineationV
unaligned_sequence rather than