OGRDB – Depositing records in ENA repositories

At the end of this deposition process, ENA will contain two types of record for use in your submission:

  • A sequence record for each inferred sequence you intend to submit.
  • One or more select sets for each inferred sequence.

The deposition process for each type is detailed below. As sequence records refer to select sets, you should create the select sets first.

General introduction to ENA submission

For general help, please refer to the ENA documentation and the training module for submission.

Creating the study

If you or your lab submitted the repertoire(s) on which the inferences are based, log on using the account that was used to submit them. You will then be able to attach the select sets to that study.

Otherwise, create a new study to hold the select sets, referencing the accession number of the study holding the repertoires in its title, for example ‘Novel alleles inferred from Study xxx, and associated reads’.

Submitting the select sets

Upload the fastq files for all the select sets to ENA. We recommend Filezilla. Your webin user name is shown at the top-right hand corner of the ENA browser after you log in.

Log in to the ENA Browser. Select ‘Submit to ENA’ from the Submit dropdown. Click on the button ‘submit to ENA interactively’. Click on the New Submission tab. Select ‘Submit sequence reads and experiments’:

select the associated study (either the study holding the repertoire sequences, or the new one you created) and click ‘next’:

click ‘skip’ on the next screen:

Either complete the table interactively (one row per select set), or download the template, fill it in and upload. Put the accession number of the associated sample from the repertoire study in the Sample reference field.

click submit. The select set records will be created. An example record can be seen under accession ERR6590597.

Entering the sequence records

The sequence records can be uploaded in bulk, using a sheet that you create. The sheet must be in tab-separated format and must contain the following columns. The accession number in the INFERACC column is the accession number of the associated select set created in the previous step.

In the ENA browser, click the New Submission tab and select ‘submit other assembled and annotated sequences’. Click next.

select the study as before, and click ‘next’

upload the sheet you created and click ‘submit completed spreadsheet’.

After submission you will see the following screen, final accession numbers will come to your email. An example sequence definition record can be seen under accession OU596101 (view in EMBL format to see the full detail, including the /inference link to the select set).