You can only upload sequences if you have created an account and are part of a group.
Before starting the upload process, ensure your data is correctly formatted. Every sequence must have a unique ID that can be used to link it with its metadata entry. Please, note that terminal Ns will be automatically removed during sequence preprocessing and will not be included in the submitted sequences.
The expected data format is as follows:
fasta format with a unique fasta ID per sequence. The fasta ID is the start of the header up to and excluding the first white space character. For example the fasta header >seq_12 has fasta ID seq_12.id.
tsv is supported.xlsx files are also accepted.Metadata and sequences will be matched using the id column in the metadata (i.e. the sequence with fasta ID seq_12 will be joined with the metadata entry with id of seq_12). You can also provide an additional metadata field called fastaIds containing a space-separated list of fasta IDs to link multiple sequences to a single submission, e.g. seq_12_A seq_12_B. This can for example be used when submitting multi-segmented pathogens.

The files can also be compressed: accepted formats are .zst, .gz, .zip and .xz.
You can try out uploading sequences to our Demo Instance - it works just like the ‘real’ Pathoplexus, but is wiped regularly and no data is sent onward to INSDC. We also have some example data you can upload to the Demo Instance.
Multi-segmented pathogens must have one unique id per isolate (i.e. one per pathogen sample containing all segments). Each segment will be a unique entry in the FASTA file with its own FASTA ID. Metadata is uploaded per isolate, meaning there will be a single metadata row per id. This row should include a fastaIds field listing all segment fasta IDs, separated by spaces.
Uploading sequences via the website is an easy way to submit sequences without having to worry about any code.
fasta file with the sequences and a metadata file with the associated metadata into the box on the website, or click the ‘Upload a file’ link within the boxes to open a file-selection boxThe data will now be processed, and you will have to approve your submission before it is finalized. You can see how to do this here.
Pathoplexus currently only accepts consensus sequence submissions. If you wish to upload raw reads, you can do so directly through the INSDC submission portal.
To ensure your raw reads are linked to your consensus sequence in the INSDC, both should be associated with the same BioSample and BioProject at the time of submission. We suggest you submit consensus sequences first to ensure metadata consistency.
Submitting the Consensus Sequence First (via Pathoplexus): After submitting your consensus sequence to Pathoplexus, use the biosample and bioproject accessions we provide (e.g., Bioproject Accession: PRJEB80643, Biosample Accession: SAMEA116354847) when submitting your raw reads to the INSDC.
Submitting Raw Reads First (via INSDC): If you submit raw reads to the INSDC first, create a biosample and bioproject during the upload process. Then, provide the raw reads accession in the metadata.tsv (e.g., insdcRawReadsAccession=SRR27477368) when submitting your consensus sequence to Pathoplexus. This allows us to link your consensus sequence to the raw reads in the INSDC.
Please contact us at submission@pathoplexus.org if you have any questions about submitting raw reads.
To use the demo instance instead of the main instance, please replace backend.pathoplexus.org with backend-demo.pathoplexus.org.
By using our API you agree to our Data Use Terms.
It is currently possible to upload sequences through an HTTP API. We also plan to release a command-line interface.
To upload sequences through the HTTP API you will need to:
To upload sequences with the open use terms: https://backend.pathoplexus.org/<organism>/submit?groupId=< group id>&dataUseTermsType=OPEN
To upload sequences with the restricted use terms: https://backend.pathoplexus.org/<organism>/submit?groupId=<group id>&dataUseTermsType=RESTRICTED&restrictedUntil=<restricted-until-date>
API upload is available for all pathogens on Pathoplexus. You can find the correct term to use in place of <organism> by using the value in the URL when you navigate to browse sequences from that Pathogen. For example, for West Nile Virus, the URL is https://pathoplexus.org/west-nile/search? and thus <organism> is west-nile.
The restricted-until date must be provided in the ISO format (e.g., 2024-08-27).
The header should contain
Authorization: Bearer <authentication-token>Content-Type: multipart/form-dataThe request body should contain the FASTA and metadata TSV files with the keys sequenceFile and metadataFile
With cURL, the corresponding command for sending the POST request can be:
curl -X 'POST' \
'https://backend.pathoplexus.org/<organism>/submit?groupId=<group id>&dataUseTermsType=OPEN' \
-H 'accept: application/json' \
-H 'Authorization: Bearer <authentication token>' \
-H 'Content-Type: multipart/form-data' \
-F 'metadataFile=@<metadata file name>' \
-F 'sequenceFile=@<fasta file name>'
Further information can be found in our Swagger API documentation.
As with the website, data will now be processed, and you will have to approve your submission before it is finalized. You can see how to do this here.