Format Overview For Developers
Python implementation
Linkml generates a datamodel file with all classes: https://github.com/PlasmoGenEpi/portable-micorhaplotype-object/blob/main/src/plasmo_tar_amp_schema/datamodel/plasmo_tar_amp_schema.py
Tools
Started development of a tool called pmotools. Python implementation can be found here: https://github.com/PlasmoGenEpi/pmotools-python/tree/develop
Format overview
Below is an overview of the entire format currently in development which is under active development and optimization. Please send questions to info@plasmogenepi.org
PortableMicrohaplotypeObject
https://plasmogenepi.github.io/portable-microhaplotype-object/PortableMicrohaplotypeObject/Show PortableMicrohaplotypeObject fields
Required
- library_sample_info (type=list of LibrarySampleInfo)
- a list of libraries of all the seq/amp of the specimens within this PMO file
- a list of libraries of all the seq/amp of the specimens within this PMO file
- specimen_info (type=list of SpecimenInfo)
- a list of all the specimens within this PMO file
- a list of all the specimens within this PMO file
- panel_info (type=list of PanelInfo)
- a list of info on the panels
- a list of info on the panels
- target_info (type=list of TargetInfo)
- a list of info on the targets
- a list of info on the targets
- representative_microhaplotypes (type=RepresentativeMicrohaplotypes)
- a list of the information on the representative microhaplotypes
- a list of the information on the representative microhaplotypes
- detected_microhaplotypes (type=list of DetectedMicrohaplotypes)
- the microhaplotypes detected in this projects
- the microhaplotypes detected in this projects
- pmo_header (type=PmoHeader)
- the PMO information for this file including version etc
Optional
- bioinformatics_methods_info (type=list of BioinformaticsMethodInfo)
- the bioinformatics pipeline/methods used to generated the microhaplotype analysis for this project
- the bioinformatics pipeline/methods used to generated the microhaplotype analysis for this project
- bioinformatics_run_info (type=list of BioinformaticsRunInfo)
- the runtime info for the bioinformatics pipeline used to generated the microhaplotypes analysis for this project
- the runtime info for the bioinformatics pipeline used to generated the microhaplotypes analysis for this project
- project_info (type=list of ProjectInfo)
- the information about the projects stored in this PMO
- the information about the projects stored in this PMO
- read_counts_by_stage (type=list of ReadCountsByStage)
- the read counts for library_samples for different stages of the pipeline
- the read counts for library_samples for different stages of the pipeline
- sequencing_info (type=list of SequencingInfo)
- a list of sequencing infos for this PMO file
- a list of sequencing infos for this PMO file
- targeted_genomes (type=list of GenomeInfo)
- a list of genomes that any genomic location information refers to
Example
Code
{
"pmo_header": {},
"targeted_genomes": [],
"target_info": [],
"panel_info": [],
"sequencing_info": [],
"project_info": [],
"specimen_info": [],
"library_sample_info": [],
"bioinformatics_methods_info": [],
"bioinformatics_run_info": [],
"representative_microhaplotypes": {},
"detected_microhaplotypes": [],
"read_counts_by_stage": []
}Code
{
"pmo_header": {},
"target_info": [],
"panel_info": [],
"specimen_info": [],
"library_sample_info": [],
"representative_microhaplotypes": {},
"detected_microhaplotypes": [],
}Code
{
"pmo_header": {
"pmo_version": "1.1.0",
"creation_date": "2026-05-15",
"generation_method": {
"program_name": "pmotools-python",
"program_version": "1.1.0"
}
},
"library_sample_info": [
{
"library_sample_name": "SRR30825770",
"specimen_id": 0,
"panel_id": 0
},
....
],
"specimen_info": [
{
"specimen_name": "SRR30825770"
},
....
],
"panel_info": [
{
"panel_name": "staph_aureus_Furstenau2025",
"reactions": [
{
"reaction_name": "full",
"panel_targets": [
0,
1,
.....
]
}
]
}
],
"target_info":[
{
"target_name": "SA_131432",
"forward_primer": {
"seq": "GTCCAGGTAGCATGATT"
},
"reverse_primer": {
"seq": "TGTCATACCAGTTAGGAATCACA"
}
},
....
],
"representative_microhaplotypes":{
"targets": [
{
"microhaplotypes": [
{
"seq": "CAATATAATAACCTAATAAAATGTTTAGGTCAACCTAAATTTATTTTAATTTTTTTAAAAGTATGAATTATTATGTTGAACGTTCGATTTAATATTAAGAAAAA"
},
],
"target_id": 8
},
....
]
},
"detected_microhaplotypes": [
{
"library_samples": [
{
"library_sample_id" : 1,
"target_results": [
{
"mhaps_target_id": 0,
"mhaps": [
{
"mhap_id": 6,
"reads": 428
}
]
},
....
]
}
....
]
}
...
]
}BioMethod
https://plasmogenepi.github.io/portable-microhaplotype-object/BioMethod/Show BioMethod fields
Required
- program_version (type=string)
- the version of program, should be in the format of v[MAJOR].[MINOR].[PATCH]
- the version of program, should be in the format of v[MAJOR].[MINOR].[PATCH]
- program (type=string)
- name of the program used for this portion of the pipeline
Optional
- additional_argument (type=list of string)
- any additional arguments that differ from the default arguments
- any additional arguments that differ from the default arguments
- program_description (type=string)
- a short description of what this method does
- a short description of what this method does
- program_url (type=string)
- a url pointing to code base of a program, e.g. a github link
Example
Code
{
"program": "SeekDeep extractorPairedEnd",
"program_description": "Takes raw paired-end reads and demultiplexes on primers and does QC filtering",
"program_version": "v2.6.5"
},BioinformaticsMethodInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/BioinformaticsMethodInfo/Show BioinformaticsMethodInfo fields
Required
- methods (type=list of BioMethod)
- methodology used to generate the microhaplotype data stored in this PMO, e.g. demultiplexing method, denosing method, or a pipeline method that ties all th steps together
Example
Code
{
"methods": [
{
"program": "SeekDeep extractorPairedEnd",
"program_description": "Takes raw paired-end reads and demultiplexes on primers and does QC filtering",
"program_version": "v2.6.5"
},
{
"additional_argument": [
"--illumina",
"--qualThres 25,20"
],
"program": "SeekDeep qluster",
"program_description": "Takes sequences per sample per target and clusters them",
"program_version": "v2.6.5"
},
{
"additional_argument": [
"--strictErrors",
"--illumina",
"--removeOneSampOnlyOneOffHaps",
"--excludeCommonlyLowFreqHaplotypes",
"--excludeLowFreqOneOffs",
"--rescueExcludedOneOffLowFreqHaplotypes"
],
"program": "SeekDeep processClusters",
"program_description": "Compare across samples for each target to create population level identifiers and do post artifact cleanup",
"program_version": "v2.6.5"
}
]
},BioinformaticsRunInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/BioinformaticsRunInfo/Show BioinformaticsRunInfo fields
Required
- bioinformatics_methods_id (type=integer)
- the index into the bioinformatics_methods_info list
- the index into the bioinformatics_methods_info list
- bioinformatics_run_name (type=string)
- a name to for this run, needs to be unique to each run
Optional
- run_date (type=string)
- the date when the run was done, should be YYYY-MM-DD
Example
Code
DetectedMicrohaplotypes
https://plasmogenepi.github.io/portable-microhaplotype-object/DetectedMicrohaplotypes/Show DetectedMicrohaplotypes fields
Required
- library_samples (type=list of DetectedMicrohaplotypesForSample)
- a list of the microhaplotypes detected for all samples with a list for each target
Optional
- bioinformatics_run_id (type=integer)
- the index into bioinformatics_run_info list
Example
Code
DetectedMicrohaplotypesForSample
https://plasmogenepi.github.io/portable-microhaplotype-object/DetectedMicrohaplotypesForSample/Show DetectedMicrohaplotypesForSample fields
Required
- library_sample_id (type=integer)
- the index into the library_sample_info list
- the index into the library_sample_info list
- target_results (type=list of DetectedMicrohaplotypesForTarget)
- a list of the microhaplotypes detected for a list of targets
Example
Code
{
"library_sample_id": 1,
"target_results": [
{
"mhaps": [
{
"mhap_id": 1,
"reads": 2227
},
{
"mhap_id": 2,
"reads": 51
}
],
"mhaps_target_id": 98
},
...
}DetectedMicrohaplotypesForTarget
https://plasmogenepi.github.io/portable-microhaplotype-object/DetectedMicrohaplotypesForTarget/Show DetectedMicrohaplotypesForTarget fields
Required
- mhaps_target_id (type=integer)
- the index for a target in the representative_microhaplotypes list
- the index for a target in the representative_microhaplotypes list
- mhaps (type=list of MicrohaplotypeForTarget)
- a list of the microhaplotypes detected for this target
Example
Code
GenomeInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/GenomeInfo/Show GenomeInfo fields
Required
- name (type=string)
- name of the genome
- name of the genome
- genome_version (type=string)
- the genome version
- the genome version
- taxon_id (type= list of integer)
- the NCBI taxonomy number, can be a list of values if it’s a genome file that has been created by combining gnomes from different species
- the NCBI taxonomy number, can be a list of values if it’s a genome file that has been created by combining gnomes from different species
- url (type=string)
- a link to the where this genome file could be downloaded
Optional
- chromosomes (type=list of string)
- a list of the chromosomes/contigs found within this genome
- a list of the chromosomes/contigs found within this genome
- gff_url (type=string)
- a link to the where this genome’s annotation file could be downloaded
Example
Code
{
"genome_version": "GCF_000013425.1",
"gff_url": "https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/013/425/GCF_000013425.1_ASM1342v1/GCF_000013425.1_ASM1342v1_genomic.gff.gz",
"name": "NCTC8325",
"taxon_id": [
1280
],
"url": "https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/013/425/GCF_000013425.1_ASM1342v1/GCF_000013425.1_ASM1342v1_genomic.fna.gz"
}GenomicLocation
https://plasmogenepi.github.io/portable-microhaplotype-object/GenomicLocation/Show GenomicLocation fields
Required
- genome_id (type=integer)
- the index to the genome in the targeted_genomes list that this location refers to
- the index to the genome in the targeted_genomes list that this location refers to
- chrom (type=string)
- the chromosome name
- the chromosome name
- start (type=integer)
- the start of the location, 0-based positioning
- the start of the location, 0-based positioning
- end (type=integer)
- the end of the location, 0-based positioning
Optional
- alt_seq (type=string)
- a possible alternative sequence of this genomic location
- a possible alternative sequence of this genomic location
- ref_seq (type=string)
- the reference sequence of this genomic location
- the reference sequence of this genomic location
- strand (type=string)
- which strand the location is, either + for plus strand or - for negative strand
Example
Code
{
"chrom": "LS483365.1",
"end": 849631,
"genome_id": 0,
"start": 849607,
"strand": "+"
},LibrarySampleInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/LibrarySampleInfo/Show LibrarySampleInfo fields
Required
- specimen_id (type=integer)
- the index into the specimen_info list
- the index into the specimen_info list
- panel_id (type=integer)
- the index into the panel_info list
- the index into the panel_info list
- library_sample_name (type=string)
- a unique identifier for this sequencing/amplification run
Optional
- alternate_identifiers (type=list of string)
- a list of alternative names
- a list of alternative names
- experiment_accession (type=string)
- ERA/SRA experiment accession number for the sample if it was submitted
- ERA/SRA experiment accession number for the sample if it was submitted
- fastqs_loc (type=string)
- the location (url or filename path) of the fastqs for a library run
- the location (url or filename path) of the fastqs for a library run
- library_prep_plate_info (type=PlateInfo)
- plate location of where library was prepared for sequencing
- plate location of where library was prepared for sequencing
- qpcr_parasite_density_info (type=list of ParasiteDensity)
- qpcr parasite density measurement for this extracted sample
- qpcr parasite density measurement for this extracted sample
- run_accession (type=string)
- ERA/SRA run accession number for the sample if it was submitted
- ERA/SRA run accession number for the sample if it was submitted
- sequencing_info_id (type=integer)
- the index into the sequencing_info list
Example
Code
{
"alternate_identifiers": [
"85b498-Wk22-Nasal"
],
"experiment_accession": "SRX26225188",
"fastqs_loc": "ftp.sra.ebi.ac.uk/vol1/fastq/SRR308/078/SRR30825778/SRR30825778_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR308/078/SRR30825778/SRR30825778_2.fastq.gz",
"library_sample_name": "SRR30825778",
"panel_id": 0,
"run_accession": "SRR30825778",
"sequencing_info_id": 0,
"specimen_id": 0
},MarkerOfInterest
https://plasmogenepi.github.io/portable-microhaplotype-object/MarkerOfInterest/Show MarkerOfInterest fields
Required
- marker_location (type=GenomicLocation)
- the genomic location
Optional
- associations (type=list of string)
- a list of associations with this marker, e.g. SP resistance, etc
Example
Code
MaskingInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/MaskingInfo/Show MaskingInfo fields
Required
- seq_start (type=integer)
- the start of the masking
- the start of the masking
- seq_segment_size (type=integer)
- the size of the masking
- the size of the masking
- replacement_size (type=integer)
- the size of replacement mask
Optional
- masking_generation_description (type=string)
- a description of how the masking information was generated
Example
Code
MicrohaplotypeForTarget
https://plasmogenepi.github.io/portable-microhaplotype-object/MicrohaplotypeForTarget/Show MicrohaplotypeForTarget fields
Required
- mhap_id (type=integer)
- the index for a microhaplotype for a target in the representative_microhaplotypes list, e.g. representative_microhaplotypes[mhaps_target_id][mhap_id]
- the index for a microhaplotype for a target in the representative_microhaplotypes list, e.g. representative_microhaplotypes[mhaps_target_id][mhap_id]
- reads (type=integer)
- the read count for this microhaplotype
Optional
- umis (type=integer)
- the unique molecular identifier (umi) count for this microhaplotype
Example
Code
PanelInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/PanelInfo/Show PanelInfo fields
Required
- reactions (type=list of ReactionInfo)
- a list of 1 or more reactions that this panel contains, each reactions list the targets that were amplified in that reaction, e.g. pool1, pool2
- a list of 1 or more reactions that this panel contains, each reactions list the targets that were amplified in that reaction, e.g. pool1, pool2
- panel_name (type=string)
- a name for the panel
Example
Code
ParasiteDensity
https://plasmogenepi.github.io/portable-microhaplotype-object/ParasiteDensity/Show ParasiteDensity fields
Required
- parasite_density_method (type=string)
- the method of how this density was obtained
- the method of how this density was obtained
- parasite_density (type=number)
- the density in microliters
Optional
- date_measured (type=string)
- the date the qpcr was performed, can be YYYY, YYYY-MM, or YYYY-MM-DD
- the date the qpcr was performed, can be YYYY, YYYY-MM, or YYYY-MM-DD
- density_method_comments (type=string)
- additional comments about how the density was performed
Example
Code
PlateInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/PlateInfo/Show PlateInfo fields
Required
- plate_name (type=string)
- a name for the plate
- a name for the plate
- plate_row (type=string)
- the row position
- the row position
- plate_col (type=integer)
- the column position
Example
Code
PmoGenerationMethod
https://plasmogenepi.github.io/portable-microhaplotype-object/PmoGenerationMethod/Show PmoGenerationMethod fields
Required
- program_version (type=string)
- the version of program, should be in the format of v[MAJOR].[MINOR].[PATCH]
- the version of program, should be in the format of v[MAJOR].[MINOR].[PATCH]
- program_name (type=string)
- the name of the program
Example
Code
PmoHeader
https://plasmogenepi.github.io/portable-microhaplotype-object/PmoHeader/Show PmoHeader fields
Required
- pmo_version (type=string)
- the version of the PMO file, should be in the format of v[MAJOR].[MINOR].[PATCH]
Optional
- creation_date (type=string)
- the date of when the PMO file was created or modified, should be YYYY-MM-DD
- the date of when the PMO file was created or modified, should be YYYY-MM-DD
- generation_method (type=PmoGenerationMethod)
- the generation method to create this PMO
Example
Code
"pmo_version": "1.0.0",
"creation_date": "2026-05-13",
"generation_method": {
"program_name": "pmotools-python.PMOReader.combine_multiple_pmos",
"program_version": "1.0.0"
}PrimerInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/PrimerInfo/Show PrimerInfo fields
Required
- seq (type=string)
- the sequence
Optional
- location (type=GenomicLocation)
- what the intended genomic location of the primer is
Example
Code
{
"location": {
"chrom": "LS483365.1",
"end": 849631,
"genome_id": 0,
"start": 849607,
"strand": "+"
},
"seq": "GGAGTTATCATGCCAACAGTTATA"
},ProjectInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/ProjectInfo/Show ProjectInfo fields
Required
- project_name (type=string)
- a name for the project, should be unique if multiple projects listed
- a name for the project, should be unique if multiple projects listed
- project_description (type=string)
- a short description of the project
Optional
- BioProject_accession (type=string)
- an SRA bioproject accession e.g. PRJNA33823
- an SRA bioproject accession e.g. PRJNA33823
- project_collector_chief_scientist (type=string)
- can be collection of names separated by a semicolon if multiple people involved or can just be the name of the primary person managing the specimen
- can be collection of names separated by a semicolon if multiple people involved or can just be the name of the primary person managing the specimen
- project_contributors (type=list of string)
- a list of collaborators who contributed to this project
- a list of collaborators who contributed to this project
- project_type (type=string)
- the type of project conducted, e.g. TES vs surveillance vs transmission
Example
Code
{
"BioProject_accession": "PRJNA1166327",
"project_description": "Amplicon sequencing of Staphylococcus aureus from oral and nasal samples",
"project_name": "PRJNA1166327"
},ProteinVariant
https://plasmogenepi.github.io/portable-microhaplotype-object/ProteinVariant/Show ProteinVariant fields
Required
- protein_location (type=GenomicLocation)
- the position within the protein, the chromosome in this case would be the transcript name
Optional
- alternative_gene_name (type=string)
- an alternative gene name
- an alternative gene name
- codon_genomic_location (type=GenomicLocation)
- the position within the genomic sequence of the codon
- the position within the genomic sequence of the codon
- gene_name (type=string)
- an identifier of the gene, if any, is being covered with this targeted
Example
Code
Pseudocigar
https://plasmogenepi.github.io/portable-microhaplotype-object/Pseudocigar/Show Pseudocigar fields
Required
- pseudocigar_seq (type=string)
- the pseudocigar itself
- the pseudocigar itself
- ref_loc (type=GenomicLocation)
- the genomic location the pseudocigar is in reference to
Optional
- pseudocigar_generation_description (type=string)
- a description of how the pseudocigar information was generated
Example
Code
ReactionInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/ReactionInfo/Show ReactionInfo fields
Required
- panel_targets (type= list of integer)
- a list of the target indexes in the target_info list
- a list of the target indexes in the target_info list
- reaction_name (type=string)
- a name for this reaction
Example
Code
ReadCountsByStage
https://plasmogenepi.github.io/portable-microhaplotype-object/ReadCountsByStage/Show ReadCountsByStage fields
Required
- read_counts_by_library_sample_by_stage (type=list of ReadCountsByStageForLibrarySample)
- a list by library_sample for the counts at each stage
Optional
- bioinformatics_run_id (type=integer)
- the index into bioinformatics_run_info list
Example
Code
ReadCountsByStageForLibrarySample
https://plasmogenepi.github.io/portable-microhaplotype-object/ReadCountsByStageForLibrarySample/Show ReadCountsByStageForLibrarySample fields
Required
- library_sample_id (type=integer)
- the index into the library_sample_info list
- the index into the library_sample_info list
- total_raw_count (type=integer)
- the raw counts off the sequencing machine that a sample began with
Optional
- read_counts_for_targets (type=list of ReadCountsByStageForTarget)
- a list of counts by stage for a target
Example
Code
ReadCountsByStageForTarget
https://plasmogenepi.github.io/portable-microhaplotype-object/ReadCountsByStageForTarget/Show ReadCountsByStageForTarget fields
Required
- target_id (type=integer)
- the index into the target_info list
- the index into the target_info list
- stages (type=list of StageReadCounts)
- the read counts by each stage
Example
Code
RepresentativeMicrohaplotype
https://plasmogenepi.github.io/portable-microhaplotype-object/RepresentativeMicrohaplotype/Show RepresentativeMicrohaplotype fields
Required
- seq (type=string)
- the sequence
Optional
- alt_annotations (type=list of string)
- a list of additional annotations associated with this microhaplotype, e.g. wildtype
- a list of additional annotations associated with this microhaplotype, e.g. wildtype
- associated_protein_variants (type=list of ProteinVariant)
- a list of protein variants for this haplotype, e.g. amino acid changes/INDELS
- a list of protein variants for this haplotype, e.g. amino acid changes/INDELS
- associated_seq_variants (type=list of GenomicLocation)
- a list of sequence variants for this haplotype, e.g. SNPS, indels
- a list of sequence variants for this haplotype, e.g. SNPS, indels
- masking (type=list of MaskingInfo)
- masking info for the sequence
- masking info for the sequence
- microhaplotype_name (type=string)
- an optional name for this microhaplotype
- an optional name for this microhaplotype
- pseudocigar (type=Pseudocigar)
- the pseudocigar of the haplotype
- the pseudocigar of the haplotype
- quality (type=string)
- the ASCII fastq per base quality score for this sequence, this is optional, must be same length as the sequence
Example
Code
RepresentativeMicrohaplotypes
https://plasmogenepi.github.io/portable-microhaplotype-object/RepresentativeMicrohaplotypes/Show RepresentativeMicrohaplotypes fields
Required
- targets (type=list of RepresentativeMicrohaplotypesForTarget)
- a list of the microhaplotypes for each targets
Example
Code
{
"microhaplotypes": [
{
"seq": "CAATATAATAACCTAATAAAATGTTTAGGTCAACCTTTATTTTAATTTTTTTAAAAGTATGAATTATTATGTTGAACGTTCGATTTAATATTAAGAAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAACGTTTAGGTCAACCTAAATTTATTTTAATTTTTTTAAAAGTATGAATTATTATGTTGAACGTTCAATTTAATGGTAAGTAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAATGTTTAGGTCAACCTAAATTTATTTTAATTTTTTTAAAAGTATGAATTATTATGTTGAACGTTCGATTTAATATTAAGAAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAACGTTTAGGTCAACCTAAATTTATTTTAATTTTTTAAAAGTATGAATTATTATGTTGAACGTTCGATTTAATATTAAGAAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAATGTTTAGGTCAACCTAAATTTATTTTAATTTTTTTTAAAAGCCTGAATTATTATGTTGAACGTTCGATTTAATATTAAGAAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAATGTTTATGTCAACCTAAATTTATTTTAATTTTTTTAAAAGTATGAATTATTATGTTGAACGTTCAATTTAATGGTAAGTAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAACGTTTAGGTCAACCTAAATTTATTTTATTTTTTTAAAAGTATGAATTATTATTTTGAATGTTCGATTTAATGGTAAGAAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAATGTTTAGGTCAACCTAAATTTATTTTAATTTTTTTAAAAGCCTGAATTATTATGTTGAACGTTCGATTTAATATTAAGAAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAACATTTAGGTCAACCTAAATTTATTTTAATTTTTTTAAAAGTATGAATTATTATGTTGAACGTTCAATTTAATGGTAAGTAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAACGTTTAGGTCAACCTAAATTTATTTTAATTTTTTTAAAAGTATGAATTATTATGTTGAACGTTCGATTTAATATTAAGAAAAA"
},
{
"seq": "CAATATAATAACCTAATAAAATGCTTAGGTCAACCTAAATTTATTTTAATTTTTTTAAAAGCCTGAATTATTATGTTGAACGTTCGATTTAATATTAAGAAAAA"
}
],
"target_id": 25
},RepresentativeMicrohaplotypesForTarget
https://plasmogenepi.github.io/portable-microhaplotype-object/RepresentativeMicrohaplotypesForTarget/Show RepresentativeMicrohaplotypesForTarget fields
Required
- target_id (type=integer)
- the index into the target_info list
- the index into the target_info list
- microhaplotypes (type=list of RepresentativeMicrohaplotype)
- a list of all the microhaplotypes for a target
Optional
- mhap_location (type=GenomicLocation)
- a genomic location that was analyzed for this target info, this allows listing location that may be different from the full target location (e.g 1 trimmed off the full length)
Example
Code
{
"mhap_location": {
"chrom": "Pf3D7_01_v3",
"end": 145621,
"genome_id": 0,
"start": 145448,
"strand": "+"
},
"microhaplotypes": [
{
"seq": "AACTTTTTTTATTTTTTTTGTCAATAGATAAATGATCAATATTTTCTATATTTAATCTATCAAGTATTTTTATATATCTATTATTTCTTTCTTCGATGGATAAATTATATGAATCAATATCCTTTCTTTCATCAACAAACTTTTTTATTGTTAACTCCATTTTTTTATTTA"
},
{
"seq": "AACTTTTTTTATTTTTTTTGTCAATAGATAAATGATCAATATTTTCTATATTTAATCTATCAAGTATTTTTATATATCTATTATTTCTTTCTTCGATGGATAAATTATAAGAATCAATATCCTTTCTTTCATCAACAAACTTTTTTATTGTTAACTCCATTTTTTTATTTA"
},
{
"seq": "AACTTTTTTTATTTTTTTTGTCAATAGATAAATGATCAATATTTTCTATATTTAATCTATCAAGAATTTTTATATATCTATTATTTCTTTCTTCGATGGATAAATTATATGAATCAATATCCTTTCTTTCATCAACAAACTTTTTTATTGTTAACTCCATTTTTTTATTTA"
},
{
"seq": "AACTTTTTTTATTTTTTTTGTCAATAGATAAATGATCAATATTTTCTATATTTAATCTATCAAGAATTTTTATATATCTATTATTTCTTTCTTCGATGGATAAATTATAAGAATCAATATCCTTTCTTTCATCAACAAACTTTTTTATTGTTAACTCCATTTTTTTATTTA"
}
],
"target_id": 79
}SequencingInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/SequencingInfo/Show SequencingInfo fields
Required
- sequencing_info_name (type=string)
- a name for a specific sequencing run, e.g. batch1
- a name for a specific sequencing run, e.g. batch1
- seq_platform (type=string)
- the sequencing technology used to sequence the run, e.g. ILLUMINA, NANOPORE, PACBIO
- the sequencing technology used to sequence the run, e.g. ILLUMINA, NANOPORE, PACBIO
- seq_instrument_model (type=string)
- the sequencing instrument model used to sequence the run, e.g. NextSeq 2000, MinION, Revio
- the sequencing instrument model used to sequence the run, e.g. NextSeq 2000, MinION, Revio
- library_layout (type=string)
- Specify the configuration of reads, e.g. paired-end, single
- Specify the configuration of reads, e.g. paired-end, single
- library_strategy (type=string)
- what the nuceloacid sequencing/amplification strategy was (common names are AMPLICON, WGS)
- what the nuceloacid sequencing/amplification strategy was (common names are AMPLICON, WGS)
- library_source (type=string)
- Source of amplification material e.g. was it DNA (GENOMIC) or RNA (TRANSCRIPTOMIC) (common names GENOMIC, TRANSCRIPTOMIC)
- Source of amplification material e.g. was it DNA (GENOMIC) or RNA (TRANSCRIPTOMIC) (common names GENOMIC, TRANSCRIPTOMIC)
- library_selection (type=string)
- how amplification was done (common are PCR=Source material was selected by designed primers, RANDOM =Random selection by shearing or other method)
Optional
- library_kit (type=string)
- Name, version, and applicable cell or cycle numbers for the kit used to prepare libraries and load cells or chips for sequencing. If possible, include a part number, e.g. MiSeq Reagent Kit v3 (150-cycle), MS-102-3001
- Name, version, and applicable cell or cycle numbers for the kit used to prepare libraries and load cells or chips for sequencing. If possible, include a part number, e.g. MiSeq Reagent Kit v3 (150-cycle), MS-102-3001
- library_screen (type=string)
- Describe enrichment, screening, or normalization methods applied during amplification or library preparation, e.g. size selection 390bp, diluted to 1 ng DNA/sample
- Describe enrichment, screening, or normalization methods applied during amplification or library preparation, e.g. size selection 390bp, diluted to 1 ng DNA/sample
- nucl_acid_amp (type=string)
- Link to a reference or kit that describes the enzymatic amplification of nucleic acids
- Link to a reference or kit that describes the enzymatic amplification of nucleic acids
- nucl_acid_amp_date (type=string)
- the date of the nucleoacid amplification
- the date of the nucleoacid amplification
- nucl_acid_ext (type=string)
- Link to a reference or kit that describes the recovery of nucleic acids from the sample
- Link to a reference or kit that describes the recovery of nucleic acids from the sample
- nucl_acid_ext_date (type=string)
- the date of the nucleoacid extraction
- the date of the nucleoacid extraction
- pcr_cond (type=string)
- the method/conditions for PCR, List PCR cycles used to amplify the target
- the method/conditions for PCR, List PCR cycles used to amplify the target
- seq_center (type=string)
- Name of facility where sequencing was performed (lab, core facility, or company)
- Name of facility where sequencing was performed (lab, core facility, or company)
- seq_date (type=string)
- the date of sequencing, should be YYYY-MM or YYYY-MM-DD
Example
Code
{
"library_layout": "PAIRED",
"library_selection": "PCR",
"library_source": "GENOMIC",
"library_strategy": "AMPLICON",
"seq_instrument_model": "Illumina MiSeq",
"seq_platform": "ILLUMINA",
"sequencing_info_name": "seq_info"
},SpecimenInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/SpecimenInfo/Show SpecimenInfo fields
Required
- specimen_name (type=string)
- an identifier for the specimen, should be unique within this sample set
Optional
- alternate_identifiers (type=list of string)
- a list of alternative names
- a list of alternative names
- blood_meal (type=boolean)
- whether host specimen has had a recent blood meal
- whether host specimen has had a recent blood meal
- collection_country (type=string)
- the name of country collected in, would be the same as admin level 0
- the name of country collected in, would be the same as admin level 0
- collection_date (type=string)
- the date of the specimen collection, can be YYYY, YYYY-MM, or YYYY-MM-DD
- the date of the specimen collection, can be YYYY, YYYY-MM, or YYYY-MM-DD
- drug_usage (type=list of string)
- Any drug used by subject and the frequency of usage; can include multiple drugs used
- Any drug used by subject and the frequency of usage; can include multiple drugs used
- env_broad_scale (type=string)
- the broad environment from which the specimen was collected, e.g. highlands, lowlands, mountainous region
- the broad environment from which the specimen was collected, e.g. highlands, lowlands, mountainous region
- env_local_scale (type=string)
- the local environment from which the specimen was collected, e.g. jungle, urban, rural
- the local environment from which the specimen was collected, e.g. jungle, urban, rural
- env_medium (type=string)
- the environment medium from which the specimen was collected from
- the environment medium from which the specimen was collected from
- geo_admin1 (type=string)
- geographical admin level 1, the secondary large demarcation of a nation (nation = admin level 0)
- geographical admin level 1, the secondary large demarcation of a nation (nation = admin level 0)
- geo_admin2 (type=string)
- geographical admin level 2, the third large demarcation of a nation (nation = admin level 0)
- geographical admin level 2, the third large demarcation of a nation (nation = admin level 0)
- geo_admin3 (type=string)
- geographical admin level 3, the third large demarcation of a nation (nation = admin level 0)
- geographical admin level 3, the third large demarcation of a nation (nation = admin level 0)
- gravid (type=boolean)
- whether host specimen is currently pregnant
- whether host specimen is currently pregnant
- gravidity (type=integer)
- the gravidity of the specimen host (number of previous pregnancies)
- the gravidity of the specimen host (number of previous pregnancies)
- has_travel_out_six_month (type=boolean)
- has travelled out from local region in the last six months
- has travelled out from local region in the last six months
- host_age (type=number)
- if specimen is from a person, the age in years of the person, can be float value so for 3 month old put 0.25
- if specimen is from a person, the age in years of the person, can be float value so for 3 month old put 0.25
- host_sex (type=string)
- if specimen is collected from a host with a sex, the sex listed for that host
- if specimen is collected from a host with a sex, the sex listed for that host
- host_subject_name (type=string)
- an identifier for the individual/person/patient a specimen was collected from
- an identifier for the individual/person/patient a specimen was collected from
- host_taxon_id (type=integer)
- the NCBI taxonomy number of the host that the specimen was collected from
- the NCBI taxonomy number of the host that the specimen was collected from
- lat_lon (type=string)
- the latitude and longitude of a specific site
- the latitude and longitude of a specific site
- parasite_density_info (type=list of ParasiteDensity)
- one or more parasite densities in microliters for this specimen
- one or more parasite densities in microliters for this specimen
- project_id (type=integer)
- the index into the project_info list
- the index into the project_info list
- specimen_accession (type=string)
- if specimen is deposited in a database, what accession is it associated with
- if specimen is deposited in a database, what accession is it associated with
- specimen_collect_device (type=string)
- the way the specimen was collected, e.g. whole blood, dried blood spot
- the way the specimen was collected, e.g. whole blood, dried blood spot
- specimen_comments (type=list of string)
- any additional comments about the specimen
- any additional comments about the specimen
- specimen_store_loc (type=string)
- the specimen store site, address or facility name
- the specimen store site, address or facility name
- specimen_taxon_id (type=list of integer)
- the NCBI taxonomy number of the organism(s) in the specimen, can list multiple if a mixed sample
- the NCBI taxonomy number of the organism(s) in the specimen, can list multiple if a mixed sample
- specimen_type (type=string)
- what type of specimen this is, e.g. negative_control, positive_control, field_sample
- what type of specimen this is, e.g. negative_control, positive_control, field_sample
- storage_plate_info (type=PlateInfo)
- plate location of where specimen is stored if stored in a plate
- plate location of where specimen is stored if stored in a plate
- travel_out_six_month (type=list of TravelInfo)
- Specification of the countries travelled in the last six months; can include multiple travels
- Specification of the countries travelled in the last six months; can include multiple travels
- treatment_status (type=list of string)
- If person has been treated with drugs, what was the treatment outcome
Example
Code
StageReadCounts
https://plasmogenepi.github.io/portable-microhaplotype-object/StageReadCounts/Show StageReadCounts fields
Required
- reads (type=integer)
- the read counts for this stage
- the read counts for this stage
- stage (type=string)
- the stage of the pipeline, e.g. demultiplexed, denoised, etc
Example
Code
{
"collection_country": "USA",
"collection_date": "2022-03-21",
"geo_admin1": "Phoenix",
"host_taxon_id": 9606,
"project_id": 0,
"specimen_accession": "SAMN43965049",
"specimen_name": "85b498-Wk22-Nasal",
"specimen_taxon_id": [
1280
]
},TargetInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/TargetInfo/Show TargetInfo fields
Required
- target_name (type=string)
- a name for this target
- a name for this target
- forward_primer (type=PrimerInfo)
- the forward primer for this target
- the forward primer for this target
- reverse_primer (type=PrimerInfo)
- the reverse primer for this target
Optional
- gene_name (type=string)
- an identifier of the gene, if any, is being covered with this targeted
- an identifier of the gene, if any, is being covered with this targeted
- insert_location (type=GenomicLocation)
- the intended genomic location of the insert of the amplicon (the location between the end of the forward primer and the beginning of the reverse primer)
- the intended genomic location of the insert of the amplicon (the location between the end of the forward primer and the beginning of the reverse primer)
- markers_of_interest (type=list of MarkerOfInterest)
- a list of markers of interest that are covered by this target
- a list of markers of interest that are covered by this target
- target_attributes (type=list of string)
- a list of classification types for this target
Example
Code
{
"forward_primer": {
"location": {
"chrom": "Pf3D7_14_v3",
"end": 1956129,
"genome_id": 0,
"start": 1956096,
"strand": "+"
},
"seq": "TTTTTCTCCACTTTGTAATTTTTATTGTTGAAT"
},
"gene_name": "PF3D7_1447900",
"insert_location": {
"chrom": "Pf3D7_14_v3",
"end": 1956286,
"genome_id": 0,
"ref_seq": "ATATATATAAAGTTAAACCTATAAATAATACACTACCTAATAAACTATTCTTATATTTAAAAATAAATATAATACATGTTATTAATCCTTCTATTGTTGCCGGAATAATATACATTAAAACAGAACTCATCAAATTATTAGCACTCTCGGTACCTCT",
"start": 1956129,
"strand": "+"
},
"reverse_primer": {
"location": {
"chrom": "Pf3D7_14_v3",
"end": 1956129,
"genome_id": 0,
"start": 1956096,
"strand": "+"
},
"seq": "CGGGTGGTATCATGAGAATAGTTGAT"
},
"target_attributes": [
"Included",
"DrugResistance"
],
"target_name": "t96"
}Code
{
"forward_primer": {
"seq": "TTTTTCTCCACTTTGTAATTTTTATTGTTGAAT"
},
"reverse_primer": {
"seq": "CGGGTGGTATCATGAGAATAGTTGAT"
},
"target_name": "t96"
}TravelInfo
https://plasmogenepi.github.io/portable-microhaplotype-object/TravelInfo/Show TravelInfo fields
Required
- travel_country (type=string)
- the name of country, would be the same as admin level 0
- the name of country, would be the same as admin level 0
- travel_start_date (type=string)
- the date of the start of travel, can be approximate, should be YYYY-MM or YYYY-MM-DD (preferred)
- the date of the start of travel, can be approximate, should be YYYY-MM or YYYY-MM-DD (preferred)
- travel_end_date (type=string)
- the date of the end of travel, can be approximate, should be YYYY-MM or YYYY-MM-DD (preferred)
Optional
- bed_net_usage (type=number)
- approximate usage of bed net while traveling, 1 = 100% nights with bed net, 0 = 0% no bed net usage
- approximate usage of bed net while traveling, 1 = 100% nights with bed net, 0 = 0% no bed net usage
- geo_admin1 (type=string)
- geographical admin level 1, the secondary large demarcation of a nation (nation = admin level 0)
- geographical admin level 1, the secondary large demarcation of a nation (nation = admin level 0)
- geo_admin2 (type=string)
- geographical admin level 2, the third large demarcation of a nation (nation = admin level 0)
- geographical admin level 2, the third large demarcation of a nation (nation = admin level 0)
- geo_admin3 (type=string)
- geographical admin level 3, the third large demarcation of a nation (nation = admin level 0)
- geographical admin level 3, the third large demarcation of a nation (nation = admin level 0)
- lat_lon (type=string)
- the latitude and longitude of a specific site
Example
Code