pmotools.pmo_engine.pmo_exporter module
- class pmotools.pmo_engine.pmo_exporter.BedLoc(chrom: str, start: int, end: int, name: str, score: float, strand: str, ref_seq: str, extra_info: str)[source]
Bases:
NamedTupleA single BED-format genomic location.
Used when extracting target / panel insert locations out of a PMO so they can be written to a BED file.
- Variables:
chrom – chromosome / contig name
start – 0-based start position
end – end position (exclusive)
name – target name
score – BED score column; here the insert length (
end - start)strand –
+or-ref_seq – reference sequence for the insert, empty string if not loaded
extra_info – free-text key/value annotation, e.g. genome name/version
- chrom: str
Alias for field number 0
- end: int
Alias for field number 2
- extra_info: str
Alias for field number 7
- name: str
Alias for field number 3
- ref_seq: str
Alias for field number 6
- score: float
Alias for field number 4
- start: int
Alias for field number 1
- strand: str
Alias for field number 5
- class pmotools.pmo_engine.pmo_exporter.PMOExporter[source]
Bases:
objectA collection of functions to export information out of a PMO
- class SheetConfig(sheet_name: str, df: DataFrame, max_row_check: int | None = None, specific_cols: list[str] | None = None)[source]
Bases:
objectConfiguration for writing a DataFrame to an Excel sheet.
- df: DataFrame
- max_row_check: int | None = None
- sheet_name: str
- specific_cols: list[str] | None = None
- static export_bioinformatics_methods_info_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the bioinformatics_methods_info meta information of a PMO to a dataframe
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the library_sample metadata
- static export_bioinformatics_run_info_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the bioinformatics_run_info meta information of a PMO to a dataframe
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the library_sample metadata
- static export_library_sample_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the library_sample meta information of a PMO to a dataframe
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the library_sample metadata
- static export_panel_info_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the panel meta information of a PMO to a dataframe
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the panel metadata
- static export_pmo_header_table(pmodata, separator: str = ',') DataFrame[source]
Export the pmo header meta information of a PMO to a dataframe
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the genomes metadata
- static export_project_info_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the project_info meta information of a PMO to a dataframe
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the project_info metadata
- static export_sequencing_info_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the sequencing_info meta information of a PMO to a dataframe
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the sequencing_info metadata
- static export_specimen_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the specimen meta information of a PMO to a dataframe Currently avoiding exporting values of complex object types like TravelInfo or Parasite densities, best to export such values in their own tables
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the specimen metadata
- static export_specimen_travel_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the specimen meta information of a PMO to a dataframe Currently avoiding exporting values of complex object types like TravelInfo or Parasite densities, best to export such values in their own tables
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the specimen metadata
- static export_target_info_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the target meta information of a PMO to a dataframe
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the panel metadata
- static export_targeted_genomes_meta_table(pmodata, separator: str = ',') DataFrame[source]
Export the targeted genomes meta information of a PMO to a dataframe
- Parameters:
pmodata – the pmo export the information from
separator – the separator to use for list values
- Returns:
a pandas dataframe of the genomes metadata
- static export_to_excel(pmo, output_path: str) None[source]
Export a PMO object to a multi-sheet Excel file.
- Parameters:
pmo – The PMO object to export.
output_path – The path to write the Excel file to.
- static extract_alleles_per_sample_table(pmodata, additional_specimen_info_fields: list[str] = None, additional_library_sample_info_fields: list[str] = None, additional_microhap_fields: list[str] = None, additional_representative_info_fields: list[str] = None, default_base_col_names: list[str] = ['library_sample_name', 'target_name', 'seq'], jsonschema_fnp='/home/runner/work/pmotools-python/pmotools-python/src/schemas/portable_microhaplotype_object_v1.1.0.schema.json', validate_pmo: bool = False) DataFrame[source]
Create a pd.Dataframe of sample, target and allele. Can optionally add on any other additional fields
- Parameters:
pmodata – the data to write from
additional_specimen_info_fields – any additional fields to write from the specimen_info object
additional_library_sample_info_fields – any additional fields to write from the library_samples object
additional_microhap_fields – any additional fields to write from the microhap object
additional_representative_info_fields – any additional fields to write from the representative_microhaplotype_sequences object
default_base_col_names – The default column name for the library_sample_name, target_name and seq
jsonschema_fnp – path to the jsonschema schema file to validate the PMO against
validate_pmo – whether to validate the PMO with a jsonschema
- Returns:
pandas dataframe
- static extract_panels_insert_bed_loc(pmodata, select_panel_ids: list[int] = None, sort_output: bool = True)[source]
Extract out of a PMO the insert location for panels, will add ref seq if loaded into PMO
- Parameters:
pmodata – the PMO to extract from
select_panel_ids – a list of panels ids to select, if None will select all panels
sort_output – whether to sort output by genomic location
- Returns:
a list of target inserts, with named tuples with fields: chrom, start, end, name, score, strand, ref_seq, extra_info
- static extract_targets_insert_bed_loc(pmodata, select_target_ids: list[int] = None, sort_output: bool = True)[source]
Extract out of a PMO the insert location for targets, will add ref seq if loaded into PMO
- Parameters:
pmodata – the PMO to extract from
select_target_ids – a list of target ids to select, if None will select all targets
sort_output – whether to sort output by genomic location
- Returns:
a list of target inserts, with named tuples with fields: chrom, start, end, name, score, strand, ref_seq, extra_info
- static list_library_sample_names_per_specimen_name(pmodata, select_specimen_ids: list[int] = None, select_specimen_names: list[str] = None) DataFrame[source]
List all the library_sample_names per specimen_name
- Parameters:
pmodata – the PMO
select_specimen_ids – a list of specimen_ids to select, if None, all specimen_ids are used
select_specimen_names – a list of specimen_names to select, if None, all specimen_names are used
- Returns:
a pandas dataframe with 3 columns, specimen_id, library_sample_id, and library_sample_id_count(the number of library_sample_ids per specimen_id)
- static write_bed_locs(bed_locs: list[pmotools.pmo_engine.pmo_exporter.BedLoc], fnp, add_header: bool = False)[source]
Write out a list of BedLoc to a file, will auto overwrite it
- Parameters:
bed_locs – a list of BedLoc
fnp – output file path, will be overwritten if it exists
add_header – add header of #chrom,start end,name,score,strand,ref_seq,extra_info, starts with comment so tools will treat it as a comment line