Code
pmotools-python
Most of these basic panel info can be found underneath extract_panel_info_from_pmo
pmotools-python v0.1.0 - A suite of tools for interacting with Portable Microhaplotype Object (PMO) file format
Available functions organized by groups are
convertors_to_json
text_meta_to_json_meta - Convert text file meta to JSON Meta
excel_meta_to_json_meta - Convert Excel file meta to JSON Meta
microhaplotype_table_to_json_file - Convert microhaplotype table to a JSON file
terra_amp_output_to_json - Convert Terra output to JSON sequence table
extractors_from_pmo
extract_pmo_with_selected_meta - Extract samples + haplotypes using selected meta
extract_pmo_with_select_specimen_names - Extract specific samples from the specimens table
extract_pmo_with_select_library_sample_names - Extract experiment sample names from experiment_info table
extract_pmo_with_select_targets - Extract specific targets
extract_pmo_with_read_filter - Extract with a read filter
extract_allele_table - Extract allele tables for tools like dcifer or moire
extract_insert_of_panels - Extract inserts of panels from a PMO
extract_refseq_of_inserts_of_panels - Extract ref_seq of panel inserts from a PMO
working_with_multiple_pmos
combine_pmos - Combine multiple PMOs of the same panel
extract_basic_info_from_pmo
list_library_sample_names_per_specimen_name - List experiment_sample_ids per specimen_id
list_specimen_meta_fields - List specimen meta fields in the specimen_info section
list_bioinformatics_run_names - List all tar_amp_bioinformatics_info_ids in a PMO
count_specimen_meta - Count values of selected specimen meta fields
count_targets_per_library_sample - Count number of targets per sample
count_library_samples_per_target - Count number of samples per target
validation
validate_pmo - Validate a PMO file against a JSON Schema
Getting files for examples
This will extract the insert location of targets of the panel infos out of a PMO and write it out as a bed file
usage: pmotools-python extract_insert_of_panels [-h] --file FILE
[--output OUTPUT]
[--overwrite] [--add_ref_seqs]
options:
-h, --help show this help message and exit
--file FILE PMO file
--output OUTPUT output file
--overwrite If output file exists, overwrite it
--add_ref_seqs add ref seqs to the output as ref_seq
The python code for extract_insert_of_panels
script is below
pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_insert_of_panels.py
#!/usr/bin/env python3
import argparse
from pmotools.pmo_engine.pmo_processor import PMOProcessor
from pmotools.pmo_engine.pmo_reader import PMOReader
from pmotools.utils.small_utils import Utils
def parse_args_extract_insert_of_panels():
parser = argparse.ArgumentParser()
parser.add_argument("--file", type=str, required=True, help="PMO file")
parser.add_argument(
"--output", type=str, default="STDOUT", required=False, help="output file"
)
parser.add_argument(
"--overwrite", action="store_true", help="If output file exists, overwrite it"
)
parser.add_argument(
"--add_ref_seqs",
action="store_true",
help="add ref seqs to the output as ref_seq",
)
return parser.parse_args()
def extract_insert_of_panels():
args = parse_args_extract_insert_of_panels()
# check files
Utils.inputOutputFileCheck(args.file, args.output, args.overwrite)
# read in PMO
pmo = PMOReader.read_in_pmo(args.file)
# get panel insert locations
panel_bed_locs = PMOProcessor.extract_panels_insert_bed_loc(pmo)
# write
with Utils.smart_open_write(args.output) as f:
f.write(
"\t".join(
[
"#chrom",
"start",
"end",
"target_id",
"length",
"strand",
"extra_info",
]
)
)
if args.add_ref_seqs:
f.write("\tref_seq")
f.write("\n")
for loc in panel_bed_locs:
f.write(
"\t".join(
[
loc.chrom,
str(loc.start),
str(loc.end),
loc.name,
str(loc.score),
loc.strand,
loc.extra_info,
]
)
)
if args.add_ref_seqs:
f.write("\t" + str(loc.ref_seq))
f.write("\n")
if __name__ == "__main__":
extract_insert_of_panels()
#chrom start end target_id length strand extra_info
Pf3D7_01_v3 145449 145622 t1 173 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 179903 180115 t2 212 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 181557 181673 t3 116 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 495971 496143 t4 172 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 512199 512388 t5 189 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 531682 531900 t6 218 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 532690 532844 t7 154 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 534215 534368 t8 153 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 534941 535110 t9 169 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_02_v3 109807 109982 t10 175 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_02_v3 278165 278336 t11 171 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_02_v3 470492 470676 t12 184 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_02_v3 805822 805942 t13 120 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 85440 85646 t14 206 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 141963 142181 t15 218 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 221363 221495 t16 132 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 618396 618581 t17 185 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 654002 654175 t18 173 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 850816 850989 t19 173 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 109912 110087 t20 175 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 133491 133701 t21 210 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 141778 141945 t22 167 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 415653 415826 t23 173 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 544718 544861 t24 143 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 748230 748436 t25 206 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 748533 748696 t26 163 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 802525 802713 t27 188 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1037634 1037844 t28 210 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1100656 1100831 t29 175 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1102389 1102578 t30 189 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1113450 1113604 t31 154 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1128489 1128673 t32 184 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 329378 329550 t33 172 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 958059 958221 t34 162 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 958389 958506 t35 117 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 1042162 1042281 t36 119 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 1309609 1309744 t37 135 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_06_v3 145343 145501 t38 158 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_06_v3 532195 532378 t39 183 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 165235 165422 t40 187 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 166035 166167 t41 132 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 298886 299005 t42 119 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 729975 730088 t43 113 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 1149415 1149585 t44 170 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 1358694 1358911 t45 217 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 102326 102500 t46 174 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 336468 336647 t47 179 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 339168 339357 t48 189 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 549993 550218 t49 225 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 933023 933143 t50 120 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 1269320 1269456 t51 136 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 1344686 1344819 t52 133 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 1362891 1363087 t53 196 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 516928 517092 t54 164 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 596133 596334 t55 201 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 685601 685792 t56 191 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 1178894 1179078 t57 184 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 1406405 1406541 t58 136 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 1437114 1437303 t59 189 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 377095 377209 t60 114 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 992371 992544 t61 173 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 1386700 1386869 t62 169 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 1399544 1399711 t63 167 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 1436479 1436682 t64 203 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 119486 119693 t65 207 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1009856 1010038 t66 182 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1018953 1019085 t67 132 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1376185 1376372 t68 187 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1552430 1552640 t69 210 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1750865 1751055 t70 190 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1816211 1816425 t71 214 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 63166 63280 t72 114 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 659891 660010 t73 119 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 684088 684261 t74 173 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 943258 943428 t75 170 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 1237431 1237603 t76 172 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 2050130 2050314 t77 184 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 103659 103879 t78 220 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 156566 156722 t79 156 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 1150303 1150493 t80 190 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 1419543 1419670 t81 127 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 1725365 1725570 t82 205 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 1876352 1876534 t83 182 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2114975 2115142 t84 167 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2124634 2124847 t85 213 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2479086 2479246 t86 160 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2481106 2481288 t87 182 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2669135 2669307 t88 172 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 39953 40137 t89 184 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 120215 120351 t90 136 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 150105 150294 t91 189 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 279663 279786 t92 123 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 407379 407571 t93 192 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 564208 564377 t94 169 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 1038369 1038486 t95 117 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 1956129 1956286 t96 157 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 1992289 1992426 t97 137 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 2524962 2525089 t98 127 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 3124642 3124842 t99 200 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 3214351 3214478 t100 127 + [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Can add on the reference sequence if it’s loaded in PMO, if it’s not loaded will be blank column
This will extract the reference sequence of the insert location of the targets within the panel info out of a PMO and write it out as a table. The reference sequence is an optional field and so if no reference sequence is loaded then just blanks will be extracted
usage: pmotools-python extract_refseq_of_inserts_of_panels [-h] --file FILE
[--output OUTPUT]
[--overwrite]
extract ref_seq of inserts of panels, but if no ref_seq is save in the PMO
will just be blank
options:
-h, --help show this help message and exit
--file FILE PMO file
--output OUTPUT output file
--overwrite If output file exists, overwrite it
The python code for extract_refseq_of_inserts_of_panels
script is below
pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_refseq_of_inserts_of_panels.py
#!/usr/bin/env python3
import argparse
from pmotools.pmo_engine.pmo_processor import PMOProcessor
from pmotools.pmo_engine.pmo_reader import PMOReader
from pmotools.utils.small_utils import Utils
def parse_args_extract_refseq_of_inserts_of_panels():
parser = argparse.ArgumentParser()
parser.add_argument("--file", type=str, required=True, help="PMO file")
parser.add_argument(
"--output", type=str, default="STDOUT", required=False, help="output file"
)
parser.add_argument(
"--overwrite", action="store_true", help="If output file exists, overwrite it"
)
parser.description = "extract ref_seq of inserts of panels, but if no ref_seq is save in the PMO will just be blank"
return parser.parse_args()
def extract_refseq_of_inserts_of_panels():
args = parse_args_extract_refseq_of_inserts_of_panels()
# check files
Utils.inputOutputFileCheck(args.file, args.output, args.overwrite)
# read in PMO
pmo = PMOReader.read_in_pmo(args.file)
# get panel insert locations
panel_bed_locs = PMOProcessor.extract_panels_insert_bed_loc(pmo)
# write
with Utils.smart_open_write(args.output) as f:
f.write("\t".join(["target_id", "ref_seq"]) + "\n")
for loc in panel_bed_locs:
f.write("\t".join([loc.name, loc.ref_seq]) + "\n")
if __name__ == "__main__":
extract_refseq_of_inserts_of_panels()
target_id ref_seq
t1 AAACTTTTTTTATTTTTTTTGTCAATAGATAAATGATCAATATTTTCTATATTTAATCTATCAAGTATTTTTATATATCTATTATTTCTTTCTTCGATGGATAAATTATAAGAATCAATATCCTTTCTTTCATCAACAAACTTTTTTATTGTTAACTCCATTTTTTTATTTAA
t2 CTTTCGATACAGGACATATAGATCATAATATAAACGAATATGAAAAACATTTTACAATTTTAAAAGAATCTTTTTTTCATAAGTCATTAAAATTTATGGATTATATATGGATTGTTATAATGAAACGAGAAAATAATACATTTTTAAATAGAATAAGAACTGAACAAGTCAAAAAATCGTTATTAATAACAGGTATTATAAACGAAAATATT
t3 TTCATTCTTTTTTTAACGAAAACTATTCATCTCAAAAATATAAGATATTTTATATGACGAATGCCATTGTATTTTTTGTTACGTAAAACCTGACTTCTTCAAGGAAAACACATGCG
t4 TGAAAGTAGCGAATACCCTGTAGAAATTGTTAGTAAACCTCTGGAATGGTTATTGTTTCATGATTTGACTAAGCCTGATGTTACTGCACTACCTGAAGAATTACCATTAACGAGCTATAAAGTAACACCTACTTCTATTAATGTATTGCATAAAGAGGGCCCCACTTTAAAA
t5 TTAAATAAATTAAGTGAAGATGATTATGAAAAAAATTGTAGACATTTAAATTATTTAATAGATAATATAAGAGAATTATTTTTTAGTTCTAAAAATATTAGAATTCCTGATGTAGCTCGTAAATATTTGTGGGATAATCAAATTGAAGGAAATCTCAAAAAATTAATATCTTCAGAAACAAAATATAAA
t6 AAAAGAAAGATGTTAAGAAAAAAAAGATCGATATGATAAATATATTACACATTCCTACAAATAATTATAGCATGCCAAATGAAAAAGATATAAACAGTTATGAATTTTTTGGATCTGAACCTATTGAATTATATGATGTAAAATCTAATAAGAATTGTGTTATAACTAGCCAATCTTATATTTGTATTGAAAACCCGCATGCTGCCATAGTATCCGAA
t7 ACTTTCTTAATTCTATGAATGAAGAATATACGCATATTAATTTAAATAACTTTATACATAATAATAACCATGATAATGACGAAAATTATACATTAGATCAGGTGGAAGGATATCCTATGACTAGTTATCAAAATAATATATACAAGGATTTTTT
t8 TGATAAAAAAAGAAAATGTAACAATCTACCTTGTGATTGTATCTTATGTAAAAAAAAGCACAATGTTTGTTATTGTAATATGTGTAAAAAGAAAGAAAAAAATATGAGAGAAGATTTATCGGTTGTAAAATATAATGAATATCCGATTAGTGT
t9 AATACTCCGTCAGATATTCAAAATGAATGTATTTGTTCAAAAATAAATGAAGATAGAAATAATGATATGATAAACATATCTGAAATATATTATCGTTTCATTAAATTTATAACAATGATTGAAATATATTTGTGTGTAATAGAGGAAATTAAAAGGGAAGAATGGGAAA
t10 AACTTATGTTCATGAGCTAATTTCCCACAAATACTCCATAACGAACTTTTCATTTTATTAAATTTATCTCTCAAAAGAGAATGACTATAATGCCATATTAAATACATATCTTTCCTTTCTAATTTTCCTGGTAATTCTATTATCATTCTTTCTAAATCTTCTTCTGTAACTTTTC
t11 GGAATTTTCTTTTTTATGACTTTCTTCTCCTTGTTCAGAAGCTTCTTTTTCATCCTTTTTTTCTGCTGCGTCAGATAAATTGGGGGAAGCACTTGAAGATTCATTTCCTCCAGGAGTATTACTAGTACTTACTCCGTCCACATTTGGTTTTTCTTCCCCTAGAATTCTCAT
t12 ATACACATAAGAAAAAAAAAATTTATTTATTCTTACAAAAAGAATATAAAAACAAAATTTTGGGATTTATAAATTTTTATAAACATATAACACACAAAATAAAAAAGAAACAAGAAAATGTTCATGATAAAATCACTTTTTTAAAATGTCTAAAGGAACTCTTTTTTGTCACACATACAAATAA
t13 TTAATTATGAAGACAGTCTCACGACTTCATGTTATATTGATGAAAACAAATCCGATTCATCCTATGAAACTGAAGAAAATGTAAACTATAATAATAAAATGGGTAAACGCAAAAATTTAG
t14 AACTTTTTAACACTATCATTATAATTATGTCTTTTATTTTCATATTTTTCTTTATAATAATTTATATCCTTTAATTTTTCTTTCATCAAATTTAACCATTTATCATTTAAATTCTCTTTTTCCACAGCTCCAGCATTTTTATTTATATCATCTACAACTACATCTTCCTTCACATAATTATTTATATAAAAATTATTATCATCTAA
t15 AGCATTTATAACAAAATATCAGAAGATGGAAAAACCAAAAATAGCCTACTTAGCGATATATCTAAATTATTTAAAATTGTAAAAGAAAGCAAATTAATATTTGTAACTGGATTTTTATTAACAACCTTGTCAGCAATTGTCGATTCATACATTCCAATTTTTCTATCCAAAACGGTATCTTTTGTAATGGAAAGAAAAAAATTTACATTTCTTAAAAT
t16 TTGAACTATTTACGACATTAAACACACTGGAACATTTTTCCATTTTACAAATTTTTTTTTCAATATCATTTGCATAATCTAATTCGTCTTTAGGTTTATTAGCAGAGCCAGGCTTTATTCTAACTTGAATAC
t17 AAAAAGAGATTATAGAAGTGGAAAGAAGACATATATAATACAAGCTCTACAATATGCATTAACATATTATAGCAAACTTTCAAATAGAAAGGAAGCACCTAAAGTAACCATGTTATTTACAGATGGAAATGATTCCTATGAATCAGAAAAGGGATTACAGGATATTGCATTATTATATAGAAAAG
t18 ATATTTTTATAATTTCCTTTCATCTTATTTTTTTCTTTATATTTTTTTTTTTCATATGTAAGATTTATATAATCACTTTCGCTTTTTCTAACCCTTTTGAACTTTTTAAATGTACTTCGTTCATTATTTCTAATTCTCGTAAACACAAGAATAAATATTTTGATATATCTTAT
t19 TCATATTCGTTTCAGCGTTTATAGAGCGAATATTATCGATTATGTCTATATTCCCTAAAATATGTACATAATAATCTCCATTTAATAATGTTACTTTCTTAAGAATATTTTTGTATATGACATCAAAGATATTACGAGTTAACAATACATCTACATTTCCGTCTATTATATAT
t20 AATATTTGTATATATTCTAGTAAACACTTTAGGTACACCTGCAAGTACCTCAACATCTGAATTATATAAATCTTTAGAAAAACAACTAATATCCTTACTCGATATATTTATTTTTATACCAAGAATAACAGCCATATAAACCAAAACTCTCTCAAATACATGTGATACAGGTAAA
t21 TTAGATTTTTTCCCTCCAGCAGGTGCACTAACTTTAGGTGTTTTAAATCTAGATGTATGTGGAACCCCATCTTTATTTGGTTTACCTCTATTTAATCTTTTACCAGCAGTAGAAACTACATTCTCTCGATTATCATTACCTACTTCAGTCTTTACAATTGTAGGAGGTGTTTTAACATGATTATCCCCTCCATGAGGAGTAACCCTTAAA
t22 CTTTCAAATTATATAAAGACAGAACTAAGAAATATAAACCTGCAAGAAATAAAAAACAATATAATAAAAATATTTAAAGAATTCAAATCTGCACACAAAGAAATTAAAAAAGAATCAGAACAAATTAATAAAGAATTTACCAAAATGGATGTCGTCATAAATCAATT
t23 ATGCTAGTTTTGCTGCTCATGAAAATAAAAGCTACTCATATGAAAGTCGTACATATAAAATGTATCCACCTGAATTTAATACATTAATGTTAAAAGCAGATTATTTTATAAGAGATATAAATACACGAGGATTTAGAGAAGTAAATATGGATTCATGTAAATCATATACAAAT
t24 ATATATTTTACATAATAACAATCCTTCATGTAATGATTATAATTTAAATAATCTTTCATTTAATATAAATAAATATATTAATGAAGAAAAAGGCAAAAATAAAAAAACAAATCAACATATATCAGAACAATTTTTATTTCCTA
t25 GAAATGTAATTCCCTAGATATGAAATATTTTTGTGCAGTTACAACATATGTGAATGAATCAAAATATGAAAAATTGAAATATAAGAGATGTAAATATTTAAACAAAGAAACTGTGGATAATGTAAATGATATGCCTAATTCTAAAAAATTACAAAATGTTGTAGTTATGGGAAGAACAAGCTGGGAAAGCATTCCAAAAAAATTTA
t26 AATAGTTTTACTTGGGAAATTAAATTACTATAAATGTTTTATTATAGGAGGTTCCGTTGTTTATCAAGAATTTTTAGAAAAGAAATTAATAAAAAAAATATATTTTACTAGAATAAATAGTACATATGAATGTGATGTATTTTTTCCAGAAATAAATGAAAAT
t27 TAAATTATTATAGAAAGATAGAAGTTATTTTATATGAATGGTTATATTTTCATTATAATAATGTATATAACACCAAAGTAAAAAAACAAAAATTTATATTTACCCAACAAAAAAAAGACATATCAAAACATAACAAGTTATATCTTCAGTATGATCAAAATAAAAGAAACTCTGAAATAGAACATACA
t28 ATCTTATAAAGTTAATGAAATAGCTCAACTTAATTTAACCATAGAAAGAGATTTAACAGATGATGCTGTGATTTTTGCACACTCACTTTATTTACCATTTGAAAAAGAAGAAATGTGGTGGATCGTTATCGGAATTAAAAAAATGAACTTACTTTTATCTATCAAAAAATTATCTTTATTGAAAAGTGTCAATAATATAAAAATTAATTT
t29 AATGTTCTAAGATCTGATGGAAAAATATCTGATCAAGGTTCTCAGAAAAGTCCTCCTAAGGAACTTTCTAATAAACAAATGACTCCTGCTCAACGTAAGAATGTGCCACATTTTGTTGAAAGAAGAGGCTATGGAAATAGTCATGTTAGGGGTAACGCACTTAAAAAAATTAGTA
t30 GAAGAAATAAATAAAATAATTTATACAAACGAATTTAATAATTATGAAGATAAAATATATGAAGATGTCAAATATATTAAAGAACAGGAAAATGAAATGTACTTGAGAGATGGAATTGAAGAGTTACATATGGATGAACCAAGTGGGGATGTATATTTTGATGATCAAGATGATTATATATTTTTAGAT
t31 AAATCATCTAAGAATAAACTTTTTTGTTTAAACCATTTATTGAACATTTCACTTAAATGTGATTCAATTTTTTTTTCTGAGCTTTCCATAATTTGATTACAACTATTCAATTTGTTTGTATAAGTTTTAGGTCTTATTATTCTTTTACGTTTTA
t32 ATTGAAAGAGTTAAAGGGAAAAATACAAAATTATTTAGATAATGATATTCAATTGAAAAATGGAAAACTCCTATATAAGGATACATGGGATAGAATTGTTTTGAAATTTTGTAGAACTGTAGCAATAGAAGAGGCAGAATACACTAGAAAATTTTATAGCTTAATTAATGATAAACATACAATT
t33 ATTGTGATGTATATAAATTCCCTTCTTCTTTATGTACCACATTATAAGAACCACGAACATATTCTTTTAAAAATGTTAACTTACTTCCAACAAATATATAATCAAAACAAAATTTTTTATTGTCGAAATGTTCTCGTTCAACCCCATTCATATTTCGGATATCATTATTTAA
t34 ATTTATATCATTTGTATGTGCTGTATTATCAGGAGGAACATTACCTTTTTTTATATCTGTGTTTGGTGTAATATTAAAGAACATGAATTTAGGTGATGATATTAATCCTATAATATTATCATTAGTATCTATAGGTTTAGTACAATTTATATTATCAATGAT
t35 TACGAAATTTATAACAATTTTTACATATGCCAGTTCCTTTTTAGGTTTATATATTTGGTCATTAATAAAAAATGCACGTTTGACTTTATGTATTACTTGCGTTTTTCCGTTAATTTA
t36 ATATTATTACGGTACCATTATATGATTCTTTAGGTCCTCAATCAAGTAGATTTATATTAGATCAAACACAGATGGAAACTATTGTATGTGACAAGACATGTGCTCGTAATTTATTTAAG
t37 TAATGAAGAAAATATGTCTGACAGACCAAACAGTTTATCTCATGATAAGGATCAACACCTCGATGAAACACATAATGAACAATATGGATTATACGTAAAAGAAATGGAATCTAAAGTTGAAAAATTAGCTGAAAA
t38 AATTTGAATAAAATTTATCATTCACAATATTATATTTTTGTTTCTTCTTTTTTCCCATAATATTATTATTTTGTTCACAATATATGTTCATGTGTCCCCTCTTTTTCTGTAAAATATTAAAATGTTTCTTACTATGCTTCTCTTCATTTTTAATAACA
t39 AGAACAATTGCTAAACACCAAATTGGGTGAAACAAAAAACCACCTGAACAGAACCCCATTTATACCTGAATCGGTCATACGAGAAAGGAAATTACGCCAAGAAAAAGCTCAATCCACAAACAACATGTTCGATTCAACAAACGCAGATAGTATTACGTCCCCATGTGATCCAACGAATGCCAC
t40 ATCATTATTATTATTATTATAATTATCATTGTTATTATTATTGTTGTTGTTGTTATTATTATTTCCTTTATTATTTAATATGCATTTTTTAATATCCATTTTATTATTTATCATATCAGTACTTTTATTTTTCTCCTTTTCGTTTATAGCTTTTTCCCTATGCACCCTGAATTTCCCATTTTTTTTT
t41 CTTTTATTTGAACTTTTTTATTTTCTTCATTATCAATATAAAAACAACAATCTTTAAATTTATGAAATAACACCCAACTTAATGTATAAAATATAATTTCTATAAAATCAAAATTAATGTAATAACCTAATT
t42 ACATTCAATATGGTTCTGAGCTTAAAGATGTTATCCTTATACATGTTTCTATTCTCATTCGTTATAACATCAATGACATATAAAACATATCGCTCAAATATTTTGTGAACAATATTATA
t43 TTTTATGTGAAATTTTAAATAAAGAGAAATTGTTTGTGCACACATCTATTTTGGGATATATATCTAATCGTTTATGTTACGATATAAAAAAATATAGATGTTCATTATTGAAG
t44 TAATATTTAATATTTTTGTTTAGGAAGTTTTCAAAATTGATGACATTGATAACCGTTTTTATTAAATCGCTTGGGTTGTTTGTGTGTTCCATTGTAGAAGACATAATTTTAAGAATGTGTTTTTGTGTTATTAAACAAAACTTTGTTACTATATGGAATCGTATGTTATA
t45 AAGAATCTCAACTTTTGCTTAAAAAAAATGATAACAAATATAATTCTAAATTTTGTAATGATTTGAAGAATAGTTTTTTAGATTATGGACATCTTGCTATGGGAAATGATATGGATTTTGGAGGTTATTCAACTAAGGCAGAAAACAAAATTCAAGAAGTTTTTAAAGGGGCTCATGGGGAAATAAGTGAACATAAAATTAAAAATTTTAGAAAAAA
t46 TTAATTTTCTCAAGTAACATATTTTTGTCTTGACTACTTTCATCTGTAAATGGAACGACTACTCTTTGCTTACCTGCAAAAAGGACAACTGACATATGAGCCTTATCCTTACTTACATTCGAATTAAAGACCATAGATTCTAAAAATGGTATAGTTCCTTTTAACCAATAATCC
t47 TAAATTTACAGAAAAATTTAGATTTAATCATATGGCTAAAAAAGAATATAAAAGAAGGAGAAGCTTTAATTTCTGATATACCCACATCAAGTTTTTTAAGGTGTACTACAAATTATAAATTTGTATTGCATCCACAATATGAAGATAGTAACTTGAGAAAGAGGGTTCAAGATTATTAT
t48 ATCATTACTATTCTTTATCCATATATATGTTGTTATATCAGTTGCACCTGAGGGTGGATTCAAACCTCTTGCTTGTATAATATATGCACGTACTACTAAATTACGTGTTAATCTATATTCTCTTACTAAATCATCACATTTCTTTATCATTTCTTTTTTTAAATACATTGTCTTTTCATCATACACATT
t49 AACTAACAAATTATGATAATCTAGTTTATGATATAAAAAATTATTTAGAACAAAGATTAAATTTTCTTGTATTAAATGGAATACCTCGTTATAGGATACTATTTGATATTGGATTAGGATTTGCGAAGAAACATGATCAATCTATTAAACTCTTACAAAATATACATGTATATGATGAGTATCCACTTTTTATTGGATATTCAAGAAAAAGATTTATTGCCCATT
t50 TTGGTTTTAATGAGGATGATTTAGATAAAGAGTTTTTTTTTGATTTGCCGTCGATTAGTGGTTTTTCAAGTAATGGTATGAAAAAATGTAATTTAAGAAATTTATTAAAAAGATTAGAAG
t51 GCCTTAGCATTAATAACACTTATAGGAGGTATTTATATGACTAAAGGAAGAAAAAGTTCCATATGGCATAATTCAGAAATAAATGATACACTTTATGATGTTGTTCTTGAAACAGTGAATCAAGGAAATACAGATG
t52 AAATATATCTTTTACTTTATCTGTAAGTTTCATGGAATCTATTATTTCATGTGCATATTTTATTTTTGCCTCGTTTATATATTTACCTGCTTTTAATTTACATAAACATTTTTCTATTAAGTTTTGATCATTT
t53 CGTAAAAAAAAAGGGTTTTCCTAAAAAACCATCGGTTCATGGTTTTACACAGATGGTGTTTTTGCAGCTTACTAATATAGTCATGTATGCCCTTTGTTTTATGAGCCTTTCACAAATATATGCTTATTTCGAAAACGTGAATTTTTATATTATAAGCAATTTTCGTTTCCTTGAGAGATATTTTAATATATTCAAT
t54 ATGTTTGTGTAATAAAGGATATTTCTATAATGTCAAGAAACAACAATGTGAAGAATGTCCCATAGGTTTTTATTGTCCAAAAATAAGCTCAAAAAATAATATAAAAAAACCTATCAAATGTCCTAAAAATTCCTCAAGTATAAAAAAAGGTATGTATATATCAA
t55 ATTATTTTCGTTTTTTTTTATATTATCATTATAATTATATGTATTTACAAAGAATTTATTATTATTTGTTTCTTTATAATTATATTCATTTTCTTTATTACTTAAGGAATTTATATTTGATAAAACTTGTTGTCCTCTTTCTTCTTTGTTGATGTGAATATTTGAGAAAATAATAATAAATGTAATGACTAGGGCTATGTT
t56 ACAAAGGAACAAGATAATTTTTTCCATTTTTATAAAAGTAATATATATGATATGGAAAATAGAACCTGTAATAATAATATATTACAATATTGTGTTATGAATAATAAATCATATTTATTTCATAATATATCATCCAAAAAAAATATATATATAAAAAAAAAGAATCATCAGTTTTCTTTATTCATATATTC
t57 ACAAATTATCCTTAGCCCCAGATATGGTAAAGACATATCACTGTTATAAATTAGGAAAGCAAGCAGCTGAATTATTAGAATCTATCATTTTAAAGAAAAAGTTCGTAAGATTTAGAGTTACTGATGCTATTGATGTATATGATTTCTTTTATATAAAGAAAGTTTTATCCAGTCGTATTAAGAA
t58 TAATTAATTTTAATGATGATGTTACTTTAGAAACACAATCCATGATAGCACACGGAGGGTCTTTGTCAGAAATAGAAGAAACTGGAGATTTATCTTCAGATGTTGATAGATTACATTCATCAATCGAAACTACTCC
t59 CTAAAACCTGACAGCTTTGTACAGTTATATTTACCCAATTAGAATGTTCAAACATATTTAAGTGTTGCCATTCTTCAAATAATTTTTTTTCATATTCAGTTTTTGGTTGGTTTTTAAATTCTTTTGATACTGTACCTTCTGCGGATGGTATAGGGAAGTTAAAAATTTTATTACGTACTTTGTTTGATT
t60 AACCAAAAAATGATACCTGTACGTTTTTCTCCAGGAGAAACAAAAAAATTAGAAAGGATTCTCAAAAAAGTAGACGACTTTTTCTGTGCCAATATTAAAGCTCAATATACTTGC
t61 TTTTTAAAAAAAAAAATTATGATATAAAATCTTCAATTAATAAAAGATCTATCCAGTTCTTTAAAGACACTAATATAGATCATTATATAGCATATGAATATTATAAGGAGGACCGTACGGAATTTATATTAACTATTATGAATGAAAAAAATATTACTCATCAAGAAACTCAA
t62 AGGGAACAGGTTAAATATATTGAAGACGTGGTTTGATAATCAGGTATTTCTTTAATTAAATATTCATGATATTGTTCGTTAGATACATAAATTAATTCGTCCTCCACATTTTGCTGTTCTGTTATAATTTGTGAAGAAGCTTGGTTTGTCTTATTTGGTAATTGCTCTT
t63 AACATCAGAACATAGTAAAGATTTAAATAATAATGATTCAAAAAATGAATCTAGTGATATTATTTCAGTAAATAATAAATCAAATAAAGTACAAAATCATTTTGAATCATTATCAGATTTAGAATTACTTGAAAATTCCTCACAAGATAATTTAGACAAAGATACAA
t64 CGAGAAGAAACACGTTTTATCTCATAATTCATATGAGAAAACTAAAAATAATGAAAATAATAAATTTTTCGATAAGGATAAAGAGTTAACGATGTCTAATGTAAAAAATGTGTCACAAACAAATTTCAAAAGTCTTTTAAGAAATCTTGGTGTTTCAGAGAATATATTCCTTAAAGAAAATAAATTAAATAAGGAAGGGAAAT
t65 TTTCATCCAAGAGTAATTATGTTTTTCTTGTTGATATTTTGCTTTATACAAACGAGCAAGATACTTTTTATACGATTCATGCTCTTTAATTTTCTTTTCAAAAGTTTTGGTTATAATGTGTTCAAGTGATTGTAAAGTATTTTTCTTTAATTTTCTCCATATTAATCTACATGCGATTATAAGAATTCTATATTCTTTGATATTCCC
t66 GAATTCTATCTATTTGTTTAATTGGTGTCCATGAATTTTTAGAGAAACACCACATATCAGCTAATAAGGAATTATTAACATCTAATCCTCCATGAATATATAAAAAATCATTAAAACTGAAAGAAGCATGCTTATATCTTGCACGTGGACAAATTTTATTTCCAATAAGTTCCCATTTCTTT
t67 CTAGTGACTGTACATTTTTCATCTAATGTTCTATAGGATAAAGAATCTGAAGGACAATGCATATCATTACAAAAATAGCATGCACCTTTAAAATATAAATAAGAATGTCTTAACACAACAAAGGAATCAAAA
t68 TTCTCATTCCTTTGAAAATATGAACTATTCAACTCATCAAGTTTTTCATATGTTATTTCATAACTGCCATCATCTGAATCCTCTTCATCACTACTACTTAAAGAATGCTCTTCTACAGTTATATCTTCATAATGATTATTTTTAGGAAAATATTCTTCAGATCTTTCTATAAATTCTGTTCCACTAC
t69 AATATCGACAATTATAAAGAAAAAATAGAAAATTTTAATGAAAGATATTATGAAGATGATTTAAGTTTCTTAAACTTTCCAGGAACCATAACAAGTTATGATAAAAACCATCAAAGAGATAAAAGAAAAAACTGTTTTGAAGAATATCTAGACAATAATAATTATTTTATTGATAGTAATCTAAAGGAAGTTATAGAAAATAATATATAT
t70 TTATTAAAAACAACAAATAAATTACCAAGGTCATGGAAAAAACTCTTGTTAAATCTATCCATTAGATTTCAAACAATACCTATAGGTGGTTCTATCGAAATCATATCAAAGGATGAGAAAAATATTTCTATGACACGATCCAGTGAATATAATGCAACAACATATGAGCATTACACATTTCCCAAATGGT
t71 ATTGACTTCTCCGGGTTGGTTACATTATCATTACAACCAATATTATTAAGGTTAATACTATTTTTATCATTATTATTGTTGTTTTCATTTGTGTGGTTATATAATAACAATGATGTGTCGTCTATTGGATCATTTTGCACAAGGTCAATATCATTAGAATTATGTTTCATATTATTTGTATACATTTTTTTGTAATTATCATTTTTTCTTACTT
t72 TATAATTTCACCTTTTGAATTACATATGTCTGTATTCAAAAATTTTATATCTCTACTCCATATATTTATCTTTACACCCAAAAACAAAGCAATGAAAAAAACAACCCTTTCATA
t73 GGGTCGACTTTTGGTGGTTTAACATTTTCAGCTACGAGATTTTCGAATGATGGCAATCCTACTTTGATTAATTCCTTTAACATAGATTTTGCGTGTTTTTTTACTTCTGATTTGATATC
t74 AATATCAAGATGGTAGTATTGTTTTAGATGTTAGTAGTACGAGAGATAAACTTTATGGAGAAGTTGAGTGGGAAGGATATAAATTTGAACTTGTGGATACAGGAGGTTTAGTTTTTGAACAAGAAAAGTTTTCAAAAGAAATAAAGGATCAAATTCTTATGGCTCTAAAAGAA
t75 AATCAACTATAATTAAAAATAGTGGAATTATTGAAAACTTTGGAGATATTGATTATATCTTTACGGATAAAACAGGAACACTAACAGAAAATGTCATGGTGCTTAAAGTTATACATATAGGTTTTGATGTTATACATGCAGAAAATGAAAAAAATTCCATACAAGGTAAT
t76 AGTATGAACACAAAACCGAAAGAGAGTTTAAAAAAAGATTCGATTTTAATGAAGGAAGAAATAACATAATAAATGTGTCAAATGTAAAGAAAACATGTAATATAGTTAAAAATGATTTAAAGGAATCTAACGAGATAAAAACTAATATAAACACAAGTTCTTATAATTTATA
t77 TTTTAAACTTTGAATCTGTTCAATTAATAACACTTCATCATTGTCTTCGCCTTTTATAATATCTCCATGATTATGTGCTCCTCTAATTAAGGTATCACTAACCTGTTGTTCATCTGTCGCATGAGAACTAGTAGATATATCTGGTTTATTTTTTTTCTTTCTTTTTGTTATTTCTAATTTTTTT
t78 AACTGATGTTAATGAAGGCCAAAATGGAAAACGTTTAACACAAGAAAAAACCCTAAAACAAATTCCTTCGACATCACCAAATAATTTAGAACAAAAATTACAACAAACACATATTTCTTCTTCCCAACAAAAACATTTAATAAAAACTACTCCTACTTCAGATTCAACTACAACTTCATCAAGCTCTTCTGTTCCTCTTACAAAACAATCCCCTCATTCT
t79 TATAGAGTTATTATGTTCGAATTAATGACCAGATTATTTCGATTTTATACAAAACCAATAGACCTTATTATTACTATGAATACACACAGATATGACACACCTATATTGTTAAAACATTCAAACAGCTTACAACTTATCTTATCAAATCAACAAAAT
t80 CTAAAAAAAGCAAACAAAAAGAAAAGAGAATGGAAGATTTAAGTATTGAGGAAAATTTGAAAGAGAAATTAATAAACCTAAATATTCAGAAACAGTATAGTGGTTTCCATATGAAAGAAATGGAGGAACAAAATATTCCTGAATTAAATATTCCACAACTTTCAGCTATTCATAATAGAGAAGATATGAA
t81 CTGGTTTTTGTACTTCTTGTCCTTGTGTTGATATCTCTTGTTCATTTTTAGATTGAGTATCTGTTTCACTTTGTGCCTTTACCTTTGATAATACGTTTTTACCGAATAATCCTAAATTTTGGAATAA
t82 CCCAATACAATAAATTCTACCATTTGACGTAACACCACAATTATTTCTTCTAGGTATATTTAAATTACTTGAAACATACCATACATCTCTTAAACGATCATACACCTCAGTTTCAAATAAAGCCTTATAATCATAGTTATTACCACCAAAAACGTATAAGAAATTATTCAATACAGCACTTCCAAAATAAGCTTTTTTGGTAGAC
t83 TTATATTTTGGTTCAGACAAATGTACGTTACATGAAATATGTGCACTATCATCAACTAAAGAAATATCTAAACTATCTGTAAAAGTATGTTTAGAACTAACATTTGAAGAGAAGTTACATCCGTGTATGACTTTTTTTTCATATTTTGGTTTTAAAACTATTTTATAATTATTATTTTTAAA
t84 ATCATTAAGTAAAAAATTTACGGTATCGACATTAACAGTGTAATAATTTCTTCCTAAATTTAAATGTTCTATATAATTCAAATTTAATACGGTATTTAACCTTTCTTCAAGTATTATCATTGCATTAATTTGAAACATAAGTAAAAAATTATTAGCACTTCTAAAAT
t85 TAACATTTTTTTAACATCTTTACCTTTTTGACTTGGTTCTTTATCATAATTCTGTTGTTCTGCAGAATCAGCATTTACTTCAGTTTCTTCTTTATTTTGAAAAGTGTTTGATTCTACATGAGAATTGGAAGATGAACGTCTATGTTTTACTTCTGTATAACTAGTACGTTCTCCAGTATGATGAGCCTTATGGTTTACATCTTCAGTTTCTTC
t86 TCTATGCTTTATCAAAACAAAGTAATCAAAATGTAATACATTATTCTTGTCGATCTATATGTGCTTTTACAAAAAACCCTAAGGGTTTAGAAGCGGTAAAGGATATCAAGGACTTTGCGAATATTATAAGTAAATCTGTTGGTGATTTGAGTAAAGAGAA
t87 GCAACTTACTTTAGAAGCCATGAATTCGGCCGATGATCCATTGGAAGCATTTAAAAATACTTTATTAATATTTGATTTTAATTTATCTGAATTTGATAAGGATCCATATATAAATGGTGTGCATGATTTAGCATCTAATATAAAGGATTGTTTAAGAAAAGGAGGTCACAGTAAAATATATT
t88 AAAAAAAATACATTTAAATCTATAATATTTGTAATAAATGATAAGAATATTAGTATTAATAGGGTACTACTATTTGTAAATACTTTTCTACATAATTTATCAATAGTAGGAAATATGATATTCTTTTTGTTTGGCTTTATTTTATCTGGGATTTTCCAATCATAAAGTTCGA
t89 ACATAGAAATATGTGCGAAAATTGGGAAAAAAATAAATAACAGGAATTGTTATATAAATTGAATCAAGAATGGCACAAAGAAAATAATACTGGTGACTTTCACACTAGTGATATCACACACAATAGTGGTATTACATACAGTAGTGGTAACATACCTAGTGAAAGTAATAGTAGTTATATCCAA
t90 CTTACAATGTTCTTCGCATTCGAAATTTTTTTCAGGATTACTTGAAAAGCCTTCTGGACAATTACAATATTCATATCCATGAGTATTCTTACAAACACCTTTTCCACAATTTAAAAAACATTTTTCTTCATTTAAA
t91 ATAATAATAGTAATATAAAGAATTATTCAGCTGTAGATATATTCTCACCCAAAAAAACTGGAAAGGAATGTATTAAATGTCTCCCAGATAATTTTTGTGAATGCGAATGTAGTTGTAAAAATAAAACAGGTTTTTCAATGAAATATAGACATGCCAGTAAGGGATCTAAAGGATATAGTAAAAAAATGA
t92 GAGTAATAGTGATTCATATAAGGTAAATTGTATTAATTTCTCTGAAGGATTTTGTTGCTGTCATCCAATAAATAATTTAGCACTATTATATGGAGAGTATCAACAAAATCAAGAATCAAAAAT
t93 TCTCCTTCCTTTTCATTCTTTACTTTATTTCTATTAATACACAACTCTCCTTGAGTTAAGAGATTTACGATATTATCAATATTGTTTTCTTTCATATACTTCAACTTTTTACTATTCATTTTTAAAAGATAATTAACCAAGTTACTCATGTTTGTATTAGATGGTATGTTCCCAATTTTTTCGTCTTTTCTT
t94 TTATTAAAACTTTTTTTTCTTTCTGTAAAGTTTGTACATTATGTTTTGATGAGTTTTGATTATCTTCATAAAACTTTATATATTTATAAAAATTATTTTGTATAAAATCATTTAATAAAGGTAACATAATCTTTTTAGCTTGATTCAATTCACTACATGAATGTATATT
t95 CATACAATTAATAAAGCTCTACGATAAATTTCTTTTTTATTTGATTTAGAAAAAGGAGAAAACCATTGAAACTTTTTGGCTAGATATTTATGGTCTTTATCAAAAATATATTCTGAA
t96 ATATATATAAAGTTAAACCTATAAATAATACACTACCTAATAAACTATTCTTATATTTAAAAATAAATATAATACATGTTATTAATCCTTCTATTGTTGCCGGAATAATATACATTAAAACAGAACTCATCAAATTATTAGCACTCTCGGTACCTCT
t97 ATAAACACCAGTACCATTTTTTTCTGATAAATTAATATTTTTTTGTATAACATCATATTTATCCCTTTTCGTGGTAAGTGCAGTATCCTGTTTTATTATTATATTATCGAATTCATCATGGTGTATATTTCTTTCAT
t98 AAATGAGGTATAATCATCCATTTCGTTGGGTCGATTTGATCTATTTTTAAGGTCCATATATTGAGATGAATCATTATGCATTTTATAATCATGAGGGATATTAGGTGCATGGTATTTAGGGTCTTTA
t99 TATGGAAAAATGGAATATGAAGTATTAAGTGATGATAACATAGTGTATGAAAATATACAACATGATTTATTAAAAACAATAGAAGATGATGAAGAAATGTTAAAAGGAACTGAAAGGAAGGATAATATAGATATACTGAGGACTCCTGGAAGGGGAGAATATAATATGTGGTCTACTTCTGGACTAGGGTTCTATGAATT
t100 ATCGTTTTGAATTGTTAGAATTTAAAATGACGGAGGATTGTTATACAAAAATGTGGTTTGATTTTATGAGTGATTTTGGAATAGCTACAATGAATGAAACCGAACATACTAGATCTTTTTATGGATT
---
title: Getting panel info out of PMO using pmotools-python
---
```{r setup, echo=F}
source("../common.R")
```
Most of these basic panel info can be found underneath `extract_panel_info_from_pmo`
```{bash, eval = F}
pmotools-python
```
```{bash, echo = F}
pmotools-python | perl -pe 's/\e\[[0-9;]*m(?:\e\[K)?//g'
```
Getting files for examples
```{bash, eval = F}
cd example
wget https://plasmogenepi.github.io/PMO_Docs/format/moz2018_PMO.json.gz
wget https://plasmogenepi.github.io/PMO_Docs/format/PathWeaverHeome1_PMO.json.gz
```
# Extract insert locations of panels from PMO
This will extract the insert location of targets of the panel infos out of a PMO and write it out as a bed file
```{bash}
pmotools-python extract_insert_of_panels -h
```
The python code for `extract_insert_of_panels` script is below
```{python}
#| echo: true
#| eval: false
#| code-fold: true
#| code-line-numbers: true
#| filename: pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_insert_of_panels.py
#| file: ../pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_insert_of_panels.py
```
```{bash}
cd example
pmotools-python extract_insert_of_panels --file ../../format/moz2018_PMO.json.gz
```
```{bash}
cd example
pmotools-python extract_insert_of_panels --file ../../format/moz2018_PMO.json.gz --output moz2018_PMO_panel_insert_locs.bed --overwrite
```
Can add on the reference sequence if it's loaded in PMO, if it's not loaded will be blank column
```{bash}
cd example
pmotools-python extract_insert_of_panels --add_ref_seqs --file ../../format/moz2018_PMO.json.gz --output moz2018_PMO_panel_insert_locs.bed --overwrite
```
# Extract ref sequences of insert locations of panels from PMO
This will extract the reference sequence of the insert location of the targets within the panel info out of a PMO and write it out as a table. The reference sequence is an optional field and so if no reference sequence is loaded then just blanks will be extracted
```{bash}
pmotools-python extract_refseq_of_inserts_of_panels -h
```
The python code for `extract_refseq_of_inserts_of_panels` script is below
```{python}
#| echo: true
#| eval: false
#| code-fold: true
#| code-line-numbers: true
#| filename: pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_refseq_of_inserts_of_panels.py
#| file: ../pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_refseq_of_inserts_of_panels.py
```
```{bash}
cd example
pmotools-python extract_refseq_of_inserts_of_panels --file ../../format/moz2018_PMO.json.gz
```
```{bash}
cd example
pmotools-python extract_refseq_of_inserts_of_panels --file ../../format/moz2018_PMO.json.gz --output moz2018_PMO_panel_ref_seqs.tsv --overwrite
```