Portable Microhaplotype Object (PMO)
  • Home
  • Format Info
    • Development of Format
    • PMO fields overview
    • PMO Examples
    • Format Overview For Developers
  • Tools Installation
    • pmotools-python installation
  • pmotools-python usages
    • Command line interface

    • pmotools-python
    • Command line interface to pmotools-python with pmotools-python
    • Extracting out of PMO
    • Extracting allele tables using pmotools-python
    • Subset PMO
    • Subsetting from a PMO using pmotools-python
    • Getting sub info from PMO
    • Getting basic info out of PMO using pmotools-python
    • Getting panel info out of PMO using pmotools-python
    • Handling Multiple PMOs
    • Handling multiple PMOs pmotools-python
    • Validating PMO files
    • Validating PMOs pmotools-python

    • Python interface
    • Getting basic info out of a PMO
    • Creating a PMO File
  • Resources
    • References
    • Documentation
    • Documentation Source Code
    • Comment or Report an issue for Documentation

    • pmotools-python
    • pmotools-python Source Code
    • Comment or Report an issue for pmotools-python

Contents

  • Extract insert locations of panels from PMO
  • Extract ref sequences of insert locations of panels from PMO

Getting panel info out of PMO using pmotools-python

  • Show All Code
  • Hide All Code

  • View Source

Most of these basic panel info can be found underneath extract_panel_info_from_pmo

Code
pmotools-python
pmotools-python v0.1.0 - A suite of tools for interacting with Portable Microhaplotype Object (PMO) file format

Available functions organized by groups are
convertors_to_json
    text_meta_to_json_meta - Convert text file meta to JSON Meta
    excel_meta_to_json_meta - Convert Excel file meta to JSON Meta
    microhaplotype_table_to_json_file - Convert microhaplotype table to a JSON file
    terra_amp_output_to_json - Convert Terra output to JSON sequence table

extractors_from_pmo
    extract_pmo_with_selected_meta - Extract samples + haplotypes using selected meta
    extract_pmo_with_select_specimen_names - Extract specific samples from the specimens table
    extract_pmo_with_select_library_sample_names - Extract experiment sample names from experiment_info table
    extract_pmo_with_select_targets - Extract specific targets
    extract_pmo_with_read_filter - Extract with a read filter
    extract_allele_table - Extract allele tables for tools like dcifer or moire
    extract_insert_of_panels - Extract inserts of panels from a PMO
    extract_refseq_of_inserts_of_panels - Extract ref_seq of panel inserts from a PMO

working_with_multiple_pmos
    combine_pmos - Combine multiple PMOs of the same panel

extract_basic_info_from_pmo
    list_library_sample_names_per_specimen_name - List experiment_sample_ids per specimen_id
    list_specimen_meta_fields - List specimen meta fields in the specimen_info section
    list_bioinformatics_run_names - List all tar_amp_bioinformatics_info_ids in a PMO
    count_specimen_meta - Count values of selected specimen meta fields
    count_targets_per_library_sample - Count number of targets per sample
    count_library_samples_per_target - Count number of samples per target

validation
    validate_pmo - Validate a PMO file against a JSON Schema

Getting files for examples

Code
cd example 

wget https://plasmogenepi.github.io/PMO_Docs/format/moz2018_PMO.json.gz
wget https://plasmogenepi.github.io/PMO_Docs/format/PathWeaverHeome1_PMO.json.gz

Extract insert locations of panels from PMO

This will extract the insert location of targets of the panel infos out of a PMO and write it out as a bed file

Code
pmotools-python extract_insert_of_panels -h 
usage: pmotools-python extract_insert_of_panels [-h] --file FILE
                                                [--output OUTPUT]
                                                [--overwrite] [--add_ref_seqs]

options:
  -h, --help       show this help message and exit
  --file FILE      PMO file
  --output OUTPUT  output file
  --overwrite      If output file exists, overwrite it
  --add_ref_seqs   add ref seqs to the output as ref_seq

The python code for extract_insert_of_panels script is below

Code
pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_insert_of_panels.py
#!/usr/bin/env python3
import argparse


from pmotools.pmo_engine.pmo_processor import PMOProcessor
from pmotools.pmo_engine.pmo_reader import PMOReader
from pmotools.utils.small_utils import Utils


def parse_args_extract_insert_of_panels():
    parser = argparse.ArgumentParser()
    parser.add_argument("--file", type=str, required=True, help="PMO file")
    parser.add_argument(
        "--output", type=str, default="STDOUT", required=False, help="output file"
    )
    parser.add_argument(
        "--overwrite", action="store_true", help="If output file exists, overwrite it"
    )
    parser.add_argument(
        "--add_ref_seqs",
        action="store_true",
        help="add ref seqs to the output as ref_seq",
    )

    return parser.parse_args()


def extract_insert_of_panels():
    args = parse_args_extract_insert_of_panels()

    # check files
    Utils.inputOutputFileCheck(args.file, args.output, args.overwrite)

    # read in PMO
    pmo = PMOReader.read_in_pmo(args.file)

    # get panel insert locations
    panel_bed_locs = PMOProcessor.extract_panels_insert_bed_loc(pmo)

    # write
    with Utils.smart_open_write(args.output) as f:
        f.write(
            "\t".join(
                [
                    "#chrom",
                    "start",
                    "end",
                    "target_id",
                    "length",
                    "strand",
                    "extra_info",
                ]
            )
        )
        if args.add_ref_seqs:
            f.write("\tref_seq")
        f.write("\n")
        for loc in panel_bed_locs:
            f.write(
                "\t".join(
                    [
                        loc.chrom,
                        str(loc.start),
                        str(loc.end),
                        loc.name,
                        str(loc.score),
                        loc.strand,
                        loc.extra_info,
                    ]
                )
            )
            if args.add_ref_seqs:
                f.write("\t" + str(loc.ref_seq))
            f.write("\n")


if __name__ == "__main__":
    extract_insert_of_panels()
Code
cd example 
pmotools-python extract_insert_of_panels --file ../../format/moz2018_PMO.json.gz
#chrom  start   end target_id   length  strand  extra_info
Pf3D7_01_v3 145449  145622  t1  173 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 179903  180115  t2  212 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 181557  181673  t3  116 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 495971  496143  t4  172 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 512199  512388  t5  189 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 531682  531900  t6  218 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 532690  532844  t7  154 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 534215  534368  t8  153 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_01_v3 534941  535110  t9  169 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_02_v3 109807  109982  t10 175 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_02_v3 278165  278336  t11 171 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_02_v3 470492  470676  t12 184 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_02_v3 805822  805942  t13 120 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 85440   85646   t14 206 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 141963  142181  t15 218 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 221363  221495  t16 132 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 618396  618581  t17 185 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 654002  654175  t18 173 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_03_v3 850816  850989  t19 173 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 109912  110087  t20 175 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 133491  133701  t21 210 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 141778  141945  t22 167 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 415653  415826  t23 173 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 544718  544861  t24 143 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 748230  748436  t25 206 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 748533  748696  t26 163 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 802525  802713  t27 188 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1037634 1037844 t28 210 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1100656 1100831 t29 175 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1102389 1102578 t30 189 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1113450 1113604 t31 154 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_04_v3 1128489 1128673 t32 184 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 329378  329550  t33 172 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 958059  958221  t34 162 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 958389  958506  t35 117 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 1042162 1042281 t36 119 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_05_v3 1309609 1309744 t37 135 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_06_v3 145343  145501  t38 158 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_06_v3 532195  532378  t39 183 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 165235  165422  t40 187 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 166035  166167  t41 132 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 298886  299005  t42 119 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 729975  730088  t43 113 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 1149415 1149585 t44 170 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_07_v3 1358694 1358911 t45 217 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 102326  102500  t46 174 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 336468  336647  t47 179 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 339168  339357  t48 189 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 549993  550218  t49 225 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 933023  933143  t50 120 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 1269320 1269456 t51 136 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 1344686 1344819 t52 133 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_08_v3 1362891 1363087 t53 196 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 516928  517092  t54 164 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 596133  596334  t55 201 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 685601  685792  t56 191 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 1178894 1179078 t57 184 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 1406405 1406541 t58 136 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_09_v3 1437114 1437303 t59 189 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 377095  377209  t60 114 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 992371  992544  t61 173 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 1386700 1386869 t62 169 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 1399544 1399711 t63 167 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_10_v3 1436479 1436682 t64 203 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 119486  119693  t65 207 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1009856 1010038 t66 182 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1018953 1019085 t67 132 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1376185 1376372 t68 187 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1552430 1552640 t69 210 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1750865 1751055 t70 190 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_11_v3 1816211 1816425 t71 214 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 63166   63280   t72 114 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 659891  660010  t73 119 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 684088  684261  t74 173 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 943258  943428  t75 170 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 1237431 1237603 t76 172 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_12_v3 2050130 2050314 t77 184 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 103659  103879  t78 220 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 156566  156722  t79 156 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 1150303 1150493 t80 190 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 1419543 1419670 t81 127 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 1725365 1725570 t82 205 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 1876352 1876534 t83 182 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2114975 2115142 t84 167 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2124634 2124847 t85 213 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2479086 2479246 t86 160 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2481106 2481288 t87 182 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_13_v3 2669135 2669307 t88 172 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 39953   40137   t89 184 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 120215  120351  t90 136 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 150105  150294  t91 189 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 279663  279786  t92 123 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 407379  407571  t93 192 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 564208  564377  t94 169 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 1038369 1038486 t95 117 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 1956129 1956286 t96 157 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 1992289 1992426 t97 137 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 2524962 2525089 t98 127 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 3124642 3124842 t99 200 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Pf3D7_14_v3 3214351 3214478 t100    127 +   [genome_name_version=3D7_2020-09-01;panel=heomev1;reaction=full;]
Code
cd example 
pmotools-python extract_insert_of_panels --file ../../format/moz2018_PMO.json.gz --output moz2018_PMO_panel_insert_locs.bed --overwrite

Can add on the reference sequence if it’s loaded in PMO, if it’s not loaded will be blank column

Code
cd example 
pmotools-python extract_insert_of_panels --add_ref_seqs --file ../../format/moz2018_PMO.json.gz --output moz2018_PMO_panel_insert_locs.bed --overwrite

Extract ref sequences of insert locations of panels from PMO

This will extract the reference sequence of the insert location of the targets within the panel info out of a PMO and write it out as a table. The reference sequence is an optional field and so if no reference sequence is loaded then just blanks will be extracted

Code
pmotools-python extract_refseq_of_inserts_of_panels -h 
usage: pmotools-python extract_refseq_of_inserts_of_panels [-h] --file FILE
                                                           [--output OUTPUT]
                                                           [--overwrite]

extract ref_seq of inserts of panels, but if no ref_seq is save in the PMO
will just be blank

options:
  -h, --help       show this help message and exit
  --file FILE      PMO file
  --output OUTPUT  output file
  --overwrite      If output file exists, overwrite it

The python code for extract_refseq_of_inserts_of_panels script is below

Code
pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_refseq_of_inserts_of_panels.py
#!/usr/bin/env python3
import argparse


from pmotools.pmo_engine.pmo_processor import PMOProcessor
from pmotools.pmo_engine.pmo_reader import PMOReader
from pmotools.utils.small_utils import Utils


def parse_args_extract_refseq_of_inserts_of_panels():
    parser = argparse.ArgumentParser()
    parser.add_argument("--file", type=str, required=True, help="PMO file")
    parser.add_argument(
        "--output", type=str, default="STDOUT", required=False, help="output file"
    )
    parser.add_argument(
        "--overwrite", action="store_true", help="If output file exists, overwrite it"
    )
    parser.description = "extract ref_seq of inserts of panels, but if no ref_seq is save in the PMO will just be blank"
    return parser.parse_args()


def extract_refseq_of_inserts_of_panels():
    args = parse_args_extract_refseq_of_inserts_of_panels()

    # check files
    Utils.inputOutputFileCheck(args.file, args.output, args.overwrite)

    # read in PMO
    pmo = PMOReader.read_in_pmo(args.file)

    # get panel insert locations
    panel_bed_locs = PMOProcessor.extract_panels_insert_bed_loc(pmo)

    # write
    with Utils.smart_open_write(args.output) as f:
        f.write("\t".join(["target_id", "ref_seq"]) + "\n")
        for loc in panel_bed_locs:
            f.write("\t".join([loc.name, loc.ref_seq]) + "\n")


if __name__ == "__main__":
    extract_refseq_of_inserts_of_panels()
Code
cd example 
pmotools-python extract_refseq_of_inserts_of_panels --file ../../format/moz2018_PMO.json.gz
target_id   ref_seq
t1  AAACTTTTTTTATTTTTTTTGTCAATAGATAAATGATCAATATTTTCTATATTTAATCTATCAAGTATTTTTATATATCTATTATTTCTTTCTTCGATGGATAAATTATAAGAATCAATATCCTTTCTTTCATCAACAAACTTTTTTATTGTTAACTCCATTTTTTTATTTAA
t2  CTTTCGATACAGGACATATAGATCATAATATAAACGAATATGAAAAACATTTTACAATTTTAAAAGAATCTTTTTTTCATAAGTCATTAAAATTTATGGATTATATATGGATTGTTATAATGAAACGAGAAAATAATACATTTTTAAATAGAATAAGAACTGAACAAGTCAAAAAATCGTTATTAATAACAGGTATTATAAACGAAAATATT
t3  TTCATTCTTTTTTTAACGAAAACTATTCATCTCAAAAATATAAGATATTTTATATGACGAATGCCATTGTATTTTTTGTTACGTAAAACCTGACTTCTTCAAGGAAAACACATGCG
t4  TGAAAGTAGCGAATACCCTGTAGAAATTGTTAGTAAACCTCTGGAATGGTTATTGTTTCATGATTTGACTAAGCCTGATGTTACTGCACTACCTGAAGAATTACCATTAACGAGCTATAAAGTAACACCTACTTCTATTAATGTATTGCATAAAGAGGGCCCCACTTTAAAA
t5  TTAAATAAATTAAGTGAAGATGATTATGAAAAAAATTGTAGACATTTAAATTATTTAATAGATAATATAAGAGAATTATTTTTTAGTTCTAAAAATATTAGAATTCCTGATGTAGCTCGTAAATATTTGTGGGATAATCAAATTGAAGGAAATCTCAAAAAATTAATATCTTCAGAAACAAAATATAAA
t6  AAAAGAAAGATGTTAAGAAAAAAAAGATCGATATGATAAATATATTACACATTCCTACAAATAATTATAGCATGCCAAATGAAAAAGATATAAACAGTTATGAATTTTTTGGATCTGAACCTATTGAATTATATGATGTAAAATCTAATAAGAATTGTGTTATAACTAGCCAATCTTATATTTGTATTGAAAACCCGCATGCTGCCATAGTATCCGAA
t7  ACTTTCTTAATTCTATGAATGAAGAATATACGCATATTAATTTAAATAACTTTATACATAATAATAACCATGATAATGACGAAAATTATACATTAGATCAGGTGGAAGGATATCCTATGACTAGTTATCAAAATAATATATACAAGGATTTTTT
t8  TGATAAAAAAAGAAAATGTAACAATCTACCTTGTGATTGTATCTTATGTAAAAAAAAGCACAATGTTTGTTATTGTAATATGTGTAAAAAGAAAGAAAAAAATATGAGAGAAGATTTATCGGTTGTAAAATATAATGAATATCCGATTAGTGT
t9  AATACTCCGTCAGATATTCAAAATGAATGTATTTGTTCAAAAATAAATGAAGATAGAAATAATGATATGATAAACATATCTGAAATATATTATCGTTTCATTAAATTTATAACAATGATTGAAATATATTTGTGTGTAATAGAGGAAATTAAAAGGGAAGAATGGGAAA
t10 AACTTATGTTCATGAGCTAATTTCCCACAAATACTCCATAACGAACTTTTCATTTTATTAAATTTATCTCTCAAAAGAGAATGACTATAATGCCATATTAAATACATATCTTTCCTTTCTAATTTTCCTGGTAATTCTATTATCATTCTTTCTAAATCTTCTTCTGTAACTTTTC
t11 GGAATTTTCTTTTTTATGACTTTCTTCTCCTTGTTCAGAAGCTTCTTTTTCATCCTTTTTTTCTGCTGCGTCAGATAAATTGGGGGAAGCACTTGAAGATTCATTTCCTCCAGGAGTATTACTAGTACTTACTCCGTCCACATTTGGTTTTTCTTCCCCTAGAATTCTCAT
t12 ATACACATAAGAAAAAAAAAATTTATTTATTCTTACAAAAAGAATATAAAAACAAAATTTTGGGATTTATAAATTTTTATAAACATATAACACACAAAATAAAAAAGAAACAAGAAAATGTTCATGATAAAATCACTTTTTTAAAATGTCTAAAGGAACTCTTTTTTGTCACACATACAAATAA
t13 TTAATTATGAAGACAGTCTCACGACTTCATGTTATATTGATGAAAACAAATCCGATTCATCCTATGAAACTGAAGAAAATGTAAACTATAATAATAAAATGGGTAAACGCAAAAATTTAG
t14 AACTTTTTAACACTATCATTATAATTATGTCTTTTATTTTCATATTTTTCTTTATAATAATTTATATCCTTTAATTTTTCTTTCATCAAATTTAACCATTTATCATTTAAATTCTCTTTTTCCACAGCTCCAGCATTTTTATTTATATCATCTACAACTACATCTTCCTTCACATAATTATTTATATAAAAATTATTATCATCTAA
t15 AGCATTTATAACAAAATATCAGAAGATGGAAAAACCAAAAATAGCCTACTTAGCGATATATCTAAATTATTTAAAATTGTAAAAGAAAGCAAATTAATATTTGTAACTGGATTTTTATTAACAACCTTGTCAGCAATTGTCGATTCATACATTCCAATTTTTCTATCCAAAACGGTATCTTTTGTAATGGAAAGAAAAAAATTTACATTTCTTAAAAT
t16 TTGAACTATTTACGACATTAAACACACTGGAACATTTTTCCATTTTACAAATTTTTTTTTCAATATCATTTGCATAATCTAATTCGTCTTTAGGTTTATTAGCAGAGCCAGGCTTTATTCTAACTTGAATAC
t17 AAAAAGAGATTATAGAAGTGGAAAGAAGACATATATAATACAAGCTCTACAATATGCATTAACATATTATAGCAAACTTTCAAATAGAAAGGAAGCACCTAAAGTAACCATGTTATTTACAGATGGAAATGATTCCTATGAATCAGAAAAGGGATTACAGGATATTGCATTATTATATAGAAAAG
t18 ATATTTTTATAATTTCCTTTCATCTTATTTTTTTCTTTATATTTTTTTTTTTCATATGTAAGATTTATATAATCACTTTCGCTTTTTCTAACCCTTTTGAACTTTTTAAATGTACTTCGTTCATTATTTCTAATTCTCGTAAACACAAGAATAAATATTTTGATATATCTTAT
t19 TCATATTCGTTTCAGCGTTTATAGAGCGAATATTATCGATTATGTCTATATTCCCTAAAATATGTACATAATAATCTCCATTTAATAATGTTACTTTCTTAAGAATATTTTTGTATATGACATCAAAGATATTACGAGTTAACAATACATCTACATTTCCGTCTATTATATAT
t20 AATATTTGTATATATTCTAGTAAACACTTTAGGTACACCTGCAAGTACCTCAACATCTGAATTATATAAATCTTTAGAAAAACAACTAATATCCTTACTCGATATATTTATTTTTATACCAAGAATAACAGCCATATAAACCAAAACTCTCTCAAATACATGTGATACAGGTAAA
t21 TTAGATTTTTTCCCTCCAGCAGGTGCACTAACTTTAGGTGTTTTAAATCTAGATGTATGTGGAACCCCATCTTTATTTGGTTTACCTCTATTTAATCTTTTACCAGCAGTAGAAACTACATTCTCTCGATTATCATTACCTACTTCAGTCTTTACAATTGTAGGAGGTGTTTTAACATGATTATCCCCTCCATGAGGAGTAACCCTTAAA
t22 CTTTCAAATTATATAAAGACAGAACTAAGAAATATAAACCTGCAAGAAATAAAAAACAATATAATAAAAATATTTAAAGAATTCAAATCTGCACACAAAGAAATTAAAAAAGAATCAGAACAAATTAATAAAGAATTTACCAAAATGGATGTCGTCATAAATCAATT
t23 ATGCTAGTTTTGCTGCTCATGAAAATAAAAGCTACTCATATGAAAGTCGTACATATAAAATGTATCCACCTGAATTTAATACATTAATGTTAAAAGCAGATTATTTTATAAGAGATATAAATACACGAGGATTTAGAGAAGTAAATATGGATTCATGTAAATCATATACAAAT
t24 ATATATTTTACATAATAACAATCCTTCATGTAATGATTATAATTTAAATAATCTTTCATTTAATATAAATAAATATATTAATGAAGAAAAAGGCAAAAATAAAAAAACAAATCAACATATATCAGAACAATTTTTATTTCCTA
t25 GAAATGTAATTCCCTAGATATGAAATATTTTTGTGCAGTTACAACATATGTGAATGAATCAAAATATGAAAAATTGAAATATAAGAGATGTAAATATTTAAACAAAGAAACTGTGGATAATGTAAATGATATGCCTAATTCTAAAAAATTACAAAATGTTGTAGTTATGGGAAGAACAAGCTGGGAAAGCATTCCAAAAAAATTTA
t26 AATAGTTTTACTTGGGAAATTAAATTACTATAAATGTTTTATTATAGGAGGTTCCGTTGTTTATCAAGAATTTTTAGAAAAGAAATTAATAAAAAAAATATATTTTACTAGAATAAATAGTACATATGAATGTGATGTATTTTTTCCAGAAATAAATGAAAAT
t27 TAAATTATTATAGAAAGATAGAAGTTATTTTATATGAATGGTTATATTTTCATTATAATAATGTATATAACACCAAAGTAAAAAAACAAAAATTTATATTTACCCAACAAAAAAAAGACATATCAAAACATAACAAGTTATATCTTCAGTATGATCAAAATAAAAGAAACTCTGAAATAGAACATACA
t28 ATCTTATAAAGTTAATGAAATAGCTCAACTTAATTTAACCATAGAAAGAGATTTAACAGATGATGCTGTGATTTTTGCACACTCACTTTATTTACCATTTGAAAAAGAAGAAATGTGGTGGATCGTTATCGGAATTAAAAAAATGAACTTACTTTTATCTATCAAAAAATTATCTTTATTGAAAAGTGTCAATAATATAAAAATTAATTT
t29 AATGTTCTAAGATCTGATGGAAAAATATCTGATCAAGGTTCTCAGAAAAGTCCTCCTAAGGAACTTTCTAATAAACAAATGACTCCTGCTCAACGTAAGAATGTGCCACATTTTGTTGAAAGAAGAGGCTATGGAAATAGTCATGTTAGGGGTAACGCACTTAAAAAAATTAGTA
t30 GAAGAAATAAATAAAATAATTTATACAAACGAATTTAATAATTATGAAGATAAAATATATGAAGATGTCAAATATATTAAAGAACAGGAAAATGAAATGTACTTGAGAGATGGAATTGAAGAGTTACATATGGATGAACCAAGTGGGGATGTATATTTTGATGATCAAGATGATTATATATTTTTAGAT
t31 AAATCATCTAAGAATAAACTTTTTTGTTTAAACCATTTATTGAACATTTCACTTAAATGTGATTCAATTTTTTTTTCTGAGCTTTCCATAATTTGATTACAACTATTCAATTTGTTTGTATAAGTTTTAGGTCTTATTATTCTTTTACGTTTTA
t32 ATTGAAAGAGTTAAAGGGAAAAATACAAAATTATTTAGATAATGATATTCAATTGAAAAATGGAAAACTCCTATATAAGGATACATGGGATAGAATTGTTTTGAAATTTTGTAGAACTGTAGCAATAGAAGAGGCAGAATACACTAGAAAATTTTATAGCTTAATTAATGATAAACATACAATT
t33 ATTGTGATGTATATAAATTCCCTTCTTCTTTATGTACCACATTATAAGAACCACGAACATATTCTTTTAAAAATGTTAACTTACTTCCAACAAATATATAATCAAAACAAAATTTTTTATTGTCGAAATGTTCTCGTTCAACCCCATTCATATTTCGGATATCATTATTTAA
t34 ATTTATATCATTTGTATGTGCTGTATTATCAGGAGGAACATTACCTTTTTTTATATCTGTGTTTGGTGTAATATTAAAGAACATGAATTTAGGTGATGATATTAATCCTATAATATTATCATTAGTATCTATAGGTTTAGTACAATTTATATTATCAATGAT
t35 TACGAAATTTATAACAATTTTTACATATGCCAGTTCCTTTTTAGGTTTATATATTTGGTCATTAATAAAAAATGCACGTTTGACTTTATGTATTACTTGCGTTTTTCCGTTAATTTA
t36 ATATTATTACGGTACCATTATATGATTCTTTAGGTCCTCAATCAAGTAGATTTATATTAGATCAAACACAGATGGAAACTATTGTATGTGACAAGACATGTGCTCGTAATTTATTTAAG
t37 TAATGAAGAAAATATGTCTGACAGACCAAACAGTTTATCTCATGATAAGGATCAACACCTCGATGAAACACATAATGAACAATATGGATTATACGTAAAAGAAATGGAATCTAAAGTTGAAAAATTAGCTGAAAA
t38 AATTTGAATAAAATTTATCATTCACAATATTATATTTTTGTTTCTTCTTTTTTCCCATAATATTATTATTTTGTTCACAATATATGTTCATGTGTCCCCTCTTTTTCTGTAAAATATTAAAATGTTTCTTACTATGCTTCTCTTCATTTTTAATAACA
t39 AGAACAATTGCTAAACACCAAATTGGGTGAAACAAAAAACCACCTGAACAGAACCCCATTTATACCTGAATCGGTCATACGAGAAAGGAAATTACGCCAAGAAAAAGCTCAATCCACAAACAACATGTTCGATTCAACAAACGCAGATAGTATTACGTCCCCATGTGATCCAACGAATGCCAC
t40 ATCATTATTATTATTATTATAATTATCATTGTTATTATTATTGTTGTTGTTGTTATTATTATTTCCTTTATTATTTAATATGCATTTTTTAATATCCATTTTATTATTTATCATATCAGTACTTTTATTTTTCTCCTTTTCGTTTATAGCTTTTTCCCTATGCACCCTGAATTTCCCATTTTTTTTT
t41 CTTTTATTTGAACTTTTTTATTTTCTTCATTATCAATATAAAAACAACAATCTTTAAATTTATGAAATAACACCCAACTTAATGTATAAAATATAATTTCTATAAAATCAAAATTAATGTAATAACCTAATT
t42 ACATTCAATATGGTTCTGAGCTTAAAGATGTTATCCTTATACATGTTTCTATTCTCATTCGTTATAACATCAATGACATATAAAACATATCGCTCAAATATTTTGTGAACAATATTATA
t43 TTTTATGTGAAATTTTAAATAAAGAGAAATTGTTTGTGCACACATCTATTTTGGGATATATATCTAATCGTTTATGTTACGATATAAAAAAATATAGATGTTCATTATTGAAG
t44 TAATATTTAATATTTTTGTTTAGGAAGTTTTCAAAATTGATGACATTGATAACCGTTTTTATTAAATCGCTTGGGTTGTTTGTGTGTTCCATTGTAGAAGACATAATTTTAAGAATGTGTTTTTGTGTTATTAAACAAAACTTTGTTACTATATGGAATCGTATGTTATA
t45 AAGAATCTCAACTTTTGCTTAAAAAAAATGATAACAAATATAATTCTAAATTTTGTAATGATTTGAAGAATAGTTTTTTAGATTATGGACATCTTGCTATGGGAAATGATATGGATTTTGGAGGTTATTCAACTAAGGCAGAAAACAAAATTCAAGAAGTTTTTAAAGGGGCTCATGGGGAAATAAGTGAACATAAAATTAAAAATTTTAGAAAAAA
t46 TTAATTTTCTCAAGTAACATATTTTTGTCTTGACTACTTTCATCTGTAAATGGAACGACTACTCTTTGCTTACCTGCAAAAAGGACAACTGACATATGAGCCTTATCCTTACTTACATTCGAATTAAAGACCATAGATTCTAAAAATGGTATAGTTCCTTTTAACCAATAATCC
t47 TAAATTTACAGAAAAATTTAGATTTAATCATATGGCTAAAAAAGAATATAAAAGAAGGAGAAGCTTTAATTTCTGATATACCCACATCAAGTTTTTTAAGGTGTACTACAAATTATAAATTTGTATTGCATCCACAATATGAAGATAGTAACTTGAGAAAGAGGGTTCAAGATTATTAT
t48 ATCATTACTATTCTTTATCCATATATATGTTGTTATATCAGTTGCACCTGAGGGTGGATTCAAACCTCTTGCTTGTATAATATATGCACGTACTACTAAATTACGTGTTAATCTATATTCTCTTACTAAATCATCACATTTCTTTATCATTTCTTTTTTTAAATACATTGTCTTTTCATCATACACATT
t49 AACTAACAAATTATGATAATCTAGTTTATGATATAAAAAATTATTTAGAACAAAGATTAAATTTTCTTGTATTAAATGGAATACCTCGTTATAGGATACTATTTGATATTGGATTAGGATTTGCGAAGAAACATGATCAATCTATTAAACTCTTACAAAATATACATGTATATGATGAGTATCCACTTTTTATTGGATATTCAAGAAAAAGATTTATTGCCCATT
t50 TTGGTTTTAATGAGGATGATTTAGATAAAGAGTTTTTTTTTGATTTGCCGTCGATTAGTGGTTTTTCAAGTAATGGTATGAAAAAATGTAATTTAAGAAATTTATTAAAAAGATTAGAAG
t51 GCCTTAGCATTAATAACACTTATAGGAGGTATTTATATGACTAAAGGAAGAAAAAGTTCCATATGGCATAATTCAGAAATAAATGATACACTTTATGATGTTGTTCTTGAAACAGTGAATCAAGGAAATACAGATG
t52 AAATATATCTTTTACTTTATCTGTAAGTTTCATGGAATCTATTATTTCATGTGCATATTTTATTTTTGCCTCGTTTATATATTTACCTGCTTTTAATTTACATAAACATTTTTCTATTAAGTTTTGATCATTT
t53 CGTAAAAAAAAAGGGTTTTCCTAAAAAACCATCGGTTCATGGTTTTACACAGATGGTGTTTTTGCAGCTTACTAATATAGTCATGTATGCCCTTTGTTTTATGAGCCTTTCACAAATATATGCTTATTTCGAAAACGTGAATTTTTATATTATAAGCAATTTTCGTTTCCTTGAGAGATATTTTAATATATTCAAT
t54 ATGTTTGTGTAATAAAGGATATTTCTATAATGTCAAGAAACAACAATGTGAAGAATGTCCCATAGGTTTTTATTGTCCAAAAATAAGCTCAAAAAATAATATAAAAAAACCTATCAAATGTCCTAAAAATTCCTCAAGTATAAAAAAAGGTATGTATATATCAA
t55 ATTATTTTCGTTTTTTTTTATATTATCATTATAATTATATGTATTTACAAAGAATTTATTATTATTTGTTTCTTTATAATTATATTCATTTTCTTTATTACTTAAGGAATTTATATTTGATAAAACTTGTTGTCCTCTTTCTTCTTTGTTGATGTGAATATTTGAGAAAATAATAATAAATGTAATGACTAGGGCTATGTT
t56 ACAAAGGAACAAGATAATTTTTTCCATTTTTATAAAAGTAATATATATGATATGGAAAATAGAACCTGTAATAATAATATATTACAATATTGTGTTATGAATAATAAATCATATTTATTTCATAATATATCATCCAAAAAAAATATATATATAAAAAAAAAGAATCATCAGTTTTCTTTATTCATATATTC
t57 ACAAATTATCCTTAGCCCCAGATATGGTAAAGACATATCACTGTTATAAATTAGGAAAGCAAGCAGCTGAATTATTAGAATCTATCATTTTAAAGAAAAAGTTCGTAAGATTTAGAGTTACTGATGCTATTGATGTATATGATTTCTTTTATATAAAGAAAGTTTTATCCAGTCGTATTAAGAA
t58 TAATTAATTTTAATGATGATGTTACTTTAGAAACACAATCCATGATAGCACACGGAGGGTCTTTGTCAGAAATAGAAGAAACTGGAGATTTATCTTCAGATGTTGATAGATTACATTCATCAATCGAAACTACTCC
t59 CTAAAACCTGACAGCTTTGTACAGTTATATTTACCCAATTAGAATGTTCAAACATATTTAAGTGTTGCCATTCTTCAAATAATTTTTTTTCATATTCAGTTTTTGGTTGGTTTTTAAATTCTTTTGATACTGTACCTTCTGCGGATGGTATAGGGAAGTTAAAAATTTTATTACGTACTTTGTTTGATT
t60 AACCAAAAAATGATACCTGTACGTTTTTCTCCAGGAGAAACAAAAAAATTAGAAAGGATTCTCAAAAAAGTAGACGACTTTTTCTGTGCCAATATTAAAGCTCAATATACTTGC
t61 TTTTTAAAAAAAAAAATTATGATATAAAATCTTCAATTAATAAAAGATCTATCCAGTTCTTTAAAGACACTAATATAGATCATTATATAGCATATGAATATTATAAGGAGGACCGTACGGAATTTATATTAACTATTATGAATGAAAAAAATATTACTCATCAAGAAACTCAA
t62 AGGGAACAGGTTAAATATATTGAAGACGTGGTTTGATAATCAGGTATTTCTTTAATTAAATATTCATGATATTGTTCGTTAGATACATAAATTAATTCGTCCTCCACATTTTGCTGTTCTGTTATAATTTGTGAAGAAGCTTGGTTTGTCTTATTTGGTAATTGCTCTT
t63 AACATCAGAACATAGTAAAGATTTAAATAATAATGATTCAAAAAATGAATCTAGTGATATTATTTCAGTAAATAATAAATCAAATAAAGTACAAAATCATTTTGAATCATTATCAGATTTAGAATTACTTGAAAATTCCTCACAAGATAATTTAGACAAAGATACAA
t64 CGAGAAGAAACACGTTTTATCTCATAATTCATATGAGAAAACTAAAAATAATGAAAATAATAAATTTTTCGATAAGGATAAAGAGTTAACGATGTCTAATGTAAAAAATGTGTCACAAACAAATTTCAAAAGTCTTTTAAGAAATCTTGGTGTTTCAGAGAATATATTCCTTAAAGAAAATAAATTAAATAAGGAAGGGAAAT
t65 TTTCATCCAAGAGTAATTATGTTTTTCTTGTTGATATTTTGCTTTATACAAACGAGCAAGATACTTTTTATACGATTCATGCTCTTTAATTTTCTTTTCAAAAGTTTTGGTTATAATGTGTTCAAGTGATTGTAAAGTATTTTTCTTTAATTTTCTCCATATTAATCTACATGCGATTATAAGAATTCTATATTCTTTGATATTCCC
t66 GAATTCTATCTATTTGTTTAATTGGTGTCCATGAATTTTTAGAGAAACACCACATATCAGCTAATAAGGAATTATTAACATCTAATCCTCCATGAATATATAAAAAATCATTAAAACTGAAAGAAGCATGCTTATATCTTGCACGTGGACAAATTTTATTTCCAATAAGTTCCCATTTCTTT
t67 CTAGTGACTGTACATTTTTCATCTAATGTTCTATAGGATAAAGAATCTGAAGGACAATGCATATCATTACAAAAATAGCATGCACCTTTAAAATATAAATAAGAATGTCTTAACACAACAAAGGAATCAAAA
t68 TTCTCATTCCTTTGAAAATATGAACTATTCAACTCATCAAGTTTTTCATATGTTATTTCATAACTGCCATCATCTGAATCCTCTTCATCACTACTACTTAAAGAATGCTCTTCTACAGTTATATCTTCATAATGATTATTTTTAGGAAAATATTCTTCAGATCTTTCTATAAATTCTGTTCCACTAC
t69 AATATCGACAATTATAAAGAAAAAATAGAAAATTTTAATGAAAGATATTATGAAGATGATTTAAGTTTCTTAAACTTTCCAGGAACCATAACAAGTTATGATAAAAACCATCAAAGAGATAAAAGAAAAAACTGTTTTGAAGAATATCTAGACAATAATAATTATTTTATTGATAGTAATCTAAAGGAAGTTATAGAAAATAATATATAT
t70 TTATTAAAAACAACAAATAAATTACCAAGGTCATGGAAAAAACTCTTGTTAAATCTATCCATTAGATTTCAAACAATACCTATAGGTGGTTCTATCGAAATCATATCAAAGGATGAGAAAAATATTTCTATGACACGATCCAGTGAATATAATGCAACAACATATGAGCATTACACATTTCCCAAATGGT
t71 ATTGACTTCTCCGGGTTGGTTACATTATCATTACAACCAATATTATTAAGGTTAATACTATTTTTATCATTATTATTGTTGTTTTCATTTGTGTGGTTATATAATAACAATGATGTGTCGTCTATTGGATCATTTTGCACAAGGTCAATATCATTAGAATTATGTTTCATATTATTTGTATACATTTTTTTGTAATTATCATTTTTTCTTACTT
t72 TATAATTTCACCTTTTGAATTACATATGTCTGTATTCAAAAATTTTATATCTCTACTCCATATATTTATCTTTACACCCAAAAACAAAGCAATGAAAAAAACAACCCTTTCATA
t73 GGGTCGACTTTTGGTGGTTTAACATTTTCAGCTACGAGATTTTCGAATGATGGCAATCCTACTTTGATTAATTCCTTTAACATAGATTTTGCGTGTTTTTTTACTTCTGATTTGATATC
t74 AATATCAAGATGGTAGTATTGTTTTAGATGTTAGTAGTACGAGAGATAAACTTTATGGAGAAGTTGAGTGGGAAGGATATAAATTTGAACTTGTGGATACAGGAGGTTTAGTTTTTGAACAAGAAAAGTTTTCAAAAGAAATAAAGGATCAAATTCTTATGGCTCTAAAAGAA
t75 AATCAACTATAATTAAAAATAGTGGAATTATTGAAAACTTTGGAGATATTGATTATATCTTTACGGATAAAACAGGAACACTAACAGAAAATGTCATGGTGCTTAAAGTTATACATATAGGTTTTGATGTTATACATGCAGAAAATGAAAAAAATTCCATACAAGGTAAT
t76 AGTATGAACACAAAACCGAAAGAGAGTTTAAAAAAAGATTCGATTTTAATGAAGGAAGAAATAACATAATAAATGTGTCAAATGTAAAGAAAACATGTAATATAGTTAAAAATGATTTAAAGGAATCTAACGAGATAAAAACTAATATAAACACAAGTTCTTATAATTTATA
t77 TTTTAAACTTTGAATCTGTTCAATTAATAACACTTCATCATTGTCTTCGCCTTTTATAATATCTCCATGATTATGTGCTCCTCTAATTAAGGTATCACTAACCTGTTGTTCATCTGTCGCATGAGAACTAGTAGATATATCTGGTTTATTTTTTTTCTTTCTTTTTGTTATTTCTAATTTTTTT
t78 AACTGATGTTAATGAAGGCCAAAATGGAAAACGTTTAACACAAGAAAAAACCCTAAAACAAATTCCTTCGACATCACCAAATAATTTAGAACAAAAATTACAACAAACACATATTTCTTCTTCCCAACAAAAACATTTAATAAAAACTACTCCTACTTCAGATTCAACTACAACTTCATCAAGCTCTTCTGTTCCTCTTACAAAACAATCCCCTCATTCT
t79 TATAGAGTTATTATGTTCGAATTAATGACCAGATTATTTCGATTTTATACAAAACCAATAGACCTTATTATTACTATGAATACACACAGATATGACACACCTATATTGTTAAAACATTCAAACAGCTTACAACTTATCTTATCAAATCAACAAAAT
t80 CTAAAAAAAGCAAACAAAAAGAAAAGAGAATGGAAGATTTAAGTATTGAGGAAAATTTGAAAGAGAAATTAATAAACCTAAATATTCAGAAACAGTATAGTGGTTTCCATATGAAAGAAATGGAGGAACAAAATATTCCTGAATTAAATATTCCACAACTTTCAGCTATTCATAATAGAGAAGATATGAA
t81 CTGGTTTTTGTACTTCTTGTCCTTGTGTTGATATCTCTTGTTCATTTTTAGATTGAGTATCTGTTTCACTTTGTGCCTTTACCTTTGATAATACGTTTTTACCGAATAATCCTAAATTTTGGAATAA
t82 CCCAATACAATAAATTCTACCATTTGACGTAACACCACAATTATTTCTTCTAGGTATATTTAAATTACTTGAAACATACCATACATCTCTTAAACGATCATACACCTCAGTTTCAAATAAAGCCTTATAATCATAGTTATTACCACCAAAAACGTATAAGAAATTATTCAATACAGCACTTCCAAAATAAGCTTTTTTGGTAGAC
t83 TTATATTTTGGTTCAGACAAATGTACGTTACATGAAATATGTGCACTATCATCAACTAAAGAAATATCTAAACTATCTGTAAAAGTATGTTTAGAACTAACATTTGAAGAGAAGTTACATCCGTGTATGACTTTTTTTTCATATTTTGGTTTTAAAACTATTTTATAATTATTATTTTTAAA
t84 ATCATTAAGTAAAAAATTTACGGTATCGACATTAACAGTGTAATAATTTCTTCCTAAATTTAAATGTTCTATATAATTCAAATTTAATACGGTATTTAACCTTTCTTCAAGTATTATCATTGCATTAATTTGAAACATAAGTAAAAAATTATTAGCACTTCTAAAAT
t85 TAACATTTTTTTAACATCTTTACCTTTTTGACTTGGTTCTTTATCATAATTCTGTTGTTCTGCAGAATCAGCATTTACTTCAGTTTCTTCTTTATTTTGAAAAGTGTTTGATTCTACATGAGAATTGGAAGATGAACGTCTATGTTTTACTTCTGTATAACTAGTACGTTCTCCAGTATGATGAGCCTTATGGTTTACATCTTCAGTTTCTTC
t86 TCTATGCTTTATCAAAACAAAGTAATCAAAATGTAATACATTATTCTTGTCGATCTATATGTGCTTTTACAAAAAACCCTAAGGGTTTAGAAGCGGTAAAGGATATCAAGGACTTTGCGAATATTATAAGTAAATCTGTTGGTGATTTGAGTAAAGAGAA
t87 GCAACTTACTTTAGAAGCCATGAATTCGGCCGATGATCCATTGGAAGCATTTAAAAATACTTTATTAATATTTGATTTTAATTTATCTGAATTTGATAAGGATCCATATATAAATGGTGTGCATGATTTAGCATCTAATATAAAGGATTGTTTAAGAAAAGGAGGTCACAGTAAAATATATT
t88 AAAAAAAATACATTTAAATCTATAATATTTGTAATAAATGATAAGAATATTAGTATTAATAGGGTACTACTATTTGTAAATACTTTTCTACATAATTTATCAATAGTAGGAAATATGATATTCTTTTTGTTTGGCTTTATTTTATCTGGGATTTTCCAATCATAAAGTTCGA
t89 ACATAGAAATATGTGCGAAAATTGGGAAAAAAATAAATAACAGGAATTGTTATATAAATTGAATCAAGAATGGCACAAAGAAAATAATACTGGTGACTTTCACACTAGTGATATCACACACAATAGTGGTATTACATACAGTAGTGGTAACATACCTAGTGAAAGTAATAGTAGTTATATCCAA
t90 CTTACAATGTTCTTCGCATTCGAAATTTTTTTCAGGATTACTTGAAAAGCCTTCTGGACAATTACAATATTCATATCCATGAGTATTCTTACAAACACCTTTTCCACAATTTAAAAAACATTTTTCTTCATTTAAA
t91 ATAATAATAGTAATATAAAGAATTATTCAGCTGTAGATATATTCTCACCCAAAAAAACTGGAAAGGAATGTATTAAATGTCTCCCAGATAATTTTTGTGAATGCGAATGTAGTTGTAAAAATAAAACAGGTTTTTCAATGAAATATAGACATGCCAGTAAGGGATCTAAAGGATATAGTAAAAAAATGA
t92 GAGTAATAGTGATTCATATAAGGTAAATTGTATTAATTTCTCTGAAGGATTTTGTTGCTGTCATCCAATAAATAATTTAGCACTATTATATGGAGAGTATCAACAAAATCAAGAATCAAAAAT
t93 TCTCCTTCCTTTTCATTCTTTACTTTATTTCTATTAATACACAACTCTCCTTGAGTTAAGAGATTTACGATATTATCAATATTGTTTTCTTTCATATACTTCAACTTTTTACTATTCATTTTTAAAAGATAATTAACCAAGTTACTCATGTTTGTATTAGATGGTATGTTCCCAATTTTTTCGTCTTTTCTT
t94 TTATTAAAACTTTTTTTTCTTTCTGTAAAGTTTGTACATTATGTTTTGATGAGTTTTGATTATCTTCATAAAACTTTATATATTTATAAAAATTATTTTGTATAAAATCATTTAATAAAGGTAACATAATCTTTTTAGCTTGATTCAATTCACTACATGAATGTATATT
t95 CATACAATTAATAAAGCTCTACGATAAATTTCTTTTTTATTTGATTTAGAAAAAGGAGAAAACCATTGAAACTTTTTGGCTAGATATTTATGGTCTTTATCAAAAATATATTCTGAA
t96 ATATATATAAAGTTAAACCTATAAATAATACACTACCTAATAAACTATTCTTATATTTAAAAATAAATATAATACATGTTATTAATCCTTCTATTGTTGCCGGAATAATATACATTAAAACAGAACTCATCAAATTATTAGCACTCTCGGTACCTCT
t97 ATAAACACCAGTACCATTTTTTTCTGATAAATTAATATTTTTTTGTATAACATCATATTTATCCCTTTTCGTGGTAAGTGCAGTATCCTGTTTTATTATTATATTATCGAATTCATCATGGTGTATATTTCTTTCAT
t98 AAATGAGGTATAATCATCCATTTCGTTGGGTCGATTTGATCTATTTTTAAGGTCCATATATTGAGATGAATCATTATGCATTTTATAATCATGAGGGATATTAGGTGCATGGTATTTAGGGTCTTTA
t99 TATGGAAAAATGGAATATGAAGTATTAAGTGATGATAACATAGTGTATGAAAATATACAACATGATTTATTAAAAACAATAGAAGATGATGAAGAAATGTTAAAAGGAACTGAAAGGAAGGATAATATAGATATACTGAGGACTCCTGGAAGGGGAGAATATAATATGTGGTCTACTTCTGGACTAGGGTTCTATGAATT
t100    ATCGTTTTGAATTGTTAGAATTTAAAATGACGGAGGATTGTTATACAAAAATGTGGTTTGATTTTATGAGTGATTTTGGAATAGCTACAATGAATGAAACCGAACATACTAGATCTTTTTATGGATT
Code
cd example 
pmotools-python extract_refseq_of_inserts_of_panels --file ../../format/moz2018_PMO.json.gz --output moz2018_PMO_panel_ref_seqs.tsv --overwrite
Source Code
---
title: Getting panel info out of PMO using pmotools-python
---

```{r setup, echo=F}
source("../common.R")
```

Most of these basic panel info can be found underneath `extract_panel_info_from_pmo`

```{bash, eval = F}
pmotools-python
```

```{bash, echo = F}
pmotools-python | perl -pe 's/\e\[[0-9;]*m(?:\e\[K)?//g'
```


Getting files for examples 


```{bash, eval = F}
cd example 

wget https://plasmogenepi.github.io/PMO_Docs/format/moz2018_PMO.json.gz
wget https://plasmogenepi.github.io/PMO_Docs/format/PathWeaverHeome1_PMO.json.gz

```



# Extract insert locations of panels from PMO 

This will extract the insert location of targets of the panel infos out of a PMO and write it out as a bed file 

```{bash}
pmotools-python extract_insert_of_panels -h 
```


The python code for `extract_insert_of_panels` script is below

```{python}
#| echo: true
#| eval: false
#| code-fold: true
#| code-line-numbers: true
#| filename: pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_insert_of_panels.py
#| file: ../pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_insert_of_panels.py
```

```{bash}
cd example 
pmotools-python extract_insert_of_panels --file ../../format/moz2018_PMO.json.gz

```

```{bash}
cd example 
pmotools-python extract_insert_of_panels --file ../../format/moz2018_PMO.json.gz --output moz2018_PMO_panel_insert_locs.bed --overwrite

```

Can add on the reference sequence if it's loaded in PMO, if it's not loaded will be blank column 
```{bash}
cd example 
pmotools-python extract_insert_of_panels --add_ref_seqs --file ../../format/moz2018_PMO.json.gz --output moz2018_PMO_panel_insert_locs.bed --overwrite

```



# Extract ref sequences of insert locations of panels from PMO 


This will extract the reference sequence of the insert location of the targets within the panel info out of a PMO and write it out as a table. The reference sequence is an optional field and so if no reference sequence is loaded then just blanks will be extracted 

```{bash}
pmotools-python extract_refseq_of_inserts_of_panels -h 
```


The python code for `extract_refseq_of_inserts_of_panels` script is below

```{python}
#| echo: true
#| eval: false
#| code-fold: true
#| code-line-numbers: true
#| filename: pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_refseq_of_inserts_of_panels.py
#| file: ../pmotools-python/src/pmotools/scripts/extract_info_from_pmo/extract_refseq_of_inserts_of_panels.py
```

```{bash}
cd example 
pmotools-python extract_refseq_of_inserts_of_panels --file ../../format/moz2018_PMO.json.gz

```

```{bash}
cd example 
pmotools-python extract_refseq_of_inserts_of_panels --file ../../format/moz2018_PMO.json.gz --output moz2018_PMO_panel_ref_seqs.tsv --overwrite

```