pmotools.pmo_builder.pmo_updater module

class pmotools.pmo_builder.pmo_updater.PMOUpdater[source]

Bases: object

static merge_dicts_by_key(main_list: list[dict], update_list: list[dict], key_field: str, replace: bool = False, ignore_fields: list[str] | None = None) list[dict][source]

Merge two lists of dicts by a shared key field.

The first list is treated as the main/base data source. The second list provides updates that are applied on top. Both input lists are left untouched (deep copies are used internally).

Parameters:
  • main_list – The primary list of dicts (source of truth).

  • update_list – The list of dicts whose values will be merged in.

  • key_field – The dict key used to match records across lists.

  • replace – If True, existing values in main are overwritten by update values. If False, a conflict raises a ValueError.

  • ignore_fields – Optional list of field names to skip entirely during the merge (they are never read from update_list).

Returns:

A new list of dicts with updates applied.

Raises:
  • ValueError – If either list contains duplicate values for key_field.

  • KeyError – If any dict in either list is missing key_field.

  • KeyError – If update_list contains a key_field value that does not exist in main_list.

  • ValueError – If replace=False and an update would overwrite an existing field.

static update_specimen_meta_with_traveler_info(pmo, traveler_info: DataFrame, specimen_name_col: str = 'specimen_name', travel_country_col: str = 'travel_country', travel_start_col: str = 'travel_start_date', travel_end_col: str = 'travel_end_date', bed_net_usage_col: str = None, geo_admin1_col: str = None, geo_admin2_col: str = None, geo_admin3_col: str = None, lat_lon_col: str = None, replace_current_traveler_info: bool = False)[source]

Update a PMO’s specimen’s metadata with travel info

Parameters:
  • pmo – the PMO to update, will directly modify this PMO

  • traveler_info – the traveler info

  • specimen_name_col – the specimen name column within the traveler input table

  • travel_country_col – the column name containing the traveled to country

  • travel_start_col – the column name containing the traveled start date, format YYYY-MM-DD or YYYY-MM

  • travel_end_col – the column name containing the traveled end date, format YYYY-MM-DD or YYYY-MM

  • bed_net_usage_col – (Optional) a number between 0 - 1 for rough frequency of bednet usage while traveling

  • geo_admin1_col – (Optional) the column name containing the traveled to country admin level 1 info

  • geo_admin2_col – (Optional) the column name containing the traveled to country admin level 2 info

  • geo_admin3_col – (Optional) the column name containing the traveled to country admin level 3 info

  • lat_lon_col – (Optional) the latitude and longitude column name containing the region traveled to latitude and longitude

  • replace_current_traveler_info – whether to replace current travel info

Returns:

a reference to the updated PMO