API Documentation

ibd_dendrogram.make_distance_matrix module

ibd_dendrogram.make_distance_matrix.check_kwargs(args_dict: dict[str, Any]) str | None[source]

Function that will make sure that the necessary arguments are passed to distance function

Parameters:

args_dict (Dict[str, Any]) – Dictionary that has the arguments as keys and the values for the distance function

ibd_dendrogram.make_distance_matrix.draw_dendrogram(clustering_results: ndarray[Any, dtype[ScalarType]], grids: List[str], output_name: Path | str, cases: List[str] | None = None, exclusions: List[str] = [], title: str | None = None, node_font_size: int = 10, save_fig: bool = False) tuple[matplotlib.figure.Figure, matplotlib.axes._axes.Axes, Dict[str, Any]][source]

Function that will draw the dendrogram

Parameters:
  • clustering_results (npt.NDArray) – numpy array that has the results from running the generate_dendrogram function

  • grids (list[str]) – list of ids to use as labels

  • output_name (Path | str) – path object or a string that tells where the dendrogram will be saved to.

  • cases (list[str] | None) – list of case ids. If the user doesn’t provided this value then all of the labels on the dendrogram will be black. If the user provides a value then the case labels will be red. Value defaults to None

  • exclusions (List[str]) – list of individuals who are consider exclusions and are indicated as N/A or -1 by the phenotype file. This value defaults to None

  • title (str | None) – Optional title for the plot. If this is not provided then the plot will have no title

  • node_font_size (int) – Size for the font of the dendrogram leaf nodes

  • save_fig (bool) – whether or not to save the figure. Defaults to False.

Returns:

returns a tuple with the matplotlib Figure, the matplotlib Axes object, and a dictionary from the sch. dendrogram command

Return type:

tuple[plt.Figure, plt.Axes, dict[str, Any]]

ibd_dendrogram.make_distance_matrix.generate_dendrogram(matrix: ndarray[Any, dtype[ScalarType]]) ndarray[Any, dtype[ScalarType]][source]

Function that will perform the hierarchical clustering algorithm

Parameters:

matrix (Array) – distance matrix represented by 2D numpy array. distance should be calculated based on 1/(ibd segment length)

Returns:

returns the results of the clustering as a numpy array

Return type:

Array

ibd_dendrogram.make_distance_matrix.make_distance_matrix(pairs_df: ~pandas.core.frame.DataFrame, min_cM: int, distance_function: ~typing.Callable = <function _determine_distances>) tuple[list[str] | None, numpy.ndarray[typing.Any, numpy.dtype[+ScalarType]]][source]

Function that will make the distance matrix

Parameters:
  • pairs_df (pd.DataFrame) – dataframe that has the pairs_files. it should have at least three columns called ‘pair_1’, ‘pair_2’, and ‘length’

  • min_cM (float) – This is the minimum centimorgan threshold that will be divided in half to get the ibd segment length when pairs do not share a segment

Returns:

returns a tuple where the first object is a list of ids that has the individual id that corresponds to each row. The second object is the distance matrix

Return type:

Dict[str, Dict[str, float]]

ibd_dendrogram.make_distance_matrix.record_matrix(output: Path | str, matrix, pair_list: List[str]) None[source]

Function that will write the dataframe to a file

Parameters:
  • output (str) – filepath to write the output to

  • matrix (array) – array that has the distance matrix for each individual

  • pair_list (List[str]) – list of ids that represent each row of the pair_list