Plotting

Loading example data

# Import the plotting module
import sctoolbox.plotting as pl

# Load example dataset
import numpy as np
np.random.seed(42)
import scanpy as sc

adata = sc.datasets.pbmc68k_reduced()
adata.obs["condition"] = np.random.choice(["C1", "C2", "C3"], size=adata.shape[0])

qc_filter

Functions for plotting QC-related figures e.g. number of cells per group and violins.

sctoolbox.plotting.qc_filter.plot_starsolo_quality(folder: str, measures: list[str] = ['Number of Reads', 'Reads Mapped to Genome: Unique', 'Reads Mapped to Gene: Unique Gene', 'Fraction of Unique Reads in Cells', 'Median Reads per Cell', 'Median Gene per Cell'], ncol: int = 3, order: list[str] | None = None, save: str | None = None, **kwargs: Any) ndarray[source]

Plot quality measures from starsolo as barplots per condition.

Parameters:
  • folder (str) – Path to a folder, e.g. “path/to/starsolo_output”, which contains folders “solorun1”, “solorun2”, etc.

  • measures (list[str], default ["Number of Reads", "Reads Mapped to Genome: Unique", "Reads Mapped to Gene: Unique Gene", "Fraction of Unique Reads in Cells", "Median Reads per Cell", "Median Gene per Cell"]) – List of measures to plot. Must be available in the solo summary table.

  • ncol (int, default 3) – Number of columns in the plot.

  • order (Optional[list[str]], default None) – Order of conditions in the plot. If None, the order is alphabetical.

  • save (Optional[str], default None) – Path to save the plot. If None, the plot is not saved.

  • **kwargs (Any) – Additional arguments passed to seaborn.barplot.

Returns:

axes – Array of axes objects containing the plot(s).

Return type:

np.ndarray

Raises:

KeyError – If a measure is not available in the solo summary table.

Examples

pl.plot_starsolo_quality("data/quant/")
../_images/plotting-2.png
sctoolbox.plotting.qc_filter.plot_starsolo_UMI(folder: str, ncol: int = 3, save: str | None = None) ndarray[source]

Plot UMI distribution for each condition in a folder.

Parameters:
  • folder (str) – Path to a folder, e.g. “path/to/starsolo_output”, which contains folders “solorun1”, “solorun2”, etc.

  • ncol (int, default 3) – Number of columns in the plot.

  • save (Optional[str], default None) – Path to save the plot. If None, the plot is not saved.

Returns:

axes – Array of axes objects containing the plot(s).

Return type:

np.ndarray

Raises:

ValueError – If no UMI files (‘UMIperCellSorted.txt’) are found in the folder.

Examples

pl.plot_starsolo_UMI("data/quant/", ncol=2)
../_images/plotting-3.png
sctoolbox.plotting.qc_filter.n_cells_barplot(adata: AnnData, x: str, groupby: str | None = None, stacked: bool = True, save: str | None = None, figsize: tuple[int | float, int | float] | None = None, add_labels: bool = False, **kwargs: Any) Iterable[Axes][source]

Plot number and percentage of cells per group in a barplot.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix object.

  • x (str) – Name of the column in adata.obs to group by on the x axis.

  • groupby (Optional[str], default None) – Name of the column in adata.obs to created stacked bars on the y axis. If None, the bars are not split.

  • stacked (bool, default True) – Whether to stack the bars or not.

  • save (Optional[str], default None) – Path to save the plot. If None, the plot is not saved.

  • figsize (Optional[Tuple[int | float, int | float]], default None) – Size of figure, e.g. (4, 8). If None, size is determined automatically depending on whether groupby is None or not.

  • add_labels (bool, default False) – Whether to add labels to the bars giving the number/percentage of cells.

  • **kwargs (Any) – Additional arguments passed to pandas.DataFrame.plot.bar.

Returns:

axarr – Array of axes objects containing the plot(s).

Return type:

Iterable[matplotlib.axes.Axes]

Examples

pl.n_cells_barplot(adata, x="louvain")
../_images/plotting-4.png
pl.n_cells_barplot(adata, x="louvain", groupby="condition")
../_images/plotting-5.png
sctoolbox.plotting.qc_filter.group_correlation(adata: AnnData, groupby: str, method: Literal['spearman', 'pearson', 'kendall'] | Callable = 'spearman', save: str | None = None, **kwargs: Any) ClusterGrid[source]

Plot correlation matrix between groups in groupby.

The function expects the count data in .X to be normalized across cells.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix object.

  • groupby (str) – Name of the column in adata.obs to group cells by.

  • method (Literal["spearman", "pearson", "kendall"] | Callable, default "spearman") – Correlation method to use. See pandas.DataFrame.corr for options.

  • save (Optional[str], default None) – Path to save the plot. If None, the plot is not saved.

  • **kwargs (Any) – Additional arguments passed to seaborn.clustermap.

Return type:

sns.matrix.ClusterGrid

Examples

import scanpy as sc
import sctoolbox.plotting as pl
adata = sc.datasets.pbmc68k_reduced()
pl.group_correlation(adata, "phase", method="spearman", save=None)
../_images/plotting-8.png
sctoolbox.plotting.qc_filter.plot_insertsize(adata: AnnData, barcodes: list[str] | None = None, **kwargs: Any) Axes[source]

Plot insertsize distribution for barcodes in adata. Requires adata.uns[“insertsize_distribution”] to be set.

Parameters:
  • adata (sc.AnnData) – AnnData object containing insertsize distribution in adata.uns[“insertsize_distribution”].

  • barcodes (Optional[list[str]], default None) – Subset of barcodes to plot information for. If None, all barcodes are used.

  • **kwargs (Any) – Additional arguments passed to seaborn.lineplot.

Returns:

ax – Axes object containing the plot.

Return type:

matplotlib.axes.Axes

Raises:

ValueError – If adata.uns[“insertsize_distribution”] is not set.

sctoolbox.plotting.qc_filter.quality_violin(adata: AnnData, columns: list[str], which: Literal['obs', 'var'] = 'obs', groupby: str | None = None, ncols: int = 2, header: list[str] | None = None, color_list: list[str | tuple[float | int, float | int, float | int]] | None = None, title: str | None = None, thresholds: dict[str, dict[str, dict[Literal['min', 'max'], int | float]] | dict[Literal['min', 'max'], int | float]] | None = None, global_threshold: bool = True, interactive: bool = True, save: str | None = None, **kwargs: Any) tuple[Any, dict[str, Any]][source]

Plot quality measurements for cells/features in an anndata object.

Notes

Notebook needs “%matplotlib widget” before the call for the interactive sliders to work.

Parameters:
  • adata (sc.AnnData) – Anndata object containing quality measures in .obs/.var

  • columns (list[str]) – A list of columns in .obs/.var to show measures for.

  • which (Literal["obs", "var"], default "obs") – Which table to show quality for. Either “obs” / “var”.

  • groupby (Optional[str], default "condition") – A column in table to values on the x-axis.

  • ncols (int, default 2) – Number of columns in the plot.

  • header (Optional[list[str]], defaul None) – A list of custom headers for each measure given in columns.

  • color_list (Optional[list[str]], default None) – A list of colors to use for violins. If None, colors are chosen automatically.

  • title (Optional[str], default None) – The title of the full plot.

  • thresholds (Optional[dict[str, dict[str, dict[Literal["min", "max"], int | float]] | dict[Literal["min", "max"], int | float]]], default None) – Dictionary containing initial min/max thresholds to show in plot.

  • global_threshold (bool, default True) – Whether to use global thresholding as the initial setting. If False, thresholds are set per group.

  • interactive (bool, default True) – Whether to show interactive sliders. If False, the static matplotlib plot is shown.

  • save (Optional[str], optional) – Save the figure to the path given in ‘save’. Default: None (figure is not saved).

  • **kwargs (Any) – Additional arguments passed to seaborn.violinplot.

Returns:

Tuple[Union[matplotlib.figure.Figure, ipywidgets.HBox], Dict[str, Union[List[ipywidgets.FloatRangeSlider.observe], Dict[str, ipywidgets.FloatRangeSlider.observe]]]] First element contains figure (static) or figure and sliders (interactive). The second element is a nested dict of slider values that are continously updated.

Return type:

Tuple[Any, Dict[str, Any]]

Raises:

ValueError – If ‘which’ is not ‘obs’ or ‘var’ or if columns are not in table.

sctoolbox.plotting.qc_filter.get_slider_thresholds(slider_dict: dict) dict[source]

Get thresholds from sliders.

Parameters:

slider_dict (dict) – Dictionary of sliders in the format ‘slider_dict[column][group] = slider’ or ‘slider_dict[column] = slider’ if no grouping.

Returns:

dict in the format threshold_dict[column][group] = {“min”: <min_threshold>, “max”: <max_threshold>} or threshold_dict[column] = {“min”: <min_threshold>, “max”: <max_threshold>} if no grouping

Return type:

dict


highly_variable

Plots for highly variable genes, e.g. as a result of sc.tl.highly_variable.

sctoolbox.plotting.highly_variable.violin_HVF_distribution(adata: AnnData, **kwargs: Any)[source]

Plot the distribution of the HVF as violinplot.

Parameters:
  • adata (sc.AnnData) – AnnData object containing columns [‘highly_variable’, ‘n_cells_by_counts’] column.

  • **kwargs (Any) – Keyword arguments to be passed to matplotlib.pyplot.violinplot.

sctoolbox.plotting.highly_variable.scatter_HVF_distribution(adata: AnnData, **kwargs: Any)[source]

Plot the distribution of the HVF as scatterplot.

Parameters:
  • adata (sc.AnnData) – AnnData object containing columns [‘variability_score’, ‘n_cells’] column.

  • **kwargs (Any) – Keyword arguments to be passed to matplotlib.pyplot.scatter.

embedding

Funtions of different single cell embeddings e.g. UMAP, PCA, tSNE.

sctoolbox.plotting.embedding.sc_colormap() ListedColormap[source]

Get a colormap with 0-count cells colored grey (to use for embeddings).

Returns:

cmap – Colormap with 0-count cells colored grey.

Return type:

matplotlib.colors.ListedColormap

sctoolbox.plotting.embedding.grey_colormap() ListedColormap[source]

Get a colormap with grey-scale colors, but without white to still show cells.

Returns:

cmap – Grey-scale colormap.

Return type:

matplotlib.colors.ListedColormap

sctoolbox.plotting.embedding.flip_embedding(adata: AnnData, key: str = 'X_umap', how: Literal['vertical', 'horizontal'] = 'vertical')[source]

Flip the embedding in adata.obsm[key] along the given axis.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix object.

  • key (str, default "X_umap") – Key in adata.obsm to flip.

  • how (Literal["vertical", "horizontal"], default "vertical") – Axis to flip along. Can be “vertical” (flips up/down) or “horizontal” (flips left/right).

Raises:
  • KeyError – If the given key is not found in adata.obsm.

  • ValueError – If the given ‘how’ is not supported.

sctoolbox.plotting.embedding.plot_embedding(adata: AnnData, method: str = 'umap', color: list[str | None] | str | None = None, style: Literal['dots', 'hexbin', 'density'] = 'dots', show_borders: bool = False, show_contour: bool = False, show_count: bool = True, show_title: bool = True, hexbin_gridsize: int = 30, shrink_colorbar: float | int = 0.3, square: bool = True, save: str | None = None, **kwargs) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes][source]

Plot a dimensionality reduction embedding e.g. UMAP or tSNE with different style options. This is a wrapper around scanpy.pl.embedding.

Parameters:
  • adata (anndata.AnnData) – Annotated data matrix object.

  • method (str, default "umap") – Dimensionality reduction method to use. Must be a key in adata.obsm, or a method available as “X_<method>” such as “umap”, “tsne” or “pca”.

  • color (Optional[str | list[str]], default None) – Key for annotation of observations/cells or variables/genes.

  • style (Literal["dots", "hexbin", "density".], default "dots") – Style of the plot. Must be one of “dots”, “hexbin” or “density”.

  • show_borders (bool, default False) – Whether to show borders around embedding plot. If False, the borders are removed and a small legend is added to the plot.

  • show_contour (bool, default False) – Whether to show a contour plot on top of the plot.

  • show_count (bool, default True) – Whether to show the number of cells in the plot.

  • show_title (bool, default True) – Whether to show the titles of the plots. If False, the titles are removed and the names are added to the colorbar/legend instead.

  • hexbin_gridsize (int, default 30) – Number of hexbins across plot - higher values give smaller bins. Only used if style=”hexbin”.

  • shrink_colorbar (float | int, default 0.3) – Shrink the height of the colorbar by this factor.

  • square (bool, default True) – Whether to make the plot square.

  • save (Optional[str], default None) – Filename to save the figure.

  • **kwargs (arguments) – Additional keyword arguments are passed to scanpy.pl.plot_embedding().

Returns:

axes – Array of axis objects

Return type:

npt.ArrayLike

Raises:
  • KeyError – If the given method is not found in adata.obsm.

  • ValueError – If the ‘components’ given is larger than the number of components in the embedding.

Examples

pl.plot_embedding(adata, color="louvain", legend_loc="on data")
../_images/plotting-9.png
_ = pl.plot_embedding(adata, method="pca", color="n_genes", show_contour=True, show_title=False)
../_images/plotting-10.png
_ = pl.plot_embedding(adata, color=['n_genes', 'HES4'], style="hexbin")
../_images/plotting-11.png
_ = pl.plot_embedding(adata, method="pca", color=['n_genes', 'HES4'],
                      style="hexbin", components=["1,2", "2,3"], ncols=2)
../_images/plotting-12.png
ax = pl.plot_embedding(adata, color=['n_genes', 'louvain'], style="density")
../_images/plotting-13.png
sctoolbox.plotting.embedding.search_umap_parameters(adata: AnnData, min_dist_range: tuple[float | int, float | int, float | int] = (0.2, 0.9, 0.2), spread_range: tuple[float | int, float | int, float | int] = (0.5, 2.0, 0.5), color: str | None = None, n_components: int = 2, threads: int = 4, save: str | None = None, **kwargs: Any) ndarray[source]

Plot a grid of different combinations of min_dist and spread variables for UMAP plots.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix object.

  • min_dist_range (Tuple[float | int, float | int, float | int], default: (0.2, 0.9, 0.2)) – Range of ‘min_dist’ parameter values to test. Must be a tuple in the form (min, max, step).

  • spread_range (Tuple[float | int, float | int, float | int], default (0.5, 2.0, 0.5)) – Range of ‘spread’ parameter values to test. Must be a tuple in the form (min, max, step).

  • color (Optional[str], default None) – Name of the column in adata.obs to color plots by. If None, plots are not colored.

  • n_components (int, default 2) – Number of components in UMAP calculation.

  • threads (int, default 4) – Number of threads to use for UMAP calculation.

  • save (Optional[str], default None) – Path to save the figure to. If None, the figure is not saved.

  • **kwargs (Any) – Additional keyword arguments are passed to scanpy.tl.umap().

Returns:

2D numpy array of axis objects

Return type:

np.ndarray

Examples

pl.search_umap_parameters(adata, min_dist_range=(0.2, 0.9, 0.2),
                                 spread_range=(2.0, 3.0, 0.5),
                                 color="bulk_labels")
../_images/plotting-14.png
sctoolbox.plotting.embedding.search_tsne_parameters(adata: AnnData, perplexity_range: tuple[int, int, int] = (30, 60, 10), learning_rate_range: tuple[int, int, int] = (600, 1000, 200), color: str | None = None, threads: int = 4, save: str | None = None, **kwargs: Any) ndarray[source]

Plot a grid of different combinations of perplexity and learning_rate variables for tSNE plots.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix object.

  • perplexity_range (Tuple[int, int, int], default (30, 60, 10)) – tSNE parameter: Range of ‘perplexity’ parameter values to test. Must be a tuple in the form (min, max, step).

  • learning_rate_range (Tuple[int, int, int], default (600, 1000, 200)) – tSNE parameter: Range of ‘learning_rate’ parameter values to test. Must be a tuple in the form (min, max, step).

  • color (Optional[str], default None) – Name of the column in adata.obs to color plots by. If None, plots are not colored.

  • threads (int, default 1) – The threads paramerter is currently not supported. Please leave at 1. This may be fixed in the future.

  • save (Optional[str], default None (not saved)) – Path to save the figure to.

  • **kwargs (Any) – Additional keyword arguments are passed to scanpy.tl.tsne().

Returns:

2D numpy array of axis objects

Return type:

np.ndarray

Examples

pl.search_tsne_parameters(adata, perplexity_range=(30, 60, 10),
                                 learning_rate_range=(600, 1000, 200),
                                 color="bulk_labels")
../_images/plotting-15.png
sctoolbox.plotting.embedding.plot_group_embeddings(adata: AnnData, groupby: str, embedding: Literal['umap', 'tsne', 'pca'] = 'umap', ncols: int = 4, save: str | None = None, **kwargs: Any) ndarray[source]

Plot a grid of embeddings (UMAP/tSNE/PCA) per group of cells within ‘groupby’.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix object.

  • groupby (str) – Name of the column in adata.obs to group by.

  • embedding (Literal["umap", "tsne", "pca"], default "umap") – Embedding to plot. Must be one of “umap”, “tsne”, “pca”.

  • ncols (int, default 4) – Number of columns in the figure.

  • save (Optional[str], default None) – Path to save the figure.

  • **kwargs (Any) – Additional keyword arguments are passed to scanpy.pl.umap() or scanpy.pl.tsne() or scanpy.pl.pca().

Returns:

Flat numpy array of axis objects

Return type:

np.ndarray

Examples

pl.plot_group_embeddings(adata, 'phase', embedding='umap', ncols=4)
../_images/plotting-16.png
sctoolbox.plotting.embedding.compare_embeddings(adata_list: list[AnnData], var_list: list[str] | str, embedding: Literal['umap', 'tsne', 'pca'] = 'umap', adata_names: list[str] | None = None, **kwargs: Any) ndarray[source]

Compare embeddings across different adata objects.

Plots a grid of embeddings with the different adatas on the x-axis, and colored variables on the y-axis.

Parameters:
  • adata_list (list[sc.AnnData]) – List of AnnData objects to compare.

  • var_list (list[str] | str) – List of variables to color in plot.

  • embedding (Literal["umap", "tsne", "pca"], default "umap") – Embedding to plot. Must be one of “umap”, “tsne” or “pca”.

  • adata_names (Optional[list[str]], default None (adatas will be named adata_1, adata_2, etc.)) – List of names for the adata objects. Must be the same length as adata_list or None

  • **kwargs (Any) – Additional arguments to pass to sc.pl.umap/sc.pl.tsne/sc.pl.pca.

Returns:

2D numpy array of axis objects

Return type:

np.ndarray

Raises:

ValueError – If none of the variables in var_list are found in any of the adata objects.

Examples

import scanpy as sc
adata1 = sc.datasets.pbmc68k_reduced()
adata2 = sc.datasets.pbmc3k_processed()
adata_list = [adata1, adata2]
var_list = ['n_counts', 'n_cells']
pl.compare_embeddings(adata_list, var_list)
../_images/plotting-19.png
sctoolbox.plotting.embedding.plot_3D_UMAP(adata: AnnData, color: str, save: str, **kwargs: Any) None[source]

Save 3D UMAP plot to a html file.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix.

  • color (str) – Variable to color in plot. Must be a column in adata.obs or an index in adata.var.

  • save (str) – Save prefix. Plot will be saved to <save>.html.

  • **kwargs (Any) – Additional keyword arguments are passed to plotly.graph_objects.Scatter3d().

Raises:

KeyError – If the given ‘color’ attribute was not found in adata.obs columns or adata.var index.

Examples

min_dist = 0.3
spread = 2.5
sc.tl.umap(adata, min_dist=min_dist, spread=spread, n_components=3)
pl.plot_3D_UMAP(adata, color="louvain", save="my3d_umap")

This will create an .html-file with the interactive 3D UMAP: my3d_umap.html

sctoolbox.plotting.embedding.umap_marker_overview(adata: AnnData, markers: list[str] | str, ncols: int = 3, figsize: tuple[int, int] | None = None, save: str | None = None, cbar_label: str = 'Relative expr.', **kwargs: Any) list[source]

Plot a pretty grid of UMAPs with marker gene expression.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix.

  • markers (list[str] | str) – List of markers or singel marker

  • ncols (int, default 3) – Number of columns in grid.

  • figsize (Optional[Tuple[int, int]], default None) – Tuple of figure size.

  • save (Optional[str], default None) – If not None save plot under given name.

  • cbar_label (str, default "Relative expr.") – Colorbar label

  • **kwargs (Any) – Additional parameter for scanpy.pl.umap()

Returns:

List of axis objects

Return type:

list

sctoolbox.plotting.embedding.anndata_overview(adatas: dict[str, AnnData], color_by: str | list[str], plots: list[Literal['UMAP', 'tSNE', 'PCA', 'PCA-var', 'LISI']] | Literal['UMAP', 'tSNE', 'PCA', 'PCA-var', 'LISI'] = ['PCA', 'PCA-var', 'UMAP', 'LISI'], figsize: tuple[int, int] | None = None, max_clusters: int = 20, output: str | None = None, dpi: int = 300, **kwargs: Any) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes][source]

Create a multipanel plot comparing PCA/UMAP/tSNE/(…) plots for different adata objects.

Parameters:
  • adatas (dict[str, sc.AnnData]) – Dict containing an anndata object for each batch correction method as values. Keys are the name of the respective method. E.g.: {“bbknn”: anndata}

  • color_by (str | list[str]) – Name of the .obs column to use for coloring in applicable plots (e.g. for UMAP or PCA).

  • plots (Union[list[Literal["UMAP", "tSNE", "PCA", "PCA-var", "LISI"]],) – Literal[“UMAP”, “tSNE”, “PCA”, “PCA-var”, “LISI”]], default [“PCA”, “PCA-var”, “UMAP”, “LISI”] Decide which plots should be created. Options are [“UMAP”, “tSNE”, “PCA”, “PCA-var”, “LISI”] Note: List order is forwarded to plot. - UMAP: Plots the UMAP embedding of the data. - tSNE: Plots the tSNE embedding of the data. - PCA: Plots the PCA embedding of the data. - PCA-var: Plots the variance explained by each PCA component. - LISI: Plots the distribution of any “LISI_score*” scores available in adata.obs

  • figsize (Optional[Tuple[int, int]], default None) – Size of the plot in inch. Defaults to automatic size based on number of columns/rows.

  • max_clusters (int, default 20) – Maximum number of clusters to show in legend.

  • output (Optional[str], default None) – Path to plot output file.

  • dpi (int, default 300) – Dots per inch for output

  • **kwargs (Any) – Additional keyword arguments are passed to scanpy.pl.umap(), scanpy.pl.tsne() or scanpy.pl.pca().

Returns:

axes – Array of matplotlib.axes.Axes objects created by matplotlib.

Return type:

npt.ArrayLike

Raises:

ValueError – If any of the adatas is not of type anndata.AnnData.

Examples

adatas = {}  # dictionary of adata objects
adatas["standard"] = adata
adatas["parameter1"] = sc.tl.umap(adata, min_dist=1, copy=True)
adatas["parameter2"] = sc.tl.umap(adata, min_dist=2, copy=True)

pl.anndata_overview(adatas, color_by="louvain", plots=["PCA", "PCA-var", "UMAP"])
../_images/plotting-22.png
sctoolbox.plotting.embedding.plot_pca_variance(adata: AnnData, method: str = 'pca', n_pcs: int = 20, selected: list[int] | None = None, show_cumulative: bool = True, n_thresh: int | None = None, corr_plot: Literal['spearmanr', 'pearsonr'] | None = None, corr_on: Literal['obs', 'var'] = 'obs', corr_thresh: float | None = None, ax: Axes | None = None, save: str | None = None, sel_col: str = 'grey', om_col: str = 'lightgrey') Axes[source]

Plot the pca variance explained by each component as a barplot.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix object.

  • method (str, default "pca") – Method used for calculating variation. Is used to look for the coordinates in adata.uns[<method>].

  • n_pcs (int, default 20) – Number of components to plot.

  • selected (Optional[List[int]], default None) – Number of components to highlight in the plot.

  • show_cumulative (bool, default True) – Whether to show the cumulative variance explained in a second y-axis.

  • n_thresh (Optional[int], default None) – Enables a vertical threshold line.

  • corr_plot (Optional[str], default None) – Enable correlation plot. Shows highest absolute correlation for each bar.

  • corr_on (Literal["obs", "var"], default "obs") – Calculate correlation on either observations (adata.obs) or variables (adata.var).

  • corr_thresh (Optional[float], default None) – Enables a red threshold line in the lower plot.

  • ax (Optional[matplotlib.axes.Axes], default None) – Axes object to plot on. If None, a new figure is created.

  • save (Optional[str], default None (not saved)) – Filename to save the figure. If None, the figure is not saved.

  • sel_col (str, default "grey") – Bar color of selected bars.

  • om_col (str, default "lightgrey") – Bar color of omitted bars.

Returns:

Axes object containing the plot.

Return type:

matplotlib.axes.Axes

Raises:

KeyError – If the given method is not found in adata.uns.

Examples

pl.plot_pca_variance(adata, method="pca",
              n_pcs=20,
              selected=[2, 3, 4, 5, 7, 8, 9],
              corr_plot="spearmanr")
../_images/plotting-23.png
sctoolbox.plotting.embedding.plot_pca_correlation(adata: AnnData, which: Literal['obs', 'var'] = 'obs', basis: str = 'pca', n_components: int = 10, columns: list[str] | None = None, pvalue_threshold: float = 0.01, method: Literal['spearmanr', 'pearsonr'] = 'spearmanr', plot_values: Literal['corrcoefs', 'pvalues'] = 'corrcoefs', figsize: tuple[int, int] | None = None, title: str | None = None, save: str | None = None, **kwargs: Any) Axes[source]

Plot a heatmap of the correlation between dimensionality reduction coordinates (e.g. umap or pca) and the given columns.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix object.

  • which (Literal["obs", "var"], default "obs") – Whether to use the observations (“obs”) or variables (“var”) for the correlation.

  • basis (str, default "pca") – Dimensionality reduction to calculate correlation with. Must be a key in adata.obsm, or a basis available as “X_<basis>” such as “umap”, “tsne” or “pca”.

  • n_components (int, default 10) – Number of components to use for the correlation.

  • columns (Optional[list[str]], default None) – List of columns to use for the correlation. If None, all numeric columns are used.

  • pvalue_threshold (float, default 0.01) – Threshold for significance of correlation. If the p-value is below this threshold, a star is added to the heatmap.

  • method (Literal["spearmanr", "pearson"], default "spearmanr") – Method to use for correlation. Must be either “pearsonr” or “spearmanr”.

  • plot_values (Literal["corrcoefs", "pvalues"], default "corrcoefs") – Values which will be used to plot the heatmap, either “corrcoefs” (correlation coefficients) or “pvalues”. P-values will be shown as np.sign(corrcoefs)*np.log10(p-value), the logged p-value with the sign of the corresponding correlation coefficient.

  • figsize (Optional[Tuple[int, int]], default None) – Size of the figure in inches. If None, the size is automatically determined.

  • title (Optional[str], default None) – Title of the plot. If None, no title is added.

  • save (Optional[str], default None) – Filename to save the figure.

  • **kwargs (Any) – Additional keyword arguments are passed to seaborn.heatmap().

Returns:

ax – Axes object containing the heatmap.

Return type:

matplotlib.axes.Axes

Examples

pl.plot_pca_correlation(adata, which="obs")
../_images/plotting-24.png
pl.plot_pca_correlation(adata, basis="umap")
../_images/plotting-25.png

clustering

Functions for plotting clustering results e.g. UMAPs colored by clusters.

sctoolbox.plotting.clustering.search_clustering_parameters(adata: AnnData, method: Literal['leiden', 'louvain'] = 'leiden', resolution_range: tuple[float | int, float | int, float | int] = (0.1, 1, 0.1), embedding: str = 'X_umap', ncols: int = 3, verbose: bool = True, save: str | None = None, **kwargs: Any) ndarray[source]

Plot a grid of different resolution parameters for clustering.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix object.

  • method (str, default: "leiden") – Clustering method to use. Can be one of ‘leiden’ or ‘louvain’.

  • resolution_range (Tuple[float | int, float | int, float | int], default: (0.1, 1, 0.1)) – Range of ‘resolution’ parameter values to test. Must be a tuple in the form (min, max, step).

  • embedding (str, default: "X_umap".) – Embedding method to use. Must be a key in adata.obsm. If not, will try to use f”X_{embedding}”.

  • ncols (int, default: 3) – Number of columns in the grid.

  • verbose (bool, default: True) – Print progress to console.

  • save (Optional[str], default None) – Path to save figure.

  • **kwargs (Any) – Keyword arguments to be passed to sc.pl.embedding.

Returns:

axarr – Array of axes objects containing the plot(s).

Return type:

np.ndarray

Raises:
  • ValueError – If step is lager than max - min

  • KeyError – If embedding is not found in adata.obsm.

Examples

pl.search_clustering_parameters(adata, method='louvain', resolution_range=(0.1, 2, 0.2), embedding='X_umap', ncols=3, verbose=True, save=None)
../_images/plotting-26.png
sctoolbox.plotting.clustering.marker_gene_clustering(adata: AnnData, groupby: str, marker_genes_dict: dict[str, list[str]], show_umap: bool = True, save: str | None = None, figsize: tuple[int | float, int | float] | None = None, **kwargs: Any) list[source]

Plot an overview of marker genes and clustering.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix.

  • groupby (str) – Key in adata.obs for which to plot the clustering.

  • marker_genes_dict (dict[str, list[str]]) – Dictionary of marker genes to plot. Keys are the names of the groups and values are lists of marker genes.

  • show_umap (bool, default: True) – Whether to show a UMAP plot on the left.

  • save (Optional[str], default: None) – If given, save the figure to this path.

  • figsize (Tuple[float | int, float | int], default: None) – Size of the figure. If None, use default size.

  • **kwargs (Any) – Keyword arguments to be passed to sc.pl.dotplot.

Returns:

axarr – List of axes objects containing the plot(s).

Return type:

list

Examples

marker_genes_dict = {"S": ["PCNA"], "G2M": ["HMGB2"]}

pl.marker_gene_clustering(adata, "phase", marker_genes_dict, show_umap=True, save=None, figsize=None)
../_images/plotting-27.png

marker_genes

Plots for marker genes e.g. as results of sc.tl.rank_genes_groups.

sctoolbox.plotting.marker_genes.rank_genes_plot(adata: AnnData, key: str | None = 'rank_genes_groups', genes: list[str] | dict[str, list[str]] | None = None, n_genes: int = 15, dendrogram: bool = False, groupby: str | None = None, title: str | None = None, style: Literal['dots', 'heatmap'] = 'dots', measure: str = 'expression', save: str | None = None, **kwargs: Any) dict[source]

Plot expression of genes from rank_genes_groups or from a gene list/dict.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix.

  • key (Optional[str], default "rank_genes_groups") – Key from adata.uns to plot. For example, rank_genes_groups or rank_genes_groups_filtered.

  • genes (Optional[list[str] | dict[str, list[str]]], default None) – List of genes to plot across groups in ‘groupby’. If a dict is passed, the keys are the group names and the values are lists of genes. Setting ‘genes’ overrides the ‘key’ parameter.

  • n_genes (int, default 15) – Number of genes to plot if key is specified.

  • dendrogram (bool, default False) – Whether to show the dendrogram for groups.

  • groupby (Optional[str], default None) – Key from adata.obs to group cells by.

  • title (Optional[str], default None) – Title for the plot.

  • style (Literal["dots", "heatmap"], default "dots") – Style of the plot. Either dots or heatmap.

  • measure (str, default "expression") – Measure to write in colorbar label. For example, expression or accessibility.

  • save (Optional[str], default None) – If given, save the figure to this path.

  • **kwargs (Any) – Additional arguments passed to sc.pl.rank_genes_groups_dotplot or sc.pl.rank_genes_groups_matrixplot.

Raises:

ValueError

  1. If style is not one of dots or heatmap 2. If groupby is not specified when genes is specified.

Returns:

g – Dictionary containing the matplotlib axes objects for the plot.

Return type:

dict

Examples

pl.rank_genes_plot(adata, n_genes=5)
../_images/plotting-28.png
pl.rank_genes_plot(adata, genes={"group1": adata.var.index[:10], "group2": adata.var.index[10:20]}, groupby="bulk_labels")
../_images/plotting-29.png
sctoolbox.plotting.marker_genes.grouped_violin(adata: AnnData, x: str | list[str], y: str | None = None, groupby: str | None = None, figsize: tuple[int | float, int | float] | None = None, title: str | None = None, style: Literal['violin', 'boxplot', 'bar'] = 'violin', normalize: bool = False, ax: Axes | None = None, save: str | None = None, **kwargs: Any) Axes[source]

Create violinplot of values across cells in an adata object grouped by x and ‘groupby’.

Can for example show the expression of one gene across groups (x = obs_group, y = gene), expression of multiple genes grouped by cell type (x = gene_list, groupby = obs_cell_type), or values from adata.obs across cells (x = obs_group, y = obs_column).

Parameters:
  • adata (sc.AnnData) – Annotated data matrix.

  • x (str | list[str]) – Column name in adata.obs or gene name(s) in adata.var.index to group by on the x-axis. Multiple gene names can be given in a list.

  • y (Optional[str], default None) – A column name in adata.obs or a gene in adata.var.index to plot values for. Only needed if x is a column in adata.obs.

  • groupby (Optional[str], default None) – Column name in adata.obs to create grouped violins. If None, a single violin is plotted per group in ‘x’.

  • figsize (Optional[Tuple[int | float, int | float]], default None) – Figure size.

  • title (Optional[str], default None) – Title of the plot. If None, no title is shown.

  • style (Literal["violin", "boxplot", "bar"], default "violin") – Plot style. Either “violin” or “boxplot” or “bar”.

  • normalize (bool, default False) – If True, normalize the values in ‘y’ to the range [0, 1] per group in ‘x’.

  • ax (Optional[matplotlib.axes.Axes], default None) – A matplotlib axes object to plot violinplots in. If None, a new figure and axes is created.

  • save (Optional[str], default None) – Path to save the figure to. If None, the figure is not saved.

  • **kwargs (Any) – Additional arguments passed to seaborn.violinplot or seaborn.boxplot.

Return type:

matplotlib.axes.Axes

Raises:

ValueError – If x or y are not columns in adata.obs or a genes in adata.var.index.

Examples

pl.grouped_violin(adata, 'phase', y='G2M_score')
../_images/plotting-30.png
sctoolbox.plotting.marker_genes.group_expression_boxplot(adata: AnnData, gene_list: list[str], groupby: str, figsize: tuple[int | float, int | float] | None = None, **kwargs: Any) Axes[source]

Plot a boxplot showing summarized gene expression of genes in gene_list across the groups in groupby.

The total gene expression is quantile normalized per group, and are subsequently normalized to 0-1 per gene across groups.

Parameters:
  • adata (sc.AnnData) – An annotated data matrix object containing counts in .X.

  • gene_list (list[str]) – A list of genes to show expression for.

  • groupby (str) – A column in .obs for grouping cells into groups on the x-axis

  • figsize (Optional[Tuple[int | float, int | float]], default None (matplotlib default)) – Control the size of the output figure, e.g. (6,10).

  • **kwargs (Any) – Additional arguments passed to seaborn.boxplot.

Return type:

matplotlib.axes.Axes

Examples

gene_list=("HES4", "PRMT2", "ITGB2")
pl.group_expression_boxplot(adata, gene_list, groupby="bulk_labels")
../_images/plotting-31.png
sctoolbox.plotting.marker_genes.gene_expression_heatmap(adata: AnnData, genes: list[str], cluster_column: str, gene_name_column: str | None = None, title: str | None = None, groupby: str | None = None, row_cluster: bool = True, col_cluster: bool = False, show_row_dendrogram: bool = False, show_col_dendrogram: bool = False, figsize: tuple[int | float, int | float] | None = None, save: str | None = None, **kwargs: Any) Any[source]

Plot a heatmap of z-score normalized gene expression across clusters/groups.

Parameters:
  • adata (sc.AnnData) – Annotated data matrix.

  • genes (list[str]) – List of genes to plot. Must match names in adata.var.index.

  • cluster_column (str) – Key in adata.obs for which to cluster the x-axis.

  • gene_name_column (Optional[str], default None) – Column in adata.var for which to use for gene row names. Default is to use the .var index.

  • title (Optional[str], default None) – Title of the plot.

  • groupby (Optional[str], default None) – Key in adata.obs for which to plot a colorbar per cluster.

  • row_cluster (bool, default True) – Whether to cluster the rows.

  • col_cluster (bool, default False) – Whether to cluster the columns.

  • show_row_dendrogram (bool, default False) – Whether to show the dendrogram for the rows.

  • show_col_dendrogram (bool, default False) – Whether to show the dendrogram for the columns.

  • figsize (Optional[Tuple[int | float, int | float]], default None) – Size of the figure. If None, use default size.

  • save (Optional[str], default None) – If given, save the figure to this path.

  • **kwargs (Any) – Additional arguments passed to seaborn.clustermap.

Returns:

g – sns.matrix.ClusterGrid: The seaborn ClusterGrid object containing the heatmap. Note: Any since sns.matrix.ClusterGrid cannot be checked by beartype.

Return type:

Any

Raises:

KeyError – If gene_name_column is not a column in adata.var.

Examples

adata.obs["samples"] = np.random.choice(["CTRL1", "CTRL2", "CTRL3", "CTRL4", "TREAT1", "TREAT2", "TREAT3", "TREAT4"], size=adata.shape[0])
adata.obs["condition"] = adata.obs["samples"].str.extract("([A-Z]+)")

genes = adata.var.index[:15]
pl.gene_expression_heatmap(adata, genes, cluster_column="samples",
                           groupby="condition",
                           title="Gene expression",
                           col_cluster=True,
                           show_col_dendrogram=True,
                           colors_ratio=0.03)
../_images/plotting-32.png
sctoolbox.plotting.marker_genes.plot_differential_genes(rank_table: DataFrame, title: str = 'Differentially expressed genes', save: str | None = None, **kwargs: Any) Axes[source]

Plot number of differentially expressed genes per contrast in a barplot.

Parameters:
  • rank_table (pd.DataFrame) – Output of sctoolbox.tools.marker_genes.pairwise_rank_genes.

  • title (str, default "Differentially expressed genes") – Title of the plot.

  • save (Optional[str], default None) – If given, save the figure to this path.

  • **kwargs (Any) – Keyword arguments passed to pl.bidirectional_barplot.

Raises:

ValueError – If no significant differentially expressed genes are found in the data.

Returns:

Axes object.

Return type:

matplotlib.axes.Axes

Examples

import sctoolbox.tools as tl
adata.obs["groups"] = np.random.choice(["G1", "G2", "G3"], size=adata.shape[0])
pairwise_table = tl.marker_genes.pairwise_rank_genes(adata, foldchange_threshold=0.2, groupby="groups")

pl.plot_differential_genes(pairwise_table)
../_images/plotting-33.png
sctoolbox.plotting.marker_genes.plot_gene_correlation(adata: AnnData, ref_gene: str, gene_list: list[str] | str, ncols: int = 3, figsize: tuple[int | float, int | float] | None = None, save: str | None = None, **kwargs: Any) Iterable[Axes][source]

Plot the gene expression of one reference gene against the expression of a set of genes.

Parameters:
  • adata (sc.AnnData) – An annotated data matrix object containing counts in .X.

  • ref_gene (str) – Reference gene to which other genes are comapred to.

  • gene_list (list[str] | str) – A list of genes to show expression for.

  • ncols (int, default 3) – Number of columns in plot grid.

  • figsize (Optional[Tuple[int | float, int | float]], default None) – Control the size of the output figure, e.g. (6,10).

  • save (Optional[str], default None) – Save the figure to a file.

  • **kwargs (Any) – Additional arguments passed to seaborn.regplot.

Returns:

List containing all axis objects.

Return type:

Iterable[matplotlib.axes.Axes]

Examples

gene_list=("HES4", "PRMT2", "ITGB2")
pl.plot_gene_correlation(adata, "SUMO3", gene_list)
../_images/plotting-34.png

genometracks

Class to create a genome track plot via pyGenomeTracks.

class sctoolbox.plotting.genometracks.GenomeTracks[source]

Bases: object

Class for creating a genome track plot via pyGenomeTracks by collecting different tracks and writing the .ini file.

Examples

import sctoolbox.plotting as pl

G = pl.GenomeTracks()

#Add bigwig tracks
G.add_track("data/tracks/bigwig1.bw", color="red")
G.add_track("data/tracks/bigwig2.bw", color="blue", orientation="inverted")

#Add hlines to previous bigwig track
G.add_hlines([100, 200], color="red")
G.add_hlines([250], color="blue", line_style="dashed")

#Add links
G.add_track("data/tracks/links.arcs", orientation="inverted")

#Add one line between tracks
G.add_hline()

#Add .bed-file regions
G.add_track("data/tracks/tad_classification.bed", title="bed")
G.add_track("data/tracks/tad_classification.bed", color="Reds", title="bed colored by score column")

#Add vlines and highlight
G.add_track("data/tracks/vlines.bed", file_type="vlines")
G.add_track("data/tracks/vhighlight.bed", file_type="vhighlight")

#Add a spacer
G.add_spacer()

#Add genes
G.add_track("data/tracks/genes.gtf", gene_rows=5)

#Add x-axis
G.add_spacer()
G.add_xaxis()

# Plot
G.plot(region="X:3000000-3500000", output="genometrack_X.png", trackLabelFraction=0.2)
../_images/genometrack_X.png
add_track(file: str, file_type: str | None = None, name: str | None = None, **kwargs: Any)[source]

Add a track to the GenomeTracks object.

The track will be added to the configuration file as one element, e.g. .add_track(“file1.bed”, file_type=”bed”, name=”my_bed”) will add the following to the configuration file: ` [my_bed] file = file1.bed file_type = bed `

Additional parameters are decided by <obj>.global_defaults and <obj>.type_defaults, or can be given by kwargs. All options and parameters are available at: https://pygenometracks.readthedocs.io/en/latest/content/all_tracks.html

Parameters:
  • file (str) – Path to the file containing information to be plotted. Can be .bed, .bw, .gtf etc.

  • file_type (str, default None) – Specify the ‘file_type’ argument for pyGenomeTracks. If None, the type will be predicted from the file ending.

  • name (str, default None) – Name of the track. If None, the name will be estimated from the file_type e.g. ‘bigwig 1’. or ‘bed 2’. If the file_type is not available, the name will be the file path.

  • **kwargs (arguments) – Additional arguments to be passed to pyGenomeTracks track configuration, for example height=5 or title=”My track”.

Raises:

ValueError – If the file_type is not valid.

add_hlines(y_values: Iterable[int | float], overlay_previous: Literal['share-y', 'no'] = 'share-y', **kwargs: Any)[source]

Add horizontal lines to the previous plot.

Parameters:
  • y_values (list of int or float) – List of y values to plot horizontal lines at.

  • overlay_previous (str, default "share-y") – Whether to plot the lines on the same y-axis as the previous plot (“share-y”) or on a new y-axis (“no”).

  • **kwargs (arguments) – Additional arguments to be passed to pyGenomeTracks track configuration, for example title=”My lines”.

add_hline(height: int | float = 1, line_width: int | float = 2, **kwargs: Any)[source]

Add a horizontal line between tracks, not within a track.

Can be used to visually separate tracks.

Parameters:
  • height (int, default 1) – Height of the track with the line in the middle.

  • line_width (int, default 2) – Width of the line.

  • **kwargs (arguments) – Additional arguments to be passed to pyGenomeTracks track configuration, for example title=”A line”.

add_spacer(height: int | float = 1)[source]

Add a spacer between tracks.

Parameters:

height (int, default 1) – Height of the spacer track.

add_xaxis(height: int | float = 1, **kwargs: Any)[source]

Add the x-axis to the plot.

Parameters:
  • height (int, default 1) – Height of the x-axis track.

  • **kwargs (arguments) – Additional arguments to be passed to pyGenomeTracks track configuration.

show_plot()[source]

Display the plot.

show_config()[source]

Show the current configuration file as a string.

plot(region: str, output: str | None = 'genometracks.png', config_file: str | None = None, title: str | None = None, show: bool = True, dpi: int = 300, **kwargs: Any)[source]

Plot the final GenomeTracks plot based on the collected tracks.

Runs pyGenomeTracks with the configuration file and the given parameters, and saves the output to the given file.

Parameters:
  • region (str) – Region to plot, e.g. “chr1:1000000-2000000”.

  • output (str, default "genometracks.png") – Path to the output file.

  • config_file (str, default None) – Path to the configuration file to create. If None, a temporary file will be created in the system’s temp directory.

  • title (str, default None) – Title of the plot. If None, no title will be shown.

  • show (bool, default True) – If the function is run in a jupyter notebook, ‘show’ controls whether to show the plot at the end of the function run.

  • dpi (int, default 300) – DPI of the plot.

  • **kwargs (arguments) – Additional arguments to be passed to pyGenomeTracks, for example trackLabelFraction=0.2.

Raises:

ValueError – If the pyGenomeTracks command fails.


velocity

Plot velocity related figures e.g. pseudo-time heatmap.

sctoolbox.plotting.velocity.pseudotime_heatmap(adata: AnnData, genes: list[str], sortby: str | None = None, layer: str | None = None, figsize: tuple[int | float, int | float] | None = None, shrink_cbar: int | float = 0.5, title: str | None = None, save: str | None = None, **kwargs: Any) Axes[source]

Plot heatmap of genes along pseudotime sorted by ‘sortby’ column in adata.obs.

Parameters:
  • adata (sc.AnnData) – Anndata object

  • genes (list[str]) – List of genes for heatmap.

  • sortby (Optional[str], default None) – Sort genes by condition

  • layer (Optional[str], default None) – Use different layer of anndata object.

  • figsize (Optional[Tuple[int | float, int | float]], default None) – Tuple of integers setting the heatmap figsize.

  • shrink_cbar (int | float, default 0.5) – Shrink color bar by set ratio.

  • title (Optional[str], default None) – Set title for plot.

  • save (Optional[str], default None) – Path and name of file to be saved.

  • **kwargs (Any) – Additional arguments passed to seaborn.heatmap.

Returns:

ax – Axes object containing the plot.

Return type:

matplotlib.axes.Axes


general

General plotting functions for sctoolbox, e.g. general plots for wrappers, and saving and adding titles to figures.

sctoolbox.plotting.general.clustermap_dotplot(table: DataFrame, x: str, y: str, size: str, hue: str, cluster_on: Literal['hue', 'size'] = 'hue', fillna: float | int = 0, title: str | None = None, figsize: tuple[int | float, int | float] | None = None, dend_height: float | int = 2, dend_width: float | int = 2, palette: str = 'vlag', x_rot: int = 45, show_grid: bool = False, save: str | None = None, **kwargs: Any) list[source]

Plot a heatmap with dots (instead of squares), which can contain the dimension of “size”.

Parameters:
  • table (pd.DataFrame) – Table in long-format. Has to have at least four columns as given by x, y, size and hue.

  • x (str) – Column in table to plot on the x-axis.

  • y (str) – Column in table to plot on the y-axis.

  • size (str) – Column in table to use for the size of the dots.

  • hue (str) – Column in table to use for the color of the dots.

  • cluster_on (Literal["hue", "size"], default hue) – Decide which values to use for creating the dendrograms. Either “hue” or “size”.

  • fillna (float | int, default 0) – Replace NaN with given value.

  • title (Optional[str], default None) – Title of the dotplot.

  • figsize (Optional[Tuple[int | float, int | float]], default None) – Figure size in inches. Default is estimated from the number of rows/columns (ncols/3, nrows/3).

  • dend_height (float | int, default 2) – Height of the x-axis dendrogram in counts of row elements, e.g. 2 represents a height of 2 rows in the dotplot.

  • dend_width (float | int, default 2) – Width of the y-axis dendrogram in counts of column elements, e.g. 2 represents a width of 2 columns in the dotplot.

  • palette (str, default vlag) – Color palette for hue colors.

  • x_rot (int, default 45) – Rotation of xticklabels in degrees.

  • show_grid (bool, default False) – Show grid behind dots in plot.

  • save (Optional[str], default None) – Save the figure to this path.

  • **kwargs (Any) – Additional arguments to pass to seaborn.scatterplot.

Returns:

List of matplotlib.axes.Axes objects containing the dotplot and the dendrogram(s).

Return type:

list

Examples

table = adata.obs.reset_index()[:10]
pl.clustermap_dotplot(
    table=table,
    x="bulk_labels",
    y="index",
    hue="n_genes",
    size="n_counts",
    palette="viridis"
)
../_images/plotting-37.png
sctoolbox.plotting.general.bidirectional_barplot(df: DataFrame, title: str | None = None, colors: dict[str, str] | None = None, figsize: tuple[int | float, int | float] | None = None, save: str | None = None) Axes[source]

Plot a bidirectional barplot.

A vertical barplot where each position has one bar going left and one going right (bidirectional).

Parameters:
  • df (pd.DataFrame) –

    Dataframe with the following mandatory column names:
    • left_label

    • right_label

    • left_value

    • right_value

  • title (Optional[str], default None) – Title of the plot.

  • colors (Optional[dict[str, str]], default None) – Dictionary with label names as keys and colors as values.

  • figsize (Optional[Tuple[int | float, int | float]], default None) – Figure size.

  • save (Optional[str], default None) – If given, the figure will be saved to this path.

Returns:

Axes containing the plot.

Return type:

matplotlib.axes.Axes

Raises:

KeyError – If df does not contain the required columns.

sctoolbox.plotting.general.boxplot(dt: DataFrame, show_median: bool = True, ax: Axes | None = None, **kwargs: Any) Axes[source]

Generate one plot containing one box per column. The median value is shown.

Parameters:
  • dt (pd.DataFrame) – pandas datafame containing numerical values in every column.

  • show_median (boolean, default True) – If True show median value as small box inside the boxplot.

  • ax (Optional[matplotlib.axes.Axes], default None) – Axes object to plot on. If None, a new figure is created.

  • **kwargs (Any) – Additional arguments to pass to seaborn.boxplot.

Returns:

containing boxplot for every column.

Return type:

matplotlib.axes.Axes

Examples

import pandas as pd
dt = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
pl.boxplot(dt, show_median=True, ax=None)
../_images/plotting-39.png
sctoolbox.plotting.general.violinplot(table: DataFrame, y: str, color_by: str | None = None, hlines: float | int | list[float | int] | dict[str, float | int | list[float | int]] | None = None, colors: list[str] | None = None, ax: Axes | None = None, title: str | None = None, ylabel: bool = True, **kwargs: Any) Axes[source]

Plot a violinplot with optional horizontal lines for each violin.

Parameters:
  • table (pd.DataFrame) – Values to create the violins from.

  • y (str) – Column name of table. Values that will be shown on y-axis.

  • color_by (Optional[str], default None) – Column name of table. Used to color group violins.

  • hlines (Optional[Union[float | int, list[float | int],) – dict[str, Union[float | int, list[float | int]]]]], default None Define horizontal lines for each violin.

  • colors (Optional[list[str]], default None) – List of colors to use for violins.

  • ax (Optional[matplotlib.axes.Axes], default None) – Axes object to draw the plot on. Otherwise use current axes.

  • title (Optional[str], default None) – Title of the plot.

  • ylabel (bool | str, default True) – Boolean if ylabel should be shown. Or str for custom ylabel.

  • **kwargs (Any) – Additional arguments to pass to seaborn.violinplot.

Returns:

Object containing the violinplot.

Return type:

matplotlib.axes.Axes

Raises:

ValueError – If y or color_by is not a column name of table. Or if hlines is not a number or list of numbers for color_by=None.

Examples

import seaborn as sns
table = sns.load_dataset("titanic")
pl.violinplot(table, "age", color_by="class", hlines=None, colors=None, ax=None, title=None, ylabel=True)
../_images/plotting-41.png
sctoolbox.plotting.general.plot_venn(groups_dict: dict[str, list[Any]], title: str | None = None, save: str | None = None, **kwargs: Any) None[source]

Plot a Venn diagram from a dictionary of 2-3 groups of lists.

Parameters:
  • groups_dict (dict[str, list[Any]]) – A dictionary where the keys are group names (strings) and the values are lists of items belonging to that group (e.g. {‘Group A’: [‘A’, ‘B’, ‘C’], …}).

  • title (Optional[str], default None) – Title of the plot.

  • save (Optional[str], default None) – Filename to save the plot to.

  • **kwargs (Any) – Additional arguments to pass to matplotlib_venn.venn2 or matplotlib_venn.venn3.

Raises:

ValueError – If number of groups in groups_dict is not 2 or 3.

Examples

venn2_example = { 'Group A': [1, 2, 3, 4],
                  'Group B': [3, 4, 5, 6]
                }
pl.plot_venn(venn2_example, "Simple Venn2 plot")
../_images/plotting-43.png
venn3_example = { 'Fruits A': ['Lemon', 'Orange', 'Blueberry', 'Grapefruit'],
                  'Fruits B': ['Pineapple', 'Mango', 'Banana', 'Papaya', 'Blueberry', 'Strawberry'],
                  'Fruits C': ['Strawberry', 'Blueberry', 'Raspberry', 'Orange', 'Mango']
                }
pl.plot_venn(venn3_example, "Simple Venn3 plot")
../_images/plotting-45.png
sctoolbox.plotting.general.pairwise_scatter(table: DataFrame, columns: list[str], thresholds: dict[str, dict[Literal['min', 'max'], int | float]] | None = None, save: str | None = None, **kwargs: Any) ndarray[source]

Plot a grid of scatterplot comparing column values pairwise.

If thresholds are given, lines are drawn for each threshold and points outside of the thresholds are colored red.

Parameters:
  • table (pd.DataFrame) – Dataframe containing the data to plot.

  • columns (list[str]) – List of column names in table to plot.

  • thresholds (Optional[dict[str, dict[Literal["min", "max"], int | float]]], default None) – Dictionary containing thresholds for each column. Keys are column names and values are dictionaries with keys “min” and “max”.

  • save (Optional[str], default None) – If given, the figure will be saved to this path.

  • **kwargs (Any) – Additional arguments to pass to matplotlib.axes.Axes.scatter.

Returns:

Array of matplotlib.axes.Axes objects.

Return type:

np.ndarray

Raises:

ValueError

  1. If columns contains less than two columns. 2. If one of the given columns is not a table column

Examples

columns = ["percent_mito", "n_counts", "S_score"]

thresholds = {"n_counts": {"min": 2500, "max": 8000},
              "percent_mito": {"max": 0.03},
              "S_score": {"max": 0.5}}

pl.pairwise_scatter(adata.obs, columns, thresholds=thresholds)
../_images/plotting-46.png