Cell transitions

In this example, we show how to use moscot.plotting.cell_transition().

See also

Imports and data loading

import warnings

warnings.simplefilter(action="ignore", category=FutureWarning)

import moscot.plotting as mtp
from moscot import datasets
from moscot.problems.time import TemporalProblem

Load the hspc() dataset.

adata = datasets.hspc()
adata
AnnData object with n_obs × n_vars = 4000 × 2000
    obs: 'day', 'donor', 'cell_type', 'technology', 'n_genes'
    var: 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'cell_type_colors', 'hvg', 'neighbors', 'neighbors_atac', 'pca', 'umap'
    obsm: 'X_lsi', 'X_pca', 'X_umap_ATAC', 'X_umap_GEX', 'peaks_tfidf'
    varm: 'PCs'
    obsp: 'connectivities', 'distances', 'neighbors_atac_connectivities', 'neighbors_atac_distances'

Prepare and solve the problem

First, we need to prepare and solve the problem. Here, we set the threshold parameter to a relatively high value to speed up convergence at the cost of lower quality solution.

tp = TemporalProblem(adata).prepare(time_key="day").solve(epsilon=1e-2, threshold=1e-2)
INFO     Computing pca with `n_comps=30` for `xy` using `adata.X`                                                  
INFO     Computing pca with `n_comps=30` for `xy` using `adata.X`                                                  
INFO     Computing pca with `n_comps=30` for `xy` using `adata.X`                                                  
INFO     Solving `3` problems                                                                                      
INFO     Solving problem BirthDeathProblem[stage='prepared', shape=(766, 1235)].                                   
INFO     Solving problem BirthDeathProblem[stage='prepared', shape=(1235, 1201)].                                  
INFO     Solving problem BirthDeathProblem[stage='prepared', shape=(1201, 798)].                                   

As for all plotting functionalities in moscot, we first call the cell_transition() method of the problem class, which stores the results of the computation in the AnnData instance. Let us assume we want to plot the cell transition between time point 4–7. Moreover, we want the rows and columns of our transition matrix to represent cell types. In general, we can aggregate by any categorical column in obs via the source_groups parameter and the target_groups parameter, respectively. Moreover, we are interested in descendants as opposed to ancestors, which is why we choose forward = True.

tp.cell_transition(
    source=4,
    target=7,
    source_groups="cell_type",
    target_groups="cell_type",
    forward=True,
    key_added="cell_transition",
)
BP EryP HSC MasP MkP MoP NeuP
BP 2.308763e-01 5.445565e-13 0.145888 0.285380 0.000265 0.194625 1.429656e-01
EryP 1.241760e-12 7.082211e-01 0.000672 0.032356 0.213781 0.044970 3.887449e-13
HSC 1.301134e-02 1.154022e-02 0.563381 0.103650 0.046381 0.077788 1.842484e-01
MasP 7.406692e-12 1.420519e-02 0.000352 0.970003 0.007889 0.007319 2.307788e-04
MkP 3.108959e-09 8.558109e-02 0.049507 0.185418 0.558227 0.067061 5.420606e-02
MoP 1.132997e-12 7.344286e-02 0.000295 0.002366 0.220322 0.703554 2.052096e-05
NeuP 7.761030e-12 5.491939e-04 0.008162 0.109118 0.003314 0.044908 8.339480e-01

The function saves the transition matrix to uns['moscot_results']['cell_transition']['{key_added}'].

It is also possible to aggregate by individuals cells as source_groups, target_groups or both. Here, we want the source groups to be the cells, so we pass aggregation_mode="cell" and source_groups=None. The rows in the resulting matrix are cell barcodes and the colums are cell types. This gives the likelihood of each single cell being mapped to a specific cluster.

tp.cell_transition(
    source=4,
    target=7,
    source_groups=None,
    target_groups="cell_type",
    forward=True,
    aggregation_mode="cell",
    key_added="cell_to_cell_type",
)
HSC MkP NeuP MoP EryP MasP BP
6646faeca349 3.070703e-06 2.249557e-08 9.986461e-01 1.350755e-03 2.314684e-24 5.767661e-08 7.740833e-28
83a315f89bdc 9.537811e-01 6.737983e-03 2.188746e-04 3.619146e-02 1.187883e-06 3.069365e-03 1.441773e-12
9d9116b806de 9.045426e-01 7.665823e-06 4.860442e-02 4.674972e-02 1.335694e-12 4.633884e-11 9.549022e-05
6c83c175ff32 9.703491e-01 1.099086e-02 1.764623e-02 1.013697e-03 3.738226e-10 1.186408e-07 7.496628e-18
18f1b5233585 1.409097e-15 7.704851e-09 7.291150e-15 6.623527e-17 2.047579e-13 1.000000e+00 2.138181e-19
... ... ... ... ... ... ... ...
f41351f7e128 9.752746e-01 4.017630e-06 1.139117e-03 3.388065e-05 2.378500e-13 2.354840e-02 4.914123e-09
1d9a3b8809f6 9.187290e-01 8.105103e-02 1.781465e-05 2.018890e-04 1.850676e-09 1.533686e-07 9.590943e-08
321744b37b5c 0.000000e+00 5.783020e-28 3.684824e-30 3.220448e-33 2.553858e-35 1.000000e+00 0.000000e+00
f51100f61640 1.851037e-10 7.439794e-11 1.000000e+00 2.983009e-08 7.320152e-29 2.716353e-11 1.970946e-29
666f69f1d594 2.016246e-27 0.000000e+00 5.225268e-14 1.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00

1201 rows × 7 columns

Analogously, we can compute the likelihoods of cell types a cell originates from by setting source_groups="cell_type", target_groups=None and forward=False:

tp.cell_transition(
    source=4,
    target=7,
    source_groups="cell_type",
    target_groups=None,
    forward=False,
    aggregation_mode="cell",
    key_added="cell_type_to_cell",
)
45a010de2960 d3d5bb4181d5 d3e8a50c8a46 3f9844774222 66737a166e82 2e5e82629058 531c5a0af7e5 07b921a9a00e fd4ebc436466 285a79857fa4 ... 038518e366a9 809ba84a9cbb 4264c55de7d0 a7097007fbea 1381ec6cb466 946cf349f84e da6baa1b0624 ba7d40e15f3d 69451694ec4c 193992d571a5
HSC 4.234188e-01 9.816411e-01 9.294160e-01 7.374105e-07 1.234406e-05 3.075591e-01 9.750719e-01 8.333546e-01 9.972167e-01 9.756793e-01 ... 4.107326e-08 4.822306e-03 7.439283e-01 9.231182e-02 5.305157e-10 1.012360e-02 9.981840e-01 9.930704e-01 7.514880e-01 6.609700e-03
MkP 1.413742e-01 4.882960e-04 6.864639e-02 1.442815e-08 2.537763e-04 4.506757e-01 2.492809e-02 6.875102e-04 2.445679e-05 2.026044e-02 ... 6.263088e-03 5.919814e-01 1.979516e-03 2.405828e-01 2.269994e-12 3.061594e-01 2.819046e-08 1.077111e-05 1.663418e-01 4.232730e-04
NeuP 1.365205e-03 1.787006e-02 1.937666e-03 1.038806e-03 9.174507e-09 1.114286e-20 8.517445e-12 1.599320e-01 2.758771e-03 4.060269e-03 ... 4.917375e-18 2.704230e-19 2.535262e-01 6.560069e-09 8.323468e-12 6.825680e-01 3.477891e-04 3.121623e-03 1.425974e-03 5.531022e-02
MoP 3.587727e-09 4.745976e-14 2.025467e-24 9.989604e-01 7.184339e-22 5.502434e-03 2.096409e-21 6.546266e-18 7.270264e-19 7.968970e-18 ... 1.650684e-08 1.491232e-01 5.659405e-04 2.042051e-02 1.541398e-26 4.286293e-17 4.230260e-05 3.797218e-03 2.642924e-11 4.646852e-09
EryP 4.341889e-03 3.048463e-23 6.108970e-29 9.814683e-24 1.083907e-08 2.362628e-01 1.049770e-16 6.731280e-22 2.272892e-21 2.010373e-17 ... 9.937353e-01 2.540731e-01 2.150421e-25 6.283615e-01 4.660298e-16 3.321713e-25 1.831522e-19 1.722867e-14 7.132902e-02 2.050497e-06
MasP 4.293696e-01 1.579725e-13 5.585922e-20 4.697034e-22 9.997339e-01 8.649193e-13 6.619282e-11 3.422012e-13 4.340160e-13 2.861642e-13 ... 1.650411e-06 9.839462e-16 9.062148e-20 1.832335e-02 1.000000e+00 4.517032e-15 1.095685e-18 3.996125e-12 9.414924e-03 9.376547e-01
BP 1.303658e-04 5.818190e-07 2.784787e-10 6.756987e-22 8.968549e-09 2.484709e-19 1.886490e-09 6.025940e-03 8.418083e-08 1.326892e-09 ... 4.198042e-19 3.062798e-22 2.883456e-10 5.044563e-27 1.941353e-14 1.149068e-03 1.426023e-03 2.893428e-11 3.031842e-07 1.445938e-10

7 rows × 798 columns

Plot cell transitions

The transition matrix is a data frame containing all the information needed, we now want to nicely visualize the result with moscot.plotting.cell_transition(). Therefore, we can either pass the AnnData instance or the problem instance. We also need to pass the key to the specific transition matrix we want to plot. Here, key="cell_transition" is the first cell transition we computed – cell types between time points 4 and 7. Depending on the size of our transition matrix, we might want to adapt the dpi and the fontsize parameters. If we don’t want to plot the values of the transition, e.g., because the transition matrix is very large, we can simply set the annotation = False.

mtp.cell_transition(tp, dpi=100, fontsize=10, key="cell_transition")
../../../_images/d2eaed5fc2e253efa5c5691403e432c9c800c10496c671e56c418500252ff56f.png

By default, the result of the moscot.plotting.cell_transition() method of a problem instance is saved in adata.uns['moscot_results']['cell_transition']['cell_transition'] and overrides this element every time the method is called. To prevent this, we can specify the parameter key_added, which we will do to store the results of the following use case.

We can also visualize transitions of only a subset of categories of an obs column by passing a dictionary for source_groups or target_groups. Moreover, passing a dictionary also allows to specify the order of the source_groups and target_groups, respectively.

new_key = "subset_cell_transition"
tp.cell_transition(
    source=4,
    target=7,
    source_groups={"cell_type": ["HSC", "MasP", "MkP", "NeuP"]},
    target_groups={"cell_type": ["MasP", "MkP", "BP"]},
    forward=True,
    key_added=new_key,
)
mtp.cell_transition(tp, dpi=100, fontsize=10, key=new_key)
../../../_images/97c8e710a9dcbedc4b5e642757b0cab9e28482d31093d13f30afe0ae3da8ead2.png