Lineage tree¶
This example shows how lineage trees can be passed, specifically
useful for the LineageProblem
, which requires lineage information.
Check moslin [Lange et al., 2023] for examples on real-world data.
moscot
allows this by passing the:
precomputed cost matrices,
barcode information,
or the lineage tree as a
DiGraph
.
In this notebook, we consider the lineage tree case.
See also
TODO: link to other relevant examples
Imports and data loading¶
from moscot import datasets
from moscot.problems.time import LineageProblem
Simulate data using simulate_data()
.
adata = datasets.simulate_data(n_distributions=3, key="day", quad_term="tree")
adata
AnnData object with n_obs × n_vars = 60 × 60
obs: 'day', 'celltype'
uns: 'trees'
We assume trees are saved in uns
as a dict
, where each key is a value in obs
and each value is a DiGraph
.
adata.uns["trees"]
{0: <networkx.classes.digraph.DiGraph at 0x7fb4786c9ae0>,
1: <networkx.classes.digraph.DiGraph at 0x7fb4786c9720>,
2: <networkx.classes.digraph.DiGraph at 0x7fb4786c94b0>}
Leaf distance¶
Now, we can instantiate and prepare the LineageProblem
by specifying the cost.
lp = LineageProblem(adata)
lp = lp.prepare(
time_key="day",
lineage_attr={"attr": "uns", "key": "trees", "cost": "leaf_distance"},
)
INFO Computing pca with `n_comps=30` for `xy` using `adata.X`
INFO Computing pca with `n_comps=30` for `xy` using `adata.X`
Internally, cost matrices have been computed from the trees using the shortest path distance between the leaves. Let us investigate the first few entries of the cost matrix computed from the first lineage tree.
lp[0, 1].x.data_src[:3, :3]
array([[0., 2., 3.],
[2., 0., 3.],
[3., 3., 0.]])
Similarly, we investigate parts of the cost matrix created from the second tree.
lp[0, 1].y.data_src[:3, :3]
array([[0., 2., 3.],
[2., 0., 3.],
[3., 3., 0.]])
Note that the gene expression term is still saved as two point clouds. This cost matrix will be computed by the backend.
lp[0, 1].xy.data_src.shape, lp[0, 1].xy.data_tgt.shape
((20, 30), (20, 30))