moscot.datasets.simulate_data#

moscot.datasets.simulate_data(n_distributions=2, cells_per_distribution=20, n_genes=60, key='batch', var=1.0, obs_to_add=mappingproxy({'celltype': 3}), marginals=None, seed=0, quad_term=None, lin_cost_matrix=None, quad_cost_matrix=None, **kwargs)[source]#

Simulate data.

This function is used to generate data, mainly for the purpose of demonstrating certain functionalities of moscot.

Parameters:
  • n_distributions (int) – Number of distributions defined by key.

  • cells_per_distribution (int) – Number of cells per distribution.

  • n_genes (int) – Number of genes per simulated cell.

  • key (Literal['day', 'batch']) – Key to identify distribution allocation.

  • var (float) – Variance of one cell distribution

  • obs_to_add (Mapping[str, Any]) – Dictionary of names to add to columns of anndata.AnnData.obs and number of different values for this column.

  • marginals (Optional[Tuple[str, str]]) – Column names of anndata.AnnData.obs where to save the randomly generated marginals. If None, no marginals are generated.

  • seed (int) – Random seed.

  • quad_term (Optional[Literal['tree', 'barcode', 'spatial']]) – Literal indicating whether to add costs corresponding to a specific problem setting. If None, no quadratic cost element is generated.

  • lin_cost_matrix (Optional[str]) – Key where to save the linear cost matrix. It is generated according to the pairwise policy. If None, no linear cost matrix is generated.

  • quad_cost_matrix (Optional[str]) – Key where to save the quadratic cost matrices. If None, no quadratic cost matrix is generated.

  • kwargs (Any) –

Return type:

AnnData

Returns:

: anndata.AnnData.