moscot.base.output.BaseDiscreteSolverOutput.sparsify¶
- BaseDiscreteSolverOutput.sparsify(mode, value=None, batch_size=1024, n_samples=None, seed=None, max_k=None)[source]¶
Sparsify the
transport_matrix.This function sets entries of the transport matrix to \(0\) according to
modeand returns aMatrixSolverOutputwith sparsified transport matrix stored as acsr_matrix. The transport matrix is materialized in row blocks ofbatch_sizerows, so peak memory is bounded bybatch_size.Warning
This function only serves for interfacing software which has to instantiate the transport matrix,
moscotnever uses the sparsified transport matrix.- Parameters:
mode (
Literal['threshold','percentile','min_row','mass']) –How to determine the entries that are set to \(0\). Valid options are:
’threshold’ -
valueis the threshold below which entries are set to \(0\).’percentile’ -
valueis the percentile in \([0, 100]\) of thetransport_matrix. below which entries are set to \(0\).’min_row’ -
valueis not used, it is chosen such that each row has at least 1 non-zero entry.’mass’ - per row, keep the largest entries capturing a fraction
valueof the row’s mass (at mostmax_kentries per row);valuemust be in \((0, 1]\).
value (
Optional[float]) – Value to use for sparsification. Its meaning depends onmode(see above).batch_size (
int) – How many rows to materialize at a time when sparsifying thetransport_matrix.n_samples (
Optional[int]) – Ifmode = 'percentile', determine the number of samples based on which the percentile is computed stochastically. Note this means that a matrix of shape [n_samples, min(transport_matrix.shape)] has to be instantiated. If None,n_samplesis set tobatch_size.seed (
Optional[int]) – Random seed needed for sampling ifmode = 'percentile'.max_k (
Optional[int]) – Maximum number of entries to keep per row. Only valid whenmode = 'mass'.
- Return type:
- Returns:
: Output with sparsified transport matrix.