moscot.problems.generic.GWProblem.compute_feature_correlation#

GWProblem.compute_feature_correlation(obs_key, corr_method='pearson', significance_method='fisher', annotation=None, layer=None, features=None, confidence_level=0.95, n_perms=1000, seed=None, **kwargs)#

Compute correlation of push-forward or pull-back distribution with features.

Correlates a feature, e.g., counts of a gene, with probabilities of cells mapped to a set of cells such as the push-forward or pull-back distributions.

See also

TODO: create and link an example

Parameters:

obs_key (str) – Key in obs containing the push-forward or pull-back distribution.
corr_method (Literal['pearson', 'spearman']) – Which type of correlation to compute, either 'pearson' or 'spearman'.
significance_method (Literal['fisher', 'perm_test']) –
Mode to use when calculating p-values and confidence intervals. Valid options are:
- 'fisher' - Fisher transformation [Fisher, 1921].
- 'perm_test' - permutation test.
annotation (Optional[dict[str, Iterable[str]]]) –
How to subset the data when computing the correlation:
- None - do not subset the data.
- str - key in obs where categorical data is stored.
- dict - a dictionary with one key corresponding to a categorical column in obs and values to a subset of categories.
layer (Optional[str]) – Key in layers from which to get the expression. If None, use X.
features (Union[list[str], Literal['human', 'mouse', 'drosophila'], None]) –
Features in AnnData to correlate with obs['{obs_key}']:
- None - all features from var will be taken into account.
- list - subset of var_names or obs_names.
- 'human', 'mouse', or 'drosophila' - the features are subsetted to the transcription factors from transcription_factors().
confidence_level (float) – Confidence level for the confidence interval calculation. Must be in interval \([0, 1]\).
n_perms (int) – Number of permutations to use when method = 'perm_test'.
seed (Optional[int]) – Random seed when method = 'perm_test'.
kwargs (Any) – Keyword arguments for parallelization, e.g., n_jobs.
self (AnalysisMixinProtocol[K, B]) –

Return type:

DataFrame

Returns:

: Dataframe of shape (n_features, 5) containing the following columns, one for each feature:

'corr' - correlation between the count data and push/pull distributions.
'pval' - calculated p-values for double-sided test.
'qval' - corrected p-values using the Benjamini-Hochberg method at \(0.05\) level.
'ci_low' - lower bound of the confidence_level correlation confidence interval.
'ci_high' - upper bound of the confidence_level correlation confidence interval.