Usage Principles¶
Import Quanp as:
import quanp as qp
Workflow¶
The typical workflow consists of subsequent calls of data analysis tools in qp.tl, e.g.:
qp.tl.umap(adata, **tool_params) # embed a neighborhood graph of the data using UMAP
where adata is an AnnData
object.
Each of these calls adds annotation to an expression matrix X,
which stores n_obs observations (subjects) of n_vars variables (features).
For each tool, there typically is an associated plotting function in qp.pl:
qp.pl.umap(adata, **plotting_params)
If you pass show=False, a Axes
instance is returned
and you have all of matplotlib’s detailed configuration possibilities.
To facilitate writing memory-efficient pipelines, by default,
Quanp tools operate inplace on adata and return None –
this also allows to easily transition to out-of-memory pipelines.
If you want to return a copy of the AnnData
object
and leave the passed adata unchanged, pass copy=True or inplace=False.
AnnData¶
Quanp is based on anndata
, which provides the AnnData
class.
At the most basic level, an AnnData
object adata stores
a data matrix adata.X, annotation of observations
adata.obs and variables adata.var as pd.DataFrame and unstructured
annotation adata.uns as dict. Names of observations and
variables can be accessed via adata.obs_names and adata.var_names,
respectively. AnnData
objects can be sliced like
dataframes, for example, adata_subset = adata[:, list_of_feature_names].
For more, see this blog post.
To read a data file to an AnnData
object, call:
adata = qp.read(filename)
to initialize an AnnData
object. Possibly add further annotation using, e.g., pd.read_csv:
import pandas as pd
anno = pd.read_csv(filename_sample_annotation)
adata.obs['subject_groups'] = anno['subject_groups'] # categorical annotation of type pandas.Categorical
adata.obs['time'] = anno['time'] # numerical annotation of type float
# alternatively, you could also set the whole dataframe
# adata.obs = anno
To write, use:
adata.write(filename)
adata.write_csvs(filename)
adata.write_loom(filename)