Skip to content

Releases: scverse/scirpy

v0.10.0

15 Nov 19:51
76233c5
Compare
Choose a tag to compare

Additions

This release adds a new feature to query reference databases (#298) comprising

  • an extension of pp.ir_dist to compute distances to a reference dataset,
  • tl.ir_query, to match immune receptors to a reference database based on the distances computed with ir_dist,
  • tl.ir_query_annotate and tl.ir_query_annotate_df to annotate cells based on the result of tl.ir_query, and
  • datasets.vdjdb which conveniently downloads and processes the latest version of VDJDB.

Fixes

  • Bump minimal dependencies for networkx and tqdm (#300)
  • Fix issue with repertoire_overlap (Fix #302 via #305)
  • Fix issue with define_clonotype_clusters (Fix #303 via #305)
  • Suppress FutureWarnings from pandas in tutorials (#307)

Internal changes

  • Update sphinx to >= 4.1 (#306)
  • Update black version
  • Update the internal folder structure: tl, pp etc. are now real packages instead of aliases

v0.9.1

24 Sep 06:57
Compare
Choose a tag to compare

Fixes

  • Scirpy can now import additional columns from Cellranger 6 (#279 by @naity)
  • Fix minor issue with include_fields in AirrCell (#297)

Documentation

  • Fix broken link in README (#296)
  • Add developer documentation (#294)

v0.9.0

07 Sep 13:23
6f187bb
Compare
Choose a tag to compare

Additions

  • Add the new "clonotype modularity" tool which ranks clonotypes by how strongly connected their gene expression neighborhood graph is. (#282).

The below example shows three clonotypes (164, 1363, 942), two of which consist of cells that are transcriptionally related.

example clonotypes clonotype modularity vs. FDR

Deprecations

  • tl.clonotype_imbalance is now deprecated in favor of the new clonotype modularity tool.

Fixes

  • Fix calling locus from gene name in some cases (#288)
  • Compatibility with networkx>=2.6 (#292)

Minor updates

  • Fix some links in README (#284)
  • Fix old instances of clonotype in docs (should be clone_id) (#287)

v0.8.0

22 Jul 07:17
fb21600
Compare
Choose a tag to compare

Additions

Fixes

  • Handle input data with "productive" chains which don't have a junction_aa sequence annotated (#281)
  • Fix issue with serialized "extra chains" not being imported correctly (#283 by @zktuong)

Minor changes

  • The CI can now build documentation from pull-requests from forks. PR docs are not deployed to github-pages anymore, but can be downloaded as artifact from the CI run.

v0.7.1

02 Jul 07:49
6ccaa66
Compare
Choose a tag to compare

Fixes

  • Ensure Compatibility with latest version of dandelion (e78701c)
  • Add links to older versions of documentation (#275)
  • Fix issue, where clonotype analysis couldn't be continued after saving and reloading h5ad object (#274)
  • Allow "None" values to be present as cell-level attributes during merge_airr_chains (#273)

Minor changes

  • Require anndata >= 0.7.6 in conda tests (#266)

v0.7.0

28 Apr 12:54
bda376e
Compare
Choose a tag to compare

This update features a

  • change of Scirpy's data structure to improve interoperability with the AIRR standard
  • a complete re-write of the clonotype definition module for improved performance.

This required several backwards-incompatible changes. Please read the release notes below and the updated tutorials.

Backwards-incompatible changes

Improve Interoperability by fully supporting the AIRR standard (#241)

Scirpy stores receptor information in adata.obs. In this release, we updated the column names to match the AIRR Rearrangement standard. Our data model is now much more flexible, allowing to import arbitrary immune-receptor (IR)-chain related information. Use scirpy.io.upgrade_schema() to update existing AnnData objects to the latest format.

Closed issues #240, #253, #258, #255, #242, #215.

This update includes the following changes:

  • IrCell is now replaced by AirrCell which has additional functionality
  • IrChain has been removed. Use a plain dictionary instead.
  • CDR3 information is now read from the junction and junction_aa columns instead of cdr3_nt and cdr3, respectively.
  • Clonotype assignments are now per default stored in the clone_id column.
  • expr and expr_raw are now duplicate_count and consensus_count.
  • {v,d,j,c}_gene is now {v,d,j,c}_call.
  • There's now an extra_chains column containing all IR-chains that don't fit into our receptor model. These chains are not used by scirpy, but can be re-exported to different formats.
  • merge_with_ir is now split up into merge_with_ir (to merge IR data with transcriptomics data) and merge_airr_chains (to merge several adatas with IR information, e.g. BCR and TCR data).
  • Tutorial and documentation updates, to reflect these changes
  • Sequences are not converted to upper case on import. Scirpy tools that consume the sequences convert them to upper case on-the-fly.
  • {to,from}_ir_objs has been renamed to {to,from}_airr_cells.

Refactor CDR3 network creation (#230)

Previously, pp.ir_neighbors constructed a cell x cell network based on clonotype similarity. This led to performance issues
with highly expanded clonotypes (i.e. thousands of cells with exactly the same receptor configuration). Such cells would
form dense blocks in the sparse adjacency matrix (see issue #217). Another downside was that expensive alignment-distances had
to be recomputed every time the parameters of ir_neighbors was changed.

The new implementation computes distances between all unique receptor configurations, only considering one instance of highly expanded clonotypes.

Closed issues #243, #217, #191, #192, #164.

This update includes the following changes:

  • pp.ir_neighbors has been replaced by pp.ir_dist.
  • The options receptor_arms and dual_ir have been moved from pp.ir_neighbors to tl.define_clonotypes and tl.define_clonotype_clusters.
  • The default key for clonotype clusters is now cc_{distance}_{metric} instead of ct_cluster_{distance}_{metric}.
  • same_v_gene now fully respects the options dual_ir and receptor_arms
  • v-genes and receptor types were previously simply appended to clonotype ids (when same_v_gene=True). Now clonotypes with different v-genes get assigned a different numeric id.
  • Distance metric classes have been moved from ir_dist to ir_dist.metrics.
  • Distances matrices generated by ir_dist are now square and symmetric instead of triangular.
  • The default value for dual_ir is now any instead of primary_only (Closes #164).
  • The API of clonotype_network has changed.
  • Clonotype network now visualizes cells with identical receptor configurations. The number of cells with identical receptor configurations is shown as point size (and optionally, as color). Clonotype network does not support plotting multiple colors at the same time any more.
Clonotype network (previous implementation) Clonotype network (now)
Each dot represents a cell. Cells with identical receptors form a fully connected subnetwork Each dot represents cells with identical receptors. The dot size refers to the number of cells
image image

Drop Support for Python 3.6

  • Support Python 3.9, drop support for Python 3.6, following the numpy guidelines. (#229)

Fixes

  • tl.clonal_expansion and tl.clonotype_convergence now respect cells with missing receptors and return nan for those cells. (#252)

Additions

  • util.graph.igraph_from_sparse_matrix allows to convert a sparse connectivity or distance matrix to an igraph object.
  • ir_dist.sequence_dist now also works sequence arrays that contain duplicate entries (#192)
  • from_dandelion and to_dandelion facilitate interaction with the Dandelion package (#240)
  • write_airr allows to write scirpy's adata.obs back to the AIRR Rearrangement format.
  • read_airr now tries to infer the locus from gene names, if no locus column is present.
  • ir.io.upgrade_schema allows to upgrade an existing scirpy anndata object to be compatible with the latest version of scirpy
  • define_clonotypes and define_clonotype_clusters now prints a logging message indicating where the results have been stored (#215)

Minor changes

  • tqdm now uses IPython widgets to display progress bars, if available
  • the process_map from tqdm is now used to display progress bars for parallel computations instead the custom implementation used previously f307c2b
  • matplotlibs "grid lines" are now suppressed by default in all plots.
  • Docs from the master branch are now deployed to icbi-lab.github.io/scirpy/develop instead of the main documentation website. The main website only gets updated on releases.
  • Refactored the _is_na function that checks if a string evaluates to None.
  • Fixed outdated documentation of the receptor_arms parameter (#264)

v0.6.1

30 Jan 16:27
Compare
Choose a tag to compare

Fixes

  • Fix an issue where define_clonotype failed when the clonotype network had no edges (#236).
  • Require pandas >= 1.0 and fix a pandas incompatibility in merge_with_ir (#238).
  • Ensure consistent order of the spectratype dataframe (#238).

Minor changes

  • Fix missing bibtex_bibfiles option in sphinx configuration
  • Work around pypa/flit#383.

v0.6.0

10 Dec 14:28
Compare
Choose a tag to compare

Backwards-incompatible changes:

  • Set more sensible defaults the the cutoff parameter in ir_neighbors. The default is now 2 for hamming and levenshtein distance metrics and 10 for the alignment distance metric.

Additions:

  • Add Hamming-distance as additional distance metric for ir_neighbors (#216 by @ktpolanski)

Minor changes:

  • Fix MacOS CI (#221)
  • Use mamba instead of conda in CI (#216)

v0.5.0

20 Oct 16:37
cce2a03
Compare
Choose a tag to compare

Add support for BCRs and gamma-delta TCRs

Backwards-incompatible changes:

  • The data structure has changed. Column have been renamed from TRA_xxx and TRB_xxx to IR_VJ_xxx and IR_VDJ_xxx. Additionally a locus column has been added for each chain.
  • All occurences of tcr in the function and class names have been replaced with ir. Aliases for the old names have been created and emit a FutureWarning.

Additions:

  • There's now a mixed TCR/BCR example dataset (maynard2020) available (#211)
  • BCR-related amendments to the documentation (#206)
  • tl.chain_qc which supersedes chain_pairing. It additionally provides information about the receptor type.
  • io.read_tracer now supports gamma-delta T-cells (#207)
  • io.to_ir_objs allows to convert adata to a list of IrCells (#210)
  • io.read_bracer allows to read-in BraCeR BCR data. (#208)
  • The pp.merge_with_ir function now can handle the case when both the left and the right AnnData object contain immune receptor information. This is useful when integrating both TCR and BCR data into the same dataset. (#210)

Fixes:

  • Fix a bug in vdj_usage which has been triggered by the new data structure (#203)

Minor changes:

  • Removed the tqdm monkey patch, as the issue has been resolved upstream (#200)
  • Add AIRR badge, as scirpy is now certified to comply with the AIRR software standard v1. (#202)
  • Require pycairo >1.20 which provides a windows wheel, eliminating the CI problems.

data release: BCR example data

20 Oct 15:16
c1cd6ff
Compare
Choose a tag to compare

The assets of this release contain the example datasets compatible
with v0.5.