Releases: scverse/scirpy
v0.10.0
Additions
This release adds a new feature to query reference databases (#298) comprising
- an extension of
pp.ir_dist
to compute distances to a reference dataset, tl.ir_query
, to match immune receptors to a reference database based on the distances computed withir_dist
,tl.ir_query_annotate
andtl.ir_query_annotate_df
to annotate cells based on the result oftl.ir_query
, anddatasets.vdjdb
which conveniently downloads and processes the latest version of VDJDB.
Fixes
- Bump minimal dependencies for networkx and tqdm (#300)
- Fix issue with
repertoire_overlap
(Fix #302 via #305) - Fix issue with
define_clonotype_clusters
(Fix #303 via #305) - Suppress
FutureWarning
s from pandas in tutorials (#307)
Internal changes
- Update sphinx to >= 4.1 (#306)
- Update black version
- Update the internal folder structure:
tl
,pp
etc. are now real packages instead of aliases
v0.9.1
v0.9.0
Additions
- Add the new "clonotype modularity" tool which ranks clonotypes by how strongly connected their gene expression neighborhood graph is. (#282).
The below example shows three clonotypes (164, 1363, 942), two of which consist of cells that are transcriptionally related.
example clonotypes | clonotype modularity vs. FDR |
---|---|
Deprecations
tl.clonotype_imbalance
is now deprecated in favor of the new clonotype modularity tool.
Fixes
Minor updates
v0.8.0
Additions
tl.alpha_diversity
now supports all metrics from scikit-bio, theD50
metric and custom callback functions (#277 by @naity)
Fixes
- Handle input data with "productive" chains which don't have a
junction_aa
sequence annotated (#281) - Fix issue with serialized "extra chains" not being imported correctly (#283 by @zktuong)
Minor changes
- The CI can now build documentation from pull-requests from forks. PR docs are not deployed to github-pages anymore, but can be downloaded as artifact from the CI run.
v0.7.1
Fixes
- Ensure Compatibility with latest version of dandelion (e78701c)
- Add links to older versions of documentation (#275)
- Fix issue, where clonotype analysis couldn't be continued after saving and reloading
h5ad
object (#274) - Allow "None" values to be present as cell-level attributes during
merge_airr_chains
(#273)
Minor changes
- Require
anndata >= 0.7.6
in conda tests (#266)
v0.7.0
This update features a
- change of Scirpy's data structure to improve interoperability with the AIRR standard
- a complete re-write of the clonotype definition module for improved performance.
This required several backwards-incompatible changes. Please read the release notes below and the updated tutorials.
Backwards-incompatible changes
Improve Interoperability by fully supporting the AIRR standard (#241)
Scirpy stores receptor information in adata.obs
. In this release, we updated the column names to match the AIRR Rearrangement standard. Our data model is now much more flexible, allowing to import arbitrary immune-receptor (IR)-chain related information. Use scirpy.io.upgrade_schema()
to update existing AnnData
objects to the latest format.
Closed issues #240, #253, #258, #255, #242, #215.
This update includes the following changes:
IrCell
is now replaced byAirrCell
which has additional functionalityIrChain
has been removed. Use a plain dictionary instead.- CDR3 information is now read from the
junction
andjunction_aa
columns instead ofcdr3_nt
andcdr3
, respectively. - Clonotype assignments are now per default stored in the
clone_id
column. expr
andexpr_raw
are nowduplicate_count
andconsensus_count
.{v,d,j,c}_gene
is now{v,d,j,c}_call
.- There's now an
extra_chains
column containing all IR-chains that don't fit into our receptor model. These chains are not used by scirpy, but can be re-exported to different formats. merge_with_ir
is now split up intomerge_with_ir
(to merge IR data with transcriptomics data) andmerge_airr_chains
(to merge several adatas with IR information, e.g. BCR and TCR data).- Tutorial and documentation updates, to reflect these changes
- Sequences are not converted to upper case on import. Scirpy tools that consume the sequences convert them to upper case on-the-fly.
{to,from}_ir_objs
has been renamed to{to,from}_airr_cells
.
Refactor CDR3 network creation (#230)
Previously, pp.ir_neighbors
constructed a cell x cell
network based on clonotype similarity. This led to performance issues
with highly expanded clonotypes (i.e. thousands of cells with exactly the same receptor configuration). Such cells would
form dense blocks in the sparse adjacency matrix (see issue #217). Another downside was that expensive alignment-distances had
to be recomputed every time the parameters of ir_neighbors
was changed.
The new implementation computes distances between all unique receptor configurations, only considering one instance of highly expanded clonotypes.
Closed issues #243, #217, #191, #192, #164.
This update includes the following changes:
pp.ir_neighbors
has been replaced bypp.ir_dist
.- The options
receptor_arms
anddual_ir
have been moved frompp.ir_neighbors
totl.define_clonotypes
andtl.define_clonotype_clusters
. - The default key for clonotype clusters is now
cc_{distance}_{metric}
instead ofct_cluster_{distance}_{metric}
. same_v_gene
now fully respects the optionsdual_ir
andreceptor_arms
- v-genes and receptor types were previously simply appended to clonotype ids (when
same_v_gene=True
). Now clonotypes with different v-genes get assigned a different numeric id. - Distance metric classes have been moved from
ir_dist
toir_dist.metrics
. - Distances matrices generated by
ir_dist
are now square and symmetric instead of triangular. - The default value for
dual_ir
is nowany
instead ofprimary_only
(Closes #164). - The API of
clonotype_network
has changed. - Clonotype network now visualizes cells with identical receptor configurations. The number of cells with identical receptor configurations is shown as point size (and optionally, as color). Clonotype network does not support plotting multiple colors at the same time any more.
Drop Support for Python 3.6
- Support Python 3.9, drop support for Python 3.6, following the numpy guidelines. (#229)
Fixes
tl.clonal_expansion
andtl.clonotype_convergence
now respect cells with missing receptors and returnnan
for those cells. (#252)
Additions
util.graph.igraph_from_sparse_matrix
allows to convert a sparse connectivity or distance matrix to anigraph
object.ir_dist.sequence_dist
now also works sequence arrays that contain duplicate entries (#192)from_dandelion
andto_dandelion
facilitate interaction with the Dandelion package (#240)write_airr
allows to write scirpy'sadata.obs
back to the AIRR Rearrangement format.read_airr
now tries to infer the locus from gene names, if no locus column is present.ir.io.upgrade_schema
allows to upgrade an existing scirpy anndata object to be compatible with the latest version of scirpydefine_clonotypes
anddefine_clonotype_clusters
now prints a logging message indicating where the results have been stored (#215)
Minor changes
tqdm
now uses IPython widgets to display progress bars, if available- the
process_map
fromtqdm
is now used to display progress bars for parallel computations instead the custom implementation used previously f307c2b matplotlib
s "grid lines" are now suppressed by default in all plots.- Docs from the
master
branch are now deployed toicbi-lab.github.io/scirpy/develop
instead of the main documentation website. The main website only gets updated on releases. - Refactored the
_is_na
function that checks if a string evaluates toNone
. - Fixed outdated documentation of the
receptor_arms
parameter (#264)
v0.6.1
Fixes
- Fix an issue where
define_clonotype
failed when the clonotype network had no edges (#236). - Require pandas >= 1.0 and fix a pandas incompatibility in
merge_with_ir
(#238). - Ensure consistent order of the spectratype dataframe (#238).
Minor changes
- Fix missing
bibtex_bibfiles
option in sphinx configuration - Work around pypa/flit#383.
v0.6.0
Backwards-incompatible changes:
- Set more sensible defaults the the
cutoff
parameter inir_neighbors
. The default is now2
forhamming
andlevenshtein
distance metrics and10
for thealignment
distance metric.
Additions:
- Add Hamming-distance as additional distance metric for
ir_neighbors
(#216 by @ktpolanski)
Minor changes:
v0.5.0
Add support for BCRs and gamma-delta TCRs
Backwards-incompatible changes:
- The data structure has changed. Column have been renamed from
TRA_xxx
andTRB_xxx
toIR_VJ_xxx
andIR_VDJ_xxx
. Additionally alocus
column has been added for each chain. - All occurences of
tcr
in the function and class names have been replaced withir
. Aliases for the old names have been created and emit aFutureWarning
.
Additions:
- There's now a mixed TCR/BCR example dataset (
maynard2020
) available (#211) - BCR-related amendments to the documentation (#206)
tl.chain_qc
which supersedeschain_pairing
. It additionally provides information about the receptor type.io.read_tracer
now supports gamma-delta T-cells (#207)io.to_ir_objs
allows to convert adata to a list ofIrCells
(#210)io.read_bracer
allows to read-in BraCeR BCR data. (#208)- The
pp.merge_with_ir
function now can handle the case when both the left and the rightAnnData
object contain immune receptor information. This is useful when integrating both TCR and BCR data into the same dataset. (#210)
Fixes:
- Fix a bug in
vdj_usage
which has been triggered by the new data structure (#203)
Minor changes:
data release: BCR example data
The assets of this release contain the example datasets compatible
with v0.5.