Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support filtering of multi-allelic variants and loading of covariates in data/ #20

Merged
merged 21 commits into from
Apr 13, 2022

Conversation

aryarm
Copy link
Member

@aryarm aryarm commented Apr 2, 2022

NOTE: please merge #21 before this PR

see #19
This PR allows for filtering out multi-allelic variants in the Genotypes class. It also introduces a new Covariates class for reading covariate data. Unit tests for both features are also available.

There are also some small changes to our dependencies in pyproject.toml to resolve some conflicts between click and black. Plus I changed the installation instructions to make sure everyone's using python 3.7

I added a warning for cases where the Genotypes class loads 0 variants when the regions spec has the wrong 'chr' prefix. (This was an issue for me previously that I spent too much time debugging 😓)

I also added an iterate() function to each class in the data module

And I rewrote parts of the Genotypes.read() function to handle large datasets

Lastly, I double-checked that various inputs (stdin and gz files, mostly) are supported in the Phenotypes and Covariates classes.

@aryarm aryarm merged commit af26af6 into main Apr 13, 2022
@aryarm aryarm deleted the feat/multi-allelic branch April 13, 2022 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants