Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about data preprocessing. #7

Open
Huahuatii opened this issue Sep 20, 2023 · 1 comment
Open

Some questions about data preprocessing. #7

Huahuatii opened this issue Sep 20, 2023 · 1 comment

Comments

@Huahuatii
Copy link

Your work is very interesting, and I would like to use portal-sc to conduct some tests on our dataset. And it's great to see the work you've done in preprocess_memory_efficient.But I've noticed that the preprocessing order seems to differ from the standard workflow in Scanpy. I was wondering if there's a specific reason for this difference?
image
image

@jiazhao97
Copy link
Collaborator

Hi there,

Thank you for your interest in our Portal method! In Portal, we select highly variable genes with flavor 'seurat_v3'. Count data is expected when using flavor 'seurat_v3', while logarithmized data is expected when using other flavors (https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html). Therefore, Portal selects genes before obtaining logarithmized data; while standard scanpy pipeline selects genes with another flavor using logarithmized data.

Best,
Jia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants