snplist support in the `ld` command #245

aryarm · 2024-06-05T19:13:08Z

It would be nice if the ld command could compute LD between SNPs. For now, I've just been using the haptools API:

import numpy as np
from haptools.data import GenotypesVCF
from haptools.ld import pearson_corr_ld

causal_SNP = "rs429358"

gts = GenotypesVCF.load("tests/data/apoe.vcf.gz")
get_snp_gts = lambda snp: gts.subset(variants=(snp,)).data[:, 0, :2].sum(axis=1)
target_gts = get_snp_gts(causal_SNP)
other_snps = set(gts.variants["id"])
other_snps.remove(causal_SNP)
ld = {vr: np.abs(pearson_corr_ld(target_gts, get_snp_gts(vr))) for vr in other_snps}

# get the top ten SNPs that have the highest LD with the causal SNP
sorted(ld, key=ld.get, reverse=True)[:10]

on second thought, I'm not sure the snplist file format is the best either, since it requires that you provide a beta for each SNP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snplist support in the `ld` command #245

snplist support in the `ld` command #245

aryarm commented Jun 5, 2024 •

edited

Loading

snplist support in the ld command #245

snplist support in the ld command #245

Comments

aryarm commented Jun 5, 2024 • edited Loading

snplist support in the `ld` command #245

snplist support in the `ld` command #245

aryarm commented Jun 5, 2024 •

edited

Loading