Skip to content

Commit

Permalink
feat: added note on UCSC liftover
Browse files Browse the repository at this point in the history
  • Loading branch information
SLundregan authored Apr 29, 2024
1 parent cf9aca9 commit f90a317
Showing 1 changed file with 35 additions and 0 deletions.
35 changes: 35 additions & 0 deletions docs/UCSC_liftover
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# UCSC liftover tool

[UCSC liftover](https://genome.ucsc.edu/cgi-bin/hgLiftOver)

## Issue: separate peaks map to same coordinates after liftover

- When using the tool to liftover regions from hg19 to hg38, separate peaks in specific regions of hg19 map to overlapping coordinates after liftover.
- This is because some contigs were not carried forward from hg19 to hg38, due to problems with the hg19 assembly that were resolved in hg38
- All lifted over peaks with overlapping coordinates should be removed from analysis because the peaks were called on an uncertain region in the old genome assembly
- Issues like this likely apply to all genome updates, not only human
- The same issue also applies with [Ensembl assembly converter](https://www.ensembl.org/Homo_sapiens/Tools/AssemblyConverter)

## Solution
Remove any peaks with overlapping coordinates after liftover before using the lifted over peak file:

```bash
#!/bin/bash

module load bedtools

EXTL=("../data/external/")

# Sort the lifted over peakfile for use with bedtools
sort -k1,1 -k2,2n ${EXTL}wong_fig3c_peaks_GRCh38.bed > ${EXTL}peaks.tmp && mv ${EXTL}peaks.tmp ${EXTL}wong_fig3c_peaks_GRCh38.bed

# Bedtools merge count rows contributing to merged peaks (overlapping peaks will have count > 1)
bedtools merge -i ${EXTL}wong_fig3c_peaks_GRCh38.bed -c 1 -o count > ${EXTL}counted.bed

# Get non-overlapping peaks
awk '/\t1$/{print}' ${EXTL}counted.bed > ${EXTL}filtered.bed

# Intersect original file with non-overlapping peaks and output overlapping peaks
bedtools intersect -wa -a ${EXTL}wong_fig3c_peaks_GRCh38.bed -b ${EXTL}filtered.bed > ${EXTL}wong_fig3c_peaks_GRCh38_correct_liftover.bed
bedtools intersect -v -a ${EXTL}wong_fig3c_peaks_GRCh38.bed -b ${EXTL}filtered.bed > ${EXTL}wong_fig3c_peaks_GRCh38_overlapping.bed
```

0 comments on commit f90a317

Please sign in to comment.