Skip to content

Commit

Permalink
GRangesEmpty added and major changes added. This newtype resolves man…
Browse files Browse the repository at this point in the history
…y issues.

 - Added the GRangesEmpty newtype around `GRanges<R, ()>`. This was
   a major step forward. The reason for this type is that when the
   generic `GRanges<R, T>` object does not have a data container
   (i.e. `T = ()`, this *always* corresponds to the case where
   the range container `R` has ranges without indices. As such, working
   with the unwrapped "empty" `GRanges` object was a bit of a headache,
   since the developer would need to handle and link *two* types: `T = ()`
   and some sort of unindexed range. The newtype resolves this.

 - Added the `IntoGRangesRef` trait, which improves the ergonomics
   when writing functions that should work on both `GRanges` and `GRangesEmpty`
   objects. Also added `Into` methods for conversion from `GRangesEmpty` to
   `GRanges` objects.

 - With two different types now, it is possible to *properly*
   have the same method (without traits) for both `GRanges` and
   `GRangesEmpty` types. Thus, several methods like
   `from_iter_ranges_only()` and `push_range_with_data()` have been removed
   and replaced with unified method names `from_iter()` and `push_range`
   using the proper function for their type.

 - Added error for unsupported ranges format.
  • Loading branch information
vsbuffalo committed Feb 20, 2024
1 parent d3d2dbc commit 0474a07
Show file tree
Hide file tree
Showing 15 changed files with 464 additions and 383 deletions.
71 changes: 32 additions & 39 deletions src/commands.rs
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
use std::path::PathBuf;

use crate::{
granges::GRangesVariant,
io::OutputFile,
prelude::*,
ranges::operations::adjust_range,
reporting::{CommandOutput, Report},
test_utilities::random_granges,
traits::{IterableRangeContainer, TsvSerialize},
PositionOffset,
};

Expand Down Expand Up @@ -69,44 +69,37 @@ pub fn granges_adjust(

Ok(CommandOutput::new((), report))
}

/// Retain only the ranges that have at least one overlap with another set of ranges.
pub fn granges_filter(
seqlens: &PathBuf,
left_bedfile: &PathBuf,
right_bedfile: &PathBuf,
output: Option<&PathBuf>,
sort: bool,
) -> Result<CommandOutput<()>, GRangesError> {
let genome = read_seqlens(seqlens)?;

// in memory (sorted queries not yet supported)
let left_record_iter = BedlikeIterator::new(left_bedfile)?
.into_variant()?;
let right_record_iter = BedlikeIterator::new(right_bedfile)?
.into_variant()?;

let left_gr = GRanges::from_iter_variant(left_record_iter, &genome)?;
let right_gr = GRanges::from_iter_variant(right_record_iter, &genome)?;

todo!()
//let right_gr = right_gr.to_coitrees();

//let intersection = left_gr.filter_overlaps(&right_gr)?;

//// output stream -- header is None for now (TODO)
//let output_stream = output.map_or(OutputFile::new_stdout(None), |file| {
// OutputFile::new(file, None)
//});
//let mut writer = output_stream.writer()?;

//// for reporting stuff to the user
//let mut report = Report::new();

//intersection.sort().to_tsv(output)?;

//Ok(CommandOutput::new((), report))
}
//
// /// Retain only the ranges that have at least one overlap with another set of ranges.
// pub fn granges_filter<DL, DR>(
// seqlens: &PathBuf,
// left_granges: GRanges<VecRangesEmpty, DL>,
// right_granges: GRanges<VecRangesEmpty, DR>,
// sort: bool,
// ) -> Result<CommandOutput<()>, GRangesError>
// where
// // we must be able to iterate over left ranges
// VecRangesEmpty: IterableRangeContainer,
// // we must be able to convert the right GRanges to interval trees
// GRanges<VecRangesEmpty, ()>: GenomicRangesToIntervalTrees<()>,
// {
// let right_granges = right_granges.to_coitrees()?;
//
// // let intersection = left_granges.filter_overlaps_(&right_granges)?;
//
// //// output stream -- header is None for now (TODO)
// //let output_stream = output.map_or(OutputFile::new_stdout(None), |file| {
// // OutputFile::new(file, None)
// //});
// //let mut writer = output_stream.writer()?;
//
// // for reporting stuff to the user
// let mut report = Report::new();
//
// //intersection.sort().to_tsv(output)?;
//
// Ok(CommandOutput::new((), report))
// }

/// Generate a random BED-like file with genomic ranges.
pub fn granges_random_bed(
Expand Down
3 changes: 2 additions & 1 deletion src/data/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ use crate::traits::DataContainer;

pub mod vec;

impl<U> DataContainer for Vec<U>{}
impl<U> DataContainer for Vec<U> {}
impl DataContainer for () {}
3 changes: 3 additions & 0 deletions src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,7 @@ pub enum GRangesError {
#[error("Invalid GRanges object: no data container.")]
NoDataContainer,

// Command line tool related errors
#[error("Unsupported genomic range format")]
UnsupportedGenomicRangesFileFormat,
}
Loading

0 comments on commit 0474a07

Please sign in to comment.