Sample Spatial Scenes Only from TemporalMultiRasterSource #1996

jejjohnson · 2023-11-20T09:54:51Z

jejjohnson
Nov 20, 2023

Hi,

I was wondering if there is a way to use the SlidingWindowGeoDataset or the RandomWindowGeoDataset on a TemporalMultiRasterSource object while being able to slice the spatial AND temporal AND channel dimensions?

Background.

I have a time series of images, e.g. TxHxWxC, which are all the same size and saved in separate folders. So the logical object for me seemed to be the TemporalMultiRasterSource. So some pseudocode would be:

# get file names (.png, non-geo)
filenames: List[str] = ...
# create raster sources
raster_sources = [RasterioSource(ifile, allow_streaming=True, ) for ifile in filenames]
# create multi source
rv_ds = TemporalMultiRasterSource(raster_sources=raster_sources, )

I would like to sample not just a window from the scene but also a window for the time step. So a chip could be Tchip x Hchip x Wchip x C. I can manually slice it without a problem in any dimension I want, which is nice. For example:

# slice [T, H, W, C]
rv_ds_sub: Float[Array, "1 256 256 1"] = rv_ds[1:2, :256, :256, :1]

However, I was wondering if there was a way I could use the WindowGeoDatasets to randomly sample windows in space & time? It seems that none of the arguments within the RegressionRandomWindowGeoDataset or the RegressionSlidingWindowGeoDataset offer this ability to select the limits for the temporal/channel domain.

Ideally, I am imagining something like

# create a scene
rv_scene = Scene(
    id="my_scene",
    raster_source= rv_ds, 
)
# create  Random window dataset
out_size = (1, 256, 256)
err = RegressionRandomWindowGeoDataset(
    scene=rv_scene, 
    out_size=out_size,
    size_lims=(256,256+1),
    return_window=True, 
)

However, it's not possible because I think the outsize is confined to just 2D objects. If it's not possible, is there a better way to do this?

Answered by AdeelH

Nov 20, 2023

You are right: this is not currently possible. And I agree that it would be nice to have.

If you are only interested in sampling one timestamp at a time, one workaround would be to just treat each image as independent. That is, convert each raster source to a separate GeoDataset and then concatenate them using torch.utils.data.ConcatDataset. That way, when you randomly sample a chip from the concatenated dataset, it would return an (H, W, C) chip from a random location and a random timestamp. Would that work for you?

View full answer

AdeelH · 2023-11-20T17:13:21Z

AdeelH
Nov 20, 2023
Maintainer

You are right: this is not currently possible. And I agree that it would be nice to have.

If you are only interested in sampling one timestamp at a time, one workaround would be to just treat each image as independent. That is, convert each raster source to a separate GeoDataset and then concatenate them using torch.utils.data.ConcatDataset. That way, when you randomly sample a chip from the concatenated dataset, it would return an (H, W, C) chip from a random location and a random timestamp. Would that work for you?

1 reply

jejjohnson Nov 21, 2023
Author

Hi! The independent assumption is a good idea indeed which I think is what I want in the end. Thank you!

jejjohnson · 2023-11-21T04:44:25Z

jejjohnson
Nov 21, 2023
Author

In the end, here is the pseudocode based on the authors suggestion above.

# get a list of file names
data_filenames: List[str] = ...

# function to initialize random window dataset
def init_random_window_dataset(
    file: int, 
    window_size: int=256,
    max_windows: int=100,
    efficient_aoi_sampling: bool=True,
    **kwargs
):

    # create raster sources
    rasterio_source = RasterioSource(file, allow_streaming=True)

    filename = Path(file).stem

    # create scene
    scene = Scene(
        id=f"file_{Path(file)}",
        raster_source=rasterio_source, 
    )

    # WARNING
    # make sure you set the max windows to a reasonable value. 
    # Otherwise it becomes basically infinite which cause integer 
    # overflow problems when concatenating datasets.
    
    ds = RegressionRandomWindowGeoDataset(
        scene=scene, 
        out_size=window_size,
        size_lims=(window_size, window_size + 1),
        max_windows=max_windows,
        efficient_aoi_sampling=efficient_aoi_sampling,
        **kwargs
    )

    return ds

# create dataset for each file
ds = [init_dataset(ifile, max_windows=100, efficient_aoi_sampling=True) for ifile in data_filenames]

# concatenate all datasets
ds = ConcatDataset(ds)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sample Spatial Scenes Only from TemporalMultiRasterSource #1996

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Sample Spatial Scenes Only from TemporalMultiRasterSource #1996

jejjohnson Nov 20, 2023

Replies: 2 comments · 1 reply

AdeelH Nov 20, 2023 Maintainer

jejjohnson Nov 21, 2023 Author

jejjohnson Nov 21, 2023 Author

jejjohnson
Nov 20, 2023

Replies: 2 comments 1 reply

AdeelH
Nov 20, 2023
Maintainer

jejjohnson Nov 21, 2023
Author

jejjohnson
Nov 21, 2023
Author