Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Summary: - Make multithreaded memcpy for staging transfers for GPU table scan. - Make variants of bit unpacking in GpuDecoder-inl.cuh. Make selective decoding templatized as opposed to runtime switching. - Add pieces to GpuDecoderTest, like comparing calling via launchDecode or (multi-function blocks) or decodeGlobal (single function thread blocks). - Add a metric for driver thread waiting for first continuable stream. - Check approx correctness of Wave runtimeStats. - Refactor QueryBenchmarkBase.* from TpchBenchmark. Logic to do sweeps across parameter combinations. - Add persistent file format to Wave mock format. - Add benchmark for scan, filter, filter expr, projection. aggregation combinations with Wave and Dwrf. Pull Request resolved: #10679 Differential Revision: D60880466 Pulled By: oerling
- Loading branch information