Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorization analysis returns wrong Vectorization Factor #3640

Open
jjsjann123 opened this issue Dec 24, 2024 · 0 comments
Open

Vectorization analysis returns wrong Vectorization Factor #3640

jjsjann123 opened this issue Dec 24, 2024 · 0 comments

Comments

@jjsjann123
Copy link
Collaborator

In vectorization analysis, we use MaxInfoSpanningTree to propagate the vectorization factor from reference TV to other TVs in the fusion. The issue is that, for each TV with multiple paths to the reference TV, only one path is traversed.

For the graph below where resize is involved, this approach runs into issue.

tv0 is being sliced twice producing tv1 / tv2, which were later added together as tv3 (reference TV).
We are taking a slice with an odd offset, tv2 = tv0[3:-5], which wouldn't allow any vectorization.

But in our analysis, when we propagate the projected contiguous inner dimensions from tv3 to tv0:

  1. if we go through tv3 -> tv1 -> tv0, the resize with resize extent (-4, -4) would give us a vectorization factor of 4;
  2. Meanwhile if we go along tv3 -> tv2 -> tv0, the other resize with resize extent (-3, -5) wouldn't allow any vectorization.
// Trivial slice
TEST_F(VectorizationAnalysisTest, ResizeFork) {
  Fusion fusion;
  FusionGuard fg(&fusion);
  std::vector<std::pair<TensorView*, int64_t>> expection_list;

  // concrete shapes to avoid dynamic Fusion
  auto tv0 = makeContigConcreteTensor({36});
  fusion.addInput(tv0);

  auto tv1 = slice(
      tv0,
      {{IrBuilder::create<Val>(4L),
        sub(tv0->axis(0)->extent(), IrBuilder::create<Val>(4L))}});
  auto tv2 = slice(
      tv0,
      {{IrBuilder::create<Val>(3L),
        sub(tv0->axis(0)->extent(), IrBuilder::create<Val>(5L))}});
  auto tv3 = add(tv1, tv2);
  fusion.addOutput(tv3);

  auto options = at::TensorOptions().dtype(at::kFloat).device(at::kCUDA, 0);
  auto t0 = at::randn({36}, options);
  std::vector<c10::IValue> aten_inputs({t0});

  std::unordered_map<TensorView*, Val*> projected_extent_map_from_producer =
      vectorize_helper::ContiguousInnerDimensionsMapper::map(
          tv3, tv3->getLogicalDomain())
          .getTvToContigMergeOfInnerSizeMap();
  checkMappedVal(projected_extent_map_from_producer, tv0, 1);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant