You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the resize scheduler (#3556), test_cat_symbolic and test_remove_empty_issue_2545 fail at deserialization. Here's the error output with test_cat_symbolic:
============================= test session starts ==============================
platform linux -- Python 3.10.15, pytest-8.3.3, pluggy-1.5.0 -- /opt/conda/pytorch/bin/python
cachedir: .pytest_cache
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase(PosixPath('/home/nmaruyama/nvfuser/debug2/.hypothesis/examples'))
rootdir: /home/nmaruyama/nvfuser/debug2
plugins: benchmark-4.0.0, hypothesis-6.115.5, typeguard-4.3.0
collecting ... collected 124 items / 123 deselected / 1 selected
tests/python/test_python_frontend.py::TestNvFuserFrontend::test_cat_symbolic
Exception For CPP Translation:
(A failure here suggests a mismatch in functionality between the original and cloned definitions.)
Does FusionDefinition supports segmentation? True
FAILED
=================================== FAILURES ===================================
____________________ TestNvFuserFrontend.test_cat_symbolic _____________________
reference_outputs = [tensor([[[-0.4806, 0.1690, -0.1517, ..., -2.1240, 0.3014, -0.7769],
[-0.0959, 0.4240, 0.0372, ..., -0.... device='cuda:0'), tensor([ -2.3808, 3.5931, 1.1129, ..., -10.6209, 13.8750, 17.9289],
device='cuda:0')]
fd =
def nvfuser_fusion_id0(fd : FusionDefinition) -> None :
S0 = fd.define_scalar(None, dtype=DataType.Double)
S1... T28 = fd.ops.sum(T27, dims=[0, 1], keepdim=False, dtype=DataType.Null)
fd.add_output(T27)
fd.add_output(T28)
inputs = [0.29730177875068026, 0.29730177875068026, 4, 64, 768, 4, ...]
supports_segmentation = True, device = None
def check_cpp_translation(
reference_outputs, fd, inputs, supports_segmentation, device=None
):
try:
torch.manual_seed(0)
# Clone
cloned_fd = FusionDefinition()
clone(fd, cloned_fd)
# Segment
if supports_segmentation:
> cloned_fd.segment(inputs)
tests/python/utils.py:268:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self =
def nvfuser_fusion_id1(fd : FusionDefinition) -> None :
S0 = fd.define_scalar(None, dtype=DataType.Double)
S1...T61 = fd.ops.sum(T60, dims=[0, 1], keepdim=False, dtype=DataType.Float)
fd.add_output(T60)
fd.add_output(T61)
inputs = [0.29730177875068026, 0.29730177875068026, 4, 64, 768, 4, ...]
def segment(self, inputs):
"""
Decompose this FusionDefinition into a sequence of segment
FusionDefinitions.
This function runs the nvfuser segmentation algorithm and translates the
segments into their corresponding FusionDefinitions.
Args:
inputs (List[Union[Tensor, Scalar]]): A list of inputs to fusion.
Returns:
List[FusionDefinition]: The FusionDefinitions corresponding to the
sub-fusion segments of this FusionDefinition.
"""
num_segments = self._setup_segmentation(inputs)
if num_segments == 1:
self._finalize_segmentation()
return []
# Track all segments for this FusionDefinition
self.segments = []
# Track map_segment_fid_to_original_fid for each segment
self.segment_index_space_maps = {}
# Track the last segment a value is used as an input
self.map_value_to_last_used_segment = {}
for idx in range(num_segments):
new_fd = FusionDefinition()
> map_segment_fid_to_original_fid = self._build_segment(new_fd, idx)
E RuntimeError: INTERNAL ASSERT FAILED at "/home/nmaruyama/nvfuser/debug2/csrc/python_frontend/fusion_state.cpp":145, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues.
E Detected exception while building Fusion Ir. The failing RecordFunctor is: T17 = fd.ops.cat([T1, T2, T15], dim=2, manual_padding=1)
E NvFuser error message is: INTERNAL ASSERT FAILED at "/home/nmaruyama/nvfuser/debug2/csrc/ops/alias.cpp":644, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues. Expected all inputs to be padded when m
anual_padding is True.
E Exception raised from cat at /home/nmaruyama/nvfuser/debug2/csrc/ops/alias.cpp:644 (most recent call first):
E frame #0: nvfuser::nvfCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x73 (0x7f1a56549573 in /home/nmaruyama/nvfuser/debug2/nvfuser/_C.cpython-310-x
86_64-linux-gnu.so)
exec_nvfuser(..., supports_segmentation=False) is a temporary WAR.
The text was updated successfully, but these errors were encountered:
In this issue, segmentation has separated a CatOp from its PadOp inputs. The PadOp has become a fusion input in the segmented fusion. Pre-segmentations passes rely on this assumption , so these segments do not run with them enabled.
With the resize scheduler (#3556),
test_cat_symbolic
andtest_remove_empty_issue_2545
fail at deserialization. Here's the error output withtest_cat_symbolic
:exec_nvfuser(..., supports_segmentation=False)
is a temporary WAR.The text was updated successfully, but these errors were encountered: