Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python segmentation failure with resize #3594

Open
naoyam opened this issue Dec 16, 2024 · 2 comments
Open

Python segmentation failure with resize #3594

naoyam opened this issue Dec 16, 2024 · 2 comments
Assignees
Labels
Python API Issues related to the Python API Segmentation Issues related to nvFuser Segmentation

Comments

@naoyam
Copy link
Collaborator

naoyam commented Dec 16, 2024

With the resize scheduler (#3556), test_cat_symbolic and test_remove_empty_issue_2545 fail at deserialization. Here's the error output with test_cat_symbolic:

============================= test session starts ==============================
platform linux -- Python 3.10.15, pytest-8.3.3, pluggy-1.5.0 -- /opt/conda/pytorch/bin/python
cachedir: .pytest_cache
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase(PosixPath('/home/nmaruyama/nvfuser/debug2/.hypothesis/examples'))
rootdir: /home/nmaruyama/nvfuser/debug2
plugins: benchmark-4.0.0, hypothesis-6.115.5, typeguard-4.3.0
collecting ... collected 124 items / 123 deselected / 1 selected

tests/python/test_python_frontend.py::TestNvFuserFrontend::test_cat_symbolic
Exception For CPP Translation:
(A failure here suggests a mismatch in functionality between the original and cloned definitions.)
Does FusionDefinition supports segmentation?     True
FAILED

=================================== FAILURES ===================================
____________________ TestNvFuserFrontend.test_cat_symbolic _____________________

reference_outputs = [tensor([[[-0.4806,  0.1690, -0.1517,  ..., -2.1240,  0.3014, -0.7769],
         [-0.0959,  0.4240,  0.0372,  ..., -0....  device='cuda:0'), tensor([ -2.3808,   3.5931,   1.1129,  ..., -10.6209,  13.8750,  17.9289],
       device='cuda:0')]
fd =
def nvfuser_fusion_id0(fd : FusionDefinition) -> None :
    S0 = fd.define_scalar(None, dtype=DataType.Double)
    S1... T28 = fd.ops.sum(T27, dims=[0, 1], keepdim=False, dtype=DataType.Null)
    fd.add_output(T27)
    fd.add_output(T28)


inputs = [0.29730177875068026, 0.29730177875068026, 4, 64, 768, 4, ...]
supports_segmentation = True, device = None

    def check_cpp_translation(
        reference_outputs, fd, inputs, supports_segmentation, device=None
    ):
        try:
            torch.manual_seed(0)

            # Clone
            cloned_fd = FusionDefinition()
            clone(fd, cloned_fd)

            # Segment
            if supports_segmentation:
>               cloned_fd.segment(inputs)

tests/python/utils.py:268:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self =
def nvfuser_fusion_id1(fd : FusionDefinition) -> None :
    S0 = fd.define_scalar(None, dtype=DataType.Double)
    S1...T61 = fd.ops.sum(T60, dims=[0, 1], keepdim=False, dtype=DataType.Float)
    fd.add_output(T60)
    fd.add_output(T61)


inputs = [0.29730177875068026, 0.29730177875068026, 4, 64, 768, 4, ...]

    def segment(self, inputs):
        """
        Decompose this FusionDefinition into a sequence of segment
        FusionDefinitions.

        This function runs the nvfuser segmentation algorithm and translates the
        segments into their corresponding FusionDefinitions.

        Args:
            inputs (List[Union[Tensor, Scalar]]): A list of inputs to fusion.

        Returns:
            List[FusionDefinition]: The FusionDefinitions corresponding to the
            sub-fusion segments of this FusionDefinition.
        """
        num_segments = self._setup_segmentation(inputs)
        if num_segments == 1:
            self._finalize_segmentation()
            return []

        # Track all segments for this FusionDefinition
        self.segments = []

        # Track map_segment_fid_to_original_fid for each segment
        self.segment_index_space_maps = {}

        # Track the last segment a value is used as an input
        self.map_value_to_last_used_segment = {}

        for idx in range(num_segments):
            new_fd = FusionDefinition()
>           map_segment_fid_to_original_fid = self._build_segment(new_fd, idx)
E           RuntimeError:  INTERNAL ASSERT FAILED at "/home/nmaruyama/nvfuser/debug2/csrc/python_frontend/fusion_state.cpp":145, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues.
E           Detected exception while building Fusion Ir. The failing RecordFunctor is: T17 = fd.ops.cat([T1, T2, T15], dim=2, manual_padding=1)
E           NvFuser error message is:  INTERNAL ASSERT FAILED at "/home/nmaruyama/nvfuser/debug2/csrc/ops/alias.cpp":644, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues. Expected all inputs to be padded when m
anual_padding is True.
E           Exception raised from cat at /home/nmaruyama/nvfuser/debug2/csrc/ops/alias.cpp:644 (most recent call first):
E           frame #0: nvfuser::nvfCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x73 (0x7f1a56549573 in /home/nmaruyama/nvfuser/debug2/nvfuser/_C.cpython-310-x
86_64-linux-gnu.so)

exec_nvfuser(..., supports_segmentation=False) is a temporary WAR.

@rdspring1 rdspring1 self-assigned this Dec 16, 2024
@rdspring1 rdspring1 added Python API Issues related to the Python API Segmentation Issues related to nvFuser Segmentation labels Dec 16, 2024
@rdspring1
Copy link
Collaborator

In this issue, segmentation has separated a CatOp from its PadOp inputs. The PadOp has become a fusion input in the segmented fusion. Pre-segmentations passes rely on this assumption , so these segments do not run with them enabled.

@rdspring1
Copy link
Collaborator

RemoveEmptyPass, MoveSplitCatPass, and MovePadPass are the failing presegmentation passes.

@rdspring1 rdspring1 changed the title Serde segmentation failure with resize Python segmentation failure with resize Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Python API Issues related to the Python API Segmentation Issues related to nvFuser Segmentation
Projects
None yet
Development

No branches or pull requests

2 participants