{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":280276446,"defaultBranch":"main","name":"kineto","ownerLogin":"pytorch","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2020-07-16T23:03:00.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/21003710?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1716334956.0","currentOid":""},"activityList":{"items":[{"before":"892a37e2db8fdf2d48c312eb8e683e4451b9e47a","after":"95e06b89a9851d151b2f03c92fab410ead4a0669","ref":"refs/heads/main","pushedAt":"2024-06-07T00:08:56.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Lower GPU User Annotation Margin (#944)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/944\n\nUsers have reported that some GPU events overlap on only a single side. This is partially because we add a margin to our GPU User Annotations so that they don't have the same exact start and end time as other events if there is only one event in a user annotation. We can lower the margin now that we don't have rounding issues.\n\nReviewed By: aaronenyeshi\n\nDifferential Revision: D58254807\n\nfbshipit-source-id: 6875fee26f57aba12cde9466d142c9dd62a67146","shortMessageHtmlLink":"Lower GPU User Annotation Margin (#944)"}},{"before":"40a94f81dbd571be1fbde201456a6027726cfb7a","after":"892a37e2db8fdf2d48c312eb8e683e4451b9e47a","ref":"refs/heads/main","pushedAt":"2024-06-03T08:19:23.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fix broken test (#942)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/942\n\nConversion to percents lost in previous commit D57975306\n\nReviewed By: davidberard98\n\nDifferential Revision: D58067577\n\nfbshipit-source-id: 4845a0257841147316dc1ae3b8db97dc54b982e0","shortMessageHtmlLink":"Fix broken test (#942)"}},{"before":"c718859cddab2dcff3e08579c3e59e60e4101ef2","after":"40a94f81dbd571be1fbde201456a6027726cfb7a","ref":"refs/heads/main","pushedAt":"2024-05-31T19:28:16.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fix division by zero (#941)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/941\n\n```\nW0530 09:23:50.932043 139856935265408 loader.py:102] Failed to parse profile data for Run train on 1716249387. Exception=float division by zero\nTraceback (most recent call last):\n File \"/mnt/xarfuse/uid-218352/391895d9-seed-nspid4026531836_cgpid68799117-ns-4026531841/torch_tb_profiler/profiler/loader.py\", line 93, in _process_data\n profile = generator.generate_run_profile()\n File \"/mnt/xarfuse/uid-218352/391895d9-seed-nspid4026531836_cgpid68799117-ns-4026531841/torch_tb_profiler/profiler/run_generator.py\", line 32, in generate_run_profile\n profile_run.overview = self._generate_overview()\n File \"/mnt/xarfuse/uid-218352/391895d9-seed-nspid4026531836_cgpid68799117-ns-4026531841/torch_tb_profiler/profiler/run_generator.py\", line 131, in _generate_overview\n build_part_time_str(costs.costs[ProfileRole.Kernel], 'Kernel'),\n File \"/mnt/xarfuse/uid-218352/391895d9-seed-nspid4026531836_cgpid68799117-ns-4026531841/torch_tb_profiler/profiler/run_generator.py\", line 87, in build_part_time_str\n percentage = round(100 * part_cost / costs.costs[ProfileRole.Total], 2)\n```\n\nReviewed By: sraikund16\n\nDifferential Revision: D57975306\n\nfbshipit-source-id: 89cd21feddecf4b91a03ab26b0392e0c65cbe569","shortMessageHtmlLink":"Fix division by zero (#941)"}},{"before":"be1317644c68b4bfc4646024a6b221066e430031","after":"c718859cddab2dcff3e08579c3e59e60e4101ef2","ref":"refs/heads/main","pushedAt":"2024-05-29T04:33:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add Join to ManifoldChromeTraceLogger (#938)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/938\n\nRequested in this post:\nhttps://fb.workplace.com/groups/ai.efficiency.tools.users/permalink/1777663889385748/\n\nManifold upload fail because resources get cleaned by parent thread while manifold is still being uploaded to. In order to fix this, we save any uploading thread as an instance variable. Then, during the destructor, join on this thread if possible.\n\nCurrently, the ChromeTraceLogger is declared locally so it will deconstruct once the save() function is exited. This will result in a join() happening right after the thread launch, which negates the use of multithreading. For this reason, we save the ChomeLogger as an instance variable of the MemLogger. That way, when the CuptiTraceActivity class exits, the MemLogger will exit and then the ChromeTraceLogger will exit thereby calling its destructor (which is much after the save() gets called).\n\nReviewed By: sanrise\n\nDifferential Revision: D57741999\n\nfbshipit-source-id: 15e80130d728ab88cf357f3748e8dc465cdb3778","shortMessageHtmlLink":"Add Join to ManifoldChromeTraceLogger (#938)"}},{"before":"b7b1fbe6037c7387306f8e99df48bd064b2ec196","after":"be1317644c68b4bfc4646024a6b221066e430031","ref":"refs/heads/main","pushedAt":"2024-05-24T22:15:14.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fix DeviceProperties compile on AMD (#940)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/940\n\nOne of the headers is incorrect. Need to change to \n\nReviewed By: fenypatel99\n\nDifferential Revision: D57788137\n\nfbshipit-source-id: 1aef27c624fa85156ccec5bcfb58eaddbde824ea","shortMessageHtmlLink":"Fix DeviceProperties compile on AMD (#940)"}},{"before":"f29b01db62c03d2ad01838f78d7222a9adb2297c","after":"b7b1fbe6037c7387306f8e99df48bd064b2ec196","ref":"refs/heads/main","pushedAt":"2024-05-23T18:23:45.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Re-land with fix -> D55262038: Support CUPTI Range Profiler for H100\" for one test failure (#936)\n\nSummary: Pull Request resolved: https://github.com/pytorch/kineto/pull/936\n\nReviewed By: briancoutinho\n\nDifferential Revision: D57396669\n\nfbshipit-source-id: 4ab43b6cd2d1c54f80b8f247fd8e55e04a191eec","shortMessageHtmlLink":"Re-land with fix -> D55262038: Support CUPTI Range Profiler for H100\"…"}},{"before":"c37cac5625179a4aae994702e9b50f76aa80cca1","after":"f29b01db62c03d2ad01838f78d7222a9adb2297c","ref":"refs/heads/main","pushedAt":"2024-05-23T17:35:59.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fix Non-CUDA Builds in Kineto (#937)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/937\n\nCurrently any Non-CUDA builds will fail to compile because the sources are not available for Device utils and Device Properties\n\nReviewed By: briancoutinho, fenypatel99\n\nDifferential Revision: D57694275\n\nfbshipit-source-id: 1fda4fbee2a53bcaf6789e80821514835a39cfb5","shortMessageHtmlLink":"Fix Non-CUDA Builds in Kineto (#937)"}},{"before":null,"after":"015a6fe5ca09488f0b87f1c24090c2e4658f3b01","ref":"refs/heads/export-D57396669","pushedAt":"2024-05-21T23:42:36.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Reland -> D55262038: Support CUPTI Range Profiler for H100\" for one test failure\n\nDifferential Revision: D57396669","shortMessageHtmlLink":"Reland -> D55262038: Support CUPTI Range Profiler for H100\" for one t…"}},{"before":"4d0641cb8331dbfccb8f896002300bb7b87a1018","after":"c37cac5625179a4aae994702e9b50f76aa80cca1","ref":"refs/heads/main","pushedAt":"2024-05-16T18:50:58.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Delayed Init for kineto for AMD\n\nSummary:\nIssue: Kineto static init doesn't happen automatically for AMD machines, and so we are unable to collect traces on-demand using dyno CLI. In libkineto_init, we initialize Profiler for NVIDIA (gated by HAS_CUPTI) and do a delayed initialize for MTIA (gated by isMTIA), but we don't initialize for AMD.\n\nAdding a delayed initialize for AMD as well.\n\nReviewed By: xw285cornell\n\nDifferential Revision: D57063901\n\nfbshipit-source-id: 76edc97ac4e0ec935d63a58359b1fee0179188b2","shortMessageHtmlLink":"Delayed Init for kineto for AMD"}},{"before":"618c6840f5f6e3007909adc96cb2f2d75b956a59","after":"4d0641cb8331dbfccb8f896002300bb7b87a1018","ref":"refs/heads/main","pushedAt":"2024-05-16T16:19:19.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add new activity type\n\nSummary: Add a new activity type for workloadd events in Kineto\n\nReviewed By: aaronenyeshi\n\nDifferential Revision: D57171609\n\nfbshipit-source-id: 1ea99bda3f957f12821d4280f0e873f9468b697c","shortMessageHtmlLink":"Add new activity type"}},{"before":"e96de1c92e3b2b019237684c00c8391fd96d1d36","after":"618c6840f5f6e3007909adc96cb2f2d75b956a59","ref":"refs/heads/main","pushedAt":"2024-05-15T18:41:06.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Quick fix mixed up function names and variables for isGpuAvailable (#934)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/934\n\nSome of the APIs got mixed up, and both of them are failing to return the right values.\n\nTest Plan: N/A\n\nDifferential Revision: D57392667\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: 6fa65af017a7a551da18945c1b1fc76c214e826d","shortMessageHtmlLink":"Quick fix mixed up function names and variables for isGpuAvailable (#934"}},{"before":"6503763fb0331b1a35b3da3a76ea97552397ed49","after":"e96de1c92e3b2b019237684c00c8391fd96d1d36","ref":"refs/heads/main","pushedAt":"2024-05-15T15:16:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Revert D55262038: Multisect successfully blamed \"D55262038: Support CUPTI Range Profiler for H100\" for one test failure\n\nSummary:\nThis diff reverts D55262038\nD55262038: Support CUPTI Range Profiler for H100 by pls331 causes the following test failure:\n\nTests affected:\n- [cogwheel:cogwheel_adranker_local_compare_test#main](https://www.internalfb.com/intern/test/281475018584027/)\n\nHere's the Multisect link:\nhttps://www.internalfb.com/multisect/5091629\nHere are the tasks that are relevant to this breakage:\n\nThe backout may land if someone accepts it.\n\nIf this diff has been generated in error, you can Commandeer and Abandon it.\n\nReviewed By: aaronenyeshi\n\nDifferential Revision: D57370679\n\nfbshipit-source-id: d10c0c84a57d3707ba6f0859756472c9eb7c09e4","shortMessageHtmlLink":"Revert D55262038: Multisect successfully blamed \"D55262038: Support C…"}},{"before":"ddec2046bea307a7e14dea3ca0c07be36e9b5676","after":"6503763fb0331b1a35b3da3a76ea97552397ed49","ref":"refs/heads/main","pushedAt":"2024-05-15T00:45:26.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Support CUPTI Range Profiler for H100 (#930)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/930\n\nPrevious errors when running on H100:\n\n1) //kineto/libkineto:cupti_range_profiler_test_prog failed with CUPTI_ERROR_NOT_INITIALIZED\n\n2) kineto/libkineto:kineto_playground failed with CUPTI_ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED\n\nReviewed By: briancoutinho\n\nDifferential Revision: D55262038\n\nfbshipit-source-id: 7c83207e8e3aae91a953b305e690b4067dc69c33","shortMessageHtmlLink":"Support CUPTI Range Profiler for H100 (#930)"}},{"before":"8982dd406e5bd9d7da3817ab9e9dbc8cfb99f87d","after":"ddec2046bea307a7e14dea3ca0c07be36e9b5676","ref":"refs/heads/main","pushedAt":"2024-05-14T22:42:13.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Split isGpuAvailable between AMD and CUDA (#933)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/933\n\nThis allows init.cpp to check which devices are available to help determine how to startup. Required for D57063901.\n\nTest Plan: CI\n\nDifferential Revision: D57343727\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: e008da9ae1c4f83e68791d94b06ededc9a61e4ab","shortMessageHtmlLink":"Split isGpuAvailable between AMD and CUDA (#933)"}},{"before":"9ebe54900e5f409ec15c71da0cb1df2fb854377c","after":"8982dd406e5bd9d7da3817ab9e9dbc8cfb99f87d","ref":"refs/heads/main","pushedAt":"2024-05-14T22:04:17.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Log RocTracer, HIP Driver and HIP Runtime versions to Scuba (#932)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/932\n\nLog the versions for roctracer, hip driver and hip runtime. Using same logger observer api as CUDA / CUPTI.\n\nTest Plan: CI\n\nReviewed By: houseroad\n\nDifferential Revision: D57349962\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: 8eaca35ee3daa130d1c85c19168e0e707a3ea19e","shortMessageHtmlLink":"Log RocTracer, HIP Driver and HIP Runtime versions to Scuba (#932)"}},{"before":"affe44cebfcf96b0913d127975723f3405f8763c","after":"9ebe54900e5f409ec15c71da0cb1df2fb854377c","ref":"refs/heads/main","pushedAt":"2024-05-14T21:01:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Rework CudaUtil into DeviceUtil to support AMD (#931)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/931\n\nRenamed CudaUtil to DeviceUtil and refactored the code to support AMD:\n- Combined CudaUtil, cupti_call and cuda_call into a single DeviceUtil.\n- Added functionality to print roctracer version, HIP driver version and HIP runtime version.\n- Updated isGpuAvailable check to support HIP.\n- Clean up TARGETS to remove file duplication and added DeviceUtil and DeviceProperties into a cpp_library, device_functions.\n\nTest Plan: CI\n\nReviewed By: xw285cornell\n\nDifferential Revision: D56943493\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: 805c2a32bdbe43a3b7def4c6484d1107dc2caa9b","shortMessageHtmlLink":"Rework CudaUtil into DeviceUtil to support AMD (#931)"}},{"before":"84d95b8232167674eee17c11c2198276f4a6482c","after":"affe44cebfcf96b0913d127975723f3405f8763c","ref":"refs/heads/main","pushedAt":"2024-05-14T19:05:44.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add deviceProperties for AMD devices to trace's metadata (#927)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/927\n\nSimilar to CUDA, save the device properties to the metadata of Roctracer GPU traces.\n- Renamed CudaDeviceProperties files to DeviceProperties, and changed TARGETS' cpp_library to device_properties\n- Added gpu agnostic definitions so we can re-use existing DeviceProperties functions to return a valid devicePropertiesJson() string for AMD. Return \"\" for any non-CUDA or non-AMD.\n- Used ifdef for a few device properties specific to CUPTI: regsPerMultiprocessor, sharedMemPerBlockOptin, sharedMemPerMultiprocessor.\n- Clean up the TARGETS, only need 1 library and include it.\n\nTest Plan: CI\n\nReviewed By: xw285cornell\n\nDifferential Revision: D56901553\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: ae87d14ab63a19abbbed836663f99e9ce6213a17","shortMessageHtmlLink":"Add deviceProperties for AMD devices to trace's metadata (#927)"}},{"before":"711cdcd8541429abe240e2619dd70d571c61829a","after":"84d95b8232167674eee17c11c2198276f4a6482c","ref":"refs/heads/main","pushedAt":"2024-05-10T20:47:20.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add Base Timestamp to Kineto Tracings (#928)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/928\n\nBecause we use relative timestamps in Kineto tracings, other plugins that use stitching, like GPUSnoop, have began to fail. In order for events to be stitched properly, they will need the basetime to get a relative timestamp off of. We output this in our chrome trace JSON as \"baseTimeNanoseconds\"\n\nReviewed By: aaronenyeshi\n\nDifferential Revision: D57186263\n\nfbshipit-source-id: f7628bebda7c40c74a558c125c81aa96312df0a1","shortMessageHtmlLink":"Add Base Timestamp to Kineto Tracings (#928)"}},{"before":"972a353358bd6878eacf1afdbcf46784812f4607","after":"711cdcd8541429abe240e2619dd70d571c61829a","ref":"refs/heads/main","pushedAt":"2024-05-10T19:29:08.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Remove ipcfabric folder which already imported from dynolog (#929)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/929\n\nRemove ipcfabric folder, which is not being used. In OSS, we are using ipcfabric that comes from the third_party/dynolog repo.\n\nTest Plan: CI\n\nReviewed By: sraikund16\n\nDifferential Revision: D57212412\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: 0225fcdc0f02545de916da6d5d05a9cd419f90c1","shortMessageHtmlLink":"Remove ipcfabric folder which already imported from dynolog (#929)"}},{"before":"3a81076cc97092666f319846f32f36b73ce2293e","after":"972a353358bd6878eacf1afdbcf46784812f4607","ref":"refs/heads/main","pushedAt":"2024-05-07T20:13:09.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Log CUPTI and CUDA Versions via LoggerCollector\n\nSummary: Add CUPTI Version, CUDA Runtime Version and CUDA Driver Version to the Logger Collector / Observer metadata. This helps us quickly check which versions are used by a tracing job. Also we need to move init of LoggerCollector before ActivityProfiler.\n\nReviewed By: briancoutinho, zhuhan0\n\nDifferential Revision: D57002323\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: 411bfa42d778f0ea0123600d2b96cdc2c8432a77","shortMessageHtmlLink":"Log CUPTI and CUDA Versions via LoggerCollector"}},{"before":"cc24537ac461f08597fab3192e59a3952719d7a2","after":"3a81076cc97092666f319846f32f36b73ce2293e","ref":"refs/heads/main","pushedAt":"2024-05-04T02:32:27.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Remove CUPTI Overhead track from AMD Traces (#924)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/924\n\nRemove CUPTI Overhead track from AMD Traces. We don't collect CUPTI overhead for Roctracer.\n\nTest Plan: CI and ran locally, see below.\n\nReviewed By: HugeEngine\n\nDifferential Revision: D56947649\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: a96d94b0763fab0c5d99d4b41d26a0d770e437f7","shortMessageHtmlLink":"Remove CUPTI Overhead track from AMD Traces (#924)"}},{"before":"d153c067decaea7cf22f1b4b86db16a48661ea43","after":"cc24537ac461f08597fab3192e59a3952719d7a2","ref":"refs/heads/main","pushedAt":"2024-05-03T00:57:27.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Delay logging for stream wait event till end to ensure filtering is (#908)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/908\n\nFixes https://github.com/pytorch/kineto/issues/904\n\nToday to avoid noisy \"even sync\" events we do some filtering such that if a kernel is not launched on a stream till now (in trace window) we do not emit these events. But this may be incorrect if a future kernel is present but we just did not see it yet in the events.\n\nFix: the idea is to not filter the CUDA event synchronizations on the fly. instead defer the logging of some of these events till all other events are processed. We then log these events only if a stream has some activity to avoid noise.\n\nReviewed By: aaronenyeshi\n\nDifferential Revision: D56198193\n\nfbshipit-source-id: 916715b063fff1d6496b719e2b85f8ca732d436b","shortMessageHtmlLink":"Delay logging for stream wait event till end to ensure filtering is (#…"}},{"before":"87ecc1626dbdf174a0f45dd51cf7921d00924618","after":"d153c067decaea7cf22f1b4b86db16a48661ea43","ref":"refs/heads/main","pushedAt":"2024-05-02T23:08:08.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Change Default Chrome Trace Unit to be in Milliseconds (#923)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/923\n\nWe have received comments stating that it would be easier to read tracings if they were back in millisecond format. For this reason, change the default back to milliseconds and if a user wants to read at nanosecond, they can just change the JSON manually.\n\nReviewed By: aaronenyeshi\n\nDifferential Revision: D56894110\n\nfbshipit-source-id: 73fad992f000f1d9a4bddb5e2f7776e6bb370e63","shortMessageHtmlLink":"Change Default Chrome Trace Unit to be in Milliseconds (#923)"}},{"before":"aac412e6e7be846062add5c84be0c3fb501c4378","after":"87ecc1626dbdf174a0f45dd51cf7921d00924618","ref":"refs/heads/main","pushedAt":"2024-05-02T15:02:27.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Improve roctracer activities metadata and names (#922)\n\nSummary:\nPull Request resolved: https://github.com/pytorch/kineto/pull/922\n\nImproving RoctracerActivity Metadata and Name:\n- Similar to CUDA, fix the naming Memcpy and Memset to include Kind, src, and dst\n- Improve metadata to properly include stream and kind\n- Also, added correlation, which is used by downstream analyzers to correlate GPU to CPU op.\n\nTest Plan:\nRan the diff on AMD sigrid predictor machine:\nBefore: https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree/traces/dynocli/0/1714589410/localhost/libkineto_activities_1521592.json.gz&bucket=gpu_traces\nAfter: https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree/traces/dynocli/0/1714598360/localhost/libkineto_activities_385338.json.gz&bucket=gpu_traces\n\n## Memcpy\n- Improved Memcpy names:\nBefore:\n {F1501692238}\nAfter:\n {F1501692699}\n\n- Improved Memcpy metadata:\nBefore:\n {F1501697406}\nAfter:\n {F1501698567}\n## Memset\n- Improved Memset names:\nBefore:\n {F1501696248}\nAfter:\n {F1501696391}\n- Improved Memset metadata:\nBefore:\n {F1501698081}\nAfter:\n {F1501697661}\n\nReviewed By: xw285cornell, houseroad\n\nDifferential Revision: D56848113\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: 8b9d6c7456035c69e78a9986a84fc7ff683fbec6","shortMessageHtmlLink":"Improve roctracer activities metadata and names (#922)"}},{"before":"09299a5e8138e707bf379ca44ada3fa4b47d322f","after":"aac412e6e7be846062add5c84be0c3fb501c4378","ref":"refs/heads/main","pushedAt":"2024-05-01T17:41:29.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Make Kineto Stage Log Level Lower\n\nSummary:\nUsers have complained that STAGE level logs are print by default. If a user is running many profiles it can certainly clutter STDOUT. Lower the STAGE level such that it is below the default. This will only affect OSS and internally, STAGE will still be outputted.\n\nBrought up by this issue https://github.com/pytorch/pytorch/issues/83383\n\nReviewed By: aaronenyeshi\n\nDifferential Revision: D56646009\n\nfbshipit-source-id: e8eb920c866af2e566be4c3bdc7c7d760be81e6c","shortMessageHtmlLink":"Make Kineto Stage Log Level Lower"}},{"before":"8bf50de29b08a3f51d9f1486268e43396faa0915","after":"09299a5e8138e707bf379ca44ada3fa4b47d322f","ref":"refs/heads/main","pushedAt":"2024-04-29T22:38:35.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Remove TMP_USE_TSC_AS_TIMESTAMP from OSS compilation\n\nSummary: We are taking a break on rolling out Kineto improvements until proper E2E and Unit testing is in. Lets roll back this change so we can update the kineto hash without breaking things.\n\nReviewed By: aaronenyeshi\n\nDifferential Revision: D56727137\n\nfbshipit-source-id: 311b5a64554f7e0fa6301571d2dab01842f9f963","shortMessageHtmlLink":"Remove TMP_USE_TSC_AS_TIMESTAMP from OSS compilation"}},{"before":"879cbd09083f033e1be7a75edea04b1dbbd8a723","after":"8bf50de29b08a3f51d9f1486268e43396faa0915","ref":"refs/heads/main","pushedAt":"2024-04-27T19:11:20.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Change relative time of 'ts' field from init time to the start of 3 month intervals\n\nSummary:\nUse the start of 3 month intervals as the base_time which all the timestamps will be relative to. This is better than using init time of the tracer because traces across ranks are starting at different starting points, and its very hard to correlate them especially when working on PP, FSDP, etc.\n\nThere are tools like Chrome Trace Viewer that uses double to represent\neach element in the timeline. Double has a 53 bit mantissa to support\nup to 2^53 significant digits (up to 9007199254740992). This holds at the\nnanosecond level, about 3 months and 12 days. So, let's round base time to\n3 months intervals, so we can still collect traces across ranks relative to\neach other. A month is 2629746, so for 3 months std::ratio is 7889238.\n\nReviewed By: sraikund16\n\nDifferential Revision: D56642236\n\nPulled By: aaronenyeshi\n\nfbshipit-source-id: cd66d85181036b75c50c73354215b96d468e9790","shortMessageHtmlLink":"Change relative time of 'ts' field from init time to the start of 3 m…"}},{"before":"117ab8c80a93b6536c6ff6bcd446b220940427a3","after":"879cbd09083f033e1be7a75edea04b1dbbd8a723","ref":"refs/heads/main","pushedAt":"2024-04-27T16:24:45.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fix deprecated use of 0/NULL in kineto/libkineto/src/CuptiRangeProfilerApi.h + 1\n\nSummary:\n`nullptr` is typesafe. `0` and `NULL` are not. In the future, only `nullptr` will be allowed.\n\nThis diff helps us embrace the future _now_ in service of enabling `-Wzero-as-null-pointer-constant`.\n\nReviewed By: palmje\n\nDifferential Revision: D56650279\n\nfbshipit-source-id: 075e0bf23af90ebb97d90d7b0580ad52daa67f4c","shortMessageHtmlLink":"Fix deprecated use of 0/NULL in kineto/libkineto/src/CuptiRangeProfil…"}},{"before":"a317b861fd8a73a6d5b04e31fc05316db9836c1f","after":"117ab8c80a93b6536c6ff6bcd446b220940427a3","ref":"refs/heads/main","pushedAt":"2024-04-27T01:41:22.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Make ipcfabric have a single definition\n\nSummary:\nThere is an ODR one definition rule violation that was causing a crash on sigrid\nhttps://fb.workplace.com/groups/560979627394613/posts/2909061125919773/?comment_id=2909752389183980&reply_comment_id=2910102119149007\nSigrid includes both kineto and ipcfabric via dynolog, and on kineto the class is fused in internal. Also adding Logger.h caused a mess.\n\nTries to resolve this issue\nNote ENABLE_IPC_FABRIC define is only used in open source kineto (and PyTorch) but not in internal fbcode\n\nDifferential Revision: D56577485\n\nfbshipit-source-id: bc482e11f03d7ec360877a8f4a100fa5fc3234a3","shortMessageHtmlLink":"Make ipcfabric have a single definition"}},{"before":"83a23e270aee83b4eb8659723426faff1e531a2e","after":"a317b861fd8a73a6d5b04e31fc05316db9836c1f","ref":"refs/heads/main","pushedAt":"2024-04-27T01:32:00.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fix deprecated use of 0/NULL in kineto/libkineto/src/CuptiCallbackApi.cpp + 1\n\nSummary:\n`nullptr` is typesafe. `0` and `NULL` are not. In the future, only `nullptr` will be allowed.\n\nThis diff helps us embrace the future _now_ in service of enabling `-Wzero-as-null-pointer-constant`.\n\nReviewed By: palmje\n\nDifferential Revision: D56650310\n\nfbshipit-source-id: ec371056d3a933e8b02fa6dec648de4003f7a2a4","shortMessageHtmlLink":"Fix deprecated use of 0/NULL in kineto/libkineto/src/CuptiCallbackApi…"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEXrb9zgA","startCursor":null,"endCursor":null}},"title":"Activity · pytorch/kineto"}