You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Since the Maxwell architecture the NVIDIA GPUs have contained hardware counters that track the traffic on both the incoming and outgoing PCIe link. Adding these counters to the fields exposed via the exporter can be very useful when monitoring these GPUs in a AI/ML fleet.
Describe the solution you'd like
Update the exporter code to support the addition of the TX Throughput and RX Throughput fields obtained via the nvidia-smi tool. We probably need to do this after a test on the GPU architecture to avoid errors on pre-Maxwell GPUs.
Describe alternatives you've considered
There are no other solutions that are as clean as this. I don't see anyone wanting to write a second exporter just for those metrics and adding more calls to nvidia-smi is probably not a wise move at the system level.
Additional context
The fields is questions are discussed in the nvidia-smi documenation. Once this issue is merged we could update the Grafana dashboard to include counters and guages for PCIe traffic.
The text was updated successfully, but these errors were encountered:
Hi, thank you for the suggestion. Lately I don't find any time to maintain the project, and I don't think it's gonna change anytime soon. But a PR would be more than welcome, if you'd be interested.
Is your feature request related to a problem? Please describe.
Since the Maxwell architecture the NVIDIA GPUs have contained hardware counters that track the traffic on both the incoming and outgoing PCIe link. Adding these counters to the fields exposed via the exporter can be very useful when monitoring these GPUs in a AI/ML fleet.
Describe the solution you'd like
Update the exporter code to support the addition of the TX Throughput and RX Throughput fields obtained via the nvidia-smi tool. We probably need to do this after a test on the GPU architecture to avoid errors on pre-Maxwell GPUs.
Describe alternatives you've considered
There are no other solutions that are as clean as this. I don't see anyone wanting to write a second exporter just for those metrics and adding more calls to nvidia-smi is probably not a wise move at the system level.
Additional context
The fields is questions are discussed in the nvidia-smi documenation. Once this issue is merged we could update the Grafana dashboard to include counters and guages for PCIe traffic.
The text was updated successfully, but these errors were encountered: