[Performance] how to set the threads when using TRT EP #22913

noahzn · 2024-11-21T06:30:17Z

Describe the issue

I notice multiple threads when using ONNXRUNTIME (TRT EP). Is this a normal behavior?

From the documentation it says:

Set number of intra-op threads
Onnxruntime sessions utilize multi-threading to parallelize computation inside each operator.

By default with intra_op_num_threads=0 or not set, each session will start with the main thread on the 1st core (not affinitized). Then extra threads per additional physical core are created, and affinitized to that core (1 or 2 logical processors).

I'm using TRT EP, although in providers I also include CPUExecutionProvider and CUDAExecutionProvider. How can I set number of threads for TRT EP? Thanks.

To reproduce

No code can be provided.

Urgency

No response

Platform

Other / Unknown

OS Version

JetPack=5.1.2

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

onnxruntime-gpu=1.17.0

ONNX Runtime API

Python

Architecture

ARM64

Execution Provider

TensorRT

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

No

The text was updated successfully, but these errors were encountered:

noahzn added the performance issues related to performance regressions label Nov 21, 2024

github-actions bot added platform:jetson issues related to the NVIDIA Jetson platform ep:TensorRT issues related to TensorRT execution provider labels Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] how to set the threads when using TRT EP #22913

[Performance] how to set the threads when using TRT EP #22913

noahzn commented Nov 21, 2024

[Performance] how to set the threads when using TRT EP #22913

[Performance] how to set the threads when using TRT EP #22913

Comments

noahzn commented Nov 21, 2024

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?