CPU count #398

jaber-the-great · 2023-09-23T19:23:58Z

Are we going to dedicate one CPU core per queue? Or we want to have one CPU core per pair of queues (each core handles one upstream and one downstream queue? ... which kind of does not make sence)
If every CPU should handle only one TX queue, there is a bug in the implementation. It would work correctly when we use OnAStick mode; But when we have two separate interfaces, instead of:
queuesAvailable = min( queuesAvailable , cpuCount)
we should use:
queuesAvailable = min( queuesAvailable , cpuCount//2)
That's because the value of queuesAvailable read from number of TX queues only correspond to number of queues in one of the interfaces not the sum of all TX queues in both interfaces.

rchac · 2023-09-23T23:08:05Z

If you run

ls /sys/class/net/INTERFACE/queues/

You should get an equivalent number of TX queues and CPU cores, or RX queues and CPU cores. RX and TX queues are almost always the same as the number of CPU cores unless there's a misconfiguration of the NIC. We just do the min() as a fail-safe in case of user misconfiguration of the NIC.

If we implement what you are proposing, wouldn't would we be limited to 4 queues on an 8 core system?

jaber-the-great · 2023-09-23T23:58:50Z

I use the same command to get the number of TX queues for each interface. Here we have two issues:

By running the command you mentioned, we get the number of TX queues for one interface. The issues is when we are not using onAStick mode, we have two interfaces. Therefore, the number of TX interfaces is twice as much as the number of cpuCount ( n TX queues for upstream and n TX queues for downstream bandwidth shaping). Therefore, if we want to compare do the min, we should get the minimum of number of TX queues for one interface and half of cpuCount
PS: Here I ignore the case when number of TX queues are not the same for each interface (Issue Number of Available Queues #375)
The other issue is that I used Veth for traffic shaping and I define the number of queues (using ethtool), so, the cpu count and number of TX queues are not the same.

rchac · 2023-09-24T00:55:42Z

By running the command you mentioned, we get the number of TX queues for one interface.

Correct

The issues is when we are not using onAStick mode, we have two interfaces. Therefore, the number of TX interfaces is twice as much as the number of cpuCount ( n TX queues for upstream and n TX queues for downstream bandwidth shaping).

Ok right, but we are using ls /sys/class/net/INTERFACE/queues/ in LibreQoS.py. Each CPU core gets an RX and TX queue assigned by default. On an 8-core system, ls /sys/class/net/INTERFACE/queues/ will show 8 TX queues for interfaceA and 8 TX queues for interfaceB.

On line 479, queuesAvailable is calculated as queuesAvailable = min(InterfaceAQueuesAvailable, InterfaceBQueuesAvailable). So it's not the sum of TX queues, it's the minimum between the TX queues of each interface. I don't quite understand why we would use queuesAvailable = min( queuesAvailable , cpuCount//2).

The two interfaces being used would be identical SFP+ ports of a dual-port NIC, so rx/tx queues would be the same between them. We do not currently support configurations where someone wants to use two different model NICs, and that would probably not be easy to officially support in the long-term.

On a standard two-interface, one NIC config, if we were to use queuesAvailable = min( queuesAvailable , cpuCount//2) - you would only be able to utilize half of the available CPU cores for queuing.

Please correct me if I am misunderstanding your proposal. Thanks. 👍

jaber-the-great · 2023-09-26T18:46:15Z

I think you can support configurations when someone uses two different NIC models (Issue #375 and PR #377 ).
Also:
queuesAvailable = min(InterfaceAQueuesAvailable, InterfaceBQueuesAvailable)
queuesAvailable = min(queuesAvailable , CpuCount)
On a 8-core system, Interface A would have 8 TX queues and interface B would have 8 TX queues and the queuesAvailable=8. We have total of 16 TX queues that need to be managed but have 8 cores; which means every 2 TX queue would be managed by one CPU core. (In libreQoS.py, we would iterate 8 times for interface B queues and 8 times for interface A queues to generate tc command for each TX queue, in total 16 queues and 16 cpu cores!!)
But if use:
queuesAvailable = min(queuesAvailable , CpuCount//2)
queuesAvailable = 4
We would iterate 4 times for interface A and 4 times for interface B (in total, 8 queues and 8 cores )

PS: This line would be different when we use OnAStick mode or the number of queues per interface are different.
If it does not make sense, I would be happy to have a short meeting and talk about it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU count #398

CPU count #398

jaber-the-great commented Sep 23, 2023

rchac commented Sep 23, 2023 •

edited

jaber-the-great commented Sep 23, 2023

rchac commented Sep 24, 2023 •

edited

jaber-the-great commented Sep 26, 2023

CPU count #398

CPU count #398

Comments

jaber-the-great commented Sep 23, 2023

rchac commented Sep 23, 2023 • edited

jaber-the-great commented Sep 23, 2023

rchac commented Sep 24, 2023 • edited

jaber-the-great commented Sep 26, 2023

rchac commented Sep 23, 2023 •

edited

rchac commented Sep 24, 2023 •

edited