Track SelectorGroupChat token usage with new Message type #4771

gziz · 2024-12-20T06:44:07Z

The below changes are not meant to be merged but rather are here to open a discussion about how to correctly track the token usage for the calls that occur internally in some agents/teams such as the SelectorGroupChat.

The problem is how do we pass a message from SelectorGroupChat.select_speaker() to somewhere where we are tracking the token usage (specifically we want to also track the tokens used inside select_speaker)? I thought about publishing to the output_topic a new type of Message, one that tracks only Token Usage, however, in the future it could track other things that are not meant to be consumed by the agents.

To consider

Some tests are failing!

The tests for SelectorGroupChat are failing because SelectorGroupChat.run(...) is generating more messages than expected, that's because BaseGroupChat is additionally yielding the new UsageEvent messages. We need to yield them for Console to receive them and account them
Fix: We can make it so that BaseGroupChat only yields messages that are not UsageEvent, note that these messages would still be considering when computing the token usage that gets return with the TaskResult.
Problem the above fix: Console would no longer be receiving UsageEvent needed for its internal token_usage track.

Why are these changes needed?

Related issue number

#4719

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

husseinmozannar · 2024-12-22T00:42:17Z

I don't understand the need for the UsageEvent.

I have a suggestion. The way Console tracks usage I find is only useful on the surface.

Two things it doesn't help with:

disambiguate usage by model_client (e.g., o1 vs gpt-4o)
disambiguate usage by agent

To address (1), RequestUsage should track model_id. To address (2), I believe we can track provenance of messages.

However, what I still don't like is that the dev of each agent/team has to do the manual work to track usage which introduces extra work and is vulnerable to errors.

Since model_client already tracks things, I wish we can just automate this process. I can imagine a way to track the model_client of each agent in a runtime and count the usages appropriately, it could be a team method instead.

ekzhu · 2024-12-22T01:53:42Z

I don't think we need to introduce new public facing message type. We can have the selector group chat manager to publish internal event for selecting agent, which includes the usage info. From the base group chat, we don't need to output that event, but accumulate usage information from it.

For now the dev needs to do more, however, let's use 80-20 rule and focus on the common user experience first.

I think grouping usage by model client is a good idea.

Track SelectorGroup select_speaker tokens with new Message type

d349011

gziz mentioned this pull request Dec 20, 2024

Total token usage and latency metrics should be reflected in TaskResult and Response #4719

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track SelectorGroupChat token usage with new Message type #4771

Track SelectorGroupChat token usage with new Message type #4771

gziz commented Dec 20, 2024 •

edited

Loading

husseinmozannar commented Dec 22, 2024 •

edited

Loading

ekzhu commented Dec 22, 2024 •

edited

Loading

Track SelectorGroupChat token usage with new Message type #4771

Are you sure you want to change the base?

Track SelectorGroupChat token usage with new Message type #4771

Conversation

gziz commented Dec 20, 2024 • edited Loading

To consider

Why are these changes needed?

Related issue number

Checks

husseinmozannar commented Dec 22, 2024 • edited Loading

ekzhu commented Dec 22, 2024 • edited Loading

gziz commented Dec 20, 2024 •

edited

Loading

husseinmozannar commented Dec 22, 2024 •

edited

Loading

ekzhu commented Dec 22, 2024 •

edited

Loading