Distributed Agent Runtime does not start #4784

kartikx · 2024-12-22T00:10:17Z

What happened?

I am trying to reproduce the distributed agent runtime example.

However my program gets stuck on the first WorkerAgent register. I don't see the output as indicated in the example.

What did you expect to happen?

I expected to be able to reproduce the Distributed Agent Runtime example.

How can we reproduce it (as minimally and precisely as possible)?

Here is the exact code I am running:

from autogen_core import TRACE_LOGGER_NAME
import logging
import asyncio
from autogen_ext.runtimes.grpc import GrpcWorkerAgentRuntime
from dataclasses import dataclass

from autogen_core import DefaultTopicId, MessageContext, RoutedAgent, default_subscription, message_handler


@dataclass
class MyMessage:
    content: str


@default_subscription
class MyAgent(RoutedAgent):
    def __init__(self, name: str) -> None:
        super().__init__("My agent")
        self._name = name
        self._counter = 0

    @message_handler
    async def my_message_handler(self, message: MyMessage, ctx: MessageContext) -> None:
        self._counter += 1
        if self._counter > 5:
            return
        content = f"{self._name}: Hello x {self._counter}"
        print(content)
        await self.publish_message(MyMessage(content=content), DefaultTopicId())


logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(TRACE_LOGGER_NAME)
logger.setLevel(logging.DEBUG)


async def main():
    print("Starting the agents...")
    worker1 = GrpcWorkerAgentRuntime(host_address="localhost:50051")
    worker1.start()
    await MyAgent.register(worker1, "worker1", lambda: MyAgent("worker1"))
    print("Worker 1 started")

    worker2 = GrpcWorkerAgentRuntime(host_address="localhost:50051")
    worker2.start()
    await MyAgent.register(worker2, "worker2", lambda: MyAgent("worker2"))
    print("Worker 2 started")

    await worker2.publish_message(MyMessage(content="Hello!"), DefaultTopicId())

    # Let the agents run for a while.
    await asyncio.sleep(5)

asyncio.run(main())

I don't see the output as indicated in the example. For me, the program gets stuck on the first MyAgent.register. Here are the logs:

Starting the agents...
INFO:autogen_core:Connecting to host: localhost:50051
INFO:autogen_core:Connecting to localhost:50051
INFO:autogen_core:Connection established
INFO:autogen_core:Send message to host: registerAgentTypeRequest {
  request_id: "1"
  type: "worker1"
}

INFO:autogen_core:Put message in send queue
INFO:autogen_core:Waiting for message from host
INFO:autogen_core:Starting read loop
INFO:autogen_core:Getting message from queue

The Worker 1 started print is never seen.

AutoGen version

0.4.0.dev11

Which package was this bug in

Core

Model used

No response

Python version

3.12

Operating system

Mac

Any additional info you think would be helpful for fixing this bug

No response

The text was updated successfully, but these errors were encountered:

ekzhu · 2024-12-22T08:02:50Z

You need to start the worker runtime host in the background before the workers can connect.

github-actions bot added the needs-triage label Dec 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed Agent Runtime does not start #4784

Distributed Agent Runtime does not start #4784

kartikx commented Dec 22, 2024

ekzhu commented Dec 22, 2024

Distributed Agent Runtime does not start #4784

Distributed Agent Runtime does not start #4784

Comments

kartikx commented Dec 22, 2024

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

AutoGen version

Which package was this bug in

Model used

Python version

Operating system

Any additional info you think would be helpful for fixing this bug

ekzhu commented Dec 22, 2024