You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can start a DHT with python3 -m petals.cli.run_dht --host_maddrs /ip4/0.0.0.0/tcp/31337 --identity_path bootstrap1.id in the EC2
In the ec2, I also used pip install . --no-cache-dir because it's a small instance, if that might impact it at all?
I'm having issues running python -m petals.cli.run_server enoch/llama-65b-hf --initial_peers /ip4/IP_ADDRESS/tcp/31337/p2p/abc /ip4/127.0.0.1/tcp/31337/p2p/abc from my computer into the EC2
and getting "hivemind.p2p.p2p_daemon_bindings.utils.P2PDaemonError: Daemon failed to start: 2023/12/08 23:53:41 failed to connect to bootstrap peers"
I have no issues doing this locally on my own computer setting up a DHT, and running peers on it from new tabs but using EC2 to connect peers from my computer to the started DHT on the EC2 i can't seem to get running
telnet EC2_IP 31337
Trying EC2_IP...
Connected to EC2_IP.
Escape character is '^]'.
/multistream/1.0.0
setup.cfg
[metadata]
name = petals
version = attr: petals.__version__
author = Petals Developers
author_email = [email protected]
description = Easy way to efficiently run 100B+ language models without high-end GPUs
long_description = file: README.md
long_description_content_type = text/markdown
url = https://github.com/bigscience-workshop/petals
project_urls =
Bug Tracker = https://github.com/bigscience-workshop/petals/issues
classifiers =
Development Status :: 4 - Beta
Intended Audience :: Developers
Intended Audience :: Science/Research
License :: OSI Approved :: MIT License
Programming Language :: Python :: 3
Programming Language :: Python :: 3.8
Programming Language :: Python :: 3.9
Programming Language :: Python :: 3.10
Programming Language :: Python :: 3.11
Topic :: Scientific/Engineering
Topic :: Scientific/Engineering :: Mathematics
Topic :: Scientific/Engineering :: Artificial Intelligence
Topic :: Software Development
Topic :: Software Development :: Libraries
Topic :: Software Development :: Libraries :: Python Modules
[options]
package_dir =
= src
packages = find:
python_requires = >=3.8
install_requires =
torch>=1.12
bitsandbytes==0.41.1
accelerate>=0.22.0
huggingface-hub>=0.11.1,<1.0.0
tokenizers>=0.13.3
transformers>=4.32.0,<4.35.0 # if you change this, please also change version assert in petals/__init__.py
speedtest-cli==2.1.3
pydantic>=1.10,<2.0 # 2.0 is incompatible with hivemind yet
hivemind==1.1.10.post2
tensor_parallel==1.0.23
humanfriendly
async-timeout>=4.0.2
cpufeature>=0.2.0; platform_machine == "x86_64"
packaging>=20.9
sentencepiece>=0.1.99
peft==0.5.0
safetensors>=0.3.1
Dijkstar>=2.6.0
[options.extras_require]
dev =
pytest==6.2.5
pytest-forked
pytest-asyncio==0.16.0
black==22.3.0
isort==5.10.1
psutil
[options.packages.find]
where = src
If i use python -m petals.cli.run_server enoch/llama-65b-hf --port 31337 --public_ip EC2_IP_ADDRESS
"hivemind.p2p.p2p_daemon_bindings.utils.P2PDaemonError: Daemon failed to start: {"level":"info","ts":"2023-12-09T00:16:02.265-0500","logger":"dht/RtRefreshManager","caller":"rtrefresh/rt_refresh_manager.go:279","msg":"starting refreshing cpl 0 with key CIQAAABBP4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (routing table size was 0)"}"
I also attempted to shut down public and private windows defender firewall before calling but same issue.
Is there anything else I can do?
Is it possibly the EC2 size that can impact this, being a smaller one?
Edit Update:
I was able to get connected using another EC2 instance. I assume this is now officially a windows WSL issue? Any recommendations on what I can do? Although what i find odd is how i can do this locally, start a DHT, and run nodes. I was even able to run a peer connecting to the petals public network as well.
I also am unable to connect using colab as well
Here is the full error:
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/mnt/d/petals/petals/venv/lib/python3.10/site-packages/petals/cli/run_server.py", line 240, in <module>
main()
File "/mnt/d/petals/petals/venv/lib/python3.10/site-packages/petals/cli/run_server.py", line 224, in main
server = Server(
File "/mnt/d/petals/petals/venv/lib/python3.10/site-packages/petals/server/server.py", line 139, in __init__
is_reachable = check_direct_reachability(initial_peers=initial_peers, use_relay=False, **kwargs)
File "/mnt/d/petals/petals/venv/lib/python3.10/site-packages/petals/server/reachability.py", line 78, in check_direct_reachability
return RemoteExpertWorker.run_coroutine(_check_direct_reachability())
File "/mnt/d/petals/petals/venv/lib/python3.10/site-packages/hivemind/moe/client/remote_expert_worker.py", line 36, in run_coroutine
return future if return_future else future.result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/mnt/d/petals/petals/venv/lib/python3.10/site-packages/petals/server/reachability.py", line 59, in _check_direct_reachability
target_dht = await DHTNode.create(client_mode=True, **kwargs)
File "/mnt/d/petals/petals/venv/lib/python3.10/site-packages/hivemind/dht/node.py", line 192, in create
p2p = await P2P.create(**kwargs)
File "/mnt/d/petals/petals/venv/lib/python3.10/site-packages/hivemind/p2p/p2p_daemon.py", line 234, in create
await asyncio.wait_for(ready, startup_timeout)
File "/usr/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
hivemind.p2p.p2p_daemon_bindings.utils.P2PDaemonError: Daemon failed to start: 2023/12/09 17:57:33 failed to connect to bootstrap peers
The text was updated successfully, but these errors were encountered:
I can start a DHT with
python3 -m petals.cli.run_dht --host_maddrs /ip4/0.0.0.0/tcp/31337 --identity_path bootstrap1.id
in the EC2In the ec2, I also used
pip install . --no-cache-dir
because it's a small instance, if that might impact it at all?I'm having issues running
python -m petals.cli.run_server enoch/llama-65b-hf --initial_peers /ip4/IP_ADDRESS/tcp/31337/p2p/abc /ip4/127.0.0.1/tcp/31337/p2p/abc
from my computer into the EC2and getting "hivemind.p2p.p2p_daemon_bindings.utils.P2PDaemonError: Daemon failed to start: 2023/12/08 23:53:41 failed to connect to bootstrap peers"
I have no issues doing this locally on my own computer setting up a DHT, and running peers on it from new tabs but using EC2 to connect peers from my computer to the started DHT on the EC2 i can't seem to get running
My inbound rules:
SSH - 20 - 0.0.0.0/0
HTTP - 80 - 0.0.0.0/0
HTTPS - 443 - 0.0.0.0/0
Custom TCP - 31337 - 0.0.0.0/0
I even tried with All Traffic - 0.0.0.0/0
I'm using WSL Ubuntu 22.04.3 on windows 11
On powershell:
WSL ubuntu:
setup.cfg
If i use
python -m petals.cli.run_server enoch/llama-65b-hf --port 31337 --public_ip EC2_IP_ADDRESS
"hivemind.p2p.p2p_daemon_bindings.utils.P2PDaemonError: Daemon failed to start: {"level":"info","ts":"2023-12-09T00:16:02.265-0500","logger":"dht/RtRefreshManager","caller":"rtrefresh/rt_refresh_manager.go:279","msg":"starting refreshing cpl 0 with key CIQAAABBP4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (routing table size was 0)"}"
I also attempted to shut down public and private windows defender firewall before calling but same issue.
Is there anything else I can do?
Is it possibly the EC2 size that can impact this, being a smaller one?
Edit Update:
I was able to get connected using another EC2 instance. I assume this is now officially a windows WSL issue? Any recommendations on what I can do? Although what i find odd is how i can do this locally, start a DHT, and run nodes. I was even able to run a peer connecting to the petals public network as well.
I also am unable to connect using colab as well
Here is the full error:
The text was updated successfully, but these errors were encountered: