-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ucc_mem_map limitations #883
Comments
ping @manjugv @wfaderhold21 |
The memory regions mapped by UCC are used in collectives that use RDMA operations, such as PUT and GET operations, to perform the collective. To ensure proper execution and completion of the collective, the buffers are checked to ensure they have been mapped prior to issuing an operation. This check can be expensive when there are many memory regions; in prior work, we found that this will increase the latency for PUT/GET operations when using more than 32 regions. Thus, we limited the regions to 32. This kind of limitation fits PGAS memory usage like OpenSHMEM’s symmetric heap rather than others like MPI’s message passing, which would likely benefit from using the two-sided algorithms rather than the one-sided RDMA algorithms. As for question 2, because we are using RDMA operations, the mkeys for the memory regions must be exchanged via an allgather operation prior to usage. Performing this dynamically could be possible with API changes, but would be expensive. By associating the memory regions to a context, we can combine the necessary allgather operation with the context creation allgather and hide some of the overhead. |
@wfaderhold21 @manjugv I was thinking about 2 for some time. |
And to add onto this - this also makes sense when integrating into NCCL, because NCCL has the notion of preregistering buffers. |
Hi all,
I am trying to mem map memory regions using
ucc_context_params.mem_params
config inucc_context_create
stage.I encountered several issues here.
#define MAX_NR_SEGMENTS 32
. That would be really small given thatucc_mem_map_params.n_segments
is auint64
.The text was updated successfully, but these errors were encountered: