Conversation
…ong code reworked with additional options and odp mode
| return ret; | ||
| } | ||
|
|
||
| int comm_register_odp(comm_reg_t *creg) |
There was a problem hiding this comment.
The name of this API is not expressive enough.
You are effectively exposing all/most of the local process memory.
Explicit/implicit ODP support has limitations, see https://community.mellanox.com/docs/DOC-2898, e.g. have to check for capability.
|
|
||
| if (!*reg) { | ||
| DBG("registering implicit ODP MR\n"); | ||
| MP_CHECK(mp_register(NULL, 0, reg, IBV_EXP_ACCESS_ON_DEMAND)); |
There was a problem hiding this comment.
The documentations says:
"To register an Implicit ODP MR, in addition to the IBV_EXP_ACCESS_ON_DEMAND access flag, use in->addr = 0 and in->length = IBV_EXP_IMPLICIT_MR_SIZE."
so 0 is not a good size.
I would provide a high level API in libmp, one which does all the appropriate tests.
There was a problem hiding this comment.
I noticed that the size parameter is ignored by mp_register() when ODP is enabled. So technically this code is correct
There was a problem hiding this comment.
I wonder what happens if the app calls comm_register_odp() multiple times. Does ibv_reg_mr() succeed? Does it return the same mr / lkey ?
| static void usage() | ||
| { | ||
| printf("Options:\n"); | ||
| printf(" -g allocate GPU intead of CPU memory buffers\n"); |
| tot_iters = MAX_ITERS; | ||
| int c; | ||
|
|
||
| while (1) { |
| void mp_finalize(); | ||
|
|
||
| int mp_register(void *addr, size_t length, mp_reg_t *reg_t); | ||
| int mp_register(void *addr, size_t length, mp_reg_t *reg_t, uint64_t exp_flags); |
There was a problem hiding this comment.
non need for 64bit flags, so just use the int type as in other APIs.
There was a problem hiding this comment.
you are effectively changing both the API and the ABI of libmp, so we'll need to bump the major version.
There was a problem hiding this comment.
I would define here a MP_ flag to express the concept of "register all process memory", instead of forwarding exp_flags to ibv_exp_reg_mr()
|
|
||
| int mp_register(void *addr, size_t length, mp_reg_t *reg_) | ||
| int mp_register(void *addr, size_t length, mp_reg_t *reg_, uint64_t exp_flags) | ||
| { |
There was a problem hiding this comment.
as mentioned above, please change the type of exp_flags to int
and introduce a new enum { MP_REGISTER_FULL_VIRTUAL_ADDRESS_SPACE }; or similar
|
|
||
| if(exp_flags & IBV_EXP_ACCESS_ON_DEMAND) | ||
| { | ||
| dattr.comp_mask = IBV_EXP_DEVICE_ATTR_ODP | IBV_EXP_DEVICE_ATTR_EXP_CAP_FLAGS; |
There was a problem hiding this comment.
we should not query the caps all the time, but rather do that lazily once and cache the result
There was a problem hiding this comment.
it's ok to address this performance issue in a future fix
| mp_err_msg("ibv_reg_mr returned NULL for addr:%p size:%zu errno=%d(%s)\n", | ||
| addr, length, errno, strerror(errno)); | ||
|
|
||
| #ifdef DADO_DEBUG |
There was a problem hiding this comment.
feel free to remove that stale code
Libmp:
Comm: