-
Notifications
You must be signed in to change notification settings - Fork 927
Updated ABI generation code and new libraries #13280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
8fbf968
to
29b1a2f
Compare
da515d4
to
340e7ad
Compare
Maybe you should somehow vendor the In short, I think it would be in everyone's convenience to use https://github.com/mpi-forum/mpi-abi-stubs as the "source of truth" for ABI-related stuff, avoiding manual synchronization of handle/constant values. |
Hello! The Git Commit Checker CI bot found a few problems with this PR: f7d94fa: WIP: explain issue with pympistandard for callback...
4d79937: WIP: fix JSONs
ec9c45a: WIP: fix typo in pympistd arg
b047b19: WIP: add JSONs for ABI and API
c9d0c7a: WIP: bump pympistandard commit for profiling embig...
a1255ce: WIP: move Aint helper macros under ifndef OMPI_NO_...
342b072: WIP: add some workarounds for MPI_Fint and MPI_Inf...
53aeac5: WIP: mangle some more functions
93c0a39: WIP: avoid double inclusion of abi.h
afd9eb2: WIP: use pympistandard by editing PYTHONPATH (inst...
d397bfc: WIP: fix some bugs in mangling names
da2630f: WIP: fix typo for MPI_internal
956dded: WIP: add additional types and functions to be mang...
0b10ce6: WIP: temp fix for Aint problems
9830327: WIP: add input for abi.h.in
68c69df: WIP: move abi.h.in
702cb23: WIP: add in 5.0 apis.json
6b2784f: WIP: move code out of consts.py
22d4cb2: WIP: call c_header from Makefile
b0d0ff2: WIP: mangle names for internal usage
96a33c3: WIP: generate callback function prototypes
d2dd7ed: WIP: remove comment function
45d8789: WIP: print out embiggened versions of functions
b081aa2: WIP: add MPI and ABI versions
80ebfe7: WIP: generate API prototypes
93e6375: WIP: Comment out a Fortran-only category
db8a2b5: WIP: add comment pointing back to MPI standard
08dfb42: WIP: use enums for most int values
aab2023: WIP: create ABI header file from template with cat...
46fae8e: WIP: generate header with ABI values for #defines
c15a05d: WIP: remove abi.py
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
1 similar comment
Hello! The Git Commit Checker CI bot found a few problems with this PR: f7d94fa: WIP: explain issue with pympistandard for callback...
4d79937: WIP: fix JSONs
ec9c45a: WIP: fix typo in pympistd arg
b047b19: WIP: add JSONs for ABI and API
c9d0c7a: WIP: bump pympistandard commit for profiling embig...
a1255ce: WIP: move Aint helper macros under ifndef OMPI_NO_...
342b072: WIP: add some workarounds for MPI_Fint and MPI_Inf...
53aeac5: WIP: mangle some more functions
93c0a39: WIP: avoid double inclusion of abi.h
afd9eb2: WIP: use pympistandard by editing PYTHONPATH (inst...
d397bfc: WIP: fix some bugs in mangling names
da2630f: WIP: fix typo for MPI_internal
956dded: WIP: add additional types and functions to be mang...
0b10ce6: WIP: temp fix for Aint problems
9830327: WIP: add input for abi.h.in
68c69df: WIP: move abi.h.in
702cb23: WIP: add in 5.0 apis.json
6b2784f: WIP: move code out of consts.py
22d4cb2: WIP: call c_header from Makefile
b0d0ff2: WIP: mangle names for internal usage
96a33c3: WIP: generate callback function prototypes
d2dd7ed: WIP: remove comment function
45d8789: WIP: print out embiggened versions of functions
b081aa2: WIP: add MPI and ABI versions
80ebfe7: WIP: generate API prototypes
93e6375: WIP: Comment out a Fortran-only category
db8a2b5: WIP: add comment pointing back to MPI standard
08dfb42: WIP: use enums for most int values
aab2023: WIP: create ABI header file from template with cat...
46fae8e: WIP: generate header with ABI values for #defines
c15a05d: WIP: remove abi.py
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
a9b105c
to
cefbde8
Compare
Hello! The Git Commit Checker CI bot found a few problems with this PR: e86186b: WIP: add JSONs for ABI and API
24417ec: WIP: bump pympistandard commit for profiling embig...
718b1d0: WIP: move Aint helper macros under ifndef OMPI_NO_...
3431c8d: WIP: add some workarounds for MPI_Fint and MPI_Inf...
566fdaa: WIP: mangle some more functions
8396bef: WIP: avoid double inclusion of abi.h
39c20b7: WIP: use pympistandard by editing PYTHONPATH (inst...
d1aece4: WIP: fix some bugs in mangling names
c7c1809: WIP: fix typo for MPI_internal
2bbf9eb: WIP: add additional types and functions to be mang...
1db8082: WIP: temp fix for Aint problems
870e925: WIP: add input for abi.h.in
a56da85: WIP: move abi.h.in
4c2aee1: WIP: add in 5.0 apis.json
3d85943: WIP: move code out of consts.py
4e85726: WIP: call c_header from Makefile
8d8f554: WIP: mangle names for internal usage
5f29a48: WIP: generate callback function prototypes
c5d5f30: WIP: remove comment function
1613149: WIP: print out embiggened versions of functions
3f229b3: WIP: add MPI and ABI versions
584aeb7: WIP: generate API prototypes
7c73091: WIP: Comment out a Fortran-only category
72d2bff: WIP: add comment pointing back to MPI standard
c303345: WIP: use enums for most int values
7d79681: WIP: create ABI header file from template with cat...
99e9992: WIP: generate header with ABI values for #defines
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
I wonder if we should make the bot not complain about unsigned commits on draft PRs. That would reduce some of the noise on PR's like this. |
cefbde8
to
d5663f7
Compare
Turns out that in commit 6bd36a7 we had a function that is not part of the MPI standard. This showed while working on ABI support - which requires us to pay attention to the truth rather than make stuff up. This commit removes our made up MPI_Session_set_info method. Turns out who ever was doing the fortran bindings knew this wasn't a method in the standard so there's no need to change the fortran bindings. Same thing applies to the man pages. Related to open-mpi#13280 Signed-off-by: Howard Pritchard <[email protected]>
65e34c9
to
bd2710f
Compare
@hppritcha There is some issue with out-of-source builds, i.e
|
interesting distcheck didn't check this. |
jenkins ci runs make distcheck |
The ABI mpi.h header is missing some MPI_T_XXX types. Also, the Status f08/c converters are declared, and they should not. I did the following manual edits to the installed mpi.h header: diff -up ./mpi.h.orig ./mpi.h
--- ./mpi.h.orig 2025-08-28 18:55:49.842968779 +0300
+++ ./mpi.h 2025-08-28 19:03:07.192957305 +0300
@@ -490,6 +490,13 @@ enum {
/* C preprocessor constants and Fortran parameters */
/* $CATEGORY:C_PREPROCESSOR_CONSTANTS_FORTRAN_PARAMETERS$ */
+typedef struct MPI_T_enum_t* MPI_T_enum;
+typedef struct MPI_T_cvar_handle_t* MPI_T_cvar_handle;
+typedef struct MPI_T_pvar_handle_t* MPI_T_pvar_handle;
+typedef struct MPI_T_pvar_session_t* MPI_T_pvar_session;
+typedef struct MPI_T_event_registration_t* MPI_T_event_registration;
+typedef struct MPI_T_event_instance_t* MPI_T_event_instance;
+
/* Handles used in the MPI tool information interface */
#define MPI_T_ENUM_NULL ((MPI_T_enum) 0)
#define MPI_T_CVAR_HANDLE_NULL ((MPI_T_cvar_handle) 0)
@@ -558,20 +565,20 @@ enum {
};
/* Source event ordering guarantees in the MPI tool information interface */
-enum {
+typedef enum MPI_T_source_order {
MPI_T_SOURCE_ORDERED = 1,
MPI_T_SOURCE_UNORDERED = 2,
-};
+} MPI_T_source_order;
/*
* Callback safety requirement levels used in the MPI tool information interface
*/
-enum {
+typedef enum MPI_T_cb_safety {
MPI_T_CB_REQUIRE_NONE = 0x00,
MPI_T_CB_REQUIRE_MPI_RESTRICTED = 0x03,
MPI_T_CB_REQUIRE_THREAD_SAFE = 0x0F,
MPI_T_CB_REQUIRE_ASYNC_SIGNAL_SAFE = 0x3F,
-};
+} MPI_T_cb_safety;
/* Callback functions */
@@ -1107,13 +1114,13 @@ int MPI_Ssend_init(const void* buf, int
int MPI_Ssend_init_c(const void* buf, MPI_Count count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request* request);
int MPI_Start(MPI_Request* request);
int MPI_Startall(int count, MPI_Request array_of_requests[]);
-/* int MPI_Status_c2f(const MPI_Status* c_status, MPI_Fint* f_status);
- */int MPI_Status_c2f08(const MPI_Status* c_status, MPI_F08_status* f08_status);
-int MPI_Status_f082c(const MPI_F08_status* f08_status, MPI_Status* c_status);
-/* int MPI_Status_f082f(const MPI_F08_status* f08_status, MPI_Fint* f_status);
- *//* int MPI_Status_f2c(const MPI_Fint* f_status, MPI_Status* c_status);
- *//* int MPI_Status_f2f08(const MPI_Fint* f_status, MPI_F08_status* f08_status);
- */int MPI_Status_get_error(const MPI_Status* status, int* err);
+// /* int MPI_Status_c2f(const MPI_Status* c_status, MPI_Fint* f_status);
+// */int MPI_Status_c2f08(const MPI_Status* c_status, MPI_F08_status* f08_status);
+// int MPI_Status_f082c(const MPI_F08_status* f08_status, MPI_Status* c_status);
+// /* int MPI_Status_f082f(const MPI_F08_status* f08_status, MPI_Fint* f_status);
+// *//* int MPI_Status_f2c(const MPI_Fint* f_status, MPI_Status* c_status);
+// *//* int MPI_Status_f2f08(const MPI_Fint* f_status, MPI_F08_status* f08_status);
+// */int MPI_Status_get_error(const MPI_Status* status, int* err);
int MPI_Status_get_source(const MPI_Status* status, int* source);
int MPI_Status_get_tag(const MPI_Status* status, int* tag);
int MPI_Status_set_cancelled(MPI_Status* status, int flag);
@@ -1799,13 +1806,13 @@ int PMPI_Ssend_init(const void* buf, int
int PMPI_Ssend_init_c(const void* buf, MPI_Count count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request* request);
int PMPI_Start(MPI_Request* request);
int PMPI_Startall(int count, MPI_Request array_of_requests[]);
-/* int PMPI_Status_c2f(const MPI_Status* c_status, MPI_Fint* f_status);
- */int PMPI_Status_c2f08(const MPI_Status* c_status, MPI_F08_status* f08_status);
-int PMPI_Status_f082c(const MPI_F08_status* f08_status, MPI_Status* c_status);
-/* int PMPI_Status_f082f(const MPI_F08_status* f08_status, MPI_Fint* f_status);
- *//* int PMPI_Status_f2c(const MPI_Fint* f_status, MPI_Status* c_status);
- *//* int PMPI_Status_f2f08(const MPI_Fint* f_status, MPI_F08_status* f08_status);
- */int PMPI_Status_get_error(const MPI_Status* status, int* err);
+// /* int PMPI_Status_c2f(const MPI_Status* c_status, MPI_Fint* f_status);
+// */int PMPI_Status_c2f08(const MPI_Status* c_status, MPI_F08_status* f08_status);
+// int PMPI_Status_f082c(const MPI_F08_status* f08_status, MPI_Status* c_status);
+// /* int PMPI_Status_f082f(const MPI_F08_status* f08_status, MPI_Fint* f_status);
+// *//* int PMPI_Status_f2c(const MPI_Fint* f_status, MPI_Status* c_status);
+// *//* int PMPI_Status_f2f08(const MPI_Fint* f_status, MPI_F08_status* f08_status);
+// */int PMPI_Status_get_error(const MPI_Status* status, int* err);
int PMPI_Status_get_source(const MPI_Status* status, int* source);
int PMPI_Status_get_tag(const MPI_Status* status, int* tag);
int PMPI_Status_set_cancelled(MPI_Status* status, int flag); I'm able to compile (using
|
@hppritcha Actually, take a look at mpi-forum/mpi-abi-stubs#63 |
5377de6
to
04ff2ff
Compare
The installed $ mpicc_abi helloworld.c
$ mpiexec -n 1 ./a.out
[optiplex:00000] *** An error occurred in MPI_Init_thread
[optiplex:00000] *** reported by process [3633774593,0]
[optiplex:00000] *** on a NULL communicator
[optiplex:00000] *** MPI_ERR_ARG: invalid argument of some other kind
[optiplex:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[optiplex:00000] *** and MPI will try to terminate your MPI job as well)
--------------------------------------------------------------------------
prterun has exited due to process rank 0 with PID 0 on node optiplex calling
"abort". This may have caused other processes in the application to be
terminated by signals sent by prterun (as reported here).
-------------------------------------------------------------------------- |
@hppritcha The following definitions in the generated /* Predefined functions */
#define MPI_COMM_NULL_COPY_FN ((MPI_Comm_copy_attr_function) 0)
#define MPI_COMM_DUP_FN ((MPI_Comm_copy_attr_function) 1)
#define MPI_COMM_NULL_DELETE_FN ((MPI_Comm_delete_attr_function) 0)
#define MPI_WIN_NULL_COPY_FN ((MPI_Win_copy_attr_function) 0)
#define MPI_WIN_DUP_FN ((MPI_Win_copy_attr_function) 1)
#define MPI_WIN_NULL_DELETE_FN ((MPI_Win_delete_attr_function) 0)
#define MPI_TYPE_NULL_COPY_FN ((MPI_Type_copy_attr_function) 0)
#define MPI_TYPE_DUP_FN ((MPI_Type_copy_attr_function) 1)
#define MPI_TYPE_NULL_DELETE_FN ((MPI_Type_delete_attr_function) 0)
#define MPI_CONVERSION_FN_NULL ((MPI_Datarep_conversion_function) 0)
#define MPI_CONVERSION_FN_NULL_C ((MPI_Datarep_conversion_function_c) 0)
/* Deprecated predefined functions */
#define MPI_NULL_COPY_FN ((MPI_Copy_function) 0)
#define MPI_DUP_FN ((MPI_Copy_function) 1)
#define MPI_NULL_DELETE_FN ((MPI_Delete_function) 0) |
PS: You may have a similar problems in some |
ompi/mpi/c/abi_converters.h
Outdated
|
||
__opal_attribute_always_inline__ static inline int ompi_convert_abi_ts_level_intern_ts_level(int ts_level) | ||
{ | ||
if (MPI_THREAD_SINGLE_ABI_INTERNAL == ts_level) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a switch
statement be better? It may ultimately be a matter of taste (optimizing compilers may generate similar binary code). Just a mild observation.
@hppritcha What's your plan for stuff introduced in MPI 4.1 and MPI 5.0? The generated mpi.h header says This is a list of what I managed to detect as missing from mpi4py's configuration machinery for missing MPI stuff.
|
Current status regarding mpi4py:
|
Thanks for checking this stuff out in its early state @dalcinl. some of the above functions have been implemented but are sitting in various states in PRs. some of these should be defined - like We still need to add some plumbing in the ompi internals for callbacks. that will come in as part of this PR. The NERSC folks would really like ABI working for Doudna so to the extent there's a "plan" it would be nice to get this working sooner than later. |
thanks. hopefully we can get most (all?) of the failures cleaned up within the next week or so. |
also see about fixing an issue with some of the ompi mca components when using --enable-mca-dso. Signed-off-by: Howard Pritchard <[email protected]>
Signed-off-by: Howard Pritchard <[email protected]>
The enum for MPI_T_source_order was not correct. Found while working on the ABI effort open-mpi#13280 Signed-off-by: Howard Pritchard <[email protected]>
The MPI include file is called |
the mpicc_abi compiler wrapper uses the include file in (INSTALL_OMPI_HERE)/include/standard_abi/mpi.h abi.h is ASOLUTUTELY NOT FOR USE BY APPLICATIONS.!!!!!! Its for the internal top level ompi 'c' files (well at some point maybe the top level fortran files too) to use that have to know about
That's the reason for all the _ABI_INTERNAL symbol mangling. So, for example, in abi.h there's a Applications are mono-lingual and only should be using the mpi.h that gets installed in standard_abi/mpi.h if they want to use the ABI. if they want to stick with speaking ompi mpi, they use the mpicc compiler wrapper and get the (INSTALL_OMPI_HERE)/include/mpi.h Chapter 20 of the MPI 5.0 standard mandates that we use the mpi.h for applications being built against the ABI. |
Thanks for the explanation. I was wondering about the |
to start using the ompi-tests/ibm had to complete adding MPI_T_ to the abi. Also some other fixes. Attributes still need more work. Signed-off-by: Howard Pritchard <[email protected]>
MPI_T_SOURCE_UNORDERED, | ||
}; | ||
|
||
typedef enum ompi_mpi_t_source_order_t MPI_T_source_order; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this typedef "merged" with the enum declaration, just like the previous MPI_T_cb_safety
?
typedef enum ompi_mpi_t_source_order_t {
MPI_T_SOURCE_ORDERED,
MPI_T_SOURCE_UNORDERED,
} MPI_T_source_order;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't know. we can fix that in an external much smaller PR - sort of like #13424
many changes. go back to generating abi_converters.h fix up tools for abi case, etc. Signed-off-by: Howard Pritchard <[email protected]>
Signed-off-by: Howard Pritchard <[email protected]>
Signed-off-by: Howard Pritchard <[email protected]>
I'm getting a some errors like the following from GCC 15. event_get_source_abi_generated.c: In function 'PMPI_T_event_get_source':
event_get_source_abi_generated.c:45:43: error: passing argument 1 of 'ompi_abi_event_get_source' from incompatible pointer type [-Wincompatible-pointer-types]
45 | ret_value = ompi_abi_event_get_source(event, source_index);
...
event_read_abi_generated.c: In function 'PMPI_T_event_read':
event_read_abi_generated.c:45:37: error: passing argument 1 of 'ompi_abi_event_read' from incompatible pointer type [-Wincompatible-pointer-types]
45 | ret_value = ompi_abi_event_read(event, element_index, buffer);
...
event_copy_abi_generated.c: In function 'PMPI_T_event_copy':
event_copy_abi_generated.c:46:37: error: passing argument 1 of 'ompi_abi_event_copy' from incompatible pointer type [-Wincompatible-pointer-types]
46 | ret_value = ompi_abi_event_copy(event, buffer);
...
|
yes that needs a tweak in the python. |
and add in a file that wasn't in the makefile! Signed-off-by: Howard Pritchard <[email protected]>
needed to get a lot of intercommunicator collectives to pass. Signed-off-by: Howard Pritchard <[email protected]>
Looking into this RESERVED_SOURCE = [
'MPI_ANY_SOURCE',
] you may be missing On a related noted, the RESERVED_DEST = [
''MPI_PROC_NULL',
] PS: I'm doing some wild guess of the implementation that I have not double-checked, so feel free to ignore my observations, I may have got it really wrong. |
Signed-off-by: Howard Pritchard <[email protected]>
i added MPI_PROC_NULL to the RESERVED_SOURCE. We'll need a new type for dest. thanks for noting these. |
fixes problems with dist graph constructors and cart_shift output. Signed-off-by: Howard Pritchard <[email protected]>
oh for inquiring minds, it looks like several opal params need to be adjusted if building libmpi_abi.so. It looks like these can all be controlled with configure options, e.g.:
I'm thinking that we'll switch back to not building libmpi_abi by default and then if its selected, tweaking the needed opal params "under the hood" during configure. I'll leave the abi building by default by now as it helps CI catch problems. |
When I made the ABI, I specifically looked at Open-MPI and MPICH to see what their values were and the ABI always chose a value at least as large as the larger of the two, so that the only thing an implementation would need to do to support the ABI was zero-pad the outputs. Are you saying that zero-padding is a deal-breaker for supporting the ABI by default? /* Maximum Sizes for Strings */
#define MPI_MAX_DATAREP_STRING 128 /* MPICH: 128 - OMPI: 128 */
#define MPI_MAX_ERROR_STRING 512 /* MPICH: 512 - OMPI: 256 */
#define MPI_MAX_INFO_KEY 256 /* MPICH: 255 - OMPI: 36 */
#define MPI_MAX_INFO_VAL 1024 /* MPICH: 1024 - OMPI: 256 */
#define MPI_MAX_LIBRARY_VERSION_STRING 8192 /* MPICH: 8192 - OMPI: 256 */
#define MPI_MAX_OBJECT_NAME 128 /* MPICH: 128 - OMPI: 64 */
#define MPI_MAX_PORT_NAME 1024 /* MPICH: 256 - OMPI: 1024 */
#define MPI_MAX_PROCESSOR_NAME 256 /* MPICH: 128 - OMPI: 256 */
#define MPI_MAX_STRINGTAG_LEN 1024 /* MPICH: 256 - OMPI: 1024 */
#define MPI_MAX_PSET_NAME_LEN 1024 /* MPICH: 256 - OMPI: 512 */
/* Assorted Constants */
#define MPI_BSEND_OVERHEAD 512 /* MPICH: 96 - OMPI: 128 */ https://github.com/mpi-forum/mpi-abi-stubs/blob/main/mpi.h#L285 I am looking at Open-MPI 5.0.8 right now on my laptop and it is smaller than the ABI value, which should not be a problem. /* Maximum length of info vals (default is 256) */
#define OPAL_MAX_INFO_VAL 256 |
nope one has to set this param otherwise several of our ompi info tests fail. its simple, the ompi is built with opal_max_info_len (or something like that) set to 256, but the app is seeing the 1024 value in the mpi.h header file. the tests that check to see that in fact they can use all 1024 bytes of space fail. |
also add better support for the 'some' test/wait functions. fix testall and startall Signed-off-by: Howard Pritchard <[email protected]>
Signed-off-by: Howard Pritchard <[email protected]>
Signed-off-by: Howard Pritchard <[email protected]>
I think I'm missing something. Is Alternatively, the internal ompi/opal layer can get an additional argument
This approach is dangerous. If you build the ABI (which IMHO should be the default, but even if it is not), then the legacy OMPI library will not be ABI-compatible with the "regular" non-ABI build, nor with previous versions of Open MPI that are ABI-compatible with |
Signed-off-by: Howard Pritchard <[email protected]>
I think sir you may be correct. We may need an interceptor function for this type of argument check. On the todo list. |
Two external MPI libraries are now created: libmpi.so and libmpi_abi.so.
Backend code that was originally in libmpi.la has been extracted into
libopen-mpi.la to be linked into both libraries.
Parts of the Open MPI C interface are now being generated by a python
script (abi.py) from modified source files (named with *.in). This
script generates files for both the ompi ABI and the standard ABI from
the same source file, also including new bigcount interfaces.
To compile standard ABI code, there's a new mpicc_abi compiler wrapper.
The standard ABI does not yet include all functions or symbols, so more
complicated source files will not compile. ROMIO must be disabled for
the code to link, since it's relying on the external MPI interface.
Many todos left:
enable-mca-dso
This PR supercedes #12033