Introduction of field API interface in ectrans#1
Introduction of field API interface in ectrans#1dhaumont wants to merge 94 commits intodump-checksumsfrom
Conversation
* hotfix/1.6.2: For CI lumi-g cce, set CMAKE_PARALLEL_BUILD_LEVEL=1 (already so in develop) More resources for CI lumi-g cce etrans benchmark: specify LINKER_LANGUAGE Fortran for static linking transi: link against etrans if enabled etrans: fix library target names in CPU build Version bump to 1.6.2
CMakeLists.txt
Outdated
| ecbuild_add_option( FEATURE FIELD_API | ||
| DEFAULT OFF | ||
| DESCRIPTION "Compile field API interface to ectrans" | ||
| ) | ||
|
|
||
| if (HAVE_FIELD_API) | ||
| ecbuild_find_package(field_api) | ||
| endif() | ||
|
|
There was a problem hiding this comment.
I would propose to simply add field_api as a required package of FIELD_API
| ecbuild_add_option( FEATURE FIELD_API | |
| DEFAULT OFF | |
| DESCRIPTION "Compile field API interface to ectrans" | |
| ) | |
| if (HAVE_FIELD_API) | |
| ecbuild_find_package(field_api) | |
| endif() | |
| ecbuild_add_option( FEATURE FIELD_API | |
| DEFAULT OFF | |
| DESCRIPTION "Compile field API interface to ectrans" | |
| REQUIRED_PACKAGES "field_api" | |
| ) | |
(Needs testing, I'm not 100% certain of the syntax.)
src/programs/ectrans-benchmark.F90
Outdated
| & output_wrapped_fields, output_fields_lists, & | ||
| & nullify_wrapped_fields, synchost_rdonly_wrapped_fields, & | ||
| & synchost_rdwr_wrapped_fields |
There was a problem hiding this comment.
| & output_wrapped_fields, output_fields_lists, & | |
| & nullify_wrapped_fields, synchost_rdonly_wrapped_fields, & | |
| & synchost_rdwr_wrapped_fields | |
| & nullify_wrapped_fields, synchost_rdonly_wrapped_fields |
I don't think these are used.
There was a problem hiding this comment.
Removed output_wrapped_fields, output_fields_lists and SYNCHOST_RDWR_WRAPPED_FIELDS which were not used
src/programs/ectrans-benchmark.F90
Outdated
|
|
||
| !=================================================================================================== | ||
|
|
||
| #include "fspgl_intf.h" |
There was a problem hiding this comment.
| #include "fspgl_intf.h" |
Is this used?
src/programs/ectrans-benchmark.F90
Outdated
| ztstep1(jstep) = timef() | ||
| call gstats(4,0) | ||
|
|
||
| if (lfield_api) then |
There was a problem hiding this comment.
| if (lfield_api) then | |
| if (lfield_api) then |
src/programs/ectrans-benchmark.F90
Outdated
| & pgp2=zgp2, & | ||
| & pgp3a=zgp3a) | ||
| endif | ||
| endif |
src/programs/ectrans-benchmark.F90
Outdated
| & kproc=myproc, ldacc=llacc) | ||
| call synchost_rdonly_wrapped_fields(ywflds) | ||
| #else | ||
| call abor1('ectrans_benchmark: No field API support') |
There was a problem hiding this comment.
| call abor1('ectrans_benchmark: No field API support') | |
| call abor1('ectrans_benchmark: No field API support') |
src/programs/ectrans-benchmark.F90
Outdated
| #else | ||
| call abor1('ectrans_benchmark: No field API support') | ||
| #endif | ||
| else |
There was a problem hiding this comment.
| else | |
| else |
The if just below should also be indented.
|
|
||
|
|
src/programs/ectrans-benchmark.F90
Outdated
| & kvsetsc2=ivsetsc, & | ||
| & kvsetsc3a=ivset) | ||
| endif | ||
| endif |
src/programs/ectrans-benchmark.F90
Outdated
| call abor1('ectrans_benchmark: No field API support') | ||
| #endif | ||
| else | ||
| if (lvordiv) then |
src/programs/ectrans-benchmark.F90
Outdated
| ! clamp small spectral values to ensure bit reproductibility with field Api interface | ||
| ! Only activated in dp, with nvhpc and on cpu | ||
| IF (JPRB == JPRD) THEN | ||
| write(nout,*) "clamp using clamp_epsilon = ", clamp_epsilon | ||
| if (associated(zspsc2)) where (abs(zspsc2) < clamp_epsilon)zspsc2 = 0 | ||
| if (associated(sp3d)) where (abs(sp3d) < clamp_epsilon)sp3d = 0 | ||
| endif |
There was a problem hiding this comment.
| ! clamp small spectral values to ensure bit reproductibility with field Api interface | |
| ! Only activated in dp, with nvhpc and on cpu | |
| IF (JPRB == JPRD) THEN | |
| write(nout,*) "clamp using clamp_epsilon = ", clamp_epsilon | |
| if (associated(zspsc2)) where (abs(zspsc2) < clamp_epsilon)zspsc2 = 0 | |
| if (associated(sp3d)) where (abs(sp3d) < clamp_epsilon)sp3d = 0 | |
| endif | |
| ! clamp small spectral values to ensure bit reproductibility with field Api interface | |
| ! Only activated in dp, with nvhpc and on cpu | |
| if (jprb == jprd) then | |
| write(nout,*) "clamp using clamp_epsilon = ", clamp_epsilon | |
| if (associated(zspsc2)) where (abs(zspsc2) < clamp_epsilon)zspsc2 = 0 | |
| if (associated(sp3d)) where (abs(sp3d) < clamp_epsilon)sp3d = 0 | |
| endif |
There was a problem hiding this comment.
done, as well as the other indentation
src/field_api/CMakeLists.txt
Outdated
| field_api_ectrans_mod.F90 | ||
| trans/dir_trans_field_api.F90 | ||
| trans/inv_trans_field_api.F90 | ||
| field_api_basic_type_mod.F90 |
There was a problem hiding this comment.
This file doesn't seem to have any precision- or platform-specific code, so I don't think we need to put it through generate_file. Why not just manually add it to ${outfiles}?
There was a problem hiding this comment.
Indeed, field_api_basic_type_mod.F90 is added now to outfiles
There was a problem hiding this comment.
UPDATE: I had to revert commit dafa4db, because it created duplicated target file name for field_api_basic_type_mod.F90.
I also tried to create a ectrans_field_api_common library to compile field_api_basic_type_mod.F90, but this will not work. In fact,. field_api_basic_type_mod.F90 contains implicitely some precision specific code, via the FIELD_BASIC object from field_api. To be able to use this object, you have to link with field_api_${prec}, which is precision dependent. This means that it is not possible to compile ``ectrans_field_api_common` without specifiyng the precision. Or maybe i miss something?
There was a problem hiding this comment.
I don't think we need to go down this route, because field_api is only going to be needed to wrap the outermost dir_trans and inv_trans like functions.
There was a problem hiding this comment.
What do you mean ? That I can keep the code like this ?
There was a problem hiding this comment.
Yes exactly, not try to make a precision-independent field_api API.
3a622e1 to
5e59be7
Compare
* Exit with error when incorrect argument applied Without this a test with an incorrect argument still passes. * Fix typo: baseargs -> base_args * Make sure base_args is expanded properly * Initial implementation * Implement dump-checksums * typo * dump with gather (not working) * fix crashes * small refactoring * typo * add checksums for lam * change filename * Add script to compare checksums * python3 * Add unit test for checksum * rename txt in checksums * remove initialization of arrays before inv_trans * rename checksums * adding underscore in checksums * remove typo correction after fix in another branch * Change color enum * Split long function calls and reorder arguments * fix naming of checksums files * Fix indentation * Fix out-of-bound access in Nvidia debug mode * The test compare_checksum will trigger other tests * Indent if * Fix out-of-bound access in Nvidia debug mode for lam * Add missing False statement * fixup merge * Implement Sam's fixes * Allow some more time for build-hpc * Allow some more time for build-hpc * Allow some more time for build-hpc * Fix minor indentation inconsistencies * Revert workflow modifications * Revert workflow modifications * Revert workflow modifications * Add missing intents * Fix case * Add some final tweaks to ectrans-lam-benchmark --------- Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int> Co-authored-by: Denis Haumont <denis.haumont@meteo.be> Co-authored-by: Willem Deconinck <willem.deconinck@ecmwf.int>
This library only contains empty dummy versions of the etrans external subroutines. The subroutines are automatically generated from the interface files, by stripping out the INTERFACE statement and adding an abort statement. This feature allows one to build IFS against ecTrans without having to enable the ETRANS feature. This is useful for development, where one would like to experiment with ecTrans global transform changes and perform tests within the IFS without having to make sure the changes are compatible also with etrans. One can simply disable etrans and instead build IFS against the etrans dummy library.
* Add --npromatr option to global benchmark program * Add test case for --npromatr 20 * Add npromatr print to benchmark program * Update nv stack on ac
The fact that we've been accidentally testing 0 fields for the "10 field" test without this guard suggests that it actually works now.
This reverts commit 0bd76d1.
This reverts commit 3fc859c.
* Remove references to unused FSPGL_PROC in inverse * Also remove unused FSPGL_PROC from etrans
This does nothing.
* Remove deprecated error codes Quoting Nvidia docs (https://docs.nvidia.com/cuda/cufft/): Starting from CUDA 13.0: The following error codes have been removed: CUFFT_INCOMPLETE_PARAMETER_LIST, CUFFT_PARSE_ERROR, CUFFT_LICENSE_ERROR. Starting from CUDA 12.9: The following error codes are deprecated and will be removed in a future release: CUFFT_INCOMPLETE_PARAMETER_LIST, CUFFT_PARSE_ERROR, CUFFT_LICENSE_ERROR. * Add missing cuFFT error codes * Add back deprecated error codes for earlier CUDA runtime versions * Guard CUDART>=13 error codes
It's literally exactly the same as LDFOU2.
* Split dumpchecksums into 4 methods) * Correct typo * Adjust whitespace * Delete unused variables --------- Co-authored-by: Denis Haumont <denis.haumont@meteo.be> Co-authored-by: Sam Hatfield <samuel.hatfield@ecmwf.int>
Field API is required to run unit tests
Remove unique symbols for ectrans_field_api
Summary
This PR introduces a field API interface to ectrans. It is based on top of dump-checksums (PR ecmwf-ifs#287)
Description of changes
The new field API interface is implemented in
src/field_api:src/field_api/trans/inv_trans_field_api.F90: inverse transform (calling inv_trans internally)src/field_api/trans/dir_trans_field_api.F90: direct transform (calling dir_trans internally)src/field_api/field_api_ectrans_mod.F90: helper functionsOptional compilation triggered by a new USE_FIELD_API configuration option. Adds a new dependency to field API.
Single and double precision for CPU and GPU
Accessible in ectrans-benchmark via a new option
--field-apisrc/programs/util/ectrans_field_api_helper.F90contains the field API implementationIn the GPU version of ectrans-benchmark , setting
llacc = .true.will force the intermediate fields on GPU. This requires additional memory copies host <-> device before and after the regular ectrans calls.New unit tests have been added to test
--field-api(CPU and GPU)compare_checksums.pyupdated to compare the checksums withfield-apiwith the regular versionnvhpc22.11 on CPU: bit reproducibility between the field API and the regular version has been partially adressed by clamping small values in spectral space after dir_trans in ectrans benchmark (2a7aa87). This only applies to double precision (single precision is only bit reproducible during the first time step and slightly different after)
Testing and validation
*Gfortran 13.3.0 - CPU : bit reproducible
Known limitations
llacc = .true.requires memory transfers between CPU and GPU, that should be avoidedNEW PR in official ectrans
This PR will be replaced by ecmwf-ifs#314.
We keep this one open for now because it contains comments from Sam Hatfield that still need to be integrated.