Ignore:
Timestamp:
Aug 29, 2025, 5:57:47 PM (3 months ago)
Author:
asima
Message:

On 18/08/2025, the SLURM version changed on Adastra (from 23 to 24.05.8) -->
the "--overcommit" overcommit must be added to srun in order to avoid crashes for memory problems when using the binding script slurm_set_cpu_binding.sh.

Before :

@ADS srun --cpu-bind=none --mem-bind=none -- ./slurm_set_cpu_binding.sh ./gcm.e > listing

After :

@ADS srun --overcommit --cpu-bind=none --mem-bind=none -- ./slurm_set_cpu_binding.sh ./gcm.e > listing

This solution is preffered to the initial fix :

@ADS srun --cpu-bind=cores -c $nthreads -n $ntasks ./gcm.e > listing

because without the binding script, the runtime passes from 50 min to 90 min for 1 year of simulation with the standard LMDZ_Setup config.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • BOL/LMDZ_Setup/script_SIMU

    r5798 r5800  
    303303time $mpicmd ./gcm.e > listing
    304304#@ADS else
    305 #@ADS # Pour memoire jusqu au 20/08/2025 : srun --cpu-bind=none --mem-bind=none -- ./slurm_set_cpu_binding.sh ./gcm.e > listing
    306 #@ADS srun --cpu-bind=cores -c $nthreads -n $ntasks ./gcm.e > listing
     305#@ADS # 18/08/2025 : SLURM version changed from 23 to 24.05.8 -->  "--overcommit" added to srun, to avoid crashes for memory problems
     306#@ADS srun --overcommit --cpu-bind=none --mem-bind=none -- ./slurm_set_cpu_binding.sh ./gcm.e > listing
    307307#@ADS fi
    308308
Note: See TracChangeset for help on using the changeset viewer.