Changeset 4796 for BOL


Ignore:
Timestamp:
Feb 2, 2024, 1:06:35 PM (10 months ago)
Author:
asima
Message:

The usual combination 5MPI * 8OMP per node crashes now with messages :

srun: warning: can't honor --ntasks-per-node set to 5 ... Ignoring --ntasks-per-node.
run: error: Unable to create step for job ...: More processors requested than permitted

(strangely enough, it still runs for 4MPI * 10OMP per node)

Solution adopted (and some comments added) : comment out "--ntasks-per-node=..."

For the record, other solutions that keep "--ntasks-per-node=..." :
--> add "--nodes=..." (=ntasks/ntasks-per-node)
--> or, replace "nthreads=8 ; export OMP_NUM_THREADS=$nthreads"

by "export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK"

(Not easy to have a job for HPC that could work, one day, on a PC as well...)

Location:
BOL/LMDZ_Setup
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • BOL/LMDZ_Setup/script_SIMU

    r4752 r4796  
    55# Nombre de processus MPI :
    66#SBATCH --ntasks=8
    7 # number of MPI processes per node :
    8 #SBATCH --ntasks-per-node=5
     7##### number of MPI processes per node : 40(procs/node on Jean-Zay) / cpus-per-task (ex : =5 for 8 OMP)
     8####SBATCH --ntasks-per-node=5    # if specified, also add "#SBATCH --nodes= ..."  with nodes=ntasks/(ntasks-per-node)
    99# nombre de threads OpenMP
    10 #SBATCH --cpus-per-task=5
     10#SBATCH --cpus-per-task=8
    1111# de Slurm "multithread" fait bien reference a l'hyperthreading.
    1212#SBATCH --hint=nomultithread       # 1 thread par coeur physique (pas d'hyperthreading)
     
    2020set -ex
    2121
    22 
     22# Number of MPI processes :
    2323ntasks=8
    24 nthreads=4
    25 # number of OpenMP threads:
    26 export OMP_NUM_THREADS=$nthreads
     24# number of OpenMP threads 
     25nthreads=8
     26export OMP_NUM_THREADS=$nthreads # for Jean-Zay it would be recommendend to use :
     27#       export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
    2728# private memory for each thread
    2829export OMP_STACKSIZE=800M
  • BOL/LMDZ_Setup/setup.sh

    r4734 r4796  
    456456# Choix du nombre de processeurs
    457457# NOTES :
    458 # omp=8 by default, but we need
     458# omp=8 by default (for Jean-Zay must be a divisor of 40 procs/node), but we need
    459459#   omp=1 for SPLA (only MPI parallelisation)
    460460#   omp=2 for veget=CMIP6 beacause of a bug in ORCHIDEE/src_xml/xios_orchidee.f90
Note: See TracChangeset for help on using the changeset viewer.