Changeset 5178


Ignore:
Timestamp:
Sep 10, 2024, 11:41:21 AM (3 months ago)
Author:
abarral
Message:

Fix libfabric crash on ADS
Increase rebuild job times

Location:
BOL/LMDZ_Setup_amaury
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • BOL/LMDZ_Setup_amaury/lmdz_env.sh

    r5177 r5178  
    7575      module load PrgEnv-gnu  # we need to load the env because lmdz links some shared libraries
    7676      module load gcc/13.2.0  # required, see https://dci.dci-gitlab.cines.fr/webextranet/user_support/index.html#prgenv-and-compilers
     77      export CRAY_CPU_TARGET=x86-64  # to suppress warnings during Cmake netcdf95 build
     78      export FI_CXI_RX_MATCH_MODE=hybrid  # 09/24 otherwise we get random SIGABRT e.g. "libfabric:2490616:1725895288::cxi:core:cxip_ux_onload_cb():2657<warn> c1456: RXC (0x5130:21) PtlTE 84:[Fatal] LE resources not recovered during flow control. FI_CXI_RX_MATCH_MODE=[hybrid|software] is required"
    7779
    7880      function cdo {  # cdo is available as a spack cmd which requires a specific, incompatible env
  • BOL/LMDZ_Setup_amaury/reb.sh

    r5006 r5178  
    2121  job=$SIM$type
    2222
    23   cat <<eod >| $job
     23  cat <<eod >| "$job"
    2424#!/bin/bash
    2525## Headers managed by sed
     
    2828#@JZ#SBATCH --nodes=1                   # nombre de noeuds
    2929#@JZ#SBATCH --ntasks-per-node=1         # nombre de taches MPI par noeud
    30 #@JZ#SBATCH --time=00:30:00             # temps d execution maximum demande (HH:MM:SS)
     30#@JZ#SBATCH --time=00:59:00             # temps d execution maximum demande (HH:MM:SS)
    3131#@JZ#SBATCH --output=post${type}%j.out  # nom du fichier de sortie
    3232#@JZ#SBATCH --error=post${type}%j.out   # nom du fichier d'erreur (ici en commun avec la sortie)
     
    3535#@SP#SBATCH --nodes=1
    3636#@SP#SBATCH --ntasks-per-node=1
    37 #@SP#SBATCH --time=00:30:00
     37#@SP#SBATCH --time=00:59:00
    3838#@SP#SBATCH --output=post${type}%j.out
    3939#@SP#SBATCH --error=post${type}%j.out
     
    4242#@ADS#SBATCH --nodes=1
    4343#@ADS#SBATCH --ntasks-per-node=1
    44 #@ADS#SBATCH --time=00:30:00
     44#@ADS#SBATCH --time=00:59:00
    4545#@ADS#SBATCH --output=post${type}%j.out
    4646#@ADS#SBATCH --error=post${type}%j.out
Note: See TracChangeset for help on using the changeset viewer.