Skip to content

Run Probtest on Säntis

Use Probtest to verify whether your test case produces consistent results on GPU. It compares a GPU test run to a CPU ensemble with perturbed input conditions.

1. Compile ICON

Compile ICON on CPU and on GPU as out-of-source builds. Note that the build directories need to be sub-directories of the ICON root folder. Otherwise the probtest container does not have access to the data.

2. Set Up the Probtest Container and Environment on Säntis

To run Probtest for ICON on Säntis, use the prebuilt container available on Docker Hub (Probtest Container ). ICON provides the wrapper script probtest_container_wrapper.py .

Note

If your ICON version doesn’t include this script, add it to scripts/cscs_ci/probtest_container_wrapper.py, along with the appropriate PROBTEST_TAG under run/tolerance/PROBTEST_TAG and yaml_experiment_test_processor.py under scripts/experiments/yaml_experiment_test_processor.py (replace if already available).

When Setting Up ICON from Scratch

In your ICON root directory, import the containter:

PROBTEST_TAG=$(cat run/tolerance/PROBTEST_TAG)
enroot import docker://c2sm/probtest:${PROBTEST_TAG}

Add a TOML configuration and export EDF path (being used when running the container):

echo "image = \"$(pwd)/c2sm+probtest+${PROBTEST_TAG}.sqsh\"" > probtest.toml
echo "mounts = [ \"$(pwd)\" ]" >> probtest.toml
echo "workdir = \"$(pwd)\"" >> probtest.toml
echo "writable = true" >> probtest.toml
export EDF_PATH=$(pwd)

Create and activate Python environment:

python3 -m venv .venv
source .venv/bin/activate
pip install pyyaml pandas click toml

Every Time You Reconnect to the Server

If the container and environment are already set up, simply re-run:

export EDF_PATH=$(pwd)
source .venv/bin/activate

Set experiment name, e.g.:

export EXPERIMENT=c2sm_clm_r13b03_seaice

Export required environment variables:

export BB_NAME=santis_cpu_nvhpc
export UENV_VERSION=$(cat config/cscs/SANTIS_ENV_TAG)

3. Run perturbed ensemble on CPU

Navigate to your CPU build directory and generate and run a 10-member ensemble (this may take time):

./make_runscripts $EXPERIMENT
uenv run ${UENV_VERSION} -- python3 scripts/cscs_ci/probtest_container_wrapper.py ensemble $EXPERIMENT --build-dir $(pwd) --member-ids $(seq -s, 1 10)

This generates:

  • stats_${EXPERIMENT}_<member_id>.csv
  • ${EXPERIMENT}_reference.csv

4. Generate Reference and Tolerance from Ensemble

Create reference and tolerance files using the 10 ensemble members:

python3 scripts/cscs_ci/probtest_container_wrapper.py tolerance $EXPERIMENT --build-dir $(pwd) --member-ids $(seq -s, 1 10)

This generates:

  • ${EXPERIMENT}_tolerance.csv

5. Run the test case on GPU and collect statistics

In the following, replace <...> by the corresponding paths you are using.

Navigate to your GPU build folder and run the same test case, e.g.:

cd <path-to-GPU-build>
./make_runscripts $EXPERIMENT
cd run && sbatch --uenv ${UENV_VERSION} ./exp.c2sm_clm_r13b03_seaice.run

Navigate back to ICON root folder and collect the GPU statistics:

cd <ICON root folder>
python3 scripts/cscs_ci/probtest_container_wrapper.py stats $EXPERIMENT --stats-file-path <path-to-CPU-build>/stats_gpu.csv --build-dir <path-to-GPU-build>

This saves the GPU stats as stats_gpu.csv in your CPU build directory.

6. Check GPU Statistics Against Reference and Tolerance

From your ICON directory, run the check using the generated reference and tolerance:

python3 scripts/cscs_ci/probtest_container_wrapper.py check $EXPERIMENT --input-file-cur stats_gpu.csv --input-file-ref <path-to-CPU-build>/${EXPERIMENT}_reference.csv --tolerance-file-name <path-to-CPU-build>/${EXPERIMENT}_tolerance.csv --build-dir $(pwd)

7. Increase Ensemble Size if Validation Fails

A 10-member ensemble may not capture the full variability, causing false negatives. Increase to 49 members for better coverage from your CPU build directory:

Run additional members (11–49):

./make_runscripts $EXPERIMENT
uenv run ${UENV_VERSION} -- python3 scripts/cscs_ci/probtest_container_wrapper.py ensemble $EXPERIMENT --build-dir $(pwd) --member-ids $(seq -s, 11 49)

Regenerate reference and tolerance using all 49 members:

python3 scripts/cscs_ci/probtest_container_wrapper.py tolerance $EXPERIMENT --build-dir $(pwd) --member-ids $(seq -s, 1 49)

If the test still fails, the GPU result is likely incorrect.