Skip to content

ICON Meeting 2024/3

17 September 2024, ETH Zurich (L 17.1) and via Zoom

Participants (on-site)

Michael Jähn (MJ), Matthieu Leclair (ML), Annika Lauber (AL), Tina Schnadt (TS), Doris Folini (DF), Anurag Dipankar (AD), Diego Villanueva (AV), Andrina Caratsch (AC)

Participants (Zoom)

Matthias Kraushaar (MK), Marco Arpagaus (MA), Brigitta Goger (BG), Arash Hamzehloo (AH), Mikael Stellio (MS), Christian Steger (CS), Will Sawyer (WS), Nander Wever (NW)

Minutes by Annika Lauber

Reports

C2SM (Michael Jähn, Annika Lauber, Matthieu Leclair)

MJ welcomes everyone and shares the latest news about ICON from C2SM .

AD asks about the transient aerosol port and who helped fix the bug. AL explains that many people were involved, including DWD, C2SM, and MPI, and that there were actually three bugs, which made the issue difficult to identify. MJ adds that finishing up the port also takes longer now because so many people are involved, requiring a lot of communication and waiting.

WS asks if preparatory projects can really be submitted at any time. MK answers that while project proposals are gathered for specific deadlines, preparatory projects can be submitted at any time.

CS asks about the namelist variable isolrad and what it does. MJ responds that he can't say much, as he hasn't really looked into it yet.

AH wants to ask about ALPS and mentions that they were given access to Daint but don’t have a user environment set up there and are struggling with ICON on the new system. Erik Koene tried compiling ICON with MJ’s help and was successful. AH asks if they should have access to Säntis instead of Daint. ML responds that this is a question for CSCS and depends on their decision—whether weather and climate work will be limited to Säntis or shared between Säntis and Tödi. Ben Cumming thought that everything should run on Säntis. Once CSCS provides more details, they will offer support. ML asks if AH's access is part of a preparatory project, but AH clarifies that it’s a larger project and they believed they had admin rights to create a user environment. ML confirms they do, but advises that for now, they should rely on what is already available.

MK adds that the differences between Tödi, Säntis, and Daint will be minimal, and the systems should function the same way. Admin rights are not necessary to create a user environment, but it will take some time to adapt to the new system. MK suggests the possibility of tagging user environments for specific systems, although most should be visible. He also shares the following links:

AH asks whether Daint or Säntis will be the long-term solution for their applications. MK explains that Säntis will be mainly dedicated to climate and weather, specifically for ICON (EXCLAIM), while a few tasks will still go to Daint. DF mentions that their colleagues are currently doing benchmarking on Tödi, but are now on Daint.Alps, where certain images were missing. DF asks if those images will be available. ML advises not to rely on his images anymore, explaining that there's a workaround using the user environment. You can query the database to see what's installed on the system. DF asks if this is possible even without Tödi access, and ML thinks it is.

DF then asks if they can expect the images from Tödi to be available on Daint. ML replies that although it's not a long-term solution, those images should eventually be deployed on Daint. However, it won’t be as straightforward. DF points out that many people need to complete benchmarking by October. AH asks if ICON can be compiled by loading the user environment without those missing images. ML explains that the software stack image is the same as the user environment and can be accessed on each cluster, but using it isn't always straightforward. Currently, the images are interchangeable, though they still depend on the operating system. In the future, user environments on one cluster may not work on another. MK confirms that the risk of incompatibility exists and they've already seen this with MCH. Regarding the missing images on Daint, MK explains that the NVIDIA part wasn’t provided completely, which is why he created another environment. These images are not preloaded on the system, so users need to search for them using the user environment tool. MK suggests trying the command uenv image find -s todi to locate the images.

BG mentions using Tödi and that while the instructions on C2SM initially worked, two weeks later the executables stopped working and the instructions disappeared. ML responds that there is now an Alps section on the landing page. It should be possible to do the same tasks as before, but without using absolute paths to his environment. Instead, tags can be used. The updated instructions are available on the workshop materials page as well as the user landing page . ML encourages anybody to reach out to him on Slack if any issues arise.

AD from Empa asks if ML has managed to run ICON-ART. ML responds that he has only been testing regular ICON, but Erik was successful in compiling it. AH confirms that Erik followed a workaround. AD asks if there is a plan for C2SM to support ICON-ART. ML explains that ICON-ART is not in the C2SM pipeline and suggests that users create a test case, and C2SM can help set up the CI infrastructure. AH mentions they have an old test case (icon-kit) that needs updating. MJ adds that ICON-ART is now part of ICON-NWP, and a test case could be set up and added.

DF asks about using default ICON versions and whether the run scripts will be used on Alps in the same way. ML replies that it’s not yet finalized for ICON. DF inquires about a timeline, and ML says it will still take some time. MJ mentions they will provide the latest OS release and testing.

WS brings up that Jonas integrated the CSCS-CI into the normal BuildBot workflow, and much of the software is available. They set up a service user and have a project for that but still need to complete a few tasks before Jonas returns. There’s a discussion about the mechanism for triggering the runners, and they will need to talk to Ralf and Sergey. WS estimates that it will take three to four weeks until the CI system is fully operational, aiming for the end of October to overlap with Piz Daint.

DF shares her experience from last week, where she had to furnish a run script herself for a simple test case (heldsuarez). She asks if they should use the job reporting tool suggested by CSCS. WS says they’ll need to discuss that further, as he’s not directly involved. He suggests continuing the conversation offline with Matthias. DF notes that several people are relying on the outcome of this work.

Doris Folini

DF tried benchmarking with the latest ICON from the C2SM website but struggled to find input files. She realized that the user environment (uenv) is no longer using the default mount point. ML explains that there has been a small change, and DF needs to access metadata that wasn’t initially available, so she should download the files again.

DF looked through various resources but is still having trouble. She asks if she would need to recompile from scratch if, for example, JSbach is needed. ML replies that it’s not a small change, so likely yes, recompiling would be necessary, as it’s not just a matter of changing a source file.

Diego Villanueva

DV is planning to use ICON-HAM and a postdoc will start next year. They might also look into ICON-CLM, hoping it will be ready by then.

Andrina Caratsch

AC is working on aerosol-cloud interactions in tropical cyclones and is going to use ICON-HAM.

Nander Wever

NW, who works for MCH and SLF, is examining the mixed snow component in ICON. This feature, which is a multilayer snow model, is expected to become operational in ICON.

Marco Arpagaus

MCH switched off COSMO two weeks ago, though it is still running in other countries. MA plans to start working on reanalysis using 20 years of reanalysis data.

Brigitta Goger

Published two papers in the recent month about clouds in ICON:

Arash Hamzehloo

AH is porting components of ART to GPU and is trying to compile ICON-ART on the new system.

Mikael Stellio

MK is using ICON-HAM and working with Sylvaine, focusing mainly on the GPU side. They worked on transitioning from the old Daint to the new Daint, including compiling on the GPU. They started running simulations at a 10km resolution, had some runtime problems initially, but everything is now working and they are currently benchmarking.

Christian Steger

CS, working at MCH, is focusing on the radiation topography scheme in ICON. This includes both EXTPAR-related fields and a new topography set. CS is also supervising a master's thesis project with BG, which involves high-resolution land use data. The work is mainly focused on EXTPAR-related ICON tasks.

Will Sawyer

WS has been working on benchmarking ICON on the new system. Alps was officially inaugurated on Saturday. WS has benchmarked various versions of ICON and is preparing a presentation: one for this afternoon at EXCLAIM. They can share an early version of the presentation. Shares presentation .

Matthias Kraushaar

MK is the service manager for climate and weather. Although his training is not related to weather and climate, he understands the challenges and is focused on finding solutions to keep the work moving forward. Alps is very beneficial for machine learning and will also become part of the climate and weather platform. MK will help introduce the new features and has created a Slack space for users . A channel for the weather and climate platform will be set up, and while it will be monitored from the CSCS side, very technical issues should be turned into tickets.

WS asks about reporting failing nodes and which channel to use. MK suggests using the ICON channel or opening a ticket. WS notes that there is no database for checking current failures. MK responds that they are working on status pages, which can also be used to communicate issues.