This is the second part of the posting on the SC10 Report on Climate Thrust Area by Will Sawyer.
Community Climate Model Development
In a masterworks talk, Ricky Rood (professor at Univ. of Michigan) insisted climate projections offer an historic opportunity: for the first time we have a glimpse into the future and can potentially adapt our way of living to minimize the risks. He emphasized the considerable difference between weather prediction and climate projections. Ricky presented several interesting problems: level-5 hurricane hitting New Orleans, and how would irrigation in the Sahara affect hurricanes. Since there are so many more climate data consumers than producers, “smart” (i.e., automated) consultants are needed, for example the MAGICC/ SCENGEN project.
Ricky’s discussion lead up to the introduction of open community activities to use climate projects to analyze climate change, a concept he puts under the umbrella of the OpenClimate community (which he is spearheading). Community Earth System Model (CESM) is a successful example of open community collaboration, but open source is not part of climate modeling culture. The model for such a community might be Linux development. This talk did not directly address HPC issues, but was rather a philosophical view on how we need to proceed as a community to affectively use HPC climate models to address real world problems.
Revolutionizing Climate Modeling: Advantages of Dedicated High-Performance Computing
Jim Kinter from the Center for Ocean-Land-Atmosphere Studies (COLA) discussed their experimental approach to climate modeling, namely an interdisciplinary, inter-center collaboration to solve real climate questions on dedicated HPC resources. COLA set out to prove whenever improved horizontal resolution actually does improve climate projections. Jim gave several examples why it could, e.g. resolving ocean eddies important for driving the atmospheric flow. To solve this, COLA set out to run multiple climate runs on the dedicated “Athena” machine (Cray XT4) at the National Institute for Computational Sciences (NICS), which is the former Kraken machine at ORNL before it was upgraded to XT5. Using the IFS and NICAM models, they were able to generate 70 TB of data in various experiments. He presented a sampling of results, for example the phenomenon of atmospheric blocking. The high resolution T1279 describes this much better than the T159. Snow depth is also much better represented with T1279. The same happens with occurrence of hurricanes at high resolution. The results clearly support the thesis that higher resolution gives better climate projections.
GPUs for Climate Models
Mark Govett of NOAA gave a HPC-oriented presentation of climate science. He first gave an overview of current architectures, illustrating the emergence of GPUs for the new Top500 list. After this brief introduction to to GPU computing, he continued to the GPU parallelization of the Non-hydrostatic Icosahedral Model (NIM). They used a F2C-to-accelerator crosscompiler to generate CUDA code; this compiler resembles the PGI F90 with accelerator directives in some ways, and PGI CUDAFortran in other ways. He also evaluated CAPS HMPP and PGI compiler solutions.
After going into a ticklist of optimizations, he presented the comparative analysis of speedups for the three different compilers. All three give speedups between 5 and 50 for various components. The average is roughly 20x. The I/O bottleneck has now to be addressed. It is not ready for production, but this is currently a valuable research tool.
BoF Session: The NOAA Global Interoperability Project BoF
Cecelia Deluca (NOAA) and Venkatramani Balaji (GFDL) presented the global interoperability project (gip.noaa.gov). One of the goals is to educate interdisciplinary experts. GIP identifies interoperable components and enables information exchange. Some components seem to be (1) a program to analyze the governance of modeling efforts, and to help coordinate them, (2) maintain and add extensions to the Earth Science Modeling Framework (ESMF) for the Community Climate System Model (CCSM), (3) the Earth System Curator to assist in developing collaborative software. Several users of the GIP infrastructure were in the audience and explained their projects.
High-End Computing and Climate Modeling: Trends and Prospects
In a masterworks talk, Phil Colella has an axe to grind: he would like to be objective in the face of the claims of massive performance increases in underlying architectures. As a mathematician, he has taken climate modeling as a case study. Climate modeling at exascale entails: (1) 1km resolution, (2) physical processes no longer columnar, (2) more efficient representation of local ocean currents and land processes, and, (4) a more systematic evaluation of uncertainties. But he stated the “brutal facts” of the transition to exascale: reduction of memory bandwidth per Flop, massive increase in concurrency, etc. Adaptively refining the grid in order not to waste memory, utilizing non-hydrostatic models and implicit methods using multigrid solvers will be necessary. Co-design is key, since hardware constraints are reflected in the algorithms to be employed. One interesting view is that we will need a low-latency, but no high bandwidth, communication layer to reduce small amounts of data globally. Hardware developers must tackle this.
Technical Paper: ASUCA
This technical paper illustrated the port of the ASUCA production non-hydrostatic model used by the Japan Met. Office to GPU. They achieved 145 TFlops using 3990 GPUs of Tsubbame 2.0. History: AFES attained 26.58 TFlops on earth simulator in 2002 (Gordon Bell prize winner), WRF was partially ported to GPU and attained 50 TFlop/s. ASUCA was rewritten from scratch from Fortran to C/C++ and then to CUDA. A horizontal X/Y data decomposition is used, with MPI halo updates, which can be overlapped with computation. The results, if true, represent a huge achievement for a weather model, and correspondingly the paper has been nominated for best paper. However, several in the audience expressed doubts about the fairness of the CPU/GPU performance comparison, citing the non-optimized nature of the CPU code.
BoF: GPUs and NWP
In a Birds-of-a-Feather session, Stan Posey and Will Sawyer invited five panelists to explain their approaches to porting their codes on GPUs: Takayuki Aoki from the ASUCA project, Univ. of Tokyo, Mark Govett from the NIM model, NOAA, Tom Clune, from the GEOS5 project, NASA GSFC, Craig Toepfer NVIDIA, and Per Nyberg, Cray. Their short presentations explained several differing techniques: a Fortran to CUDA translator (Mark), a CUDA rewrite (Takayuki), CUDA Fortran (Tom), PGI accelerator directives (Craig, and, in more general terms, Per). The subsequent discussion treated issues like the priorities between dynamics, physics and chemistry parts of models (chemistry was generally deemed to be the black sheep), the difficulties inherent in semi-structured meshes (Mark had the most experience here, and reported good performance in spite of the indirect addressing), retaining the investment in software (accelerator directives appear to be central here).