 |
Posts Tagged ‘SC10’
Monday, February 7th, 2011
This is the second part of the posting on the SC10 Report on Climate Thrust Area by Will Sawyer.
Community Climate Model Development
In a masterworks talk, Ricky Rood (professor at Univ. of Michigan) insisted climate projections offer an historic opportunity: for the first time we have a glimpse into the future and can potentially adapt our way of living to minimize the risks. He emphasized the considerable difference between weather prediction and climate projections. Ricky presented several interesting problems: level-5 hurricane hitting New Orleans, and how would irrigation in the Sahara affect hurricanes. Since there are so many more climate data consumers than producers, “smart” (i.e., automated) consultants are needed, for example theMAGICC/ SCENGEN project:
http://www.cgd.ucar.edu/cas/wigley/magicc/.
Ricky´s discussion lead up to the introduction of open community activities to use climate projects to analyze climate change, a concept he puts under the umbrella of the OpenClimate community (which he is spearheading). Community Earth System Model (CESM) is a successful example of open community collaboration, but open source is not part of climate modeling culture. The model for such a community might be Linux development. This talk did not directly address HPC issues, but was rather a philosophical view on how we need to proceed as a community to affectively use HPC climate models to address real world problems.
(more…)
Tags: Climate, Conference, SC10 Posted in Conference | Comments Off
Monday, January 31st, 2011
The short series of reports from the SC11 conference closes with two posting from our colleague Will Sawyer (CSCS) on the Climate Thrust Area. The climate thrust area had a significant presence at the conference. The basic program consisted of one plenary talk, four masterworks presentations, one BoF session, two panels and a number of technical papers. The first posting is dedicated to the Climate Plenary Talks, Challenges in Analysis and Visualization of Large Climate Data Sets, Parallel I/O, Pushing the Frontiers of Climate and Weather Models and Climate Computing.
Climate Plenary Talks
In the climate thrust area plenary talk “Climate Prediction and Research: The Next 20 years”, Terry Davies from the UK Meteorological Office presented the current status of climate models, and the prospects for the future. As a general overview, there were not many surprises: ensembles are central in climate studies; climate projections are improving, but the metrics of reliability are not the same forecast skill for numerical weather prediction; there are significant impediments to scaling the current models to petascale, one such hurdle is the amount of data which models currently generate.
(more…)
Tags: Climate, Conference, SC10 Posted in Conference | Comments Off
Thursday, January 20th, 2011
We continue our short series of reports from the SC11 conference based on the input of our colleague Will Sawyer. This posting is dedicated to advanced topics in heterogeneous programming with OpenCL
Trying to find out whether OpenCL is the right horse to bet on, we attended Advanced Topics in Heterogeneous Programming with OpenCL, given by Tim Mattson (Intel), Ben Gaster (AMD), Ian Buck (NVIDIA), Peng Wang (NVIDIA), and Mike Houston (AMD).
Tim gave a quick summary of the OpenCL introduction (previous post) and treated the limited synchronization model: there are synchronization primitives within work groups, but between groups all synchronization is through the level of commands. He also reminded us of the Event model: commands return events and obey wait lists.
Ben discussed the mapping of hardware platforms to OpenCL, using the AMD 5870 and Cell BE as examples.
Ian discussed the Fermi (GTX 480) architecture. The mapping was not well presented, leading Tim to ask what values will exactly be returned from the OpenCL querying commands the various architectures.
Mike answered that question for all the architectures, e.g., the Fermi will support 15 work groups, with 32 work items per group. To fill these chips, you need typically need tens of thousands of work items.
Peng presented his implementation of the conjugate gradient solver for sparse matrices, closely analogous to Nathan Bell´s and Mark Garland´s (both from NVIDIA) work on CUSP. The hybrid (ELLPACK+COO) format performs best for almost all matrices tested. The conversion from/to typical CSR format is expensive, but can be parallelized on multiple cores. Unlike the CUSP team, Peng has no plans to wrap this work into a library, which seems quite unfortunate.
Mike Houston presented the discrete Fourier transform for GPUs, which illuminated many of the performance issues, but illustrated very little OpenCL code. The discursive nature of this session symbolized that lack of maturity of OpenCL. Clearly, OpenCL has a lot of potential, but it is work in progress. Other paradigms, such as CUDA, may be proprietary and less solid, but they are farther along.
Tags: OpenCL, SC10 Posted in Conference, Technology | Comments Off
Monday, January 17th, 2011
We start a short series of reports from the SC11 conference (thanks to Will Sawyer of CSCS for sharing with us his experiences). The first episode is dedicated to an Introduction to OpenCL given by Benedict Gaster (AMD) and Tim Mattson (Intel).
General impressions: the architecture and programming model seems cleaner than CUDA, e.g., the architecture model hierarchy: compute devices -> compute units -> processing units, as well as the execution model of work groups which contain work items. As a paradigm, OpenCL seems to be sufficiently powerful to map programs to a non-trivial set of devices, i.e., multiple GPUs and multiple CPUs with multiple cores. But there are many steps to writing an OpenCL program, some of them complex. The presenters made it clear this is necessary to support such a wide range of processors, but claim it is mostly boilerplate and can be cut and pasted after the first implementation (and auto-generated at some later stage).
A key deficiency seems to be the decision to base OpenCL on the ISO C99 C language standard (with minor restrictions, e.g., no recursion and no variable length arrays), instead of C++. The other deficiency which both presenters admitted is the myth of portability: although OpenCL code can be run on both CPU and GPU, optimization choices made is several of the examples for GPUs caused the code not to run on CPUs. Besides the typical simple vector add and matrix multiplication
examples, they also presented a (non-trivial) parallel radix sort, which is allegedly the best performing sort algorithm on GPUs.
In the final question of the session, Tim and Ben were asked whether companies like Intel, AMD and NVIDIA playing both sides: supporting OpenCL on the one hand, Tim (in spite of his affiliation) made an emphatic statement: “industries will act stupidly if you [i.e., the users] allow them to. Every time you use CUDA you are damaging the community.”
The release schedule has a cadence of about 18 months: OpenCL 1.0 in Dec. 2008, 1.1 in Jun. 2010 and 1.2 expected around Dec. 2012.
Tags: OpenCL, SC10 Posted in Conference, Technology | Comments Off
Monday, November 8th, 2010
A look to the schedule of the Supercomputing Conference (SC10) in New Orleans (November 13-19, 2010) shows that Switzerland is being represented in panels, doctoral research showcases and as a ACM Gordon Bell Finalist.

Will Sawyer (CSCS) will be chair of the technological thrust area Addressing Climate Change Uncertainties.
Each year, the SC Technical Program highlights key thrust areas that are integrated throughout the various components of the program to showcase the SC community´s impact on these new and emerging fields. For SC10, the technological thrust areas are climate simulation, heterogeneous computing and data-intensive computing. Read on scientificcomputing.org an article introducing the three thrust areas.
Sadaf R. Alam (CSCS) is member of the Technical Paper program committee and of the Performance Committee. Sadaf will be Chair of doctoral research showcase, has an accepted technical paper (co-author) on Optimal Utilization of Heterogeneous Resources for Biomolecular Simulations and will present a poster at PGAS booth.
John Biddiscombe (CSCS) will moderate the Panel Parallel I/O: Libraries and Applications, Making the Most of Resources.
The need for parallel IO is becoming ever more pressing – With several libraries available, some more established than others, and some seemingly filling niche roles for particular scientific domains, the developer has a difficult time choosing the right library, and getting the best performance from it. If parallel file systems can be scaled up to support more and more IO nodes and metadata servers, then is the issue of raw bandwidth a ‘solved’ problem? If so, how can library developers ensure that scientific applications get the maximum performance without requiring extensive understanding of the underlying methods and issues. In this panel, we bring together developers of some of the leading parallel IO libraries used in the scientific computing community and ask how they can be best used by developers (such as climate modelers) producing prodigious amounts of data and how will the libraries change to meet future needs.
Anton Kozhevnikov (ETH Zürich), Adolfo G. Eguiluz (University of Tennessee, Knoxville) and Thomas C. Schulthess (ETH Zürich/CSCS) are ACM Gordon Bell finalists and will present Toward First Principles Electronic Structure Simulations of Excited States and Strong Correlations in Nano- and Materials Science.
Methods based on the many-body Green’s function are generally accepted as the path forward beyond Kohn-Sham based density functional theory, in order to compute from first principles electronic structure of materials with strong correlations and excited state properties in nano- and materials science. Here we present an efficient method to compute the screened Coulomb interaction W, the crucial and computationally most demanding ingredient in the GW method, within the framework of the all-electron Linearized Augmented Plane Wave method. We use the method to compute from first principles the frequency dependent screened Hubbard U parameter for La2CuO4, the canonical parent compound of several cuprate high-temperature superconductors. These results were computed at scale on the Cray XT5 at ORNL, sustaining 1.30 petaflop. We discuss the details of the algorithm and implementation that allowed us to reach high efficiency and minimal time to solution on today’s petaflops supercomputers.
Tags: CSCS, SC10 Posted in Conference, CSCS | Comments Off
|
|
 |