Blog

Archive for the ‘HPC System’ Category

Installation of the Phoenix Upgrade for CHIPP

Saturday, February 6th, 2010

In January 2010 started at CSCS the upgrade of the cluster “Phoenix” that is being used as tier 2 by the Swiss Institute of Particle Physics (CHIPP) in the LHC experiment at CERN.  The Swiss commitment in 3 of the 4 large LHC experiments (ATLAS, CMS and LHCb) mandates establishing their own Grid computing infrastructure for performing LHC physics data analysis in Switzerland.

The previous system based on SunBlade 8000 for the Compute Notes, SF_X4200 M2 machines with 2.8 GHz AMD processors for the Service Nodes and SF_X4500 for the Storage Nodes with our ZFS-Solaris Technology has been running now for 3 years and an upgrade and expansion was necessary.

The new HPC system is based on a Gigabit Ethernet network and an additional Infiniband network based on QDR technology with the Sun Datacenter Infiniband Switch 648 and uses Lustre as parallel file system. The worker nodes are base on Sun X6275 blade server, based on the new Intel X5500 processor generation (Nehalem, 2.53 GHz, Quad-Core, 8MBCache).

The hardware has been delivered beginning January, 2010 and will be functional in March, 2010. The old compute nodes will be decomissioned by mid year.

In the next picture you can follow the assembling of the system in the first month.

The truck delivering the hardware beginning of January at a sunny day in Manno.

The boxes with the delivered hardware in the CSCS computing room.

Unloading of the racks that can up to 1’080 kg heavy.

Christoph Grab (CHIPP)  inspects the location of the upgrade of Phoenix (which is next to the existing system).

The first QDR InfiniBand cables (in blue) are connected to the IB fabric. Because they are very fragile, the cables are suspended to the deck and not running in the raised floor.

The sysadmins of Phoenix Jason and Fotis are getting ready to take over the new system.

A New Home for Palu: The ENTER Museum for Computer

Wednesday, February 3rd, 2010

In a previous blog entry we reported about the Cray  XT3, Palu, being decommissioned at CSCS. In the meantime we were able to give a new home to the  three of the six computing racks at ENTER, the Museum for Computer and Technology close  to Solothurn.

Das ENTER ist das einzige Museum in der Schweiz, welches sich der gesamten Breite der Computer, Computerperipherie und Technik widmet. Vermutlich weltweit einzigartig ist die Anzahl noch funktionierender Computersysteme. Nebst den ca. 400 Computern werden auch 100 Taschenrechner, ca. 50 mechanische Rechenmaschinen, Telekommunikation vom Telegrafen, Zentralen über die Telefonie bis zum modernen Handy, Röhrenradios, Röhrenfernsehgeräte, Chiffriermaschinen und vieles andere gezeigt.

It is a good feeling to know that computing systems running once at CSCS can be viewed by the generations of people. The Cray XT3 was an important step for CSCS being the first massively parallel MIMD supercomputer used at CSCS. Palu was used both by researchers of Swiss universities and by MeteoSwiss for weather forecasts.

In the next photos you may see some steps of the shipping of the three racks (each with a weight of about 830 kg). In addition a SGI Origin 9500 has also been donated to the museum.

The first Cray XT3 rack exits the CSCS building:

The rack being loaded on the truck for transport:

The console of the Cray XT3 and an additional rack are loaded in the computer room for transport.

Farewell of the Cray by the director of CSCS, Prof. Thomas Schulthess

Decommission of «Palu» at CSCS

Wednesday, January 27th, 2010

On January 27th, 2010 the Cray XT3 has reached the end of its life here at CSCS, for the last time the system has been shut down and disassembled.

This computer has been named after Piz Palü in the Bernina range in Graubunden, Eastern Switzerland with an elevation 3901 m and his last configuration has been the follow:

  • 6 Cabinets containing 14 service processing elements (PEs), subdivided into 7 service blades, and 548 dual core nodes giving 1096 compute PEs subdivided into 137 compute blade
  • 3 GB of Ram (Compute nodes)

Palu started its life as production system on January 2006. This computer was the very first Cray XT machine to set foot on Europe. It was the first supercomputer based on the XT architecture thatrun Catamount operating system using Infiniband as high-speed interconnect. Palu was mainly used for massively parallel jobs.

Of the 6 cabinets of Palu, two will be disassembled by Cray to be used as spare parts, one will stay at CSCS and be the first exhibit of an own museum. The last three cabinets will be shipped next week to a compter museum near to Solothurn. Stay tuned on this blog to get additional information on the museum next week…

In the next photos you may see the technician of Cray starting the disassembling of Palu (this will need almost three working days).

Disassembling of the interconnect.

The interconnect cables and removing the power supply cables.

Another view of the interconnect being disassembled:

Palu_Interconnect

Disassembling part of the compute blades to be used as spare parts.

Move of Dôle at CSCS to a New Location

Tuesday, January 26th, 2010

Today the HPC system called “Dôle” has been moved to a new location inside CSCS. Dôle is being used by MeteoSwiss as failover for Buin, the main production system for weather forecast.

Previously Dôle and Buin were placed next to each one. The replacement to two separate locations in the CSCS computer room will ensure a maximum of availability in case of any technical issue.

HPC @ University of Bern: ubelix – Uni BErn LInuX cluster

Friday, January 22nd, 2010

See also the original slides of Andres Aeschlimann as pdf »

Purpose
This Grid HPC infrastructure is primarily designed to support the researchers at the Campus. They should use their time doing research and not be bothered by deploying a Grid HPC infrastructure.

Picture of ubelix

Some facts

  • first Linux Cluster was installed in 2001 (1 master and 32 single core nodes)
  • continuously expanded to ~1000 cores in >200 nodes today
  • Dual- and quadcore worker nodes
  • Mostly Opterons, increasing # of Intels (Nehalem)
  • several suppliers (mostly SUN, but currently also IBM and some Dell)
  • < 100kW
  • Gentoo Linux www.gentoo.org
  • Kernel 2.6.22/2.6.27
  • 2TB memory, 50TB disk
  • Lustre filesystem: 1.8.1
  • Sun Grid Engine 6.2
  • Gb Switch
  • Currently no Infiniband Switch

Internal (private) network

  • TCP/IP
  • Stackable Switches (~40Gbs)
  • ?normal“ Gigabit Ethernet on the worker nodes
  • 10GE Ethernet for high throughput servers

Internal Network of ubelix

Lustre@ubelix

ubelix Lustre

Application portfolio (local users)

  • HE Physics
  • Astronomy
  • Computational and Molecular Population Genetics Lab
  • Space Research Physics
  • Computer Vision and Artificial Intelligence
  • Chemistry and Biochemistry

Applications from remote (SMSCG)

  • ATLAS: high energy physics application developed for the LHC experiment at CERN
  • RSA768: cryptographic application
  • NAMD and GROMACS: biochemistry applications
  • GAMESS: biochemistry application (work in progress)

Remote access to cluster

Other clusters @ UniBE

  • The LHEP UNIBE Atlas T3 2009 – A ROCKS Cluster with ~200 cores (Sun Fire X2200 IU dual quad cores) and ~50 TB on CentOS. Located in same room as ID UNIBE clusgter. Mainly serves local and remote ATLAS scientists. Backfilled with remote users and applications. Speciality: Access only via ARC clients, i.e. remote and local users habe the same interface. http://ce.lhep.unibe.ch
  • Theoretical Physics (~200 cores, with interconnect)
  • Climate Physics (~100 cores)
  • Space Physics (~100 cores)
  • Chemistry (~100 cores, with interconnect)
  • Computational and Molecular Population Genetics (~60 cores)