Oak Ridge Leadership Computing Facility (OLCF) is holding a scientific workflow of large data sets workshop
on August 6-8 at University of Tennessee, in Knoxville, Tennesse, USA.
The workshop will be transmitted over the web (technical details will follow).
This workshop will help HPC users learn how to deal with large scientific data, “from cradle to publication”, by bringing together emergent communities with large, scientific HPC-related data, with communities that have experience dealing with large, permanent scientific data. It is aimed at the rest of the “Big Data” communities… Planned topics include:
- How do I know if I have BIG data?
- What you should use for large data prep and analysis
- Why shuffling data during a job kills performance and how you can improve it
- Libraries: better ways to do parallel I/O
- Are all file formats the same?
- How do I begin to visualize enormous datasets?
- In-situ analysis: a how-to
- Sharing your massive data with friends and strangers
- Future outlook on a growing data problem
Hands on tutorials are also planned for various scripting languages, parallel I/O libraries, and viz and analysis tools.
Speakers come from places such as ORNL, CERN, NERSC, JGI and CSCS collaborators with the Blue Brain project and will talk about lessons learned and what has worked for them.
The workshop is free of charge and registration is now open »