Legacy HPC Application Migration 2013

The Legacy HPC Application Migration (LHAM) 2013 will be held in conjunction with the IEEE 7th International Symposium on Embedded Multicore/Many-core System-on-Chip (MCSoC-13), NII, Tokyo, Japan, September 26-28, 2013

[Topics of Interest] [Schedule] [Program] [Committees] [Supports] [Contact]

Topics of Interest

In HPC software development, the top priority is always given to performance. As system-specific optimizations are almost always required to fully exploit the potential of a system, application programmers usually optimize their application programs for particular systems. Whenever the target system of an application program is changed to a new one, thus, they need to adapt the program to the new system. This is so-called legacy HPC application migration. The migration cost increases with the hardware complexity of target systems. Since future HPC systems are expected to be extremely massive and heterogeneous, it will be more difficult to afford the migration cost in the upcoming post-Petascale era. Therefore, this special session, LHAM, offers an opportunity to share practices and experience of legacy HPC application migration, and also discuss promising technologies to reduce the migration cost.

Schedule

Workshop Date: September 27, 2013

Program

Invited 1

"Ease of Porting Parallel HPC Codes to Intel Xeon Phi"

Michael McCool, 10:35-11:20(45min)

Abstract:

The Intel Xeon Phi co-processor runs Linux and supports many existing parallel and distributed software development systems, including OpenMP and MPI. The Intel compiler also supports offload functionality so that existing computations running on a host processor can be annotated to offload parts of this computation to the Xeon Phi. Both of these approaches can be used to develop Xeon Phi applications with very little modification of existing code. However, applications may still need to be tuned for the best performance. The Xeon Phi has a very large number of cores and a wide vector width, and applications need to make good use of both of these features to get the best performance. Intel also provides a number of tools to analyze performance and identify bottlenecks. In this talk I will go over the features of the Xeon Phi, will discuss the parallel and distributed programming systems that work with it, and then will present some case studies of applications that have been ported and tuned for the Xeon Phi.

Bio:

MichaelMcCool Michael McCool is Intel Principal Engineer. He has degrees in Computer Engineering (University of Waterloo, BASc) and Computer Science (University of Toronto, M.Sc. and PhD.) with specializations in mathematics (BASc) and biomedical engineering (MSc) as well as computer graphics and parallel computing (MSc, PhD). He has research and application experience in the areas of data mining, computer graphics (specifically sampling, rasterization, path rendering, texture hardware, antialiasing, shading, illumination, function approximation, compression, and visualization), medical imaging, signal and image processing, financial analysis, and parallel languages and programming platforms. In order to commercialize research work into many-core computing platforms done while he was an Associate Professor at the University of Waterloo, in 2004 he co-founded RapidMind, which in 2009 was acquired by Intel. Currently he is a software architect with Intel working on parallel programming languages, applications, and mobile computing. In addition to his university teaching, he has presented numerous tutorials at Eurographics, SIGGRAPH, and SC on graphics and/or parallel computing, and has co-authored three books. The most recent book, Structured Parallel Programming, was co-authored with James Reinders and Arch Robison. It presents a pattern-based approach to parallel programming using a large number of examples in Intel Cilk Plus and Intel Threading Building Blocks.

Invited 2

"Software Engineering of Scientific Applications for Portability, Evolvability, and Performance"

Shirley Moore, 14:50-15:35(45min)

Abstract:

Execution environments and scientific applications are both in a period of rapid evolution and are becoming increasingly complex. High performance architectures are evolving towards combining many processors with diverse architectures ranging from multicore chips through SIMD accelerators. Applications must execute effectively on a diversity of architectures over their lifetimes. The increasing diversity and complexity in execution environments and application codes is leading to a steep increase in cost and effort in attainment of portability, evolvability and performance. Design and development methodologies appropriate for life-cycle use with complex long-lived application systems such as software architectures and component-based software engineering are now well-established in mainline computer science. There exist methods and tools that when extended and combined can yield a tool chain for partial automation of the necessary code transformations. Adapting and extending the software engineering methodologies and developing a tool chains for life-cycle development of complex scientific application codes has the potential to yield a paradigm change in how these codes are mapped onto emerging complex architectures. We describe initial steps towards establishing such a tool chain as well as initial successes in restructuring of legacy application codes.

Bio:

shirley_moore Shirley Moore is an Associate Professor of Computer Science at the University of Texas at El Paso (UTEP). She is also a core faculty member in the graduate interdisciplinary Computational Sciences Program at UTEP. She received a PhD in Computer Sciences from Purdue University in 1990. Her research interests are in software engineering, performance modeling, and performance optimization of scientific applications and in hardware-software co-design. She is a Principal Investigator on U.S. Department of Energy and Air Force Office of Scientific Research funded projects in these areas.

Session 1

11:25-12:40(75min)

"Towards an Extensible Programming Environment for Software Evolution"

Hiroyuki Takizawa, 25min

"Experience of Implementing Parallel FFTs on GPU Clusters"

Daisuke Takahashi, 25min

Session 2

15:50-17:05(75min)

"The Future of Accelerator Programming: Abstraction, Performance or Can We Have Both?"

Kamil Rocki, 25min

"An HPC Refactoring Catalog; Guidelines to Bridge The Gap between HPC Systems"

Ryusuke Egawa, 25min

Committees

Organizing Committee

Advisory Committee

Supports

Basic Research Programs: CREST Development of System Software Technologies for post-Peta Scale High Performance Computing. "An evolutionary approach to construction of a software development environment for massively-parallel heterogeneous systems"

Contact

E-mail: lham2013 .at. xev.arch.is.tohoku.ac.jp (replace ".at." by "@" in the email address)