Previous events


ADER-DG - a high-order, compute-bound scheme for future supercomputers?


You are invited to a research talk by Michael Bader from Department of Informatics of the Technical University of Munich.  Prof Bader will be around all afternoon so please get in touch if you would like to have a meeting with him.


Date: December 15th, 2016
Time: 16:00 - 17:00
Venue: LT 311, Huxley  (building 13 on the map)
Campus: South Kensington Campus, Imperial College London



Supercomputers of the current petascale and future exascale class are posing new requirements on simulation software. Besides demands for energy efficiency or resiliency, a key requirement on exascale-ready numerical schemes is posed by the trend towards an increasing ratio between executed floating point operations and transferred bytes of memory - hence, numerical schemes with high arithmetic intensity will be privileged on future architectures.

This presentation will discuss two simulation packages that exploit the ADER-DG (Arbitrary high-order DERivative Discontinuous Galerkin) scheme. SeisSol simulates dynamic rupture processes and seismic wave propagation on adaptive tetrahedral meshes for a highly accurate physics-based modelling of earthquakes. It has achieved Multi-PFlop/s performance on several of the largest supercomputers. ExaHyPE (an Exascale Hyperbolic PDE Engine) is being developed as part of a respective Horizon 2020 to meet requirements of exascale hardware. The engine will solve hyperbolic PDEs using high-order ADER-DG on tree-structured Cartesian grids. The talk will discuss experiences with optimising SeisSol and introduce plans and first results for the ExaHyPE engine.


Michael Bader is Associate Professor at the Department of Informatics of the Technical University of Munich. He works on hardware-aware algorithms in computational science and engineering and in high performance computing.  His main focus is on the challenges imposed by the latest supercomputing platforms and the development of suitable efficient and scalable algorithms and software for simulation tasks in science and engineering.  His research group is located at the Leibniz Supercomputing Center.



Program Analysis and Transformation for Scientific Computing 

You are warmly invited to a research talk on by Paul Hovland, Argonne National Laboratory.  If you would like to meet Paul, he is visiting Dec 2-6.

Date: December 5th, 2016
Time: 11:00 - 12:00
Venue: Room 217, Huxley  (building 13 on the map)
Campus: South Kensington Campus, Imperial College London



We discuss several applications of program analysis and transformation in scientific computing.  We begin with a discussion of automatic empirical performance tuning (autotuning) techniques and strategies for dealing with multiple, competing objectives (such as time and power).  We continue with a discussion of automatic (also called algorithmic) differentiation techniques for computing the derivatives of functions defined by computer subprograms.  We conclude with a consideration of program verification, with an emphasis on proving the equivalence of two implementations.



Paul Hovland's research focuses on program analysis and transformation tools for high performance scientific computing applications. He holds a B.S. in computer engineering and an M.S. in computer science from Michigan State University. He received his Ph.D. in computer science with a computational science and engineering option from the University of Illinois at Urbana-Champaign, advised by Michael T. Heath.  He is a Senior Computer Scientist and the Strategic Lead for Applied Mathematics in the Mathematics and Computer Science Division at Argonne National Laboratory (



High-Precision Anchored Accumulators for Reproducible Floating-Point Summation

You are invited to a talk by Neil Burgess (ARM, Cambridge) – it should be very interesting.  Neil will be around afterwards and is looking for research collaboration opportunities in this area.

Date: November 24th, 2016
Time: 15:00 - 16:00
Venue: Room 217, Huxley  (building 13 on the map)
Campus: South Kensington Campus, Imperial College London


We propose a new datatype and new instructions that allow reproducible accumulation of floating-point (FP) numbers and products in a programmer-selectable range. The new datatype has a larger significand and a smaller range than existing floating- point formats and has much better arithmetic and computational properties. In particular, it is associative, parallelizable, reproducible, and correct. For the modest ranges that will accommodate most problems, it is also much faster: 3 to 12 times faster on a single 256-bit SIMD implementation, and potentially thousands of times faster for large multicore implementations.


Neil is a Principal Design Engineer at ARM, where he has worked (in Cambridge and Austin, Texas) since 2009.  Before that he worked as a Silicon Design at Icera.  He has a PhD in VLSI Design and Test from Southampton, and last year was recognised as an ARM Inventor of the Year. 



Accelerating Scientific Python code with Numba

You are warmly invited to this talk by Graham Markall, one of the PhD graduates behind the Firedrake Project.  The is based on his work at Continuum Analytics, a key player in the scientific Python and PyData worlds:

Date: November 17th, 2016
Time: 11:00 - 12:00
Venue: Room 217, Huxley  (building 13 on the map)
Campus: South Kensington Campus, Imperial College London


Scaling up the performance of Python applications without having to resort to writing the performance-critical sections in native code can be challenging. Numba is a tool that solves this problem by JIT-compiling user-selected Python functions using LLVM to deliver execution speed on a par with languages more traditionally used in scientific computing, such as Fortran and C++. As well as supporting CPU targets, Numba includes CUDA and HSA GPU backends that allow offloading of vectorised operations with little programmer effort. For more complicated GPU workloads, Numba provides similar capabilities to CUDA C within Python, and debugging tools that integrate with Python debugging tools such as pdb and pub.

This talk discusses the implementation of Numba and provides guidance for getting the best performance out of Numba-compiled code. Some examples of real-world applications that use Numba will be presented.

Speaker bio:

Graham Markall came to Imperial for our MSc in Advanced Computing in 2008, and stayed to do a PhD on “Multilayered Abstractions for Partial Differential Equations” – work which formed the foundation for the Firedrake Project ( Since graduating, Graham has worked at OpenGamma and Continuum Analytics, and is currently a compiler engineer at Embecosm.



Title image 

You are cordially invited to attend Drop Impact: from Coalescence to Splashing, a one-day meeting bringing together specialists in the areas of drop impact onto liquid and solid surfaces. 

The event has attracted very strong participation from institutions across the United Kingdom and mainland Europe. Speakers from varied backgrounds developing analytical, computational as well as experimental methods will present their recent work on this fascinating subject. 

The meeting is to be held on Friday November 11th in the Imperial College Business School (ICBS 300 LT3). The scientific programme consists of four sessions of contributed talks, separated by sponsored coffee breaks and lunch. We encourage you to take advantage of the free registration in order to facilitate logistics and catering.
For further information please consult the meeting website: 

The event is generously supported by the Centre for Computational Methods in Science and Engineering of Imperial College London.

We are looking forward to welcoming you on the day.

The Organising Committee
Dr. Radu Cimpeanu
Dr. Matthew Moore
Prof. Demetrios Papageorgiou
Department of Mathematics
Imperial College London


HPC summer school 2016

September 26 - 30, 2016

The second instalment of the HPC summer school starts on Monday September 26th and lasts one week.

The programme includes a code optimisation tutorial, one day of performance tuning for cx2 (helen) and a two day MPI class.  On Friday, we'll host six community sessions on imaging, research software engineering, computational molecular sciences, genomics, research data management and simulation methods. The event concludes with a keynote lecture on compilers, HPC prize announcement and a reception.

You can register for separate workshops or community sessions.  All are welcome!



10 years of HPC at Imperial

The HPC Service group at the College is pleased to invite you to their 10th birthday celebration.  You are welcome to join us and celebrate with exciting presentations and refreshments.

We'll start with messages from the HPC champion Professor Peter Haynes, from the Dean of Faculty of Engineering Professor Jeff Magee and from the HPC manager Simon Burbidge.  Next, we'll have scientific presentations from Professor Robert Glen from the Department of Surgery and Cancer and Professor Spencer Sherwin from the Department of Aeronautics.  The Imperial students will be represented by Ioan Hadade from the Department of Mechanical Engineering.  Finally, we'll have the pleasure to welcome Dr. Eng Lim Goh, the Senior Vice President & Chief Technology Officer of SGI.  The event will conclude with a reception. 

 We encourage all to bring posters showing their research that was supported by HPC.

Date: July 7th, 2016
Time: 14:00 - 17:00
Venue: Room 340, Huxley  (building 13 on the map)
Campus: South Kensington Campus, Imperial College London
Registration: Please register here.



14:05 Welcome
Professor Peter Haynes, Head of the Department of Materials and the HPC champion
14:10 Message from the Dean
Professor Jeff Magee, Dean of the Faculty of Engineering
14:20 Celebrating a Decade of HPC at Imperial - looking to the future
Simon Burbidge, HPC Manager
14:30 From Molecules to Man - HPC accelerates discovery
Professor Robert Glen, Department of Surgery and Cancer
14:45 Algorithms, Arteries & Automobiles
Professor Spencer Sherwin, Department of Aeronautics
15:00 Turbomachinery CFD on Modern Multicore and Manycore Architectures
Ioan Hadade, Department of Mechanical Engineering
15:15 HPC and Beyond
Dr. Eng Lim Goh, Senior Vice President and Chief Technology Officer of SGI
16:00 - 17:00 Celebration with food, drinks and cake 



Nektar++ Workshop 2016

Date: June 7th - 8th, 2016
Venue: Room 207, Skempton building
Campus: South Kensington Campus, Imperial College London
Registration: Please register here.

The event is organised in association with the EPSRC Centre for Doctoral Training in Fluid Dynamics across Scales, and PRISM.

The purpose of this meeting is to bring together the Nektar++ developers and design time with users of any experience level within the broader community.  This two day workshop will feature talks providing an overview of Nektar++, discussions on the current design roadmap, tutorials on how to use some of the more recently added and/or requested features within Nektar++ (such as high-order mesh generation and parallelism), and presentations by those adapting and using Nektar++ for pushing the boundaries of their own scientific and engineering disciplines.

The programme and the details can be found here.



Modelling and simulation of complex fluids

May, 2016

Visiting Nelder fellow Prof. John Tsamopoulos delivered a series of lectures on complex fluids:

Rheology of complex fluids: introduction

Part 2: Shear flows

Part 3: Extensional flows

Part 4: Generalized Newtonian fluids

Part 5: Linear viscoelasticity

Part 6: Nonlinear viscoelasticity



Numbers to the rescue: can maths save the planet?

Date: Wednesday 18 May, 2016
Time: 18:00 - 19:00 
Venue: Room G34, Sir Alexander Fleming Building, Imperial College London
Campus: South Kensington Campus (#33 on the map)
Registration: Please register here by May 16th.


The Grantham Institute and the Mathematics of Planet Earth Centre for Doctoral Training invite members of the business community to attend a panel discussion at Imperial College London where experts will discuss how mathematics can combat environmental problems, and their impact on society and business.

The panel discussion will be chaired by Imperial's Professor Sir Brian Hoskins, an esteemed meteorologist and member of the UK Committee on Climate Change. Following the discussion, we would like to offer you the opportunity to meet our academics and PhD students, and see examples of what our research in climate modelling, uncertainty, data assimilation and computational science can do for you.

The lecture will be followed by networking and a research exhibition.


Should you have any enquiries, please email Chloe Stockford (

PRISM Event: Workshop on Embracing Accelerators 

Date: Monday 18 April, 2016
Time: 10:00 - 17:00 
Venue: Lecture Theatre 201, Skempton Building
Campus: South Kensington Campus
Registration: The event is free and open to all but registration is essential.  Please register here.
Speakers: Alex Heinecke (Intel), Karl Rupp and members of the Imperial College


Members of PRISM community are delighted to invite you to Imperial College on Monday 18th April for an event that explores the use of accelerators with finite element methods. We will cover CG vs DG, higher-order versus lower-order methods in CFD and the tradeoffs between implicit and explicit methods in the context of manycore and GPU hardware developments.

The event will consist of speakers from Imperial College and we are pleased to have Alex Heinecke (Intel) and Karl Rupp as guest speakers.


10:00: Coffee

10:30: Alex Heinecke (Intel) 
Fighting B/F Ratios in Scientific Computing by Solving PDEs in High Order

11:30: Short talks (TBC)

12:30: Lunch and posters

14:00: Karl Rupp
FEM Integration with Quadrature and Preconditioners on GPUs

15:00: Futher talks (TBC) plus primer for discussion panel on implicit vs explicit methods on accelerators.

15:45: Discussion: Implicit vs explicit methods on accelerators.

16:15: Networking Drinks 

Speaker Biographies and Abstracts:

Alex Heinecke
Fighting B/F Ratios in Scientific Computing by Solving PDEs in High Order

Today’s and tomorrow’s architectures follow a common trend: wider vector instructions which offer denser arithmetic intensity, but constant and therefore relatively lower bandwidth. When solving PDEs, high-order methods are a possible candidate for adopting to this hardware development. Their computing cost increases with higher order due to the higher arithmetic intensity, while relatively reducing the required memory bandwidth. Therefore, they offer an adjustable trade-off between the computational costs, required bandwidth and the accuracy delivered per degree of freedom. In this talk we examine the impact of convergence order, clock frequency, vector instruction sets, alignment and chip-level parallelism for higher order discretization on their time to solution, more precisely their time to accuracy, with respect to yesterday’s, today’s and tomorrow’s CPU architectures. From a performance perspective, especially on state-of-the-art and future architectures, the shift from a memory- to a compute-bound scheme and the need for double precision arithmetic with increasing order describes a compelling path for modern PDE solvers.

Alexander Heinecke studied Computer Science and Finance and Information Management at Technische Universität München, Germany. In 2010 and 2012, he completed internships at Intel in Munich, Germany and at Intel Labs Santa Clara, CA, USA. In 2013 he completed his Ph.D. studies at TUM and joined Intel’s Parallel Computing in Santa Clara in 2014. His core research topic is the use of multi- and many-core architectures in advanced scientific computing applications.
Awards: Alexander Heinecke was awarded the Intel Doctoral Student Honor Programme Award  in 2012. In Nov. 2012 he was part of a team which placed the Beacon System #1 on the Green500 list. In 2013 and 2014 he and his co-authors received the PRACE ISC Award for achieving peta-scale performance in the fields of molecular dynamics and seismic hazard modelling on more than 140,000 cores. In 2014, he and his co-authors were additional selected as Gordon Bell finalists for running multi-physics earthquake simulations at multi-petaflop performance on more than 1.5 million of cores.

Karl Rupp
FEM Integration with Quadrature and Preconditioners on GPUs

Efficient integration of low-order elements on a GPU has proven difficult. Former work has shown how to integrate a differential form (such as Laplace or elasticity) efficiently using algebraic simplification and exact integration. This, however, breaks down for multilinear forms or when using a coefficient. In this talk, I present results from joint work with M. Knepley and A. Terrel on how to efficiently integrate an arbitrary form using quadrature. The key is a technique we call “thread transposition” which matches the work done during evaluation at quadrature points to that done during basis coefficient evaluation. We are able to achieve more than 300GF/s for the variable-coefficient Laplacian, and provide a performance model to explain these results.
The second part of the talk discusses performance aspects of preconditioners for GPUs, in particular algebraic multigrid. While the preconditioner application maps well to the fine-grained parallelism provided by GPUs, our benchmarks indicate that GPUs have to be paired with powerful CPUs to obtain best performance.

Karl Rupp holds master’s degrees in microelectronics and in technical mathematics from the TU Wien and completed his doctoral degree on deterministic numerical solutions of the Boltzmann transport equation in 2011. During his doctoral studies, he started several interacting free open source projects, including the GPU-accelerated linear algebra library ViennaCL. After a one-year postdoctoral research position working with the PETSc-team at the Argonne National Laboratory, USA, and a research project on improving the efficiency of semiconductor device simulators at TU Wien, he is now a freelance scientist. His current activities include the GPU-acceleration of large-scale geodynamics simulations.

Allinea Forge/DDT graphical debugger hands-on tutorial

Date: 11 February 2016
Time: 10:00-17:00
Venue: Huxley 410
Campus: South Kensington Campus
Audience: Open to all staff and students, please register at
Speaker: Florent Lebeau

The HPC group has recently acquired Allinea Forge/DDT graphical debugger for use on all the HPC systems. Florent Lebeau, a technical expert from Allinea, is coming to Imperial to give a full day tutorial for all interested students and staff. Please refer to HPC wiki for details and registration. The class is highly recommended to those who develop their own code in C, C++ and Fortran, including MPI and threaded parallel code. 






AMMP Colloquium

Professor Michael Siegel: Analysis and computations of the initial value problem for hydroelastic waves

Date: 18 Dec 2015
Time: 15:00-16:30
Venue: Huxley 139
Campus: South Kensington Campus
Audience: Open to all
Speaker: Professor Michael Siegel, New Jersey Institute of Technology, USA
Contact: Pavel Berloff, Demetrios Papageorgiou

The hydroelastic problem describes the evolution of a thin elastic membrane in potential flow. It arises in many applications, including the dynamics of flapping flags and ice sheets in the ocean. An efficient, non-stiff boundary integral method for the 3D hydroelastic problem is presented. The stiffness is removed by a small-scale decomposition, following prior work on 2D interfacial flow with surface tension. A convergence proof for a version the numerical method will be discussed (joint work with David Ambrose and Yang Liu).



HPC summer school

September 28th - October 2nd, 2015

The HPC summer school brought together the HPC support team, scientific community and businesses for one week of tutorials, lectures, community sessions and exchange of ideas.

Abstractions for Data-Centric Computing

Didem Unat, Koç University
Wednesday July 29th at 15:30 in Huxley 217

Programming models play a crucial role in providing the necessary tools to express locality and minimize data movement, while also abstracting complexity from programmers.Unfortunately, existing compute-centric programming environments provide fewabstractions to manage data movement and locality, and rely on a large shared cache to virtualize data movement. We propose three programming abstractions, tiles, layout and loop traversal that address data locality and increased parallelism on emerging parallel computing systems. The TiDA library implements these abstractions in the data structures through domain decomposition and provides performance portable codes. We demonstrate how TiDA provides high performance with minimal coding effort on current and future NUMA node architectures.

Bio: Didem Unat joined Koç University in Istanbul in September 2014 as a full time faculty. Previously she was at the Lawrence Berkeley National Laboratory. She is the recipient of the Luis Alvarez Fellowship in 2012 at the Berkeley Lab. Her research interest lies primarily in the area of high performance computing, parallel programming models, compiler analysis and performance modeling. She is currently working on designing and evaluating programming models for future exascale architectures, as part of Hardware Software co-design project. She received her Ph.D under Prof. Scott B. Baden's research group at University of California-San Diego. In her thesis, she developed the Mint programming model and its source-to-source compiler to facilitate GPGPU programming. She holds a B.S in computer engineering from Bo─čaziçi University.

ExaStencils: Towards the automatic generation of highly parallel multigrid implementations

Hannah Rittich (Univ Wuppertal)
Friday June 26th 2015 at 10:30 in Huxley 218

In an ideal (scientific computing) world we would have a software thatrequires only a partial differential equation (PDE) as input to compute itssolution efficiently on a highly parallel computer. ExaStencils is one of theresearch projects that work towards this ideal world. Within the project weare especially focused on efficient and highly parallel multigrid solvers.To implement them we design domain specific languages (DSLs) that allow us toconveniently describe the problems and the algorithms that solve them. TheseDSLs are designed to be translated into efficient and parallel code easily.Furthermore, to achieve good performance of the solver, we employ methods thatchoose automatically the more efficient algorithms that solve the given PDE onthe given hardware.I would like to present the current state of the project, and the lessons we have learned so far.

Hannah Rittich is a PhD student at the University of Wuppertal in the group of Andreas Frommer, supervised by Matthias Bolten ( . She is working on the ExaStencils project (

A Nonhydrostatic Atmospheric Dynamical Core in CAM-SE

Henry Tufo, University of Colorado
Friday June 5th at 4pm, location to be announced

In order to perform climate simulations in the Commuity Earth-System Model (CESM) with atmosphericresolutions well beyond 1/8th of a degree, a number of new technologies must be put into place. Mostfundamentally, the Primitive-Equations of motion utilized by the Communicty Atmosphere Model (CAM)dynamical-core must be replaced or augmented with one or more nonhydrostatic models of the rotating,compressible-Euler / Navier-Stokes system. Additionally, these solvers must perform at extrememly highlevels of throughput and scale with near optimal parallel efficiency in order to make such expensivehigh-resolution simulations practical. In this talk, we will discuss our efforts to develop a nonhydrostaticmodel in CAM-SE, its parallel scaling, and related supporting technologies.

Dr. Henry Tufo received his Ph.D. from Brown University in 1998. Henry was a member of the DOE ASC Center for Astrophysical Thermonuclear Flashes at the University of Chicago and Argonne National Laboratory from 1998 to 2002 and from 2002 to 2013 was as a Scientist and Computer Science Section Head at the National Center for Atmospheric Research. He is currently a Full Professor of Computer Science at the University of Colorado and Director and founder of the Computational Science Center. Henry conducts research in high-performance scientific computing, parallel architectures and algorithms, Linux clusters, scalable solvers, high-order numerical methods, computational fluid dynamics, and flow visualization. He is co-developer of NEK5000, long-time contributor to the HOMME effort, lead architect for the Janus computer system and its co-designed facility, manager of NCAR’s TeraGrid resources, and recipient of the Gordon Bell award in 1999 and 2000 for demonstrated excellence in high-performance and large-scale parallel computing.

Two talks by Stephane Popinet

Adaptive numerical methods for fluid mechanics

Date: Thursday June 4
Time & place: 4pm Huxley 340

The equations of fluid mechanics can be used to describe natural processes over a wide range of scales, from the behaviour of micro-organisms to astrophysics. Each of these processes is in turn often controlled by internal interactions on widely different scales. Numerical methods able to efficiently resolve these interactions are — in combination with theoretical analysis and lab experiments — an essential tool for advancing our understanding. I will give a general overview of the hierarchical numerical methods I have worked on, as implemented within the free software Gerris Flow Solver ( and Basilisk ( and discuss a range of applications including microscale high-energy droplet dynamics, multiphase and complex flows, tsunamis and climate dynamics.

Continuum modelling of granular materials

Date: Friday June 5
Time and place: 11am Huxley 340

Granular materials such as sand/gravel are notoriously difficult to model. One approach is to treat them as fluids with a specific "non-Newtonian" rheology. Although this approach dates back to the early 20th century, rheologies suited to dry granular materials have only been discovered very recently. I willl show how these advances can be combined with numerical methods to obtain very accurate models of avalanches and other granular flows.

Stephane Popinet is a member of the Complex Fluids and Hydrodynamic Instabilities group of the Institut Jean Le Rond s'Alembert which is part of the Universite Pierre et Marie Curie (Paris 6) and CNRS. Stephane is the author of the public domain hydrodynamics software Gerris ( and Basilisk (, and has published extensively in the area of multiphase flows, complex fluids and droplet dynamics among others. He is one of the leading figures in the numerical simulation of fluid flows and combines expertise in both fluid dynamics as well as scientific computation to address complex nonlinear hydrodynamics problems.

Sneaking up on reliable and effective jet noise control

Daniel J Bodony (Univ Illinois)
Huxley 130 2-3pm on Tuesday May 5th

Abstract: The loudest source of high-speed jet noise, such as found on naval tactical fighters, appears to be unsteady wavepackets that are acoustically efficient but relatively weak compared to the main jet turbulence. These wavepackets can be usefully described by linear dynamics and connected to transient growth mechanisms. Through a component-wise structural sensitivity analysis of the turbulent jet baseflow, using both the equilibrium and time-average fields, estimates are given as to what location and kind of actuators and sensors are most effective, in a linear feedback context, to control the wavepackets to reduce their noise. Low and high frequency approaches are examined where the controlling mechanisms differ: the low-frequency control indirectly targets the slow variation of the mean on which the wavepackets propagate while the high-frequency control targets the wavepackets themselves. The predicted control strategy is evaluated using direct numerical simulations on a series of Mach 1.5 turbulent jets. Bio: Daniel J. Bodony is the Blue Waters Associate Professor and Donald Biggar Willett Faculty Scholar in the Department of Aerospace Engineering at the University of Illinois. Prior to joining the University of Illinois he was an engineering research associate at the NASA Ames/Stanford Center for Turbulence Research. He received his PhD from Stanford University in 2005, he received an NSF CAREER award in 2012 in fluid dynamics, and he is an Associate Fellow of the AIAA.

OPS - An abstraction from multi-block structured mesh computations

Istvan Reguly (Oxford University)
Monday April 27th, 11am in Huxley 144

In this talk Istvan will introduce the OPS abstraction for multi-block structured grids and discuss some of the targeted applications: primarily CloverLeaf. CloverLeaf is a mini-app from AWE, and has several hand-tuned implementations for various architectures, which serve as good baselines for comparison with a high-level approaches, such as OPS. I will discuss performance with MPI, OpenMP, CUDA, OpenCL, OpenACC, both on a single node, including IBM's Power8, and large scale systems. Referring back to OP2, He will discuss strategies for GPU and vectorised CPU code generation and the challenges involved from the perspective of a simple code-to-code transformation tool in python. Finally, we are taking a look at OPS's checkpointing functionality and ho w it can near-optimal functionality with very little user input, by relying on the loop chaining abstraction and the high-level description of computations in OPS.

Bio: Istvan has been working as an RA at Oxford on high level abstractions and Domain Specific Languages for scientific computing, and also as a GPU specialist associated with the GPU-accelerated Emerald supercomputer. He has a PhD from the Pázmány Péter Catholic University in Hungary (

A bilevel programming problem occurring in smart grids

Prof. Leo Liberti (CNRS LIX, Ecole Polytechnique) on
7th May, 11am, CPSE Seminar Room, RODH C615, Roderic Hill Building. Refreshments beforehand in the CPSE Common Room (Centre for Process Systems Engineering2015 Seminar Series)

Abstract: A key property to define a power grid "smart" is its real-time, fine-grained monitoring capabilities. For this reason, a variety of monitoring equipment must be installed on the grid. We look at the problem of fully monitoring a power grid by means of Phasor Measurement Units (PMUs), which is a graph covering problem with some equipment-specific constraints. We show that, surprisingly, a bilevel formulation turns out to provide the most efficien t algorithm.

Computational Progress in Linear and Mixed Integer Programming

Prof. Robert Bixby
13th May, 11am, CPSE Seminar Room, RODH C615, Roderic Hill Building. Refreshments beforehand in the CPSE Common Room (Centre for Process Systems Engineering2015 Seminar Series).

Abstract: We will look at the progress in linear and mixed-integer programming software over the last 25 years. As a result of this progress, modern linear programming codes are now capable of robustly and efficiently solving instances with multiple millions of variables and constraints. With these linear programming advances as a foundation, mixed-integer programming then provides the modeling framework and solution technology that enables the overwhelming majority of present-day business planning and scheduling applications, and is the key technology behind prescriptive analytics. The performance improvements in mixed-integer programming code overs the last 25 years have been nothing short of remarkable, well beyond those of linear programming and have transformed this technology into an out-of-the box tool with applications to an almost unlimited range of real-world problems.

Towards better HPC with Allinea tools

Date : 22nd April 2015, venue : RSM 3.35

Abstract: Allinea offers a wide range of products for production and development environments to help the HPC community improve Supercomputers workloads. Allinea Forge is a professional development environment designed for the challenges that face software developers with multi-threaded and multi-process codes. This toolkit is here to resolve any defects during the development workflow, from initial design to the integration with testing facilities. With Allinea Forge, HPC developers can easily: Discover the critical performance bottlenecks Measure the scaling prop erties of codes across threads and processes Observe and control threads and processes simultaneously Resolve complex destructive bugs. Once an application is optimized and debugged, it goes to production. At this stage of an application life cycle, Allinea Performance Reports is very useful to make sure that the application is running to its full capabilities. By finding the sweet spot, it is possible to drastically increase the number of runs within a given timeframe and make better use of the allocations available. During this hands-on workshop, we will discover Allinea tools on ICL cluster. At the end of the session, the attendees will: know the basic knowledge necessary to get started with the tools in a real life environment know how to choose the right feature to be used during the development activity be able to conduct benchmarks easily and analyze performance reports to make the right submission choices. During this workshop, an equal amount of time will be spent on Allinea DDT - the parallel debugger, Allinea MAP - the parallel profiler, and Allinea Performance Reports - the application analysis tool.

Proposed Agenda: 14:00 - 14:15 : Roundtable and introduction
14:15 - 14:30 : Presentation of Allinea tools (ppt)
14:30 - 15:15 : Getting started with Allinea Forge : optimizing a simple code
15:15 - 16:00 : From profiling to debugging with Allinea Forge
16:00 - 16:45 : Improving an HPC workload with Allinea Performance Reports

>BEM++ - Efficient solution of boundary integral equation problems

Date: Thursday 26 Feb 2015
Time: 16:00 - 17:00
Venue: Huxley 340
Campus: South Kensington Campus
Speaker: Timo Betcke, UCL
Contact: Pavel Berloff

The BEM++ boundary element library is a software project that was started in 2010 at University Colleg e London to provide an open-source general purpose BEM library for a variety of application areas. In this talk we introduce the underlying design concepts of the library and discuss several applications, including high-frequency preconditioning for ultrasound applications, the solution of time-domain problems via convolution quadrature, light-scattering from ice crystals, and the solution of coupled FEM/BEM problems with FEniCS and BEM++.

Worst-case complexity of nonlinear optimization: Where do we stand?

Speaker: Philippe To int, Uni de Namur
Wed 11th February at 11am, CPSE Seminar Room RODH C615, Roderic Hill Bldg
Organiser: Ruth Misener, DoC

We review the available results on the evaluation complexity of algorithms using Lipschitz-continuous Hessians for the approximate solution of nonlinear and potentially nonconvex optimization problems. Here, evaluation complexity is a bound on the largest number of problem functions (objective, constraints) and derivatives evaluations that are needed before an approximate first-order critical point of the problem is guaranteed to be found. We start by considering the unconstrained case and examine classical methods (such as Newton's method) and the more recent ARC2 method, which we show is optimal under reasonable assumptions. We then turn to constrained problems and analyze the case of convex constraints first, showing that a suitable adaptation ARC2CC of the ARC2 approach also possesses remarkable complexity properties. We finally extend the results obtained in simpler settings to the general equality and inequality constrained non linear optimization problem by constructing a suitable ARC2GC algorithm whose evaluation complexity also exhibits the same remarkable properties.

Philippe L. Toint (born 1952) received its degree in Mathematics in the University of Namur (Belgium) in 1974 and his Ph.D. in 1978 under the guidance of Prof M.J.D. Powell. He was appointed as lecturer at the University of Namur in 1979 were he became associate professor in 1987 and full-professor in 1993. Since 1979, he has been the co-director of the Numerical Analysis Unit and director of the Transportation Research Group in this department. He was in charge of the University Computer Services from 1998 to 2000 and director of the Department of Mathematics from 2006 to 2009. He currently serves as Vice-rector for Research and IT for the university. His research interests include numerical optimization, numerical analysis and transportation research. He has published four books and more than 280 papers and technical reports. Elected as SIAM Fellow (2009), he was also awarded the Beale-Orchard-Hayes Prize (1994, with Conn and Gould)) and the Lagrange Prize in Continuous Optimization (2006, with Fletcher and Leyffer). He is the past Chairman (2010-2013) of the Mathematical Programming Society, the international scientific body gathering most researchers in mathematical optimization world-wide. Married and father of two girls, he is a keen music and poetry lover as well as an enthusiast scuba-diver.

Electron correlation in van der Waals interactions

Ridgway Scott, Professor of Computer Science and of Mathematics at the University of Chicago
Coordinates: Room G01, Royal School of Mines, 12:00 Monday January 26th 2015

We examine a technique of Slater and Kirkwood which provides an exact resolution of the asymptotic behavior of the van der Waals attraction between two hydrogens atoms. We modify their technique to make the problem more tractable analytically and more easily solvable by numerical methods. Moreover, we prove rigorously that this approach provides an exact solution for the asymptotic electron correlation. The proof makes use of recent results that utilize the Feshbach-Schur perturbation technique. We provide visual representations of the asymptotic electron correlation (entanglement) based on the use of Laguerre approximations.


Formal Alchemy for Real-World HPC Parallelism

Ganesh Gopalakrishnan Professor, School of Computing, University of Utah
11:00 Friday 31 October in Huxley 217

Long-lived parallel programming notations for high performance computing tend to grow by accretion of features. Deconstructing these notations into their elemental components may allow us to view what once appeared ugly and confusing as basically beautifully put together, with only a mild amount of tarnish. We will demonstrate that there is some truth to this assertion by presenting how we once deconstructed MPI - a venerable API for HPC - and built an active testing tool that helps "predict" the behavior of MPI programs. The key binding principle turned out to be a somewhat non-traditional "matches before" relation that is aware of not only program control flow but also the status of message buffer availability, helping us explain deadlocks that occur either when there is limited buffering or also when there is excessive buffering.

Unfortunately, the magic of such theories always seems to fall short of the full scope of any practical API; yet, we believe that the daunting world of HPC concurrency desperately needs such acts of alchemy. I will raise the question of how to deconstruct future APIs that may, in addition to behavioral elements, contain aspects of allowed degrees of floating-point result non-determinism or even fault-handling methods. I hope to explain our current projects in this context.

A Computational Viewpoint on Classical Density Functional Theory

Dr Matt Knepley, Senior Research Associate, Computation Institute, Univ. of Chicago
2.42 at 4pm, Thursday October 9th 2014.

Classical Density Functional Theory (CDFT) has become a powerful tool for invesitgating the physics of molecular scale biophysical systems. It can accurately represent the entropic contribution to the free energy. When augmented with an electrostatic contribution, it can treat complex systems such as ion channels. We will look at the basic computational steps involved in the Rosenfeld formulation of CDFT, using the Reference Fluid Density (RFD) method of Gillespie. We present a scalable and accurate implementation using a hybrid numerical-analytic Fourier method.

Workshop: Computational Cardiac Electrophysiology

Organisers: Dr Chris Cantwell, Dr Richard Clayton, Professor Spencer Sherwin; Visit the event website
15-16 May 2014, room RH266, Roderic Hill Building

With recent advances in computer technology it is now becoming feasible to simulate not only the highly complex electrophysiological processes which occur in myocardium, but also on timescales which could enable modelling to directly influence and assist clinical intervention and mechanistic understanding across multiple scales.

This two-day workshop aims to bring together UK groups with an interest in using computer modelling to tackle basic science and clinical challenges in cardiac electrphysiology and discuss recent advances.

Topics include:

  • Challenges in patient-specific modelling
  • Image acquisition, pre-processing and mesh generation
  • Clinical interfacing: assisting diagnosis and treatment
  • High-performance computing: towards real-time simulation
  • Continuous modelling: monodomain/bidomain, FEM, SEM, finite difference
  • Discrete modelling: cellular automata, modelling cell coupling and ion channel changes.
  • Multiscale modelling from cell to organ: the right model for the job.

As a secondary focus of the workshop, we hope to discuss and define a range of benchmark problems of increasing complexity, covering both cellular and tissue aspects of the electrophysiology models. A number of cardiac electrophysiology codes exist and some are widely used, but few benchmark problems have been designed to verify these codes and assess their performance in preparation for research and clinical use. By designing a suite of test problems, we can further increase confidence in the available software tools.

The Pochoir Stencil Compiler

Bradley Kuszmaul, MIT Computer Science and Artificial Intelligence Laboratory and Tokutek Inc
Friday January 17th at 11am, room 212 in the William Penney Building

A stencil computation repeatedly updates each point of a $d$-dimensional grid as a function of itself and its near neighbors. Parallel cache-efficient stencil algorithms based on trapezoidal decompositions are known, but most programmers find them difficult to write. The Pochoir stencil compiler allows a programmer to write a simple specification of a stencil in a domain-specific stencil language embedded in C++ which the Pochoir compiler then translates into high-performing Cilk code that employs an efficient parallel cache-oblivious algorithm. Pochoir supports general $d$-dimensional stencils and handles both periodic and aperiodic boundary conditions in one unified algorithm. The Pochoir system provides a C++ template library that allows the user's stencil specification to be executed directly in C++ without the Pochoir compiler (albeit more slowly), which simplifies user debugging and greatly simplified the implementation of the Pochoir compiler itself. A host of stencil benchmarks run on a modern multicore machine demonstrates that Pochoir outperforms standard parallel-loop implementations, typically running 2--10 times faster. The algorithm behind Pochoir improves on prior cache-efficient algorithms on multidimensional grids by making hyperspace cuts, which yield asymptotically more parallelism for the same cache efficiency.


Workshop on Global Optimisation

Thursday December 19th Centre for Process Systems Engineering Seminar Room C615

Global optimization problems arise in all disciplines of science, applied science and engineering, including molecular biology, computati onal chemistry, thermodynamics, process systems design and control, finance and transportation. Examples of important applications of global optimization are the structure prediction of molecules, protein folding peptide docking, phase and chemical equilibrium, portfolio selection, air traffic control, chip layout and compaction problems, and the optimal design and control of non-ideal separations, to name a few.
This one-day workshop will provide a unique opportunity, for both academic and industrial communities, to discuss the latest developments in global optimization, meet experts in the field and explore new research directions. The workshop is supported by the Centre for Computational Science and Engineering (CMSE) at Imperial and jointly organised by the Department of Computing (DoC) and the Centre for Process Systems Engineering (CPSE) of Imperial College London.


  • Claire Adjiman (Imperial College London)
  • Roy L. Johnston (University of Birmingham)
  • Ruth Misener (Imperial College London)
  • Panos Parpas (Imperial College London)
  • Fabio Schoen (Universita degli Studi di Firenze)
  • David Wales (University of Cambridge)

Two talks by Jan Treibig on multicore performance engineering

Jan Treibig

The first is more of a research talk, while the second is more of a tutorial in making software run fast on multicore processors.

Talk 1: 11:00-12:00 on Monday December 9th 2013, Room 212, William Penney Building, Imperial College:

Comparing the performance of different x86 SIMD instruction sets for a medical imaging application on modern Multi- and Manycore chips - Exploring performance and power properties of modern multicore chips via simple machine models

Talk 2: 11:00-13:00 on Tuesday Dec 10th 2013 in the Grantham Institute's Boardroom:

Performance Engineering for Multi-Core Processors

Performance optimization for multi-core processors is a complex task requiring intimate knowledge about algorithm, implementation level, a nd processor architecture. This talk will introduce the principles of modern computer architecture illustrated with micro-benchmarking explorations. Based on this knowledge a pattern-driven performance engineering process will be described which focuses on a resource utilization based view on optimization. The proposed process will be explained on several examples.

Prospects for next-generation multigridding: low communication, low memory, adaptive, and resilient

Jed Brown, Argonne National Laboratory
5pm Thursday 26th September 2013, Room G.39, Royal School of Mines, Imperial College

Several hardware trends jeopardize traditional multilevel solvers: data motion rather than flops are becoming the leading hardware and energy cost, hybrid architectures move the primary compute capability and local memory bandwidth further from the global network, and the frequency of hardware failures is expected to increase. We propose to address the challenge of versatile multilevel solver performance at extreme scale with a radical departure from the status quo: inverting the multilevel philosophy from "coarse accelerates fine" to "fine improves accuracy of coarse", allowing the removal of "horizontal" communication and merging infrastructure with multiscale modeling. This approach is derived from a prescient and nearly forgotten observation by Achi Brandt over 30 years ago that a multigrid solve (specifically, evaluating functionals of the solution) can be performed in poly-logarithmic memory while reserving O(N) complexity. This ideal of ephemeral state is rarely feasible in real applications, but the same pr inciple removes communication and breaks dependencies that also allows local recovery from faults despite the fact that implicit solves are globally coupled, is naturally suited to architectures in which only a small fraction of total compute capability is latency-opt imized next to a fast network, and has the potential to transform the postprocessing workflow for PDE-based models to drastically reduce the amount of data that must be stored to disk without compromising analysis capability.

The Periodic Table of Finite Elements

Douglas Arnold, University of Minnesota
4pm Wednesday 12th June 2013, Room 340, Huxley Building, Imperial College

Finite element methodology, reinforced by deep mathematical analysis, provides one of the most important and powerful toolsets for numerical simulation. Over the past forty years a bewildering variety of different finite element spaces have been invented to meet the demands of many different problems. The relationship between these finite elements has often not been clear, and the techniques developed to analyze them can seem like a collection of ad hoc tricks. The finite element exterior calculus, developed over the last decade, has elucidated the requirements for stable finite elem ent methods for a large class of problems, clarifying and unifying this zoo of methods, and enabling the development of new finite elements suited to previously intractable problems. In this talk, we will discuss the big picture that emerges, providing a sort of periodic table of finite element methods.

A Comparative Study of the Construction of Global Climate Models

Steve Easterbrook, University of Toronto
4pm Tuesday March 12th 2013, Room 217, Huxley Building, Imperial College

In the literature, comparisons between climate models tend to focus on how well each model captures various physical processes, and how skillful they are in reproducing the climatology of observational datasets. In this talk, we present a different type of comparison, based on an analysis of the software architecture of global climate models. As with the architecture of a historic building, the architecture of a typical climate model includes a mix of older and newer elements, and evidence of re-purposing as new generations of scientists have adapted the model to new uses. An analysis of the development history of each model shows a mix of 'essence' and 'accident'. The essence include explicit design decisions that all climate modellers face (e.g. the choice of grids, selection of numerical methods for the dynamical core, use of particular coupling technologies, etc), while the accidental aspects represent constraints that are beyond the immediate control of a model's designers (e.g. available funding and experti se, the demands of different user groups, the rhythms imposed by collaborations between different research groups, etc). The talk will draw on a observations from case studies of four major models from four different countries: the UK Met Office Hadley Centre (UKMO); the US National Centre for Atmospheric Research (NCAR); the German Max-Planck Institute for Meteorology (MPI-M); and the French Institute Pierre Simon Laplace (IPSL). We will use differences in the architecture of these four models to ill ustrate some of t he different organizational constraints faced by climate research labs. The result of the analysis suggests that there may be much more structural diversity among the current generation of earth system models than is revealed in recent comparisons of their climatology.

TAU - open-source performance tools for HPC

Sameer Shende will be giving two day training course on TAU at Imperial College London, 28-29 January.

TAU is one of the best known tools for performance tuning and analysis of parallel codes for high end computing. It is an essential took for anyone serious about developing high performance parallel software.

Location: Room 3.38, Royal School of Mines, Imperial College London, SW7 2AZ
Date: 28-29 January 2013
Time: 09:00-17:00 each day with breaks for lunch, and morning and afternoon coffee.

To register for the event please contact Gerard Gorman ( This is a free event and places are allocated on a first come first served basis.

About TAU

TAU Performance System is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, Java, Python.

TAU (Tuning and Analysis Utilities) is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements.

All C++ language features are supported including templates and namespaces. The API also provides selection of profiling groups for organizing and controlling instrumentation. The instrumentation can be inserted in the source code using an automatic instrumentor tool based on the Program Database Toolkit (PDT), dynamically using DyninstAPI, at runtime in the Java Virtual Machine, or manually using the instrumentation API.

TAU's profile visualization tool, paraprof, provides graphical displays of all the performance analysis results, in aggregate and single node/context/thread forms. The user can quickly identify sources of performance bottlenecks in the application using the graphical interface. In addition, TAU can generate event traces that can be displayed with the Vampir, Paraver or JumpShot trace visualization tools.

About Sameer Shende

Sameer Shende serves as the Director of the Performance Research Lab, NIC, University of Oregon as well as being one of the developers of TAU.