Schedule
-
Monday, July 2, 2018
-
Tuesday, July 3, 2018
-
Wednesday, July 4, 2018
Monday, July 2, 2018
If you are interested in participating, make sure you add this option at the time of registration.
Please note that places are limited. Further details available here.
09:00 - 10:00
Registration
Foyer 2nd Floor
Welcome from the Local Hosts
Hans-Peter Wessels (City of Basel, Switzerland)
Andrea Schenker-Wicki (University of Basel, Switzerland)
Hans-Peter Wessels (City of Basel, Switzerland)
Andrea Schenker-Wicki (University of Basel, Switzerland)
Welcome from the Conference Co-Chairs
Florina Ciorba (University of Basel, Switzerland)
Erik Lindahl (Stockholm University, Sweden)
Florina Ciorba (University of Basel, Switzerland)
Erik Lindahl (Stockholm University, Sweden)
Chair: Dimitri Komatitsch (CNRS, France)
Earthquakes are highly non-linear multiscale problems, encapsulating geometry and rheology of faults within the Earth’s crust torn apart by propagating shear fracture and emanating seismic wave radiation. This talk will focus on using physics-based scenarios, modern numerical methods and hardware specific optimizations to shed light on the dynamics, and severity, of earthquake behaviour. It will present the largest-scale dynamic earthquake rupture simulation to date, which models the 2004 Sumatra-Andaman event - an unexpected subduction zone earthquake which generated a rupture of over 1,500 km in length within the ocean floor followed by a series of devastating tsunamis. The core components of the simulation software will be described, highlighting the benefits of strong collaborations between domain and computational scientists. Lastly, future directions in coupling the short-term elastodynamics phenomena to long-term tectonics and tsunami generation will be discussed.
+ Biography { "slot": {"id":"evtypp136","type":"parent","title":"","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":true,"abstract":"Earthquakes are highly non-linear multiscale problems, encapsulating geometry and rheology of faults within the Earth\u2019s crust torn apart by propagating shear fracture and emanating seismic wave radiation. This talk will focus on using physics-based scenarios, modern numerical methods and hardware specific optimizations to shed light on the dynamics, and severity, of earthquake behaviour. It will present the largest-scale dynamic earthquake rupture simulation to date, which models the 2004 Sumatra-Andaman event - an unexpected subduction zone earthquake which generated a rupture of over 1,500 km in length within the ocean floor followed by a series of devastating tsunamis. The core components of the simulation software will be described, highlighting the benefits of strong collaborations between domain and computational scientists. Lastly, future directions in coupling the short-term elastodynamics phenomena to long-term tectonics and tsunami generation will be discussed.","filename":"evtypp136s1-file1.pdf","bio":"Alice-Agnes Gabriel is an Assistant Professor of Geophysics at Ludwig Maximilian University of Munich. She received a PhD in seismology from ETH Zurich in 2013. She fuses expertise from Earth science, physics and computational mathematics to study the fundamentals of earthquake physics and develop methodological innovations for seismology. She is specifically interested in simulating waves and rupture processes within arbitrarily complex geological structures to enhance classic probabilistic seismic hazard assessment and a wide range of industry applications. Her career is distinguished by first-rate earthquake scenarios realized on some of the largest supercomputers worldwide.","contributors":[{"type":"Session chair \/ organizer \/ interviewer","first_name":"Chair: Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"Alice-Agnes Gabriel is an Assistant Professor of Geophysics at Ludwig Maximilian University of Munich. She received a PhD in seismology from ETH Zurich in 2013. She fuses expertise from Earth science, physics and computational mathematics to study the fundamentals of earthquake physics and develop methodological innovations for seismology. She is specifically interested in simulating waves and rupture processes within arbitrarily complex geological structures to enhance classic probabilistic seismic hazard assessment and a wide range of industry applications. Her career is distinguished by first-rate earthquake scenarios realized on some of the largest supercomputers worldwide.","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Session chair \/ organizer \/ interviewer","first_name":"Chair: Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"Alice-Agnes Gabriel is an Assistant Professor of Geophysics at Ludwig Maximilian University of Munich. She received a PhD in seismology from ETH Zurich in 2013. She fuses expertise from Earth science, physics and computational mathematics to study the fundamentals of earthquake physics and develop methodological innovations for seismology. She is specifically interested in simulating waves and rupture processes within arbitrarily complex geological structures to enhance classic probabilistic seismic hazard assessment and a wide range of industry applications. Her career is distinguished by first-rate earthquake scenarios realized on some of the largest supercomputers worldwide.","order":"1","is_presenter":true}]} } Presentation
Chair: Sabine Roller (University of Siegen, Germany)
Track(s):
Emerging Application Domains, Climate and Weather
12:10 - 13:00
Lunch (sponsored by Cray Inc.)
Foyer 2nd Floor
13:00 - 15:00
Minisymposia Session I
Organizer(s):
Andreas Vitalis (University of Zurich, Switzerland)
, Marco Bacci (University of Zurich, Switzerland)
, Amedeo Caflisch (University of Zurich, Switzerland)
Track(s):
Life Sciences, Emerging Application Domains, Computer Science and Applied Mathematics
A common problem in numerical optimization and sampling is the detection of relevant states. These could be, for instance, the local minima on a rugged parameter surface or the transition state of a chemical reaction. For most cases, an exhaustive search for the optimal solution is intractable. Here, we focus on parallel sampling and optimization strategies relying on multiple replicas, most prominently, adaptive methods where all simulated replicas use the same propagator and sample the same underlying surface. In these methods, replica intercommunication is used to provide a global assessment as to which replicas are most interesting. This implies, in general, periodic data mining steps across replicas. Furthermore, in order to extract and utilize the gained information in post-processing, data must often be stored, which poses stringent data management and analysis challenges in particular for high-dimensional cases. The minisymposium wishes to discuss the following questions: What are meaningful and easily generalizable tools, strategies, and algorithms to guide the sampling/exploration? How can we maintain scalability and load balance? What types of post-processing algorithms can be applied to the generated data, and are those scalable to provide on-the-fly solutions to direct the exploration?
13:30 - 14:00
Applications and Advancements of the Progress-Index Guided Sampling Method in Molecular Dynamics Simulations
, Marco Bacci (University of Zurich, Switzerland)
+ Abstract { "session": {"id":"sess153","title":"MS01 - Adaptive Parallel Strategies for the Exploration of Challenging Search Spaces with Applications in Particle Simulations and Optimization, Part I","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Samarkand Room","contributors":[{"type":"Session Chair","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp112","type":"minisymposia","title":"MS01 - Adaptive Parallel Strategies for the Exploration of Challenging Search Spaces with Applications in Particle Simulations and Optimization, Part I","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"A common problem in numerical optimization and sampling is the detection of relevant states. These could be, for instance, the local minima on a rugged parameter surface or the transition state of a chemical reaction. For most cases, an exhaustive search for the optimal solution is intractable. Here, we focus on parallel sampling and optimization strategies relying on multiple replicas, most prominently, adaptive methods where all simulated replicas use the same propagator and sample the same underlying surface. In these methods, replica intercommunication is used to provide a global assessment as to which replicas are most interesting. This implies, in general, periodic data mining steps across replicas. Furthermore, in order to extract and utilize the gained information in post-processing, data must often be stored, which poses stringent data management and analysis challenges in particular for high-dimensional cases. The minisymposium wishes to discuss the following questions: What are meaningful and easily generalizable tools, strategies, and algorithms to guide the sampling\/exploration? How can we maintain scalability and load balance? What types of post-processing algorithms can be applied to the generated data, and are those scalable to provide on-the-fly solutions to direct the exploration?","bio":"","contributors":[{"type":"Organizer","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa159","type":"child","title":"FAST - Goal-Oriented Adaptive Sampling of Protein Dynamics","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecular dynamics simulations are a powerful \u2028means of understanding conformational changes. However, it \u2028is still difficult to simulate biologically relevant time scales\u2028 without the use of specialized supercomputers. Here, we \u2028introduce a goal-oriented sampling method, called fluctuation \u2028amplification of specific traits (FAST), for extending the\u2028 capabilities of commodity hardware. FAST works by iteratively running a batch of simulations, building a Markov state model (MSM), and then using the last MSM to decide what subset of the states that have been discovered so far it would be most valuable to run the next set of simulations from. Importantly, the ranking function we use to choose starting points for each batch of simulations includes an exploitation term that favors states with desirable geometric properties and an exploration term that favors poorly sampled states. FAST outperforms conventional simulations and other MSM-based adaptive sampling algorithms by at least an order of magnitude. Furthermore, FAST yields both the proper thermodynamics and kinetics because, in contrast to many other enhanced sampling algorithms, the Hamiltonian used during individual simulations is unperturbed. Therefore, we expect FAST to be of great utility for a wide range of applications.","bio":"","contributors":[{"type":"Author","first_name":"Gregory","last_name":"Bowman","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Gregory","last_name":"Bowman","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa166","type":"child","title":"Applications and Advancements of the Progress-Index Guided Sampling Method in Molecular Dynamics Simulations","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Computer simulations of molecules offer unparalleled spatial and temporal resolution to the characterization of atomistic processes. However, the complexity and ruggedness of the free energy landscape often hamper the usefulness of brute-force molecular dynamics as most of the simulation time is spent in a few metastable states. To help in overcoming these limitations, we have recently developed the Progress Index Guided Sampling (PIGS) method. PIGS is a multi-replica unsupervised adaptive sampling protocol that aims to maximize phase space coverage by reseeding redundant replicas with interesting ones. Interesting replicas are detected on-the-fly by using a heuristic, which is informed by scalable data-mining algorithms that take as input a user-defined representation of the simulated system. Therefore, PIGS allows focusing the sampling enhancement on selected regions of interest without the need for reaction coordinates or external potentials. This also means that it is a straightforward task to retrieve in post-processing the thermodynamics and kinetics of the system within a Markovian approximation of the true dynamics. Here we show results from real-life simulations of biomolecules in explicit solvent performed with sampling enhancement on segments of different length. Additionally, we present algorithmic advancements, especially a fully scalable implementation of PIGS in the simulation engine GROMACS.","filename":"msa166s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa286","type":"child","title":"iMapD: Intrinsic Map Dynamics Exploration for Uncharted Effective Free Energy Surfaces","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecular dynamics (MD) simulations explore the configurational space of physical systems at their natural pace. Simulations extensively revisit typical configurations until rare and interesting transition events occur. Biasing the simulator away from the region already explored can, therefore, drastically accelerate the discovery of new regions, and is often the only way to gain access to all relevant states. We propose iMapD, an enhanced exploration simulation framework, where MD and machine learning adaptively bootstrap each other. Machine learning guides the search for important configurations by processing information from previous explorations. This search proceeds iteratively in an algorithmically orchestrated fashion without advance knowledge of suitable collective variables. The enhanced exploration occurs through strategically initialized short unbiased simulations, and does not rely on any unphysical force steering the dynamics of the system. Applied to a molecular sensor of lipid saturation in membranes, a dimer dissociation pathway not seen in millisecond long equilibrium simulations is discovered at the second iteration. In combination with path sampling techniques, iMapD enables us to characterize even the slowest dynamics of the system.","bio":"","contributors":[{"type":"Author","first_name":"Roberto","last_name":"Covino","affiliation":"Max Planck Institute of Biophysics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Hendrik","last_name":"Jung","affiliation":"Max Planck Institute of Biophysics","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Eliodoro","last_name":"Chiavazzo","affiliation":"Politecnico di Torino","country":"Italy","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Hummer","affiliation":"Max Planck Institute of Biophysics","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Kevrekidis","affiliation":"Johns Hopkins University","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Roberto","last_name":"Covino","affiliation":"Max Planck Institute of Biophysics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa141","type":"child","title":"Exploiting Task-Based Parallelism in Bayesian Uncertainty Quantification and Stochastic Optimization","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The mapping of Uncertainty Quantification (UQ) to computing architectures is a very challenging process and at the same time an essential aspect for all fields of simulation science. In UQ the aggregate scientific knowledge is obtained by ensembles of simulation runs, created dynamically by the employed UQ algorithm and scheduled on the available compute nodes. \u03a04U is a computational framework that exploits the capabilities of massively parallel and hybrid computer architectures for large scale Bayesian uncertainty quantification, reliability analysis and stochastic optimization. At the core of the framework, a platform-agnostic task-parallel library supports nested parallelism and provides automatic load balancing on computing architectures that range from multicore systems to hybrid GPU clusters. The software is open-source and includes HPC implementations of algorithms such as Transitional Markov Chain Monte Carlo and Approximate Bayesian Computation. Experimental results using representative applications demonstrate the flexibility and excellent scalability of the proposed framework.","bio":"","contributors":[{"type":"Author","first_name":"Panagiotis","last_name":"Hadjidoukas","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Panagiotis","last_name":"Hadjidoukas","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa166","type":"child","title":"Applications and Advancements of the Progress-Index Guided Sampling Method in Molecular Dynamics Simulations","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Computer simulations of molecules offer unparalleled spatial and temporal resolution to the characterization of atomistic processes. However, the complexity and ruggedness of the free energy landscape often hamper the usefulness of brute-force molecular dynamics as most of the simulation time is spent in a few metastable states. To help in overcoming these limitations, we have recently developed the Progress Index Guided Sampling (PIGS) method. PIGS is a multi-replica unsupervised adaptive sampling protocol that aims to maximize phase space coverage by reseeding redundant replicas with interesting ones. Interesting replicas are detected on-the-fly by using a heuristic, which is informed by scalable data-mining algorithms that take as input a user-defined representation of the simulated system. Therefore, PIGS allows focusing the sampling enhancement on selected regions of interest without the need for reaction coordinates or external potentials. This also means that it is a straightforward task to retrieve in post-processing the thermodynamics and kinetics of the system within a Markovian approximation of the true dynamics. Here we show results from real-life simulations of biomolecules in explicit solvent performed with sampling enhancement on segments of different length. Additionally, we present algorithmic advancements, especially a fully scalable implementation of PIGS in the simulation engine GROMACS.","filename":"msa166s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}] } Presentation
MS02 - Capability Computing, Performance Portability, and Co-Design in the PASC Projects
Sydney Room
Organizer(s):
Joost VandeVondele (ETH Zurich / CSCS, Switzerland)
Track(s):
Life Sciences, Computer Science and Applied Mathematics, Climate and Weather, Chemistry and Materials, Solid Earth Dynamics
Selected PASC projects will present the scientific challenge they aim to solve by using high-end supercomputers, and in particular the computational approach adopted. The topics include astrophysics (smooth particle hydrodynamics), numerical weather and climate (stencils and grids), linear algebra for electronic structure (sparse matrix and tensor operations), and biomedical applications (fluid-structure interaction, machine learning).
The focus will be on aspects concerning (i) capability computing: how to scale to several hundreds/thousands of compute nodes, including the use of communication optimal algorithms and asynchronous communication; (ii) performance portability: how to address the growing diversity in hardware on a compute node, including generic software design, and auto-tuning; and (iii) co-design in these projects: how to engage with vendors to optimally exploit current hardware, and to provide feedback that has or will influence next-generation hardware.
Topics include: side by side comparisons of multi-core, many core, and GPU compute nodes; optimization techniques for flops, memory bandwidth, or network performance; JIT compilation of machine specific kernels; programming approaches such as the use of domain specific languages (DSLs), remote memory access (RMA), task based programming.
The focus will be on aspects concerning (i) capability computing: how to scale to several hundreds/thousands of compute nodes, including the use of communication optimal algorithms and asynchronous communication; (ii) performance portability: how to address the growing diversity in hardware on a compute node, including generic software design, and auto-tuning; and (iii) co-design in these projects: how to engage with vendors to optimally exploit current hardware, and to provide feedback that has or will influence next-generation hardware.
Topics include: side by side comparisons of multi-core, many core, and GPU compute nodes; optimization techniques for flops, memory bandwidth, or network performance; JIT compilation of machine specific kernels; programming approaches such as the use of domain specific languages (DSLs), remote memory access (RMA), task based programming.
13:30 - 14:00
Portability and Scalability of the COSMO Weather and Climate Model on Heterogeneous Architectures
, Carlos E. Osuna (MeteoSwiss, Switzerland)
+ Abstract { "session": {"id":"sess159","title":"MS02 - Capability Computing, Performance Portability, and Co-Design in the PASC Projects","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Sydney Room","contributors":[{"type":"Session Chair","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials","Solid Earth Dynamics"],"slots":[{"id":"symp108","type":"minisymposia","title":"MS02 - Capability Computing, Performance Portability, and Co-Design in the PASC Projects","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Selected PASC projects will present the scientific challenge they aim to solve by using high-end supercomputers, and in particular the computational approach adopted. The topics include astrophysics (smooth particle hydrodynamics), numerical weather and climate (stencils and grids), linear algebra for electronic structure (sparse matrix and tensor operations), and biomedical applications (fluid-structure interaction, machine learning).\u003Cbr \/\u003E\u003Cbr \/\u003EThe focus will be on aspects concerning (i) capability computing: how to scale to several hundreds\/thousands of compute nodes, including the use of communication optimal algorithms and asynchronous communication; (ii) performance portability: how to address the growing diversity in hardware on a compute node, including generic software design, and auto-tuning; and (iii) co-design in these projects: how to engage with vendors to optimally exploit current hardware, and to provide feedback that has or will influence next-generation hardware.\u003Cbr \/\u003E\u003Cbr \/\u003ETopics include: side by side comparisons of multi-core, many core, and GPU compute nodes; optimization techniques for flops, memory bandwidth, or network performance; JIT compilation of machine specific kernels; programming approaches such as the use of domain specific languages (DSLs), remote memory access (RMA), task based programming.","bio":"","contributors":[{"type":"Organizer","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa123","type":"child","title":"SPH-EXA: Optimizing Smooth Particle Hydrodynamics for Exascale Computing","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Understanding fluid and plasma behavior under complex physical conditions forms the basis of highly\u00a0important research questions. Numerical simulations of fluids in\u00a0astrophysics and computational fluid dynamics are among the most computationally demanding calculations in\u00a0terms of sustained floating point operations per second, which are expected to benefit from the upcoming Exascale high-performance computers. A well-known hydrodynamics solver is\u00a0Smooth Particle Hydrodynamics (SPH). The parallelization of codes implementing the SPH method is not trivial\u00a0due to the nature of the\u00a0physics and the algorithms involved. The SPH-EXA project targets the design of a\u00a0scalable\u00a0and\u00a0fault tolerant\u00a0SPH-EXA mini-app. The scientific insights from the optimized executions of the SPH-EXA mini-app will be\u00a0incorporated into current SPH-based production codes in the fields of astrophysics\u00a0(SPHYNX, ChaNGa), and CFD (SPH-flow), resulting in, what we call, the SPH-EXA\u00a0version of those codes. The SPH-EXA mini-app will employ advanced parallelization methods, scalable dynamic load balancing within single compute nodes and across massive numbers of nodes, and fault-tolerance mechanisms to sustain its scalable execution. An essential outcome of this project is a repository of experiments to enable verification, reproducibility, and portability of the execution and simulation results to other SPH-EXA codes.","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa287","type":"child","title":"Portability and Scalability of the COSMO Weather and Climate Model on Heterogeneous Architectures","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The clear evidence of the importance of high horizontal resolutions in the quality and accuracy of weather and climate simulations is demanding unprecedented computational capacity. Previous developments from PASC projects resulted in a GPU capable version of the COSMO model that provides significant speedup in the time to solution on NVIDIA GPUs, which allowed the first operational GPU enabled weather forecast system at MeteoSwiss as well as European-scale decadal climate simulations at unprecedented resolutions of 2 km. In order to improve the performance portability of COSMO in heterogeneous systems, recent efforts are supporting and optimizing COSMO for Xeon Phi KNL architecture and further improving the performance on accelerators exploiting advanced optimizations like task parallelism on the model that improve the performance on strong scalability regimes on massively parallel accelerators. Additionally, recent developments of a toolchain allow to combine all these advanced optimizations with a performance model specific to the domain and configuration of the model. We present results and performance comparisons for the COSMO 1km resolution configuration on Xeon Phi KNL and NVIDIA P100 systems.","filename":"msa287s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Stefan","last_name":"Moosbrugger","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Felix","last_name":"Thaler","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa155","type":"child","title":"Implementing a Sparse Tensor Linear Algebra Library for Electronic Structure Calculations","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. As part of the PASC project, we are extending the library to include tensor algebra, based on the realization that most tensor operations can be mapped on matrix multiplications. First, we introduce the library, describing the repository on GitHub, how to compile it, the test methods, the tutorial, and the API in Fortran and C\/C++. Then we give details on the implemented solutions to tackle scalability on large node-counts, based on a communication optimal algorithm with dynamically distributed load-balancing, implemented with remote memory access MPI communications. At the node level, we present a novel approach for the generation of optimal kernels based on autotuning and JIT compilation. Finally, we report the performance results, in terms of time-to-solution and energy-to-solution, of DBCSR on systems with Intel Xeon CPUs, Intel Xeon Phi Knights Landing (KNL) processors, and systems with NVIDIA GPUs.","bio":"","contributors":[{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa214","type":"child","title":"AV-FLOW: A High-Performance Library for Fluid-Structure Interaction with Complex Materials and Transitional Flow","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The flow systems of the heart and the great blood vessels comprise complex materials (soft tissue) and flows at moderately high Reynolds numbers which may undergo transition from laminar to turbulent flow. Computational modelling of such fluid-structure interaction (FSI) problems requires efficient high-fidelity solvers for structure and flow as well as a robust scheme for coupling the two phases. We present a new FSI framework based on the immersed boundary method which has been developed for modelling cardiovascular flow systems. This high-performance library is optimized for parallel execution on the Cray XC40\/50 system at CSCS. The structural and flow solvers use geometric domain decomposition for parallelization on multi-core multi-node platforms. The coupling between the structure and flow uses a parallel transfer library to minimize communication between the different computing cores. Compute intensive kernels were written in CUDA to make use of the GPGPUs on the nodes of the Cray XC40\/50. We show performance benchmarks and different FSI test cases including a benchmark for solid inertia and a problem with transitional flow past an obstacle made of a complex material with fibers.","bio":"","contributors":[{"type":"Author","first_name":"Dominik","last_name":"Obrist","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dominik","last_name":"Obrist","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa287","type":"child","title":"Portability and Scalability of the COSMO Weather and Climate Model on Heterogeneous Architectures","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The clear evidence of the importance of high horizontal resolutions in the quality and accuracy of weather and climate simulations is demanding unprecedented computational capacity. Previous developments from PASC projects resulted in a GPU capable version of the COSMO model that provides significant speedup in the time to solution on NVIDIA GPUs, which allowed the first operational GPU enabled weather forecast system at MeteoSwiss as well as European-scale decadal climate simulations at unprecedented resolutions of 2 km. In order to improve the performance portability of COSMO in heterogeneous systems, recent efforts are supporting and optimizing COSMO for Xeon Phi KNL architecture and further improving the performance on accelerators exploiting advanced optimizations like task parallelism on the model that improve the performance on strong scalability regimes on massively parallel accelerators. Additionally, recent developments of a toolchain allow to combine all these advanced optimizations with a performance model specific to the domain and configuration of the model. We present results and performance comparisons for the COSMO 1km resolution configuration on Xeon Phi KNL and NVIDIA P100 systems.","filename":"msa287s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Stefan","last_name":"Moosbrugger","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Felix","last_name":"Thaler","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Stefan","last_name":"Moosbrugger","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Felix","last_name":"Thaler","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}] } Presentation
Organizer(s):
Felix Kubler (University of Zurich, Switzerland)
Track(s):
Emerging Application Domains
Discrete-time, infinite-horizon, general equilibrium models are routinely used in macroeconomics and in public finance for exploring the quantitative features of model economies and for counterfactual policy analysis. One important question concerns the importance of household heterogeneity for the amplification and propagation of macroeconomic shocks.
In this session we bring together leading young researchers in the field to present alternative approaches to the computation of equilibria in dynamic stochastic models with heterogeneous agents and/or with financial frictions. Three of the papers directly propose new methods for the solution of models with a continuum of ex post heterogeneous agents.
In this session we bring together leading young researchers in the field to present alternative approaches to the computation of equilibria in dynamic stochastic models with heterogeneous agents and/or with financial frictions. Three of the papers directly propose new methods for the solution of models with a continuum of ex post heterogeneous agents.
13:30 - 14:00
Solving Heterogeneous Agent Models with Nonconvex Optimization Problems: Linearization and Beyond
, Michael Reiter (Institute for Advanced Studies, Austria)
+ Abstract { "session": {"id":"sess161","title":"MS03 - Computational Aspects of Heterogeneous Agents Macro","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Nairobi Room","contributors":[{"type":"Session Chair","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains"],"slots":[{"id":"symp125","type":"minisymposia","title":"MS03 - Computational Aspects of Heterogeneous Agents Macro","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Discrete-time, infinite-horizon, general equilibrium models are routinely used in macroeconomics and in public finance for exploring the quantitative features of model economies and for counterfactual policy analysis.\u00a0One important question concerns the importance of household heterogeneity for the amplification and propagation of macroeconomic shocks.\u003Cbr \/\u003E\u003Cbr \/\u003EIn this session we bring together leading young researchers in the field to present alternative approaches to the computation of equilibria in dynamic stochastic models with heterogeneous agents and\/or with financial frictions.\u00a0Three of the papers directly propose new methods for the solution of models with a continuum of ex post heterogeneous agents.","bio":"","contributors":[{"type":"Organizer","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa124","type":"child","title":"Exploiting MIT Shocks in Heterogeneous-Agent Economies: The Impulse Response as a Numerical Derivative","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We propose a new method for computing equilibria in heterogeneous-agent models with aggregate uncertainty. The idea relies on an assumption that linearization offers a good approximation; we share this assumption with existing linearization methods. However, unlike those methods, the approach here does not rely on direct derivation of first-order Taylor terms. It also does not use recursive methods, whereby aggregates and prices would be expressed as linear functions of the state, usually a very high-dimensional object (such as the wealth distribution). Rather, we rely merely on solving nonlinearly for a deterministic transition path: we study the equilibrium response to a single, small \u0022MIT shock\u0027\u0027 carefully. We then regard this impulse response path as a numerical derivative in sequence space and hence provide our linearized solution directly using this path. The method can easily be extended to the case of many shocks and computation time rises linearly in the number of shocks. We also propose a set of checks on whether linearization is a good approximation. We assert that our method is the simplest, most transparent linearization technique among currently known methods. The key numerical tool required to implement it is value-function iteration, using a very limited set of state variables.","bio":"","contributors":[{"type":"Author","first_name":"Timo","last_name":"Boppart","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Per","last_name":"Krusell","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Kurt","last_name":"Mitman","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"3","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Kurt","last_name":"Mitman","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"3","is_presenter":true}]},{"id":"msa151","type":"child","title":"Solving Heterogeneous Agent Models with Nonconvex Optimization Problems: Linearization and Beyond","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this talk I present a methodology for the solution of dynamic stochastic general equilibrium models with heterogeneous agents, with an emphasis on models with nonconvex decision problems. First I present an implementation of a linearization method that makes the solution of large models feasible, using dimension reduction methods both on the states and on the equilibrium variables of the model. The linearized solution serves as a starting point to compute global approximation solutions, by providing a guess of the value functions and a suitable collocation grid. The method is applied to a two-asset model where households hold a financial asset and face a discrete housing choice.","filename":"msa151s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Reiter","affiliation":"Institute for Advanced Studies","country":"Austria","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Reiter","affiliation":"Institute for Advanced Studies","country":"Austria","bio":"","order":"1","is_presenter":true}]},{"id":"msa109","type":"child","title":"Comparative Valuation Dynamics in Models with Financing Restrictions","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This contribution develops a theoretical framework to nest many recent dynamic stochastic general equilibrium economies with financial frictions into one common generic model. Our goal is to study the macroeconomic and asset pricing properties of this class of models, identify common features and highlight areas where these models depart from each other. In order to characterize the asset pricing implications of this family of models, we study their term structure of risk prices and risk exposures, the natural extension of impulse response functions in economic environments exhibiting non-linear dynamics. Given our continuous time setup with a Brownian information structure, our study requires us to solve systems of non-linear partial differential equations of up to 4 state variables; the occasionally binding nature of our financial frictions give rise to a free boundary problem in the 4-dimensional state space. We use finite difference schemes coded in C++ and an iterative procedure to compute the equilibrium dynamics, the stationary distribution, the shock exposure and cost elasticities, and rho-mixing coefficients of our model.","filename":"msa109s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fabrice","last_name":"Tourre","affiliation":"Northwestern University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Paymon","last_name":"Khorrami","affiliation":"University of Chicago","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Paymon","last_name":"Khorrami","affiliation":"University of Chicago","country":"United States of America","bio":"","order":"2","is_presenter":true}]},{"id":"msa143","type":"child","title":"Self-Justified Equilibria: Existence and Computation","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this talk we introduce \u0022self-justified\u0022 equilibrium as a solution concept in stochastic general equilibrium models with a large number of heterogeneous agents. In each period agents trade in assets to maximize the sum of current utility and forecasted future utility. Current prices ensure that markets clear and agents forecast the probability distribution of future prices and consumption on the basis of current endogenous variables and the current exogenous shock. The forecasts are self-justified in the sense that agents use forecasting functions that are optimal within a given class of functions and that can be viewed as optimally trading off the accuracy of the forecast and its complexity. We show that self-justified equilibria always exist and we develop a computational method to approximate them numerically. By restricting the complexity of agents\u0027 forecasts we can solve models with a very large number of heterogeneous agents. Errors can be directly interpreted.","bio":"","contributors":[{"type":"Author","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Simon","last_name":"Scheidegger","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa151","type":"child","title":"Solving Heterogeneous Agent Models with Nonconvex Optimization Problems: Linearization and Beyond","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this talk I present a methodology for the solution of dynamic stochastic general equilibrium models with heterogeneous agents, with an emphasis on models with nonconvex decision problems. First I present an implementation of a linearization method that makes the solution of large models feasible, using dimension reduction methods both on the states and on the equilibrium variables of the model. The linearized solution serves as a starting point to compute global approximation solutions, by providing a guess of the value functions and a suitable collocation grid. The method is applied to a two-asset model where households hold a financial asset and face a discrete housing choice.","filename":"msa151s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Reiter","affiliation":"Institute for Advanced Studies","country":"Austria","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Reiter","affiliation":"Institute for Advanced Studies","country":"Austria","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Michael","last_name":"Reiter","affiliation":"Institute for Advanced Studies","country":"Austria","bio":"","order":"1","is_presenter":true}] } Presentation
14:00 - 14:30
Comparative Valuation Dynamics in Models with Financing Restrictions
, Paymon Khorrami (University of Chicago, United States of America)
+ Abstract { "session": {"id":"sess161","title":"MS03 - Computational Aspects of Heterogeneous Agents Macro","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Nairobi Room","contributors":[{"type":"Session Chair","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains"],"slots":[{"id":"symp125","type":"minisymposia","title":"MS03 - Computational Aspects of Heterogeneous Agents Macro","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Discrete-time, infinite-horizon, general equilibrium models are routinely used in macroeconomics and in public finance for exploring the quantitative features of model economies and for counterfactual policy analysis.\u00a0One important question concerns the importance of household heterogeneity for the amplification and propagation of macroeconomic shocks.\u003Cbr \/\u003E\u003Cbr \/\u003EIn this session we bring together leading young researchers in the field to present alternative approaches to the computation of equilibria in dynamic stochastic models with heterogeneous agents and\/or with financial frictions.\u00a0Three of the papers directly propose new methods for the solution of models with a continuum of ex post heterogeneous agents.","bio":"","contributors":[{"type":"Organizer","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa124","type":"child","title":"Exploiting MIT Shocks in Heterogeneous-Agent Economies: The Impulse Response as a Numerical Derivative","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We propose a new method for computing equilibria in heterogeneous-agent models with aggregate uncertainty. The idea relies on an assumption that linearization offers a good approximation; we share this assumption with existing linearization methods. However, unlike those methods, the approach here does not rely on direct derivation of first-order Taylor terms. It also does not use recursive methods, whereby aggregates and prices would be expressed as linear functions of the state, usually a very high-dimensional object (such as the wealth distribution). Rather, we rely merely on solving nonlinearly for a deterministic transition path: we study the equilibrium response to a single, small \u0022MIT shock\u0027\u0027 carefully. We then regard this impulse response path as a numerical derivative in sequence space and hence provide our linearized solution directly using this path. The method can easily be extended to the case of many shocks and computation time rises linearly in the number of shocks. We also propose a set of checks on whether linearization is a good approximation. We assert that our method is the simplest, most transparent linearization technique among currently known methods. The key numerical tool required to implement it is value-function iteration, using a very limited set of state variables.","bio":"","contributors":[{"type":"Author","first_name":"Timo","last_name":"Boppart","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Per","last_name":"Krusell","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Kurt","last_name":"Mitman","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"3","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Kurt","last_name":"Mitman","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"3","is_presenter":true}]},{"id":"msa151","type":"child","title":"Solving Heterogeneous Agent Models with Nonconvex Optimization Problems: Linearization and Beyond","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this talk I present a methodology for the solution of dynamic stochastic general equilibrium models with heterogeneous agents, with an emphasis on models with nonconvex decision problems. First I present an implementation of a linearization method that makes the solution of large models feasible, using dimension reduction methods both on the states and on the equilibrium variables of the model. The linearized solution serves as a starting point to compute global approximation solutions, by providing a guess of the value functions and a suitable collocation grid. The method is applied to a two-asset model where households hold a financial asset and face a discrete housing choice.","filename":"msa151s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Reiter","affiliation":"Institute for Advanced Studies","country":"Austria","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Reiter","affiliation":"Institute for Advanced Studies","country":"Austria","bio":"","order":"1","is_presenter":true}]},{"id":"msa109","type":"child","title":"Comparative Valuation Dynamics in Models with Financing Restrictions","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This contribution develops a theoretical framework to nest many recent dynamic stochastic general equilibrium economies with financial frictions into one common generic model. Our goal is to study the macroeconomic and asset pricing properties of this class of models, identify common features and highlight areas where these models depart from each other. In order to characterize the asset pricing implications of this family of models, we study their term structure of risk prices and risk exposures, the natural extension of impulse response functions in economic environments exhibiting non-linear dynamics. Given our continuous time setup with a Brownian information structure, our study requires us to solve systems of non-linear partial differential equations of up to 4 state variables; the occasionally binding nature of our financial frictions give rise to a free boundary problem in the 4-dimensional state space. We use finite difference schemes coded in C++ and an iterative procedure to compute the equilibrium dynamics, the stationary distribution, the shock exposure and cost elasticities, and rho-mixing coefficients of our model.","filename":"msa109s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fabrice","last_name":"Tourre","affiliation":"Northwestern University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Paymon","last_name":"Khorrami","affiliation":"University of Chicago","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Paymon","last_name":"Khorrami","affiliation":"University of Chicago","country":"United States of America","bio":"","order":"2","is_presenter":true}]},{"id":"msa143","type":"child","title":"Self-Justified Equilibria: Existence and Computation","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this talk we introduce \u0022self-justified\u0022 equilibrium as a solution concept in stochastic general equilibrium models with a large number of heterogeneous agents. In each period agents trade in assets to maximize the sum of current utility and forecasted future utility. Current prices ensure that markets clear and agents forecast the probability distribution of future prices and consumption on the basis of current endogenous variables and the current exogenous shock. The forecasts are self-justified in the sense that agents use forecasting functions that are optimal within a given class of functions and that can be viewed as optimally trading off the accuracy of the forecast and its complexity. We show that self-justified equilibria always exist and we develop a computational method to approximate them numerically. By restricting the complexity of agents\u0027 forecasts we can solve models with a very large number of heterogeneous agents. Errors can be directly interpreted.","bio":"","contributors":[{"type":"Author","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Simon","last_name":"Scheidegger","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Felix","last_name":"Kubler","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa109","type":"child","title":"Comparative Valuation Dynamics in Models with Financing Restrictions","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This contribution develops a theoretical framework to nest many recent dynamic stochastic general equilibrium economies with financial frictions into one common generic model. Our goal is to study the macroeconomic and asset pricing properties of this class of models, identify common features and highlight areas where these models depart from each other. In order to characterize the asset pricing implications of this family of models, we study their term structure of risk prices and risk exposures, the natural extension of impulse response functions in economic environments exhibiting non-linear dynamics. Given our continuous time setup with a Brownian information structure, our study requires us to solve systems of non-linear partial differential equations of up to 4 state variables; the occasionally binding nature of our financial frictions give rise to a free boundary problem in the 4-dimensional state space. We use finite difference schemes coded in C++ and an iterative procedure to compute the equilibrium dynamics, the stationary distribution, the shock exposure and cost elasticities, and rho-mixing coefficients of our model.","filename":"msa109s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fabrice","last_name":"Tourre","affiliation":"Northwestern University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Paymon","last_name":"Khorrami","affiliation":"University of Chicago","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Paymon","last_name":"Khorrami","affiliation":"University of Chicago","country":"United States of America","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Fabrice","last_name":"Tourre","affiliation":"Northwestern University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Paymon","last_name":"Khorrami","affiliation":"University of Chicago","country":"United States of America","bio":"","order":"2","is_presenter":true}] } Presentation
Organizer(s):
Jean-Roch Vlimant (California Institute of Technology, United States of America)
, Sofia Vallecorsa (CERN, Switzerland)
, Wahid Bhimji (Lawrence Berkeley National Laboratory, United States of America)
Track(s):
Computer Science and Applied Mathematics, Physics
The power of artificial neural nets at executing challenging tasks learned from data is very attractive to fields of physical science and especially to High Energy Physics. In recent years, there have been a significant number of articles reporting promising results with applying deep learning to HEP challenges. By virtue of the very large number of parameters of artificial neural nets, with deep and wide architecture, trained with stochastic gradient descent, it is mandatory to process a lot of representative data in order to obtain accurate models. Training of such models requires a tremendous amount of computing, and commonly takes days, if not weeks, to converge. GP-GPU technology has enabled a lot of this computation, thanks to the high level of parallelisation of the formalism of artificial neural net, but more however can be gained in parallelised calculation of the stochastic gradient descent. Supercomputing facilities are particularly suited for distributed training of deep neural nets, thanks to their large computation power and excellent connectivity. This minisymposium will address the current state-of-the-art and present performances in training models for high energy physics, with a particular view to software availability and to foster further utilization of supercomputers.
14:00 - 14:30
Extreme Scale Deep Learning at NERSC
, Thorsten Kurth (Lawrence Berkeley National Laboratory, United States of America)
+ Abstract { "session": {"id":"sess166","title":"MS04 - Distributed Training of Deep Neural Net Models for High Energy Physics","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Osaka Room","contributors":[{"type":"Session Chair","first_name":"Jean-Roch","last_name":"Vlimant","affiliation":"California Institute of Technology","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Physics"],"slots":[{"id":"symp142","type":"minisymposia","title":"MS04 - Distributed Training of Deep Neural Net Models for High Energy Physics","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The power of artificial neural nets at executing challenging tasks learned from data is very attractive to fields of physical science and especially to High Energy Physics. In recent years, there have been a significant number of articles reporting promising results with applying deep learning to HEP challenges. By virtue of the very large number of parameters of artificial neural nets, with deep and wide architecture, trained with stochastic gradient descent, it is mandatory to process a lot of representative data in order to obtain accurate models. Training of such models requires a tremendous amount of computing, and commonly takes days, if not weeks, to converge. GP-GPU technology has enabled a lot of this computation, thanks to the high level of parallelisation of the formalism of artificial neural net, but more however can be gained in parallelised calculation of the stochastic gradient descent. Supercomputing facilities are particularly suited for distributed training of deep neural nets, thanks to their large computation power and excellent connectivity. This minisymposium will address the current state-of-the-art and present performances in training models for high energy physics, with a particular view to software availability and to foster further utilization of supercomputers.","bio":"","contributors":[{"type":"Organizer","first_name":"Jean-Roch","last_name":"Vlimant","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Wahid","last_name":"Bhimji","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Jean-Roch","last_name":"Vlimant","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa193","type":"child","title":"Large Scale Training for Model Optimization","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the recent years, several studies have demonstrated the benefit of using deep learning to solve typical tasks related to high energy physics data taking and analysis. The computational need for inference of a model once trained is rather modest and does not usually need specific treatment. The training of neural net models requires a lot of data, especially for deep models with numerous parameters. Training of such models has been made tractable with the improvement of optimization methods and the advent of GPUs well adapted to tackle the task of training neural nets. It is important to scale up the available network-training resources and to provide tools for optimal large-scale trainings. One of the avenues to further accelerate the training is via data parallelism, in which the computation of the gradients is computed on multiple subsets of the data in parallel and used collectively to update the model toward the optimum parameters. Several frameworks exist for performing distributed training, all with their strengths and limitations. In this context, our development of a new training workflow, which scales on multi-node\/multi-GPU architectures with an eye to deployment on high performance computing machines is described.","bio":"","contributors":[{"type":"Author","first_name":"Felice","last_name":"Pantaleo","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Maurizio","last_name":"Pierini","affiliation":"CERN","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jean-Roch","last_name":"Vlimant","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Thong","last_name":"Nguyen","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Felice","last_name":"Pantaleo","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa191","type":"child","title":"Training Generative Adversarial Models over Distributed Computing System","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the High Energy Physics field, simulation of the interaction of particles in detectors material is a computing intensive task, even more so with complex and fined grained detectors. The complete and most accurate simulation of particle\/matter interaction is primordial while calibrating and understanding the detector, but is seldomly required at physics analysis level, once several detector effects can hide slight imperfection in simulation. Some level of approximation is therefore acceptable and less computationally intensive approaches can be implemented. We present a fast simulation based on conditional generative adversarial networks. We use a dataset composed of the energy deposition from electron, photons, charged and neutral hadrons in a fine grained digital calorimeter. The training of these models is quite computing intensive, even with the help of GPGPU, and we propose a method to train them over multiple nodes and GPGPU using a standard message passing interface. We report on the scalings of time-to-solution. Further tuning of hyper-parameter of the models are rendered tractable and we present the physics performance of the best model obtained via a Bayesian optimization using gaussian processes. We demonstrate how a high performance computing center can be utilized to globally optimize these kinds of models.","bio":"","contributors":[{"type":"Author","first_name":"Gul Rukh","last_name":"Khattak","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jean-Roch","last_name":"Vlimant","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Federico","last_name":"Carminati","affiliation":"CERN","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Gul Rukh","last_name":"Khattak","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa182","type":"child","title":"Extreme Scale Deep Learning at NERSC","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We present various studies on very large scale distributed deep learning on HPC systems including the ~10k node Intel Xeon-Phi-based Cori system at NERSC. We explore CNN classification architectures and generative adversarial networks for HEP problems using large images corresponding to full LHC detectors and high-resolution cosmology convergence maps. We have explored distributed scaling in different deep-learning frameworks, including Caffe, TensorFlow and PyTorch with different communication layers, i.e. Google RPC or MPI-based approaches such as Intel MLSL, Uber Horovod and Cray\u2019s CPE ML Plugin. We describe various approaches for scaling out the training of single models up to the full Cori system. We further discuss recent work contrasting performance with different frameworks, systems and system architectures.","filename":"msa182s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Thorsten","last_name":"Kurth","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Wahid","last_name":"Bhimji","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thorsten","last_name":"Kurth","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa260","type":"child","title":"Practical Scaling Techniques","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The need for large scale training of neural networks is stemming from the advent of ever growing labeled datasets in data science combined with the successes of deep learning at achieving super-human performance at pattern recognition tasks and others. Fast and powerful GP-GPU have enabled such trainings thanks to an impressive level of parallelisation of computation. There remain however large problems which may take days to weeks to converge. To this end, additional level of parallelisation across computing units are used for additional speed up. We present an overview of the practical techniques which can be used for scaling throughput of model training.","bio":"","contributors":[{"type":"Author","first_name":"Peter","last_name":"Messmer","affiliation":"NVIDIA Inc.","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Fernanda","last_name":"Foertter","affiliation":"NVIDIA Inc.","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Fernanda","last_name":"Foertter","affiliation":"NVIDIA Inc.","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"msa182","type":"child","title":"Extreme Scale Deep Learning at NERSC","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We present various studies on very large scale distributed deep learning on HPC systems including the ~10k node Intel Xeon-Phi-based Cori system at NERSC. We explore CNN classification architectures and generative adversarial networks for HEP problems using large images corresponding to full LHC detectors and high-resolution cosmology convergence maps. We have explored distributed scaling in different deep-learning frameworks, including Caffe, TensorFlow and PyTorch with different communication layers, i.e. Google RPC or MPI-based approaches such as Intel MLSL, Uber Horovod and Cray\u2019s CPE ML Plugin. We describe various approaches for scaling out the training of single models up to the full Cori system. We further discuss recent work contrasting performance with different frameworks, systems and system architectures.","filename":"msa182s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Thorsten","last_name":"Kurth","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Wahid","last_name":"Bhimji","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thorsten","last_name":"Kurth","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Thorsten","last_name":"Kurth","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Wahid","last_name":"Bhimji","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}] } Presentation
Organizer(s):
Gerhard Wellein (Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany)
, Georg Hager (Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany)
, Helmar Burkhart (University of Basel, Switzerland)
Track(s):
Solid Earth Dynamics, Physics, Life Sciences, Engineering, Emerging Application Domains, Computer Science and Applied Mathematics, Climate and Weather, Chemistry and Materials
Achieving hardware and energy efficiency is important for current large-scale numerical simulations and will be a key component in the exascale era. In a world of heterogeneous, highly parallel computer architectures with deep memory hierarchies, complex application scenarios, and a broad spectrum of algorithms, a thorough analysis and understanding of the complex interaction of software, data structures, algorithms, and hardware features, a.k.a. performance engineering, is required for implementing codes that allow for portable performance on the computer generations to come. The minisymposium addresses a broad range of topics in performance engineering for modern HPC architectures, ranging from recent advances in performance models and tools supporting a "white-box" performance engineering approach to application performance tuning cases studies and "black-box" solutions. The presentations will point out the potentials and limitations of performance engineering activities and demonstrate the wide spectrum of performance models used in the performance engineering, including simple performance expectations, automatic model parameter selections, and analytic models.
13:00 - 13:30
Performance Engineering - Why and How?
, Georg Hager (Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany)
+ Abstract { "session": {"id":"sess174","title":"MS05 - Foundations and Applications of Performance Engineering","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Solid Earth Dynamics","Physics","Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials"],"slots":[{"id":"symp156","type":"minisymposia","title":"MS05 - Foundations and Applications of Performance Engineering","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Achieving hardware and energy efficiency is important for current large-scale numerical simulations and will be a key component in the exascale era. In a world of heterogeneous, highly parallel computer architectures with deep memory hierarchies, complex application scenarios, and a broad spectrum of algorithms, a thorough analysis and understanding of the complex interaction of software, data structures, algorithms, and hardware features, a.k.a. performance engineering, is required for implementing codes that allow for portable performance on the computer generations to come. The minisymposium addresses a broad range of topics in performance engineering for modern HPC architectures, ranging from recent advances in performance models and tools supporting a \u0022white-box\u0022 performance engineering approach to application performance tuning cases studies and \u0022black-box\u0022 solutions. The presentations will point out the potentials and limitations of performance engineering activities and demonstrate the wide spectrum of performance models used in the performance engineering, including simple performance expectations, automatic model parameter selections, and analytic models.","bio":"","contributors":[{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa145","type":"child","title":"Performance Engineering - Why and How?","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We give an overview of Performance Engineering (PE) techniques used in scientific computing. Starting from a motivation based on resource efficiency, we demonstrate how PE can support computational science along several directions of thrust: Classification, insight, prediction, and optimization. There is a wide range of PE techniques, all of which have a modeling component of some kind. Such models come in all shapes and sizes, but we classify them on a scale from black to gray to white: Black-box models ignore all or most of the actual \u0022inner workings\u0022 of hardware-software interactions and try to classify or predict interesting metrics automatically, based mostly on measurements. White-box models try to derive useful predictions from first principles, i.e., known properties of the hardware and the software. The wide range of \u0022gray-box\u0022 models in between bridge the gap and use the best of both worlds. Examples from physics and high performance computing are given.","filename":"msa145s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa224","type":"child","title":"Towards a Discipline of Performance Engineering: Lessons Learned from Stencil Kernel Benchmarks","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"An accurate measure of performance is often challenging, and the measurement process is seldom well documented and accurate, raising a credibility problem concerning the collected data. In this talk, we show how performance models and tools can work together. The selected test cases are kernels belonging to the stencil pattern, that is present in several scientific applications, ranging from geophysics to astronomy, fluid dynamics, image processing, and weather forecasts. First, we comment on how to pass from a description of a stencil to pseudo code and move to a modeling phase based on the \u0022Kerncraft\u0022 tool for automatic Roofline and Execution-Cache-Memory performance modeling. After the automatic generation of compilable source code, we will focus on how to ensure the reproducibility of the performance results of its execution, using \u0022PROVA!\u0022 a distributed workflow and system management tool for reproducible research. Knowing that a specific code performs accordingly with the model(s) can drive to the identification of relevant bottlenecks and therefore to potential optimizations. The ultimate goal is to generalize our approach to modeling, predicting and benchmarking, to a general application context.","filename":"msa224s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa158","type":"child","title":"Holistic Performance Engineering for Sparse Iterative Solvers","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In many applications, sparse (linear and\/or eigenvalue) solvers take up a large fraction of the overall runtime. We believe that the increasingly complex hardware of today\u0027s and future HPC systems has led to a gap in the understanding of the performance achieved by actual applications, many of which are still using a monolithic \u0027MPI only\u0027 approach despite the heterogeneous nature of the hardware. We have developed a new sparse solver library PHIST (https:\/\/bitbucket.org\/essex\/phist\/) that defines a simple \u0022kernel interface\u0022 layer inspired by MPI. Algorithms implemented in PHIST are portable in terms of software and performance as they only call building blocks of linear algebra via this interface. We have introduced simple performance models for these basic building blocks at the interface level, so that regardless of the backend providing the implementation, an overview of the optimization potential on the kernel level can be obtained, and performance pitfalls in the application (e.g. strided memory accesses) may be revealed. Available backends for PHIST include established libraries such as Trilinos\/Epetra or PETSc, as well as more recent \u0022MPI+X\u0022 approaches as implemented in Trilinos\/Tpetra or our own kernel library GHOST (https:\/\/bitbucket.org\/essex\/ghost).","filename":"msa158s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonas","last_name":"Thies","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonas","last_name":"Thies","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa206","type":"child","title":"Machine Learning Framework for Performance Coverage Analysis","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Proxy applications are written to represent subsets of performance behaviors of larger, and more complex applications that often have distribution restrictions. They enable easy evaluation of these behaviors across systems, e.g., for procurement or co-design purposes. However, the intended correlation between the performance behaviors of proxy applications and their parent codes is often based solely on the developer\u0027s intuition. In this paper, we present novel machine learning techniques to methodically quantify the coverage of performance behaviors of parent codes by their proxy applications. We have developed a framework, VERITAS, to answer these questions in the context of on-node performance: (a) which hardware resources are covered by a proxy application and how well, and (b) which resources are important, but not covered. Since 2016, a more general machine learning framework has been developed around VERITAS which leverages deep learning techniques to automatically learn feature space and present information in a more intuitive fashion.","bio":"","contributors":[{"type":"Author","first_name":"Tanzima Z.","last_name":"Islam","affiliation":"Western Washington University","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jayaraman J.","last_name":"Thiagarajan","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhinav","last_name":"Bhatele","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Schulz","affiliation":"TU Munich","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Todd","last_name":"Gamblin","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tanzima Z.","last_name":"Islam","affiliation":"Western Washington University","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa145","type":"child","title":"Performance Engineering - Why and How?","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We give an overview of Performance Engineering (PE) techniques used in scientific computing. Starting from a motivation based on resource efficiency, we demonstrate how PE can support computational science along several directions of thrust: Classification, insight, prediction, and optimization. There is a wide range of PE techniques, all of which have a modeling component of some kind. Such models come in all shapes and sizes, but we classify them on a scale from black to gray to white: Black-box models ignore all or most of the actual \u0022inner workings\u0022 of hardware-software interactions and try to classify or predict interesting metrics automatically, based mostly on measurements. White-box models try to derive useful predictions from first principles, i.e., known properties of the hardware and the software. The wide range of \u0022gray-box\u0022 models in between bridge the gap and use the best of both worlds. Examples from physics and high performance computing are given.","filename":"msa145s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}] } Presentation
13:30 - 14:00
Towards a Discipline of Performance Engineering: Lessons Learned from Stencil Kernel Benchmarks
, Danilo Guerrera (University of Basel, Switzerland)
+ Abstract { "session": {"id":"sess174","title":"MS05 - Foundations and Applications of Performance Engineering","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Solid Earth Dynamics","Physics","Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials"],"slots":[{"id":"symp156","type":"minisymposia","title":"MS05 - Foundations and Applications of Performance Engineering","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Achieving hardware and energy efficiency is important for current large-scale numerical simulations and will be a key component in the exascale era. In a world of heterogeneous, highly parallel computer architectures with deep memory hierarchies, complex application scenarios, and a broad spectrum of algorithms, a thorough analysis and understanding of the complex interaction of software, data structures, algorithms, and hardware features, a.k.a. performance engineering, is required for implementing codes that allow for portable performance on the computer generations to come. The minisymposium addresses a broad range of topics in performance engineering for modern HPC architectures, ranging from recent advances in performance models and tools supporting a \u0022white-box\u0022 performance engineering approach to application performance tuning cases studies and \u0022black-box\u0022 solutions. The presentations will point out the potentials and limitations of performance engineering activities and demonstrate the wide spectrum of performance models used in the performance engineering, including simple performance expectations, automatic model parameter selections, and analytic models.","bio":"","contributors":[{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa145","type":"child","title":"Performance Engineering - Why and How?","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We give an overview of Performance Engineering (PE) techniques used in scientific computing. Starting from a motivation based on resource efficiency, we demonstrate how PE can support computational science along several directions of thrust: Classification, insight, prediction, and optimization. There is a wide range of PE techniques, all of which have a modeling component of some kind. Such models come in all shapes and sizes, but we classify them on a scale from black to gray to white: Black-box models ignore all or most of the actual \u0022inner workings\u0022 of hardware-software interactions and try to classify or predict interesting metrics automatically, based mostly on measurements. White-box models try to derive useful predictions from first principles, i.e., known properties of the hardware and the software. The wide range of \u0022gray-box\u0022 models in between bridge the gap and use the best of both worlds. Examples from physics and high performance computing are given.","filename":"msa145s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa224","type":"child","title":"Towards a Discipline of Performance Engineering: Lessons Learned from Stencil Kernel Benchmarks","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"An accurate measure of performance is often challenging, and the measurement process is seldom well documented and accurate, raising a credibility problem concerning the collected data. In this talk, we show how performance models and tools can work together. The selected test cases are kernels belonging to the stencil pattern, that is present in several scientific applications, ranging from geophysics to astronomy, fluid dynamics, image processing, and weather forecasts. First, we comment on how to pass from a description of a stencil to pseudo code and move to a modeling phase based on the \u0022Kerncraft\u0022 tool for automatic Roofline and Execution-Cache-Memory performance modeling. After the automatic generation of compilable source code, we will focus on how to ensure the reproducibility of the performance results of its execution, using \u0022PROVA!\u0022 a distributed workflow and system management tool for reproducible research. Knowing that a specific code performs accordingly with the model(s) can drive to the identification of relevant bottlenecks and therefore to potential optimizations. The ultimate goal is to generalize our approach to modeling, predicting and benchmarking, to a general application context.","filename":"msa224s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa158","type":"child","title":"Holistic Performance Engineering for Sparse Iterative Solvers","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In many applications, sparse (linear and\/or eigenvalue) solvers take up a large fraction of the overall runtime. We believe that the increasingly complex hardware of today\u0027s and future HPC systems has led to a gap in the understanding of the performance achieved by actual applications, many of which are still using a monolithic \u0027MPI only\u0027 approach despite the heterogeneous nature of the hardware. We have developed a new sparse solver library PHIST (https:\/\/bitbucket.org\/essex\/phist\/) that defines a simple \u0022kernel interface\u0022 layer inspired by MPI. Algorithms implemented in PHIST are portable in terms of software and performance as they only call building blocks of linear algebra via this interface. We have introduced simple performance models for these basic building blocks at the interface level, so that regardless of the backend providing the implementation, an overview of the optimization potential on the kernel level can be obtained, and performance pitfalls in the application (e.g. strided memory accesses) may be revealed. Available backends for PHIST include established libraries such as Trilinos\/Epetra or PETSc, as well as more recent \u0022MPI+X\u0022 approaches as implemented in Trilinos\/Tpetra or our own kernel library GHOST (https:\/\/bitbucket.org\/essex\/ghost).","filename":"msa158s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonas","last_name":"Thies","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonas","last_name":"Thies","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa206","type":"child","title":"Machine Learning Framework for Performance Coverage Analysis","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Proxy applications are written to represent subsets of performance behaviors of larger, and more complex applications that often have distribution restrictions. They enable easy evaluation of these behaviors across systems, e.g., for procurement or co-design purposes. However, the intended correlation between the performance behaviors of proxy applications and their parent codes is often based solely on the developer\u0027s intuition. In this paper, we present novel machine learning techniques to methodically quantify the coverage of performance behaviors of parent codes by their proxy applications. We have developed a framework, VERITAS, to answer these questions in the context of on-node performance: (a) which hardware resources are covered by a proxy application and how well, and (b) which resources are important, but not covered. Since 2016, a more general machine learning framework has been developed around VERITAS which leverages deep learning techniques to automatically learn feature space and present information in a more intuitive fashion.","bio":"","contributors":[{"type":"Author","first_name":"Tanzima Z.","last_name":"Islam","affiliation":"Western Washington University","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jayaraman J.","last_name":"Thiagarajan","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhinav","last_name":"Bhatele","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Schulz","affiliation":"TU Munich","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Todd","last_name":"Gamblin","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tanzima Z.","last_name":"Islam","affiliation":"Western Washington University","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa224","type":"child","title":"Towards a Discipline of Performance Engineering: Lessons Learned from Stencil Kernel Benchmarks","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"An accurate measure of performance is often challenging, and the measurement process is seldom well documented and accurate, raising a credibility problem concerning the collected data. In this talk, we show how performance models and tools can work together. The selected test cases are kernels belonging to the stencil pattern, that is present in several scientific applications, ranging from geophysics to astronomy, fluid dynamics, image processing, and weather forecasts. First, we comment on how to pass from a description of a stencil to pseudo code and move to a modeling phase based on the \u0022Kerncraft\u0022 tool for automatic Roofline and Execution-Cache-Memory performance modeling. After the automatic generation of compilable source code, we will focus on how to ensure the reproducibility of the performance results of its execution, using \u0022PROVA!\u0022 a distributed workflow and system management tool for reproducible research. Knowing that a specific code performs accordingly with the model(s) can drive to the identification of relevant bottlenecks and therefore to potential optimizations. The ultimate goal is to generalize our approach to modeling, predicting and benchmarking, to a general application context.","filename":"msa224s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
14:00 - 14:30
Holistic Performance Engineering for Sparse Iterative Solvers
, Jonas Thies (German Aerospace Center, Germany)
+ Abstract { "session": {"id":"sess174","title":"MS05 - Foundations and Applications of Performance Engineering","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Solid Earth Dynamics","Physics","Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials"],"slots":[{"id":"symp156","type":"minisymposia","title":"MS05 - Foundations and Applications of Performance Engineering","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Achieving hardware and energy efficiency is important for current large-scale numerical simulations and will be a key component in the exascale era. In a world of heterogeneous, highly parallel computer architectures with deep memory hierarchies, complex application scenarios, and a broad spectrum of algorithms, a thorough analysis and understanding of the complex interaction of software, data structures, algorithms, and hardware features, a.k.a. performance engineering, is required for implementing codes that allow for portable performance on the computer generations to come. The minisymposium addresses a broad range of topics in performance engineering for modern HPC architectures, ranging from recent advances in performance models and tools supporting a \u0022white-box\u0022 performance engineering approach to application performance tuning cases studies and \u0022black-box\u0022 solutions. The presentations will point out the potentials and limitations of performance engineering activities and demonstrate the wide spectrum of performance models used in the performance engineering, including simple performance expectations, automatic model parameter selections, and analytic models.","bio":"","contributors":[{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa145","type":"child","title":"Performance Engineering - Why and How?","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We give an overview of Performance Engineering (PE) techniques used in scientific computing. Starting from a motivation based on resource efficiency, we demonstrate how PE can support computational science along several directions of thrust: Classification, insight, prediction, and optimization. There is a wide range of PE techniques, all of which have a modeling component of some kind. Such models come in all shapes and sizes, but we classify them on a scale from black to gray to white: Black-box models ignore all or most of the actual \u0022inner workings\u0022 of hardware-software interactions and try to classify or predict interesting metrics automatically, based mostly on measurements. White-box models try to derive useful predictions from first principles, i.e., known properties of the hardware and the software. The wide range of \u0022gray-box\u0022 models in between bridge the gap and use the best of both worlds. Examples from physics and high performance computing are given.","filename":"msa145s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa224","type":"child","title":"Towards a Discipline of Performance Engineering: Lessons Learned from Stencil Kernel Benchmarks","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"An accurate measure of performance is often challenging, and the measurement process is seldom well documented and accurate, raising a credibility problem concerning the collected data. In this talk, we show how performance models and tools can work together. The selected test cases are kernels belonging to the stencil pattern, that is present in several scientific applications, ranging from geophysics to astronomy, fluid dynamics, image processing, and weather forecasts. First, we comment on how to pass from a description of a stencil to pseudo code and move to a modeling phase based on the \u0022Kerncraft\u0022 tool for automatic Roofline and Execution-Cache-Memory performance modeling. After the automatic generation of compilable source code, we will focus on how to ensure the reproducibility of the performance results of its execution, using \u0022PROVA!\u0022 a distributed workflow and system management tool for reproducible research. Knowing that a specific code performs accordingly with the model(s) can drive to the identification of relevant bottlenecks and therefore to potential optimizations. The ultimate goal is to generalize our approach to modeling, predicting and benchmarking, to a general application context.","filename":"msa224s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa158","type":"child","title":"Holistic Performance Engineering for Sparse Iterative Solvers","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In many applications, sparse (linear and\/or eigenvalue) solvers take up a large fraction of the overall runtime. We believe that the increasingly complex hardware of today\u0027s and future HPC systems has led to a gap in the understanding of the performance achieved by actual applications, many of which are still using a monolithic \u0027MPI only\u0027 approach despite the heterogeneous nature of the hardware. We have developed a new sparse solver library PHIST (https:\/\/bitbucket.org\/essex\/phist\/) that defines a simple \u0022kernel interface\u0022 layer inspired by MPI. Algorithms implemented in PHIST are portable in terms of software and performance as they only call building blocks of linear algebra via this interface. We have introduced simple performance models for these basic building blocks at the interface level, so that regardless of the backend providing the implementation, an overview of the optimization potential on the kernel level can be obtained, and performance pitfalls in the application (e.g. strided memory accesses) may be revealed. Available backends for PHIST include established libraries such as Trilinos\/Epetra or PETSc, as well as more recent \u0022MPI+X\u0022 approaches as implemented in Trilinos\/Tpetra or our own kernel library GHOST (https:\/\/bitbucket.org\/essex\/ghost).","filename":"msa158s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonas","last_name":"Thies","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonas","last_name":"Thies","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa206","type":"child","title":"Machine Learning Framework for Performance Coverage Analysis","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Proxy applications are written to represent subsets of performance behaviors of larger, and more complex applications that often have distribution restrictions. They enable easy evaluation of these behaviors across systems, e.g., for procurement or co-design purposes. However, the intended correlation between the performance behaviors of proxy applications and their parent codes is often based solely on the developer\u0027s intuition. In this paper, we present novel machine learning techniques to methodically quantify the coverage of performance behaviors of parent codes by their proxy applications. We have developed a framework, VERITAS, to answer these questions in the context of on-node performance: (a) which hardware resources are covered by a proxy application and how well, and (b) which resources are important, but not covered. Since 2016, a more general machine learning framework has been developed around VERITAS which leverages deep learning techniques to automatically learn feature space and present information in a more intuitive fashion.","bio":"","contributors":[{"type":"Author","first_name":"Tanzima Z.","last_name":"Islam","affiliation":"Western Washington University","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jayaraman J.","last_name":"Thiagarajan","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhinav","last_name":"Bhatele","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Schulz","affiliation":"TU Munich","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Todd","last_name":"Gamblin","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tanzima Z.","last_name":"Islam","affiliation":"Western Washington University","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa158","type":"child","title":"Holistic Performance Engineering for Sparse Iterative Solvers","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In many applications, sparse (linear and\/or eigenvalue) solvers take up a large fraction of the overall runtime. We believe that the increasingly complex hardware of today\u0027s and future HPC systems has led to a gap in the understanding of the performance achieved by actual applications, many of which are still using a monolithic \u0027MPI only\u0027 approach despite the heterogeneous nature of the hardware. We have developed a new sparse solver library PHIST (https:\/\/bitbucket.org\/essex\/phist\/) that defines a simple \u0022kernel interface\u0022 layer inspired by MPI. Algorithms implemented in PHIST are portable in terms of software and performance as they only call building blocks of linear algebra via this interface. We have introduced simple performance models for these basic building blocks at the interface level, so that regardless of the backend providing the implementation, an overview of the optimization potential on the kernel level can be obtained, and performance pitfalls in the application (e.g. strided memory accesses) may be revealed. Available backends for PHIST include established libraries such as Trilinos\/Epetra or PETSc, as well as more recent \u0022MPI+X\u0022 approaches as implemented in Trilinos\/Tpetra or our own kernel library GHOST (https:\/\/bitbucket.org\/essex\/ghost).","filename":"msa158s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonas","last_name":"Thies","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonas","last_name":"Thies","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jonas","last_name":"Thies","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Stefan Goedecker (University of Basel, Switzerland)
, Andre Schleife (University of Illinois at Urbana-Champaign, United States of America)
, Matthieu Verstraete (Universite de Liege, Belgium)
Track(s):
Engineering, Chemistry and Materials, Physics
Modern electronic-structure methods provide parameter-free simulations of properties across the whole spectrum of Physics, Chemistry, and Materials Science. They have become a bedrock of advanced analytical methods and are used systematically to interpret advanced experiments and complex interactions, with ever growing perspectives for more "realistic" systems, including defects, thermal and external fields, and transient phenomena. This minisymposium explores the next generation of electronic-structure software, which will lead users to exascale supercomputers, through highly efficient and highly parallel algorithms. We will showcase recent advances in ground- and excited-state calculations, spectroscopic quantities and transport, and adaptive methods which exploit different algorithms for different systems such as periodic/localized or many/few electrons per atom.
13:00 - 13:30
First-Principles Electron Transport with Phonon Coupling: Large Scale at Low Cost
, Tue Gunst (Technical University of Denmark, Denmark)
+ Abstract { "session": {"id":"sess184","title":"MS06 - Large Scale Electronic-Structure Calculations on Modern and Future High-Performance Supercomputers","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Engineering","Chemistry and Materials","Physics"],"slots":[{"id":"symp137","type":"minisymposia","title":"MS06 - Large Scale Electronic-Structure Calculations on Modern and Future High-Performance Supercomputers","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Modern electronic-structure methods provide parameter-free simulations of properties across the whole spectrum of Physics, Chemistry, and Materials Science. They have become a bedrock of advanced analytical methods and are used systematically to interpret advanced experiments and complex interactions, with ever growing perspectives for more \u0022realistic\u0022 systems, including defects, thermal and external fields, and transient phenomena. This minisymposium explores the next generation of electronic-structure software, which will lead users to exascale supercomputers, through highly efficient and highly parallel algorithms. We will showcase recent advances in ground- and excited-state calculations, spectroscopic quantities and transport, and adaptive methods which exploit different algorithms for different systems such as periodic\/localized or many\/few electrons per atom.","bio":"","contributors":[{"type":"Organizer","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Andre","last_name":"Schleife","affiliation":"University of Illinois at Urbana-Champaign","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Matthieu","last_name":"Verstraete","affiliation":"Universite de Liege","country":"Belgium","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa121","type":"child","title":"First-Principles Electron Transport with Phonon Coupling: Large Scale at Low Cost","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the race towards high-performance nanometer-scaled devices the electronics industry now faces a major challenge from phonon-assisted tunneling. Despite the rapid size-reduction in experiments, system sizes still fall outside what is feasible for existing device models including electron-phonon coupling from first-principles. Therefore, the role of phonon-assisted tunneling in sub-10-nanometer gate-length devices has not been accurately quantified so far. We present a method that include phonon-assisted tunneling in large-scale first-principles calculations using a single \u0022special thermal displacement\u0022 of the atomic coordinates at almost the same cost as elastic transport calculations [1]. We apply the method to ultrascaled silicon devices and demonstrate the importance of phonon-assisted band-to-band and source-to-drain tunneling. In a diode the phonons lead to a rectification ratio suppression in good agreement with experiments, while in an ultrathin body transistor the phonons increase off currents by four orders of magnitude, in agreement with our state-of-the-art perturbation theory calculations. In addition, electron-phonon coupling of nanostructured devices in operation conditions can change significantly from its bulk value [2]. This makes the method an appealing design tool for next-generation devices and nanomaterials. [1]T. Gunst \u003Cem\u003Eet al\u003C\/em\u003E.,\u00a0Phys. Rev. B 96, 161404(R) (2017). [2]T. Gunst \u003Cem\u003Eet al\u003C\/em\u003E., Phys. Rev. Lett. 118, 046601 (2017).","filename":"msa121s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tue","last_name":"Gunst","affiliation":"Technical University of Denmark","country":"Denmark","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tue","last_name":"Gunst","affiliation":"Technical University of Denmark","country":"Denmark","bio":"","order":"1","is_presenter":true}]},{"id":"msa140","type":"child","title":"Large-Scale First-Principles Electronic Structure Calculations in Petascale and Exascale Supercomputers: A Real-Space Density Functional Theory Code","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"First-principles electronic structure calculation based on the Density Functional Theory (DFT) has been an indispensable tool for many fields of material science and engineering. With the development of supercomputers, the size of the targets of the first-principles DFT calculations becomes larger and larger, and nowadays, the target systems with a few hundreds to a thousand of atoms have been computable with standard plane-wave based DFT program codes. However, the computable size is not yet satisfactory to clarify the properties of materials in the situations close to realistic applications. We\u0027d like to introduce our program code RSDFT, which has been developed to perform large-scale first-principles calculations on massively-parallel supercomputers including the Japanese flagship machine K computer. RSDFT is based on the real-space finite-difference pseudopotential method. Contrary to the standard plane-wave methods, the real-space method does not need to use Fast Fourier Transformations, which requires heavy communication burden in parallel computations, and therefore RSDFT shows rather good scalability even in the computations with tens of thousands of compute nodes. It has also been started to develop RSDFT for the next Japanese flagship computer called post-K computer. We aim to make the first-principles calculations of tens-of-thousand-atom systems easy as a daily work.","filename":"msa140s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun-Ichi","last_name":"Iwata","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Atsushi","last_name":"Oshiyama","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jun-Ichi","last_name":"Iwata","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true}]},{"id":"msa270","type":"child","title":"Potentialities of Wavelet Formalism towards a Reduction of the Complexity of Large Scale Electronic Structure Calculations","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For the last few years, the BigDFT software package has implemented a linear scaling Kohn-Sham density functional theory optimization algorithm based on Daubechies wavelets, where a minimal set of localized support functions are optimized in situ and therefore adapted to the physico-chemical properties of the system under investigation. We illustrate, from a general perspective, a quantitative method to identify and assess the partitioning of a large quantum-mechanical system into fragments. Our approach reduces arbitrariness in the fragmentation procedure and enables the possibility of assessing quantitatively whether the corresponding fragment multipoles can be interpreted as observable quantities associated with a system moiety. Such an approach is based on general grounds and its implementation is unrelated to the wavelet formalism. However, we show that the use of a minimal set of in situ-optimized basis functions allows at the same time a proper fragment definition and an accurate description of the electronic structure.","bio":"","contributors":[{"type":"Author","first_name":"Luigi","last_name":"Genovese","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Stephan","last_name":"Mohr","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Laura","last_name":"Ratcliff","affiliation":"Imperial College London","country":"United Kingdom","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luigi","last_name":"Genovese","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa134","type":"child","title":"ABINIT on Pre-Exascale Supercomputers: Hybrid Parallelism and Numerical Stability","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ABINIT is one of the most widely used electronic structure codes, implementing plane-wave based Density-Functional Theory. With multiple levels of parallelism, changing the code with every hardware evolution is a tedious task. To avoid obsolescence and allow adaptivity, an abstract layer for intensive low-level computing tasks has been introduced. Low-level sections have been rewritten specifically for a few hardware types. A global change of the hybrid parallelism is necessary to adapt the code to the new and future many-core architectures, as well as to memory bandwidth. A positive side effect of memory sharing is a better convergence of the diagonalization algorithm. Performances will be shown on Intel Xeon Skylake and Intel Xeon Phi KNL. Vectorization with large vectors and multithreading with more and more tasks induce a non-predictability of the floating-point operations that increase numerical noise and instabilities. We tackle this issue with the use of stochastic arithmetic to estimate the number of significant digits of each code section. Doing this, we can identify numerically sensitive code sections.","filename":"msa134s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marc","last_name":"Torrent","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jordan","last_name":"Bieder","affiliation":"CEA","country":"France","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yohan","last_name":"Chatelain","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Pablo","last_name":"Oliveira","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marc","last_name":"Torrent","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa121","type":"child","title":"First-Principles Electron Transport with Phonon Coupling: Large Scale at Low Cost","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the race towards high-performance nanometer-scaled devices the electronics industry now faces a major challenge from phonon-assisted tunneling. Despite the rapid size-reduction in experiments, system sizes still fall outside what is feasible for existing device models including electron-phonon coupling from first-principles. Therefore, the role of phonon-assisted tunneling in sub-10-nanometer gate-length devices has not been accurately quantified so far. We present a method that include phonon-assisted tunneling in large-scale first-principles calculations using a single \u0022special thermal displacement\u0022 of the atomic coordinates at almost the same cost as elastic transport calculations [1]. We apply the method to ultrascaled silicon devices and demonstrate the importance of phonon-assisted band-to-band and source-to-drain tunneling. In a diode the phonons lead to a rectification ratio suppression in good agreement with experiments, while in an ultrathin body transistor the phonons increase off currents by four orders of magnitude, in agreement with our state-of-the-art perturbation theory calculations. In addition, electron-phonon coupling of nanostructured devices in operation conditions can change significantly from its bulk value [2]. This makes the method an appealing design tool for next-generation devices and nanomaterials. [1]T. Gunst \u003Cem\u003Eet al\u003C\/em\u003E.,\u00a0Phys. Rev. B 96, 161404(R) (2017). [2]T. Gunst \u003Cem\u003Eet al\u003C\/em\u003E., Phys. Rev. Lett. 118, 046601 (2017).","filename":"msa121s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tue","last_name":"Gunst","affiliation":"Technical University of Denmark","country":"Denmark","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tue","last_name":"Gunst","affiliation":"Technical University of Denmark","country":"Denmark","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Tue","last_name":"Gunst","affiliation":"Technical University of Denmark","country":"Denmark","bio":"","order":"1","is_presenter":true}] } Presentation
13:30 - 14:00
Large-Scale First-Principles Electronic Structure Calculations in Petascale and Exascale Supercomputers: A Real-Space Density Functional Theory Code
, Jun-Ichi Iwata (The University of Tokyo, Japan)
+ Abstract { "session": {"id":"sess184","title":"MS06 - Large Scale Electronic-Structure Calculations on Modern and Future High-Performance Supercomputers","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Engineering","Chemistry and Materials","Physics"],"slots":[{"id":"symp137","type":"minisymposia","title":"MS06 - Large Scale Electronic-Structure Calculations on Modern and Future High-Performance Supercomputers","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Modern electronic-structure methods provide parameter-free simulations of properties across the whole spectrum of Physics, Chemistry, and Materials Science. They have become a bedrock of advanced analytical methods and are used systematically to interpret advanced experiments and complex interactions, with ever growing perspectives for more \u0022realistic\u0022 systems, including defects, thermal and external fields, and transient phenomena. This minisymposium explores the next generation of electronic-structure software, which will lead users to exascale supercomputers, through highly efficient and highly parallel algorithms. We will showcase recent advances in ground- and excited-state calculations, spectroscopic quantities and transport, and adaptive methods which exploit different algorithms for different systems such as periodic\/localized or many\/few electrons per atom.","bio":"","contributors":[{"type":"Organizer","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Andre","last_name":"Schleife","affiliation":"University of Illinois at Urbana-Champaign","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Matthieu","last_name":"Verstraete","affiliation":"Universite de Liege","country":"Belgium","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa121","type":"child","title":"First-Principles Electron Transport with Phonon Coupling: Large Scale at Low Cost","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the race towards high-performance nanometer-scaled devices the electronics industry now faces a major challenge from phonon-assisted tunneling. Despite the rapid size-reduction in experiments, system sizes still fall outside what is feasible for existing device models including electron-phonon coupling from first-principles. Therefore, the role of phonon-assisted tunneling in sub-10-nanometer gate-length devices has not been accurately quantified so far. We present a method that include phonon-assisted tunneling in large-scale first-principles calculations using a single \u0022special thermal displacement\u0022 of the atomic coordinates at almost the same cost as elastic transport calculations [1]. We apply the method to ultrascaled silicon devices and demonstrate the importance of phonon-assisted band-to-band and source-to-drain tunneling. In a diode the phonons lead to a rectification ratio suppression in good agreement with experiments, while in an ultrathin body transistor the phonons increase off currents by four orders of magnitude, in agreement with our state-of-the-art perturbation theory calculations. In addition, electron-phonon coupling of nanostructured devices in operation conditions can change significantly from its bulk value [2]. This makes the method an appealing design tool for next-generation devices and nanomaterials. [1]T. Gunst \u003Cem\u003Eet al\u003C\/em\u003E.,\u00a0Phys. Rev. B 96, 161404(R) (2017). [2]T. Gunst \u003Cem\u003Eet al\u003C\/em\u003E., Phys. Rev. Lett. 118, 046601 (2017).","filename":"msa121s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tue","last_name":"Gunst","affiliation":"Technical University of Denmark","country":"Denmark","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tue","last_name":"Gunst","affiliation":"Technical University of Denmark","country":"Denmark","bio":"","order":"1","is_presenter":true}]},{"id":"msa140","type":"child","title":"Large-Scale First-Principles Electronic Structure Calculations in Petascale and Exascale Supercomputers: A Real-Space Density Functional Theory Code","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"First-principles electronic structure calculation based on the Density Functional Theory (DFT) has been an indispensable tool for many fields of material science and engineering. With the development of supercomputers, the size of the targets of the first-principles DFT calculations becomes larger and larger, and nowadays, the target systems with a few hundreds to a thousand of atoms have been computable with standard plane-wave based DFT program codes. However, the computable size is not yet satisfactory to clarify the properties of materials in the situations close to realistic applications. We\u0027d like to introduce our program code RSDFT, which has been developed to perform large-scale first-principles calculations on massively-parallel supercomputers including the Japanese flagship machine K computer. RSDFT is based on the real-space finite-difference pseudopotential method. Contrary to the standard plane-wave methods, the real-space method does not need to use Fast Fourier Transformations, which requires heavy communication burden in parallel computations, and therefore RSDFT shows rather good scalability even in the computations with tens of thousands of compute nodes. It has also been started to develop RSDFT for the next Japanese flagship computer called post-K computer. We aim to make the first-principles calculations of tens-of-thousand-atom systems easy as a daily work.","filename":"msa140s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun-Ichi","last_name":"Iwata","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Atsushi","last_name":"Oshiyama","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jun-Ichi","last_name":"Iwata","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true}]},{"id":"msa270","type":"child","title":"Potentialities of Wavelet Formalism towards a Reduction of the Complexity of Large Scale Electronic Structure Calculations","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For the last few years, the BigDFT software package has implemented a linear scaling Kohn-Sham density functional theory optimization algorithm based on Daubechies wavelets, where a minimal set of localized support functions are optimized in situ and therefore adapted to the physico-chemical properties of the system under investigation. We illustrate, from a general perspective, a quantitative method to identify and assess the partitioning of a large quantum-mechanical system into fragments. Our approach reduces arbitrariness in the fragmentation procedure and enables the possibility of assessing quantitatively whether the corresponding fragment multipoles can be interpreted as observable quantities associated with a system moiety. Such an approach is based on general grounds and its implementation is unrelated to the wavelet formalism. However, we show that the use of a minimal set of in situ-optimized basis functions allows at the same time a proper fragment definition and an accurate description of the electronic structure.","bio":"","contributors":[{"type":"Author","first_name":"Luigi","last_name":"Genovese","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Stephan","last_name":"Mohr","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Laura","last_name":"Ratcliff","affiliation":"Imperial College London","country":"United Kingdom","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luigi","last_name":"Genovese","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa134","type":"child","title":"ABINIT on Pre-Exascale Supercomputers: Hybrid Parallelism and Numerical Stability","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ABINIT is one of the most widely used electronic structure codes, implementing plane-wave based Density-Functional Theory. With multiple levels of parallelism, changing the code with every hardware evolution is a tedious task. To avoid obsolescence and allow adaptivity, an abstract layer for intensive low-level computing tasks has been introduced. Low-level sections have been rewritten specifically for a few hardware types. A global change of the hybrid parallelism is necessary to adapt the code to the new and future many-core architectures, as well as to memory bandwidth. A positive side effect of memory sharing is a better convergence of the diagonalization algorithm. Performances will be shown on Intel Xeon Skylake and Intel Xeon Phi KNL. Vectorization with large vectors and multithreading with more and more tasks induce a non-predictability of the floating-point operations that increase numerical noise and instabilities. We tackle this issue with the use of stochastic arithmetic to estimate the number of significant digits of each code section. Doing this, we can identify numerically sensitive code sections.","filename":"msa134s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marc","last_name":"Torrent","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jordan","last_name":"Bieder","affiliation":"CEA","country":"France","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yohan","last_name":"Chatelain","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Pablo","last_name":"Oliveira","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marc","last_name":"Torrent","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa140","type":"child","title":"Large-Scale First-Principles Electronic Structure Calculations in Petascale and Exascale Supercomputers: A Real-Space Density Functional Theory Code","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"First-principles electronic structure calculation based on the Density Functional Theory (DFT) has been an indispensable tool for many fields of material science and engineering. With the development of supercomputers, the size of the targets of the first-principles DFT calculations becomes larger and larger, and nowadays, the target systems with a few hundreds to a thousand of atoms have been computable with standard plane-wave based DFT program codes. However, the computable size is not yet satisfactory to clarify the properties of materials in the situations close to realistic applications. We\u0027d like to introduce our program code RSDFT, which has been developed to perform large-scale first-principles calculations on massively-parallel supercomputers including the Japanese flagship machine K computer. RSDFT is based on the real-space finite-difference pseudopotential method. Contrary to the standard plane-wave methods, the real-space method does not need to use Fast Fourier Transformations, which requires heavy communication burden in parallel computations, and therefore RSDFT shows rather good scalability even in the computations with tens of thousands of compute nodes. It has also been started to develop RSDFT for the next Japanese flagship computer called post-K computer. We aim to make the first-principles calculations of tens-of-thousand-atom systems easy as a daily work.","filename":"msa140s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun-Ichi","last_name":"Iwata","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Atsushi","last_name":"Oshiyama","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jun-Ichi","last_name":"Iwata","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jun-Ichi","last_name":"Iwata","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Atsushi","last_name":"Oshiyama","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"2","is_presenter":false}] } Presentation
14:30 - 15:00
ABINIT on Pre-Exascale Supercomputers: Hybrid Parallelism and Numerical Stability
, Marc Torrent (CEA, France)
+ Abstract { "session": {"id":"sess184","title":"MS06 - Large Scale Electronic-Structure Calculations on Modern and Future High-Performance Supercomputers","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Engineering","Chemistry and Materials","Physics"],"slots":[{"id":"symp137","type":"minisymposia","title":"MS06 - Large Scale Electronic-Structure Calculations on Modern and Future High-Performance Supercomputers","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Modern electronic-structure methods provide parameter-free simulations of properties across the whole spectrum of Physics, Chemistry, and Materials Science. They have become a bedrock of advanced analytical methods and are used systematically to interpret advanced experiments and complex interactions, with ever growing perspectives for more \u0022realistic\u0022 systems, including defects, thermal and external fields, and transient phenomena. This minisymposium explores the next generation of electronic-structure software, which will lead users to exascale supercomputers, through highly efficient and highly parallel algorithms. We will showcase recent advances in ground- and excited-state calculations, spectroscopic quantities and transport, and adaptive methods which exploit different algorithms for different systems such as periodic\/localized or many\/few electrons per atom.","bio":"","contributors":[{"type":"Organizer","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Andre","last_name":"Schleife","affiliation":"University of Illinois at Urbana-Champaign","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Matthieu","last_name":"Verstraete","affiliation":"Universite de Liege","country":"Belgium","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa121","type":"child","title":"First-Principles Electron Transport with Phonon Coupling: Large Scale at Low Cost","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the race towards high-performance nanometer-scaled devices the electronics industry now faces a major challenge from phonon-assisted tunneling. Despite the rapid size-reduction in experiments, system sizes still fall outside what is feasible for existing device models including electron-phonon coupling from first-principles. Therefore, the role of phonon-assisted tunneling in sub-10-nanometer gate-length devices has not been accurately quantified so far. We present a method that include phonon-assisted tunneling in large-scale first-principles calculations using a single \u0022special thermal displacement\u0022 of the atomic coordinates at almost the same cost as elastic transport calculations [1]. We apply the method to ultrascaled silicon devices and demonstrate the importance of phonon-assisted band-to-band and source-to-drain tunneling. In a diode the phonons lead to a rectification ratio suppression in good agreement with experiments, while in an ultrathin body transistor the phonons increase off currents by four orders of magnitude, in agreement with our state-of-the-art perturbation theory calculations. In addition, electron-phonon coupling of nanostructured devices in operation conditions can change significantly from its bulk value [2]. This makes the method an appealing design tool for next-generation devices and nanomaterials. [1]T. Gunst \u003Cem\u003Eet al\u003C\/em\u003E.,\u00a0Phys. Rev. B 96, 161404(R) (2017). [2]T. Gunst \u003Cem\u003Eet al\u003C\/em\u003E., Phys. Rev. Lett. 118, 046601 (2017).","filename":"msa121s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tue","last_name":"Gunst","affiliation":"Technical University of Denmark","country":"Denmark","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tue","last_name":"Gunst","affiliation":"Technical University of Denmark","country":"Denmark","bio":"","order":"1","is_presenter":true}]},{"id":"msa140","type":"child","title":"Large-Scale First-Principles Electronic Structure Calculations in Petascale and Exascale Supercomputers: A Real-Space Density Functional Theory Code","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"First-principles electronic structure calculation based on the Density Functional Theory (DFT) has been an indispensable tool for many fields of material science and engineering. With the development of supercomputers, the size of the targets of the first-principles DFT calculations becomes larger and larger, and nowadays, the target systems with a few hundreds to a thousand of atoms have been computable with standard plane-wave based DFT program codes. However, the computable size is not yet satisfactory to clarify the properties of materials in the situations close to realistic applications. We\u0027d like to introduce our program code RSDFT, which has been developed to perform large-scale first-principles calculations on massively-parallel supercomputers including the Japanese flagship machine K computer. RSDFT is based on the real-space finite-difference pseudopotential method. Contrary to the standard plane-wave methods, the real-space method does not need to use Fast Fourier Transformations, which requires heavy communication burden in parallel computations, and therefore RSDFT shows rather good scalability even in the computations with tens of thousands of compute nodes. It has also been started to develop RSDFT for the next Japanese flagship computer called post-K computer. We aim to make the first-principles calculations of tens-of-thousand-atom systems easy as a daily work.","filename":"msa140s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun-Ichi","last_name":"Iwata","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Atsushi","last_name":"Oshiyama","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jun-Ichi","last_name":"Iwata","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true}]},{"id":"msa270","type":"child","title":"Potentialities of Wavelet Formalism towards a Reduction of the Complexity of Large Scale Electronic Structure Calculations","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For the last few years, the BigDFT software package has implemented a linear scaling Kohn-Sham density functional theory optimization algorithm based on Daubechies wavelets, where a minimal set of localized support functions are optimized in situ and therefore adapted to the physico-chemical properties of the system under investigation. We illustrate, from a general perspective, a quantitative method to identify and assess the partitioning of a large quantum-mechanical system into fragments. Our approach reduces arbitrariness in the fragmentation procedure and enables the possibility of assessing quantitatively whether the corresponding fragment multipoles can be interpreted as observable quantities associated with a system moiety. Such an approach is based on general grounds and its implementation is unrelated to the wavelet formalism. However, we show that the use of a minimal set of in situ-optimized basis functions allows at the same time a proper fragment definition and an accurate description of the electronic structure.","bio":"","contributors":[{"type":"Author","first_name":"Luigi","last_name":"Genovese","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Stephan","last_name":"Mohr","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Laura","last_name":"Ratcliff","affiliation":"Imperial College London","country":"United Kingdom","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luigi","last_name":"Genovese","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa134","type":"child","title":"ABINIT on Pre-Exascale Supercomputers: Hybrid Parallelism and Numerical Stability","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ABINIT is one of the most widely used electronic structure codes, implementing plane-wave based Density-Functional Theory. With multiple levels of parallelism, changing the code with every hardware evolution is a tedious task. To avoid obsolescence and allow adaptivity, an abstract layer for intensive low-level computing tasks has been introduced. Low-level sections have been rewritten specifically for a few hardware types. A global change of the hybrid parallelism is necessary to adapt the code to the new and future many-core architectures, as well as to memory bandwidth. A positive side effect of memory sharing is a better convergence of the diagonalization algorithm. Performances will be shown on Intel Xeon Skylake and Intel Xeon Phi KNL. Vectorization with large vectors and multithreading with more and more tasks induce a non-predictability of the floating-point operations that increase numerical noise and instabilities. We tackle this issue with the use of stochastic arithmetic to estimate the number of significant digits of each code section. Doing this, we can identify numerically sensitive code sections.","filename":"msa134s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marc","last_name":"Torrent","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jordan","last_name":"Bieder","affiliation":"CEA","country":"France","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yohan","last_name":"Chatelain","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Pablo","last_name":"Oliveira","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marc","last_name":"Torrent","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa134","type":"child","title":"ABINIT on Pre-Exascale Supercomputers: Hybrid Parallelism and Numerical Stability","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ABINIT is one of the most widely used electronic structure codes, implementing plane-wave based Density-Functional Theory. With multiple levels of parallelism, changing the code with every hardware evolution is a tedious task. To avoid obsolescence and allow adaptivity, an abstract layer for intensive low-level computing tasks has been introduced. Low-level sections have been rewritten specifically for a few hardware types. A global change of the hybrid parallelism is necessary to adapt the code to the new and future many-core architectures, as well as to memory bandwidth. A positive side effect of memory sharing is a better convergence of the diagonalization algorithm. Performances will be shown on Intel Xeon Skylake and Intel Xeon Phi KNL. Vectorization with large vectors and multithreading with more and more tasks induce a non-predictability of the floating-point operations that increase numerical noise and instabilities. We tackle this issue with the use of stochastic arithmetic to estimate the number of significant digits of each code section. Doing this, we can identify numerically sensitive code sections.","filename":"msa134s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marc","last_name":"Torrent","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jordan","last_name":"Bieder","affiliation":"CEA","country":"France","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yohan","last_name":"Chatelain","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Pablo","last_name":"Oliveira","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marc","last_name":"Torrent","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Marc","last_name":"Torrent","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jordan","last_name":"Bieder","affiliation":"CEA","country":"France","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yohan","last_name":"Chatelain","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Pablo","last_name":"Oliveira","affiliation":"Universit\u00e9 de Versailles Saint-Quentin-en-Yvelines","country":"France","bio":"","order":"4","is_presenter":false}] } Presentation
Organizer(s):
Peter Dominik Dueben (ECMWF, United Kingdom)
, Rupert Ford (Science and Technology Facilities Council, United Kingdom)
, Willem Deconinck (ECMWF, United Kingdom)
Track(s):
Climate and Weather
The increasingly large amounts of data being produced by weather and climate simulations and earth system observations is sometimes characterised as a deluge. This deluge of data is both a challenge and an opportunity. The main opportunities are to make use of this wealth of data to 1) improve knowledge by extracting additional knowledge from the data and 2) to improve the quality of the models themselves by analysing the accuracy, or lack thereof, of the resultant simulation data. An example of the former case is improved prediction of large scale phenomena such as El Nino. An example of the latter is the improvement of a Physics parameterisation scheme through detailed analysis of the errors in a large number of datasets.
One way to realise these opportunities is to use machine learning approaches. As machine learning in weather and climate is a relatively new topic this minisymposium introduces the audience to how machine learning could be used in weather and climate and outlines its implications in terms of computing costs. To ground the ideas in concrete examples it also illustrates the use of machine learning in the weather and climate domain with practical examples.
One way to realise these opportunities is to use machine learning approaches. As machine learning in weather and climate is a relatively new topic this minisymposium introduces the audience to how machine learning could be used in weather and climate and outlines its implications in terms of computing costs. To ground the ideas in concrete examples it also illustrates the use of machine learning in the weather and climate domain with practical examples.
13:30 - 14:00
Deep Learning in Weather and Climate, Part 2: The Computing Perspective
, Jakob Progsch (NVIDIA Inc., Germany)
+ Abstract { "session": {"id":"sess185","title":"MS07 - Machine Learning in Weather and Climate","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"University of Oxford","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp155","type":"minisymposia","title":"MS07 - Machine Learning in Weather and Climate","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The increasingly large amounts of data being produced by weather and climate simulations and earth system observations is sometimes characterised as a deluge. This deluge of data is both a challenge and an opportunity. The main opportunities are to make use of this wealth of data to 1) improve knowledge by extracting additional knowledge from the data and 2) to improve the quality of the models themselves by analysing the accuracy, or lack thereof, of the resultant simulation data. An example of the former case is improved prediction of large scale phenomena such as El Nino. An example of the latter is the improvement of a Physics parameterisation scheme through detailed analysis of the errors in a large number of datasets.\u003Cbr \/\u003E\u003Cbr \/\u003EOne way to realise these opportunities is to use machine learning approaches. As machine learning in weather and climate is a relatively new topic this minisymposium introduces the audience to how machine learning could be used in weather and climate and outlines its implications in terms of computing costs. To ground the ideas in concrete examples it also illustrates the use of machine learning in the weather and climate domain with practical examples.","bio":"","contributors":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Rupert","last_name":"Ford","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa111","type":"child","title":"Deep Learning in Weather and Climate, Part 1: The Domain Perspective","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"From the perspective of Earth System modelling, the use of machine learning, an in particular deep learning, is still in its infancy. There are many possible ways how deep learning could improve model quality or generate significant speed-ups for simulations. However, it has yet to be shown that deep learning can hold what it is promising for this application and its specific needs. This talk will provide an overview how deep learning may impact Earth System modelling in the future. We will provide examples how these methods have been used until today and discuss both limitations and prospects for their application. We will present results when using deep learning to improve model simulations for a toy model of atmospheric dynamics (the Lorenz\u002795 model). We will also show preliminary results that use deep neural networks that are trained from global atmospheric data to represent atmospheric dynamics and networks that are designed to speed-up parts of a weather forecast model at full complexity.","bio":"","contributors":[{"type":"Author","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa149","type":"child","title":"Deep Learning in Weather and Climate, Part 2: The Computing Perspective","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this presentation we will discuss ways in which deep neural networks can be integrated with traditional climate and weather\u00a0simulations. In particular, we will be focusing on\u00a0the design, implementation, and training of a deep convolutional neural network and its integration with the IFS Forecast Model inside RAPS as a new stand-alone\u00a0radiation scheme. This work is a case study for how\u00a0AI and large scale simulation may be applied on a cooperative basis and let the strengths of each converge to form a new tool for science.","filename":"msa149s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Angerer","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jakob","last_name":"Progsch","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakob","last_name":"Progsch","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"msa253","type":"child","title":"Integrating Machine Learning Algorithms and HPDA Frameworks to Run Predictive Analytics on Large-Scale Climate and Weather Datasets","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work relates to the integration of a recurrent neural network algorithm (Long Short-Term Memory - LSTM) into Ophidia, a datacube-oriented High Performance Data Analytics framework. More specifically, Ophidia provides a (big) datacube abstraction to the end users, while it physically builds its core set of functionalities (namely \u0027\u003Cem\u003Eoperators\u003C\/em\u003E\u0027) on top of an array-database system. Operators in Ophidia run in parallel over a cluster to tackle big data challenges on massive scientific datasets. At the array-database level, Ophidia allows end-user developing her own analytics functions (namely \u0027\u003Cem\u003Eprimitives\u003C\/em\u003E\u0027), which by definition represent a sequential array-based data transformation. By implementing the LSTM algorithm as a \u003Cem\u003Eprimitive\u003C\/em\u003E running over a long time series, machine learning capabilities can be integrated into Ophidia taking advantage of a HPDA approach applied over large-scale datasets. A couple of case studies have been considered: the former relates to the output of a WRF model running over the Brazilian region of Curitiba, whereas the latter includes both simulated data, through an unstructured grid forecasting model running at CMCC by the Ocean Predictions and Applications Division, and observations over the Apulia region in the South-East of Italy. Preliminary insights about the proposed approach seems promising and will be presented.","filename":"msa253s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alessandro","last_name":"D\u0027Anca","affiliation":"CMCC","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Giovanni","last_name":"Aloisio","affiliation":"University of Salento","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sandro","last_name":"Fiore","affiliation":"CMCC Foundation","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alessandro","last_name":"D\u0027Anca","affiliation":"CMCC","country":"Italy","bio":"","order":"1","is_presenter":true}]},{"id":"msa112","type":"child","title":"Using Self-Organising Maps to Understand Relationships between Clouds and Cloud Controlling Factors","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The long term global warming predicted for a doubling of carbon dioxide is known as the \u0027climate sensitivity\u0027. Many future regional impacts of climate change become more serious for larger climate sensitivities. Current estimates of climate sensitivity from different climate models around the world vary by more than a factor of two, from approximately 2 to 5K. The reason for such a large range of estimates is due to the uncertainty around how clouds will change as the climate warms. Changes in clouds are hard to predict because they depend non-linearly on many interacting environmental factors. In this work we are interested in establishing whether machine learning can provide new insights into a) the factors controlling cloud changes and b) how various climate models represent these relationships. We use the Self-Organising Map (SOM), an unsupervised learning technique well suited to analysing high-dimensional data, to explore relationships between cloud controlling factors and compare the results to standard linear correlation. We find that potentially interesting new relationships not shown by linear correlation are revealed by the SOM technique.","filename":"msa112s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samantha V.","last_name":"Adams","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Mark","last_name":"Webb","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samantha V.","last_name":"Adams","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa149","type":"child","title":"Deep Learning in Weather and Climate, Part 2: The Computing Perspective","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this presentation we will discuss ways in which deep neural networks can be integrated with traditional climate and weather\u00a0simulations. In particular, we will be focusing on\u00a0the design, implementation, and training of a deep convolutional neural network and its integration with the IFS Forecast Model inside RAPS as a new stand-alone\u00a0radiation scheme. This work is a case study for how\u00a0AI and large scale simulation may be applied on a cooperative basis and let the strengths of each converge to form a new tool for science.","filename":"msa149s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Angerer","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jakob","last_name":"Progsch","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakob","last_name":"Progsch","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Christoph","last_name":"Angerer","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jakob","last_name":"Progsch","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"2","is_presenter":true}] } Presentation
14:00 - 14:30
Integrating Machine Learning Algorithms and HPDA Frameworks to Run Predictive Analytics on Large-Scale Climate and Weather Datasets
, Alessandro D'Anca (CMCC, Italy)
+ Abstract { "session": {"id":"sess185","title":"MS07 - Machine Learning in Weather and Climate","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"University of Oxford","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp155","type":"minisymposia","title":"MS07 - Machine Learning in Weather and Climate","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The increasingly large amounts of data being produced by weather and climate simulations and earth system observations is sometimes characterised as a deluge. This deluge of data is both a challenge and an opportunity. The main opportunities are to make use of this wealth of data to 1) improve knowledge by extracting additional knowledge from the data and 2) to improve the quality of the models themselves by analysing the accuracy, or lack thereof, of the resultant simulation data. An example of the former case is improved prediction of large scale phenomena such as El Nino. An example of the latter is the improvement of a Physics parameterisation scheme through detailed analysis of the errors in a large number of datasets.\u003Cbr \/\u003E\u003Cbr \/\u003EOne way to realise these opportunities is to use machine learning approaches. As machine learning in weather and climate is a relatively new topic this minisymposium introduces the audience to how machine learning could be used in weather and climate and outlines its implications in terms of computing costs. To ground the ideas in concrete examples it also illustrates the use of machine learning in the weather and climate domain with practical examples.","bio":"","contributors":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Rupert","last_name":"Ford","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa111","type":"child","title":"Deep Learning in Weather and Climate, Part 1: The Domain Perspective","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"From the perspective of Earth System modelling, the use of machine learning, an in particular deep learning, is still in its infancy. There are many possible ways how deep learning could improve model quality or generate significant speed-ups for simulations. However, it has yet to be shown that deep learning can hold what it is promising for this application and its specific needs. This talk will provide an overview how deep learning may impact Earth System modelling in the future. We will provide examples how these methods have been used until today and discuss both limitations and prospects for their application. We will present results when using deep learning to improve model simulations for a toy model of atmospheric dynamics (the Lorenz\u002795 model). We will also show preliminary results that use deep neural networks that are trained from global atmospheric data to represent atmospheric dynamics and networks that are designed to speed-up parts of a weather forecast model at full complexity.","bio":"","contributors":[{"type":"Author","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa149","type":"child","title":"Deep Learning in Weather and Climate, Part 2: The Computing Perspective","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this presentation we will discuss ways in which deep neural networks can be integrated with traditional climate and weather\u00a0simulations. In particular, we will be focusing on\u00a0the design, implementation, and training of a deep convolutional neural network and its integration with the IFS Forecast Model inside RAPS as a new stand-alone\u00a0radiation scheme. This work is a case study for how\u00a0AI and large scale simulation may be applied on a cooperative basis and let the strengths of each converge to form a new tool for science.","filename":"msa149s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Angerer","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jakob","last_name":"Progsch","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakob","last_name":"Progsch","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"msa253","type":"child","title":"Integrating Machine Learning Algorithms and HPDA Frameworks to Run Predictive Analytics on Large-Scale Climate and Weather Datasets","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work relates to the integration of a recurrent neural network algorithm (Long Short-Term Memory - LSTM) into Ophidia, a datacube-oriented High Performance Data Analytics framework. More specifically, Ophidia provides a (big) datacube abstraction to the end users, while it physically builds its core set of functionalities (namely \u0027\u003Cem\u003Eoperators\u003C\/em\u003E\u0027) on top of an array-database system. Operators in Ophidia run in parallel over a cluster to tackle big data challenges on massive scientific datasets. At the array-database level, Ophidia allows end-user developing her own analytics functions (namely \u0027\u003Cem\u003Eprimitives\u003C\/em\u003E\u0027), which by definition represent a sequential array-based data transformation. By implementing the LSTM algorithm as a \u003Cem\u003Eprimitive\u003C\/em\u003E running over a long time series, machine learning capabilities can be integrated into Ophidia taking advantage of a HPDA approach applied over large-scale datasets. A couple of case studies have been considered: the former relates to the output of a WRF model running over the Brazilian region of Curitiba, whereas the latter includes both simulated data, through an unstructured grid forecasting model running at CMCC by the Ocean Predictions and Applications Division, and observations over the Apulia region in the South-East of Italy. Preliminary insights about the proposed approach seems promising and will be presented.","filename":"msa253s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alessandro","last_name":"D\u0027Anca","affiliation":"CMCC","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Giovanni","last_name":"Aloisio","affiliation":"University of Salento","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sandro","last_name":"Fiore","affiliation":"CMCC Foundation","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alessandro","last_name":"D\u0027Anca","affiliation":"CMCC","country":"Italy","bio":"","order":"1","is_presenter":true}]},{"id":"msa112","type":"child","title":"Using Self-Organising Maps to Understand Relationships between Clouds and Cloud Controlling Factors","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The long term global warming predicted for a doubling of carbon dioxide is known as the \u0027climate sensitivity\u0027. Many future regional impacts of climate change become more serious for larger climate sensitivities. Current estimates of climate sensitivity from different climate models around the world vary by more than a factor of two, from approximately 2 to 5K. The reason for such a large range of estimates is due to the uncertainty around how clouds will change as the climate warms. Changes in clouds are hard to predict because they depend non-linearly on many interacting environmental factors. In this work we are interested in establishing whether machine learning can provide new insights into a) the factors controlling cloud changes and b) how various climate models represent these relationships. We use the Self-Organising Map (SOM), an unsupervised learning technique well suited to analysing high-dimensional data, to explore relationships between cloud controlling factors and compare the results to standard linear correlation. We find that potentially interesting new relationships not shown by linear correlation are revealed by the SOM technique.","filename":"msa112s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samantha V.","last_name":"Adams","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Mark","last_name":"Webb","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samantha V.","last_name":"Adams","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa253","type":"child","title":"Integrating Machine Learning Algorithms and HPDA Frameworks to Run Predictive Analytics on Large-Scale Climate and Weather Datasets","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work relates to the integration of a recurrent neural network algorithm (Long Short-Term Memory - LSTM) into Ophidia, a datacube-oriented High Performance Data Analytics framework. More specifically, Ophidia provides a (big) datacube abstraction to the end users, while it physically builds its core set of functionalities (namely \u0027\u003Cem\u003Eoperators\u003C\/em\u003E\u0027) on top of an array-database system. Operators in Ophidia run in parallel over a cluster to tackle big data challenges on massive scientific datasets. At the array-database level, Ophidia allows end-user developing her own analytics functions (namely \u0027\u003Cem\u003Eprimitives\u003C\/em\u003E\u0027), which by definition represent a sequential array-based data transformation. By implementing the LSTM algorithm as a \u003Cem\u003Eprimitive\u003C\/em\u003E running over a long time series, machine learning capabilities can be integrated into Ophidia taking advantage of a HPDA approach applied over large-scale datasets. A couple of case studies have been considered: the former relates to the output of a WRF model running over the Brazilian region of Curitiba, whereas the latter includes both simulated data, through an unstructured grid forecasting model running at CMCC by the Ocean Predictions and Applications Division, and observations over the Apulia region in the South-East of Italy. Preliminary insights about the proposed approach seems promising and will be presented.","filename":"msa253s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alessandro","last_name":"D\u0027Anca","affiliation":"CMCC","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Giovanni","last_name":"Aloisio","affiliation":"University of Salento","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sandro","last_name":"Fiore","affiliation":"CMCC Foundation","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alessandro","last_name":"D\u0027Anca","affiliation":"CMCC","country":"Italy","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Alessandro","last_name":"D\u0027Anca","affiliation":"CMCC","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Giovanni","last_name":"Aloisio","affiliation":"University of Salento","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sandro","last_name":"Fiore","affiliation":"CMCC Foundation","country":"Italy","bio":"","order":"3","is_presenter":false}] } Presentation
14:30 - 15:00
Using Self-Organising Maps to Understand Relationships between Clouds and Cloud Controlling Factors
, Samantha V. Adams (Met Office, United Kingdom)
+ Abstract { "session": {"id":"sess185","title":"MS07 - Machine Learning in Weather and Climate","date":"Monday, July 2nd 2018","begin_time":"13:00","end_time":"15:00","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"University of Oxford","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp155","type":"minisymposia","title":"MS07 - Machine Learning in Weather and Climate","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The increasingly large amounts of data being produced by weather and climate simulations and earth system observations is sometimes characterised as a deluge. This deluge of data is both a challenge and an opportunity. The main opportunities are to make use of this wealth of data to 1) improve knowledge by extracting additional knowledge from the data and 2) to improve the quality of the models themselves by analysing the accuracy, or lack thereof, of the resultant simulation data. An example of the former case is improved prediction of large scale phenomena such as El Nino. An example of the latter is the improvement of a Physics parameterisation scheme through detailed analysis of the errors in a large number of datasets.\u003Cbr \/\u003E\u003Cbr \/\u003EOne way to realise these opportunities is to use machine learning approaches. As machine learning in weather and climate is a relatively new topic this minisymposium introduces the audience to how machine learning could be used in weather and climate and outlines its implications in terms of computing costs. To ground the ideas in concrete examples it also illustrates the use of machine learning in the weather and climate domain with practical examples.","bio":"","contributors":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Rupert","last_name":"Ford","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa111","type":"child","title":"Deep Learning in Weather and Climate, Part 1: The Domain Perspective","begin_time":"13:00","end_time":"13:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"From the perspective of Earth System modelling, the use of machine learning, an in particular deep learning, is still in its infancy. There are many possible ways how deep learning could improve model quality or generate significant speed-ups for simulations. However, it has yet to be shown that deep learning can hold what it is promising for this application and its specific needs. This talk will provide an overview how deep learning may impact Earth System modelling in the future. We will provide examples how these methods have been used until today and discuss both limitations and prospects for their application. We will present results when using deep learning to improve model simulations for a toy model of atmospheric dynamics (the Lorenz\u002795 model). We will also show preliminary results that use deep neural networks that are trained from global atmospheric data to represent atmospheric dynamics and networks that are designed to speed-up parts of a weather forecast model at full complexity.","bio":"","contributors":[{"type":"Author","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa149","type":"child","title":"Deep Learning in Weather and Climate, Part 2: The Computing Perspective","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this presentation we will discuss ways in which deep neural networks can be integrated with traditional climate and weather\u00a0simulations. In particular, we will be focusing on\u00a0the design, implementation, and training of a deep convolutional neural network and its integration with the IFS Forecast Model inside RAPS as a new stand-alone\u00a0radiation scheme. This work is a case study for how\u00a0AI and large scale simulation may be applied on a cooperative basis and let the strengths of each converge to form a new tool for science.","filename":"msa149s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Angerer","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jakob","last_name":"Progsch","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakob","last_name":"Progsch","affiliation":"NVIDIA Inc.","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"msa253","type":"child","title":"Integrating Machine Learning Algorithms and HPDA Frameworks to Run Predictive Analytics on Large-Scale Climate and Weather Datasets","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work relates to the integration of a recurrent neural network algorithm (Long Short-Term Memory - LSTM) into Ophidia, a datacube-oriented High Performance Data Analytics framework. More specifically, Ophidia provides a (big) datacube abstraction to the end users, while it physically builds its core set of functionalities (namely \u0027\u003Cem\u003Eoperators\u003C\/em\u003E\u0027) on top of an array-database system. Operators in Ophidia run in parallel over a cluster to tackle big data challenges on massive scientific datasets. At the array-database level, Ophidia allows end-user developing her own analytics functions (namely \u0027\u003Cem\u003Eprimitives\u003C\/em\u003E\u0027), which by definition represent a sequential array-based data transformation. By implementing the LSTM algorithm as a \u003Cem\u003Eprimitive\u003C\/em\u003E running over a long time series, machine learning capabilities can be integrated into Ophidia taking advantage of a HPDA approach applied over large-scale datasets. A couple of case studies have been considered: the former relates to the output of a WRF model running over the Brazilian region of Curitiba, whereas the latter includes both simulated data, through an unstructured grid forecasting model running at CMCC by the Ocean Predictions and Applications Division, and observations over the Apulia region in the South-East of Italy. Preliminary insights about the proposed approach seems promising and will be presented.","filename":"msa253s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alessandro","last_name":"D\u0027Anca","affiliation":"CMCC","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Giovanni","last_name":"Aloisio","affiliation":"University of Salento","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sandro","last_name":"Fiore","affiliation":"CMCC Foundation","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alessandro","last_name":"D\u0027Anca","affiliation":"CMCC","country":"Italy","bio":"","order":"1","is_presenter":true}]},{"id":"msa112","type":"child","title":"Using Self-Organising Maps to Understand Relationships between Clouds and Cloud Controlling Factors","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The long term global warming predicted for a doubling of carbon dioxide is known as the \u0027climate sensitivity\u0027. Many future regional impacts of climate change become more serious for larger climate sensitivities. Current estimates of climate sensitivity from different climate models around the world vary by more than a factor of two, from approximately 2 to 5K. The reason for such a large range of estimates is due to the uncertainty around how clouds will change as the climate warms. Changes in clouds are hard to predict because they depend non-linearly on many interacting environmental factors. In this work we are interested in establishing whether machine learning can provide new insights into a) the factors controlling cloud changes and b) how various climate models represent these relationships. We use the Self-Organising Map (SOM), an unsupervised learning technique well suited to analysing high-dimensional data, to explore relationships between cloud controlling factors and compare the results to standard linear correlation. We find that potentially interesting new relationships not shown by linear correlation are revealed by the SOM technique.","filename":"msa112s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samantha V.","last_name":"Adams","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Mark","last_name":"Webb","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samantha V.","last_name":"Adams","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa112","type":"child","title":"Using Self-Organising Maps to Understand Relationships between Clouds and Cloud Controlling Factors","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The long term global warming predicted for a doubling of carbon dioxide is known as the \u0027climate sensitivity\u0027. Many future regional impacts of climate change become more serious for larger climate sensitivities. Current estimates of climate sensitivity from different climate models around the world vary by more than a factor of two, from approximately 2 to 5K. The reason for such a large range of estimates is due to the uncertainty around how clouds will change as the climate warms. Changes in clouds are hard to predict because they depend non-linearly on many interacting environmental factors. In this work we are interested in establishing whether machine learning can provide new insights into a) the factors controlling cloud changes and b) how various climate models represent these relationships. We use the Self-Organising Map (SOM), an unsupervised learning technique well suited to analysing high-dimensional data, to explore relationships between cloud controlling factors and compare the results to standard linear correlation. We find that potentially interesting new relationships not shown by linear correlation are revealed by the SOM technique.","filename":"msa112s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samantha V.","last_name":"Adams","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Mark","last_name":"Webb","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samantha V.","last_name":"Adams","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Samantha V.","last_name":"Adams","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Mark","last_name":"Webb","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"2","is_presenter":false}] } Presentation
Organizer(s):
Ramesh Balakrishnan (Argonne National Laboratory, United States of America)
, Philipp Schlatter (KTH Royal Institute of Technology, Sweden)
Track(s):
Engineering, Computer Science and Applied Mathematics
Wall resolved LES is prohibitively expensive, and a first principles based wall modeled LES, that is free of tunable parameters, is far from becoming an alternative to RANS as a predictive tool for design. On the spatial discretization front, most higher-order and spectral methods have had many of their successes in the realm of low to moderate Reynolds number flows, and on relatively less complex geometries. The bulk of the complex flow high Re simulations still employ nominally second-order accurate schemes in unstructured mesh finite volume flow solvers, and finite-element based solvers with modest polynomial order. Even in these fairly mature solvers, evidence suggests that merely running larger cases with an increased number of grid points and on larger computational domains does not always guarantee a better solution (in the LES sense). There is, therefore, a need for a two-level effort whereby benchmark higher-order simulations can serve to inform sub-grid models to improve the predictive capability of existing mature flow solvers. We hope to motivate a discussion along these lines with examples of results of higher-order simulations, as well as their role in assessing and improving the predictive capabilities of sub-grid models with more conventional flow solvers.
15:00 - 15:30
Coffee Break
Foyer 2nd Floor
15:30 - 17:30
Minisymposia Session II
Organizer(s):
Andreas Vitalis (University of Zurich, Switzerland)
, Marco Bacci (University of Zurich, Switzerland)
, Amedeo Caflisch (University of Zurich, Switzerland)
Track(s):
Life Sciences, Emerging Application Domains, Computer Science and Applied Mathematics
A common problem in numerical optimization and sampling is the detection of relevant states. These could be, for instance, the local minima on a rugged parameter surface or the transition state of a chemical reaction. For most cases, an exhaustive search for the optimal solution is intractable. Here, we focus on parallel sampling and optimization strategies relying on multiple replicas, most prominently, adaptive methods where all simulated replicas use the same propagator and sample the same underlying surface. In these methods, replica intercommunication is used to provide a global assessment as to which replicas are most interesting. This implies, in general, periodic data mining steps across replicas. Furthermore, in order to extract and utilize the gained information in post-processing, data must often be stored, which poses stringent data management and analysis challenges in particular for high-dimensional cases. The minisymposium wishes to discuss the following questions: What are meaningful and easily generalizable tools, strategies, and algorithms to guide the sampling/exploration? How can we maintain scalability and load balance? What types of post-processing algorithms can be applied to the generated data, and are those scalable to provide on-the-fly solutions to direct the exploration?
16:30 - 17:00
On the Interpretation of Non-Equilibrium MD Trajectories
, Tanja Schilling (University of Freiburg, Germany)
+ Abstract { "session": {"id":"sess205","title":"MS09 - Adaptive Parallel Strategies for the Exploration of Challenging Search Spaces with Applications in Particle Simulations and Optimization, Part II","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Samarkand Room","contributors":[{"type":"Session Chair","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp160","type":"minisymposia","title":"MS09 - Adaptive Parallel Strategies for the Exploration of Challenging Search Spaces with Applications in Particle Simulations and Optimization, Part II","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"A common problem in numerical optimization and sampling is the detection of relevant states. These could be, for instance, the local minima on a rugged parameter surface or the transition state of a chemical reaction. For most cases, an exhaustive search for the optimal solution is intractable. Here, we focus on parallel sampling and optimization strategies relying on multiple replicas, most prominently, adaptive methods where all simulated replicas use the same propagator and sample the same underlying surface. In these methods, replica intercommunication is used to provide a global assessment as to which replicas are most interesting. This implies, in general, periodic data mining steps across replicas. Furthermore, in order to extract and utilize the gained information in post-processing, data must often be stored, which poses stringent data management and analysis challenges in particular for high-dimensional cases. The minisymposium wishes to discuss the following questions: What are meaningful and easily generalizable tools, strategies, and algorithms to guide the sampling\/exploration? How can we maintain scalability and load balance? What types of post-processing algorithms can be applied to the generated data, and are those scalable to provide on-the-fly solutions to direct the exploration?","bio":"","contributors":[{"type":"Organizer","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa201","type":"child","title":"Task-Based Parallelization of Replica Exchange Transition Interface Sampling in OpenPathSampling","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Path sampling methods, such as transition path sampling and transition interface sampling, are powerful tools for studying rare events. They perform Monte Carlo simulations in the space of trajectories, focusing the simulation effort on the transition itself to avoid spending long waiting times in the stable states. Since they are Monte Carlo approaches, they can use multiple walkers, but some approaches also use replicas from different path ensembles. In particular, replica exchange transition interface sampling (RETIS) involves simultaneously sampling trajectories from several path ensembles. However, even within a single ensemble, the lengths of the sampled trajectories can vary and are unpredictable. This makes load balancing an extremely challenging problem. This presentation describes the parallelization of RETIS in the software package OpenPathSampling using dask.distributed, a Python package for task-based programming. The task-based approach enables parallelization that provides optimal use of computational resources, not only by load balancing, but also by allowing the allocated resources to be scaled up or down according to the needs of the simulation. While the approach is described here in the context of path sampling, the same technique could be applied to many trajectory-based simulation methods.","bio":"","contributors":[{"type":"Author","first_name":"David W. H.","last_name":"Swenson","affiliation":"University of Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David W. H.","last_name":"Swenson","affiliation":"University of Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}]},{"id":"msa126","type":"child","title":"Replica-Exchange Enveloping Distribution Sampling (RE-EDS) to Calculate Multiple Free-Energy Differences in a Single Simulation","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Enveloping distribution sampling (EDS) allows the calculation of free-energy differences between multiple end states from a single simulation. A reference-state Hamiltonian is simulated which envelopes the Hamiltonians of the end states. The challenge when using EDS is the determination of optimal parameters for the reference-state Hamiltonian. Previously, the choice of parameters for an EDS simulation with multiple end states was a non-trivial problem that limited the application of the methodology. To overcome these limitations, we have generalized the replica-exchange EDS (RE-EDS) methodology to arbitrary systems. By exchanging configurations between replicas with different parameters for the reference-state Hamiltonian, major parts of the problem to choose optimal parameters are circumvented. Algorithms to estimate the energy offsets and optimize the replica distribution have been developed. Our approach was tested successfully using a system consisting of nine inhibitors of phenylethanolamine N-methyltransferase (PNMT), which were studied previously with thermodynamic integration and EDS.","bio":"","contributors":[{"type":"Author","first_name":"Sereina Z.","last_name":"Riniker","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sereina Z.","last_name":"Riniker","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa120","type":"child","title":"On the Interpretation of Non-Equilibrium MD Trajectories","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"As a researcher in statistical physics, one may often be interested in reducing the complexity of a many-particle system to the study of a set of relevant observables. If the system is in equilibrium, a systematic way to derive an equation of motion for the \u0022relevant\u0022 observables from the microscopic dynamics has been known for some time as the \u0022Mori-Zwanzig\u0022 formalism, which leads to the Langevin equation. In contrast, if the dynamics is not stationary, it is not a priori clear which form the equation of motion for an averaged observable will have. We adapt Mori-Zwanzig formalism to derive the equation of motion for a non-equilibrium trajectory-averaged observable as well as for its non-stationary auto-correlation function. We also derive a fluctuation-dissipation-like relation which relates the memory kernel and the autocorrelation function of the fluctuating force. In addition, we show how to relate the Taylor expansion of the memory kernel to experimental data, thus allowing to construct the equation of motion from direct measurements.","filename":"msa120s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tanja","last_name":"Schilling","affiliation":"University of Freiburg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thomas","last_name":"Voigtmann","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hugues","last_name":"Meyer","affiliation":"University of Luxembourg","country":"Luxembourg","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tanja","last_name":"Schilling","affiliation":"University of Freiburg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa252","type":"child","title":"Dynamic Histogram Analysis to Determine Free Energies and Rates from Biased Simulations","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Transitions between metastable states govern many fundamental processes in biology, such as biomolecular folding. The underlying free energy surfaces can be obtained from simulations using enhanced sampling methods. We present an algorithm to calculate free energies and rates from enhanced sampling simulations on biased potential energy surfaces. Inputs are the accumulated times spent in each state or bin of a histogram, and the transition counts between them. For each of the states\/bins optimal unbiased free energies are obtained by maximizing the likelihood of a master equation (i.e., first-order kinetic rate model). Unbiased rate coefficients for transitions between states can then be estimated. The resulting \u0022dynamic histogram analysis method extended to detailed balance\u0022 (DHAMed) improves on the DHAM method. DHAMed yields accurate free energies in cases where the common weighted-histogram analysis method (WHAM) for umbrella sampling fails because dynamics within the windows is slow. We illustrate DHAMed with applications to proteins and RNAs and accurately estimate free energies from sets of short trajectories, providing a way forward for computational drug design. Our rate formalism can be used to construct Markov state models from biased simulations and we demonstrate its practical applicability by determining RNA folding kinetics from replica exchange molecular dynamics.","bio":"","contributors":[{"type":"Author","first_name":"Lukas S.","last_name":"Stelzl","affiliation":"Max Planck Institute of Biophysics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Adam","last_name":"Kells","affiliation":"King\u0027s College London","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Edina","last_name":"Rosta","affiliation":"King\u0027s College London","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Hummer","affiliation":"Max Planck Institute of Biophysics","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Lukas S.","last_name":"Stelzl","affiliation":"Max Planck Institute of Biophysics","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa120","type":"child","title":"On the Interpretation of Non-Equilibrium MD Trajectories","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"As a researcher in statistical physics, one may often be interested in reducing the complexity of a many-particle system to the study of a set of relevant observables. If the system is in equilibrium, a systematic way to derive an equation of motion for the \u0022relevant\u0022 observables from the microscopic dynamics has been known for some time as the \u0022Mori-Zwanzig\u0022 formalism, which leads to the Langevin equation. In contrast, if the dynamics is not stationary, it is not a priori clear which form the equation of motion for an averaged observable will have. We adapt Mori-Zwanzig formalism to derive the equation of motion for a non-equilibrium trajectory-averaged observable as well as for its non-stationary auto-correlation function. We also derive a fluctuation-dissipation-like relation which relates the memory kernel and the autocorrelation function of the fluctuating force. In addition, we show how to relate the Taylor expansion of the memory kernel to experimental data, thus allowing to construct the equation of motion from direct measurements.","filename":"msa120s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tanja","last_name":"Schilling","affiliation":"University of Freiburg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thomas","last_name":"Voigtmann","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hugues","last_name":"Meyer","affiliation":"University of Luxembourg","country":"Luxembourg","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tanja","last_name":"Schilling","affiliation":"University of Freiburg","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Tanja","last_name":"Schilling","affiliation":"University of Freiburg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thomas","last_name":"Voigtmann","affiliation":"German Aerospace Center","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hugues","last_name":"Meyer","affiliation":"University of Luxembourg","country":"Luxembourg","bio":"","order":"3","is_presenter":false}] } Presentation
Organizer(s):
Xavier Lapillonne (MeteoSwiss, Switzerland)
, Valentin Clement (Center for Climate System Modeling, Switzerland)
Track(s):
Climate and Weather
Numerical weather prediction and climate models are large and complex software applications that need to run efficiently on today's and future massively parallel computer systems. The rapid change in these computing architectures and the increase in diversity are seriously affecting the ability to retain a single source code that runs efficiently in different architectures. Several weather models have successfully adapted their codebases to many-core and heterogeneous architectures like GPUs and Xeon Phi using a combination of multiple traditional programming models for parallel architectures like OpenMP, OpenACC and MPI. However porting existing large community codes to multiple architectures is a daunting task and leads to codes that are more complex and difficult to maintain. As a result in the past years numerous new technologies and approaches are emerging in order to provide new programming models, like domain-specific languages (DSLs) or source-to-source translation tools that can increase the productivity of development in weather codes while providing a high degree of performance portability. In this minisymposium we propose a discussion with some of the most prominent novel approaches where the new advances in programming models used for heterogeneous architectures in weather and climate models will be presented.
16:00 - 16:30
Performance Portability for Next Generation HPC Architectures in E3SM via the Kokkos Programming Model
, Luca Bertagna (Sandia National Laboratories, United States of America)
+ Abstract { "session": {"id":"sess158","title":"MS10 - Bridging the Software Productivity Gap for Climate and Weather Models","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp119","type":"minisymposia","title":"MS10 - Bridging the Software Productivity Gap for Climate and Weather Models","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Numerical weather prediction and climate models are large and complex software applications that need to run efficiently on today\u0027s and future massively parallel computer systems. The rapid change in these computing architectures and the increase in diversity are seriously affecting the ability to retain a single source code that runs efficiently in different architectures. Several weather models have successfully adapted their codebases to many-core and heterogeneous architectures like GPUs and Xeon Phi using a combination of multiple traditional programming models for parallel architectures like OpenMP, OpenACC and MPI. However porting existing large community codes to multiple architectures is a daunting task and leads to codes that are more complex and difficult to maintain. As a result in the past years numerous new technologies and approaches are emerging in order to provide new programming models, like domain-specific languages (DSLs) or source-to-source translation tools that can increase the productivity of development in weather codes while providing a high degree of performance portability. In this minisymposium we propose a discussion with some of the most prominent novel approaches where the new advances in programming models used for heterogeneous architectures in weather and climate models will be presented.","bio":"","contributors":[{"type":"Organizer","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Valentin","last_name":"Clement","affiliation":"Center for Climate System Modeling","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa230","type":"child","title":"Experience on Porting Atmosphere Kernels on Many-Core Processors and Accelerators","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This talk includes a summary of our previous work that ports different atmosphere kernels onto various state-of-the-art platforms, including the Sunway TaihuLight system. Performance portability for atmosphere codes is no doubt a big challenge, so great efforts have to be made and patience is required as well. In addition to some experiences and lessons, we also take this opportunity to discuss on the novel Sunway processors. For Sunway system, different software is being developed to make it easy for applications to be ported.","bio":"","contributors":[{"type":"Author","first_name":"Lin","last_name":"Gan","affiliation":"Tsinghua University","country":"China","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Lin","last_name":"Gan","affiliation":"Tsinghua University","country":"China","bio":"","order":"1","is_presenter":true}]},{"id":"msa181","type":"child","title":"Performance Portability for Next Generation HPC Architectures in E3SM via the Kokkos Programming Model","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work converts the atmospheric dynamical core (HOMME) of the Energy Exascale Earth System Model (E3SM) from the current CPU-centric implementation, in Fortran 90, to a new performance-portable implementation, in C++ with the Kokkos performance-portability framework. HOMME simulates the dynamics and physical processes of the atmosphere. It is the most computationally demanding part of E3SM. Kokkos provides performance-portable multidimensional arrays and intraprocess parallel execution constructs. These form an abstraction layer over the hardware architecture of a compute node within a supercomputer. We will present results for the performance of our implementation on conventional CPU, Intel Xeon Phi, and Nvidia GPU; compare performance with the original Fortran on CPU and Xeon Phi; and discuss details of the implementation.","filename":"msa181s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Luca","last_name":"Bertagna","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andrew","last_name":"Salinger","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Irina","last_name":"Tezaur","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Andrew","last_name":"Bradley","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Deakin","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Sunderland","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Oksana","last_name":"Guba","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luca","last_name":"Bertagna","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa138","type":"child","title":"Experience Applying the PSyclone Configurable Domain Specific Compiler to the Met Office LFRic Model","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Earth-system models tend to be large, complex codes developed by large teams of scientists over periods of years. However, the scale of the problems to be simulated calls for the highest levels of computational performance. Achieving good performance when both computer architectures and the underlying code base are constantly evolving is a complex challenge. In recent years, the use of Domain-Specific Languages (DSLs) as a potential solution to this problem has begun to be investigated. The UK Met Office\u0027s LFRic project is developing a new, Finite Element dynamical core and has adopted a DSL approach. In this talk we will describe this work and the functionality of the domain-specific compiler, PSyclone, which has been developed to process the (serial) code written by the natural scientists and generate the code required to run on massively parallel machines.","bio":"","contributors":[{"type":"Author","first_name":"Rupert","last_name":"Ford","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andrew R.","last_name":"Porter","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sergi","last_name":"Siso","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Rupert","last_name":"Ford","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa254","type":"child","title":"Novel Programming Models for Large Geophysical Fluid Dynamics Models","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Running operationally high resolution (~1km) global weather and climate models will be a milestone for the scientific community since there is clear evidence of the importance of high horizontal resolutions in the quality and accuracy of the simulations. Yet achieving this will pose serious computational challenges for large scientific codes that are developed using traditional programming models such as OpenMP and MPI. In order to adapt models to run efficiently on modern computing architectures and accelerators, numerous domain specific languages (DSL) and libraries that abstract architecture dependent optimizations have been proposed, like the GridTools libraries used operationally for running COSMO on GPUs. Yet these tools are specific to a domain or model, and have little reuse among them of architecture specific optimizers which leads to high maintenance costs. We present a novel programming model based on the GridTools ecosystem of libraries, a toolchain that allows to develop and interoperate various DSL frontends by providing domain and architecture specific optimizers. It aims at standardizing tools for performance portability by proposing a standard intermediate representation for weather and climate codes. We demonstrate the toolchain for the COSMO regional model and evaluate performance results compared to the operational model running on NVIDIA GPUs.","filename":"msa254s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stefan","last_name":"Moosbrugger","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa181","type":"child","title":"Performance Portability for Next Generation HPC Architectures in E3SM via the Kokkos Programming Model","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work converts the atmospheric dynamical core (HOMME) of the Energy Exascale Earth System Model (E3SM) from the current CPU-centric implementation, in Fortran 90, to a new performance-portable implementation, in C++ with the Kokkos performance-portability framework. HOMME simulates the dynamics and physical processes of the atmosphere. It is the most computationally demanding part of E3SM. Kokkos provides performance-portable multidimensional arrays and intraprocess parallel execution constructs. These form an abstraction layer over the hardware architecture of a compute node within a supercomputer. We will present results for the performance of our implementation on conventional CPU, Intel Xeon Phi, and Nvidia GPU; compare performance with the original Fortran on CPU and Xeon Phi; and discuss details of the implementation.","filename":"msa181s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Luca","last_name":"Bertagna","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andrew","last_name":"Salinger","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Irina","last_name":"Tezaur","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Andrew","last_name":"Bradley","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Deakin","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Sunderland","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Oksana","last_name":"Guba","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luca","last_name":"Bertagna","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Luca","last_name":"Bertagna","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andrew","last_name":"Salinger","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Irina","last_name":"Tezaur","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Andrew","last_name":"Bradley","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Deakin","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Sunderland","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Oksana","last_name":"Guba","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"7","is_presenter":false}] } Presentation
17:00 - 17:30
Novel Programming Models for Large Geophysical Fluid Dynamics Models
, Carlos E. Osuna (MeteoSwiss, Switzerland)
+ Abstract { "session": {"id":"sess158","title":"MS10 - Bridging the Software Productivity Gap for Climate and Weather Models","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp119","type":"minisymposia","title":"MS10 - Bridging the Software Productivity Gap for Climate and Weather Models","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Numerical weather prediction and climate models are large and complex software applications that need to run efficiently on today\u0027s and future massively parallel computer systems. The rapid change in these computing architectures and the increase in diversity are seriously affecting the ability to retain a single source code that runs efficiently in different architectures. Several weather models have successfully adapted their codebases to many-core and heterogeneous architectures like GPUs and Xeon Phi using a combination of multiple traditional programming models for parallel architectures like OpenMP, OpenACC and MPI. However porting existing large community codes to multiple architectures is a daunting task and leads to codes that are more complex and difficult to maintain. As a result in the past years numerous new technologies and approaches are emerging in order to provide new programming models, like domain-specific languages (DSLs) or source-to-source translation tools that can increase the productivity of development in weather codes while providing a high degree of performance portability. In this minisymposium we propose a discussion with some of the most prominent novel approaches where the new advances in programming models used for heterogeneous architectures in weather and climate models will be presented.","bio":"","contributors":[{"type":"Organizer","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Valentin","last_name":"Clement","affiliation":"Center for Climate System Modeling","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa230","type":"child","title":"Experience on Porting Atmosphere Kernels on Many-Core Processors and Accelerators","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This talk includes a summary of our previous work that ports different atmosphere kernels onto various state-of-the-art platforms, including the Sunway TaihuLight system. Performance portability for atmosphere codes is no doubt a big challenge, so great efforts have to be made and patience is required as well. In addition to some experiences and lessons, we also take this opportunity to discuss on the novel Sunway processors. For Sunway system, different software is being developed to make it easy for applications to be ported.","bio":"","contributors":[{"type":"Author","first_name":"Lin","last_name":"Gan","affiliation":"Tsinghua University","country":"China","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Lin","last_name":"Gan","affiliation":"Tsinghua University","country":"China","bio":"","order":"1","is_presenter":true}]},{"id":"msa181","type":"child","title":"Performance Portability for Next Generation HPC Architectures in E3SM via the Kokkos Programming Model","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work converts the atmospheric dynamical core (HOMME) of the Energy Exascale Earth System Model (E3SM) from the current CPU-centric implementation, in Fortran 90, to a new performance-portable implementation, in C++ with the Kokkos performance-portability framework. HOMME simulates the dynamics and physical processes of the atmosphere. It is the most computationally demanding part of E3SM. Kokkos provides performance-portable multidimensional arrays and intraprocess parallel execution constructs. These form an abstraction layer over the hardware architecture of a compute node within a supercomputer. We will present results for the performance of our implementation on conventional CPU, Intel Xeon Phi, and Nvidia GPU; compare performance with the original Fortran on CPU and Xeon Phi; and discuss details of the implementation.","filename":"msa181s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Luca","last_name":"Bertagna","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andrew","last_name":"Salinger","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Irina","last_name":"Tezaur","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Andrew","last_name":"Bradley","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Deakin","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Sunderland","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Oksana","last_name":"Guba","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luca","last_name":"Bertagna","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa138","type":"child","title":"Experience Applying the PSyclone Configurable Domain Specific Compiler to the Met Office LFRic Model","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Earth-system models tend to be large, complex codes developed by large teams of scientists over periods of years. However, the scale of the problems to be simulated calls for the highest levels of computational performance. Achieving good performance when both computer architectures and the underlying code base are constantly evolving is a complex challenge. In recent years, the use of Domain-Specific Languages (DSLs) as a potential solution to this problem has begun to be investigated. The UK Met Office\u0027s LFRic project is developing a new, Finite Element dynamical core and has adopted a DSL approach. In this talk we will describe this work and the functionality of the domain-specific compiler, PSyclone, which has been developed to process the (serial) code written by the natural scientists and generate the code required to run on massively parallel machines.","bio":"","contributors":[{"type":"Author","first_name":"Rupert","last_name":"Ford","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andrew R.","last_name":"Porter","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sergi","last_name":"Siso","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Rupert","last_name":"Ford","affiliation":"Science and Technology Facilities Council","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa254","type":"child","title":"Novel Programming Models for Large Geophysical Fluid Dynamics Models","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Running operationally high resolution (~1km) global weather and climate models will be a milestone for the scientific community since there is clear evidence of the importance of high horizontal resolutions in the quality and accuracy of the simulations. Yet achieving this will pose serious computational challenges for large scientific codes that are developed using traditional programming models such as OpenMP and MPI. In order to adapt models to run efficiently on modern computing architectures and accelerators, numerous domain specific languages (DSL) and libraries that abstract architecture dependent optimizations have been proposed, like the GridTools libraries used operationally for running COSMO on GPUs. Yet these tools are specific to a domain or model, and have little reuse among them of architecture specific optimizers which leads to high maintenance costs. We present a novel programming model based on the GridTools ecosystem of libraries, a toolchain that allows to develop and interoperate various DSL frontends by providing domain and architecture specific optimizers. It aims at standardizing tools for performance portability by proposing a standard intermediate representation for weather and climate codes. We demonstrate the toolchain for the COSMO regional model and evaluate performance results compared to the operational model running on NVIDIA GPUs.","filename":"msa254s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stefan","last_name":"Moosbrugger","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa254","type":"child","title":"Novel Programming Models for Large Geophysical Fluid Dynamics Models","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Running operationally high resolution (~1km) global weather and climate models will be a milestone for the scientific community since there is clear evidence of the importance of high horizontal resolutions in the quality and accuracy of the simulations. Yet achieving this will pose serious computational challenges for large scientific codes that are developed using traditional programming models such as OpenMP and MPI. In order to adapt models to run efficiently on modern computing architectures and accelerators, numerous domain specific languages (DSL) and libraries that abstract architecture dependent optimizations have been proposed, like the GridTools libraries used operationally for running COSMO on GPUs. Yet these tools are specific to a domain or model, and have little reuse among them of architecture specific optimizers which leads to high maintenance costs. We present a novel programming model based on the GridTools ecosystem of libraries, a toolchain that allows to develop and interoperate various DSL frontends by providing domain and architecture specific optimizers. It aims at standardizing tools for performance portability by proposing a standard intermediate representation for weather and climate codes. We demonstrate the toolchain for the COSMO regional model and evaluate performance results compared to the operational model running on NVIDIA GPUs.","filename":"msa254s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stefan","last_name":"Moosbrugger","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stefan","last_name":"Moosbrugger","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}] } Presentation
Organizer(s):
Michel Juillard (Banque de France, France)
Track(s):
Emerging Application Domains
Many important economic phenomena relate to the notion of risk. Economic actors not only make decisions as a function of their current situation but also depending on their expectation of future developments. Because economic systems are not deterministic, future economic events are usually treated as stochastic phenomena. The general form of the problem at hand is to determine how the probabilistic distribution of future economic events influences current decisions. The wider the distribution, the more risk in today's decisions. The papers in this session present different computation challenges involved in attempting to describe the effect of risk on economic decisions.
15:30 - 16:00
Approximating Equilibria with Ex-Post Heterogeneity and Aggregate Risk
, Elisabeth Proehl (University of Geneva, Switzerland)
+ Abstract { "session": {"id":"sess163","title":"MS11 - Computing the Effect of Risk","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Montreal Room","contributors":[{"type":"Session Chair","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains"],"slots":[{"id":"symp134","type":"minisymposia","title":"MS11 - Computing the Effect of Risk","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Many important economic phenomena relate to the notion of risk. Economic actors not only make decisions as a function of their current situation but also depending on their expectation of future developments. Because economic systems are not deterministic, future economic events are usually treated as stochastic phenomena. The general form of the problem at hand is to determine how the probabilistic distribution of future economic events influences current decisions. The wider the distribution, the more risk in today\u0027s decisions. The papers in this session present different computation challenges involved in attempting to describe the effect of risk on economic decisions.","bio":"","contributors":[{"type":"Organizer","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa189","type":"child","title":"Approximating Equilibria with Ex-Post Heterogeneity and Aggregate Risk","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Dynamic stochastic general equilibrium models with ex-post heterogeneity due to idiosyncratic risk have to be solved numerically. This is a nontrivial task as the cross-sectional distribution of endogenous variables becomes an element of the state space due to aggregate risk. Existing global solution methods have assumed bounded rationality in terms of a parametric law of motion of aggregate variables in order to reduce dimensionality. In this paper,\u00a0I remove that assumption and compute a fully rational equilibrium dependent on the whole cross-sectional distribution. Dimensionality is tackled by polynomial chaos expansions, a projection technique for square-integrable random variables, resulting in a nonparametric law of motion.\u00a0I establish conditions under which the method converges and approximation error bounds. To illustrate the method, I compute the Aiyagari-Bewley growth model and the Huggett model with aggregate risk. In\u00a0the former,\u00a0I find that the bounded rationality assumption leads to significantly more inequality than in a fully rational equilibrium. Furthermore, more risk sharing in form of redistribution can lead to higher systemic risk. In the latter model, I find that prices increase with more stringent selling constraints, but are also more negatively skewed.","filename":"msa189s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa122","type":"child","title":"The Extended Perturbation Method","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This presentation introduces the extended perturbation method, which improves upon standard perturbation by removing approximation errors under certainty equivalence. Using the neoclassical growth model and a New Keynesian model, we show that extended perturbation achieves higher accuracy than standard perturbation when using third order approximations. We also show that extended perturbation generates stable approximations even when standard perturbation explodes. This paper also adds to the literature on downward nominal wage rigidities in the New Keynesian model, by showing that this friction only plays a significant role when using standard perturbation but not when using the more accurate extended perturbation approximation.","filename":"msa122s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anders","last_name":"Kronborg","affiliation":"Danish National Bank","country":"Denmark","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true}]},{"id":"msa218","type":"child","title":"Back in Time. Fast. Improved Time Iterations.","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We consider a new solution algorithm to solve nonlinear economic models using projections. For Bellman problems, our method is a variant of Howard\u0027s improvement steps. Contrary to the original improvements, it generalizes to models specified by equilibrium conditions in which case it is equivalent to the Newton-Raphson algorithm applied to one big nonlinear system of equations, without requiring the explicit inversion of the (memory-hungry) Jacobian. In particular, convergence is quadratic, i.e. much faster than regular time-iterations. Convergence of each gradient improvement step requires the (local) contractivity of the time-iterations operator. We show how this property relates to eigenvalues coming from local perturbation analysis, and how to estimate the local spectral radius of this operator close to a solution candidate. Gradient improvements can be implemented easily, essentially by composing the same elements as time-iterations. Our timing comparisons still suggest it performs much faster, especially when the number of dimensions or the number of grid points increase.","filename":"msa218s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa175","type":"child","title":"Taking Risk into Account with Higher-Order Approximations","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a nonlinear model, expectation of future shocks entails expected benefits or expected losses. Rational agents can make decisions today so as to maximize expected benefits or minimize expected loss. These behaviors are related to economic concepts such as precautionary saving, asset prices, risk premium, term premium. By contrast, linear models are characterized by certainty equivalence, and in such environments, agents are indifferent to future uncertainty. One of the major benefits of using higher order approximation of a certain class of economic models, is the ability to analyse attitude towards risk. Computing higher-order approximation of DSGE models involves several computational challenges. Derivatives of the original model must be evaluated. These high dimensional objects must be stored in a convenient manner. Above second order, computations involve tensor algebra. A key component is a fast implementation of the Faa Di Bruno formula for the derivatives of the composition of two functions. Until now, all these steps have only been programmed in C++ in dynare++. They represent challenging tasks for a rather new programming language such as Julia and an interesting test case.","filename":"msa175s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa189","type":"child","title":"Approximating Equilibria with Ex-Post Heterogeneity and Aggregate Risk","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Dynamic stochastic general equilibrium models with ex-post heterogeneity due to idiosyncratic risk have to be solved numerically. This is a nontrivial task as the cross-sectional distribution of endogenous variables becomes an element of the state space due to aggregate risk. Existing global solution methods have assumed bounded rationality in terms of a parametric law of motion of aggregate variables in order to reduce dimensionality. In this paper,\u00a0I remove that assumption and compute a fully rational equilibrium dependent on the whole cross-sectional distribution. Dimensionality is tackled by polynomial chaos expansions, a projection technique for square-integrable random variables, resulting in a nonparametric law of motion.\u00a0I establish conditions under which the method converges and approximation error bounds. To illustrate the method, I compute the Aiyagari-Bewley growth model and the Huggett model with aggregate risk. In\u00a0the former,\u00a0I find that the bounded rationality assumption leads to significantly more inequality than in a fully rational equilibrium. Furthermore, more risk sharing in form of redistribution can lead to higher systemic risk. In the latter model, I find that prices increase with more stringent selling constraints, but are also more negatively skewed.","filename":"msa189s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
16:00 - 16:30
The Extended Perturbation Method
, Martin M. Andreasen (Aarhus University, Denmark)
+ Abstract { "session": {"id":"sess163","title":"MS11 - Computing the Effect of Risk","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Montreal Room","contributors":[{"type":"Session Chair","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains"],"slots":[{"id":"symp134","type":"minisymposia","title":"MS11 - Computing the Effect of Risk","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Many important economic phenomena relate to the notion of risk. Economic actors not only make decisions as a function of their current situation but also depending on their expectation of future developments. Because economic systems are not deterministic, future economic events are usually treated as stochastic phenomena. The general form of the problem at hand is to determine how the probabilistic distribution of future economic events influences current decisions. The wider the distribution, the more risk in today\u0027s decisions. The papers in this session present different computation challenges involved in attempting to describe the effect of risk on economic decisions.","bio":"","contributors":[{"type":"Organizer","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa189","type":"child","title":"Approximating Equilibria with Ex-Post Heterogeneity and Aggregate Risk","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Dynamic stochastic general equilibrium models with ex-post heterogeneity due to idiosyncratic risk have to be solved numerically. This is a nontrivial task as the cross-sectional distribution of endogenous variables becomes an element of the state space due to aggregate risk. Existing global solution methods have assumed bounded rationality in terms of a parametric law of motion of aggregate variables in order to reduce dimensionality. In this paper,\u00a0I remove that assumption and compute a fully rational equilibrium dependent on the whole cross-sectional distribution. Dimensionality is tackled by polynomial chaos expansions, a projection technique for square-integrable random variables, resulting in a nonparametric law of motion.\u00a0I establish conditions under which the method converges and approximation error bounds. To illustrate the method, I compute the Aiyagari-Bewley growth model and the Huggett model with aggregate risk. In\u00a0the former,\u00a0I find that the bounded rationality assumption leads to significantly more inequality than in a fully rational equilibrium. Furthermore, more risk sharing in form of redistribution can lead to higher systemic risk. In the latter model, I find that prices increase with more stringent selling constraints, but are also more negatively skewed.","filename":"msa189s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa122","type":"child","title":"The Extended Perturbation Method","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This presentation introduces the extended perturbation method, which improves upon standard perturbation by removing approximation errors under certainty equivalence. Using the neoclassical growth model and a New Keynesian model, we show that extended perturbation achieves higher accuracy than standard perturbation when using third order approximations. We also show that extended perturbation generates stable approximations even when standard perturbation explodes. This paper also adds to the literature on downward nominal wage rigidities in the New Keynesian model, by showing that this friction only plays a significant role when using standard perturbation but not when using the more accurate extended perturbation approximation.","filename":"msa122s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anders","last_name":"Kronborg","affiliation":"Danish National Bank","country":"Denmark","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true}]},{"id":"msa218","type":"child","title":"Back in Time. Fast. Improved Time Iterations.","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We consider a new solution algorithm to solve nonlinear economic models using projections. For Bellman problems, our method is a variant of Howard\u0027s improvement steps. Contrary to the original improvements, it generalizes to models specified by equilibrium conditions in which case it is equivalent to the Newton-Raphson algorithm applied to one big nonlinear system of equations, without requiring the explicit inversion of the (memory-hungry) Jacobian. In particular, convergence is quadratic, i.e. much faster than regular time-iterations. Convergence of each gradient improvement step requires the (local) contractivity of the time-iterations operator. We show how this property relates to eigenvalues coming from local perturbation analysis, and how to estimate the local spectral radius of this operator close to a solution candidate. Gradient improvements can be implemented easily, essentially by composing the same elements as time-iterations. Our timing comparisons still suggest it performs much faster, especially when the number of dimensions or the number of grid points increase.","filename":"msa218s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa175","type":"child","title":"Taking Risk into Account with Higher-Order Approximations","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a nonlinear model, expectation of future shocks entails expected benefits or expected losses. Rational agents can make decisions today so as to maximize expected benefits or minimize expected loss. These behaviors are related to economic concepts such as precautionary saving, asset prices, risk premium, term premium. By contrast, linear models are characterized by certainty equivalence, and in such environments, agents are indifferent to future uncertainty. One of the major benefits of using higher order approximation of a certain class of economic models, is the ability to analyse attitude towards risk. Computing higher-order approximation of DSGE models involves several computational challenges. Derivatives of the original model must be evaluated. These high dimensional objects must be stored in a convenient manner. Above second order, computations involve tensor algebra. A key component is a fast implementation of the Faa Di Bruno formula for the derivatives of the composition of two functions. Until now, all these steps have only been programmed in C++ in dynare++. They represent challenging tasks for a rather new programming language such as Julia and an interesting test case.","filename":"msa175s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa122","type":"child","title":"The Extended Perturbation Method","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This presentation introduces the extended perturbation method, which improves upon standard perturbation by removing approximation errors under certainty equivalence. Using the neoclassical growth model and a New Keynesian model, we show that extended perturbation achieves higher accuracy than standard perturbation when using third order approximations. We also show that extended perturbation generates stable approximations even when standard perturbation explodes. This paper also adds to the literature on downward nominal wage rigidities in the New Keynesian model, by showing that this friction only plays a significant role when using standard perturbation but not when using the more accurate extended perturbation approximation.","filename":"msa122s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anders","last_name":"Kronborg","affiliation":"Danish National Bank","country":"Denmark","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anders","last_name":"Kronborg","affiliation":"Danish National Bank","country":"Denmark","bio":"","order":"2","is_presenter":false}] } Presentation
16:30 - 17:00
Back in Time. Fast. Improved Time Iterations.
, Pablo Winant (Bank of England, United Kingdom)
+ Abstract { "session": {"id":"sess163","title":"MS11 - Computing the Effect of Risk","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Montreal Room","contributors":[{"type":"Session Chair","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains"],"slots":[{"id":"symp134","type":"minisymposia","title":"MS11 - Computing the Effect of Risk","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Many important economic phenomena relate to the notion of risk. Economic actors not only make decisions as a function of their current situation but also depending on their expectation of future developments. Because economic systems are not deterministic, future economic events are usually treated as stochastic phenomena. The general form of the problem at hand is to determine how the probabilistic distribution of future economic events influences current decisions. The wider the distribution, the more risk in today\u0027s decisions. The papers in this session present different computation challenges involved in attempting to describe the effect of risk on economic decisions.","bio":"","contributors":[{"type":"Organizer","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa189","type":"child","title":"Approximating Equilibria with Ex-Post Heterogeneity and Aggregate Risk","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Dynamic stochastic general equilibrium models with ex-post heterogeneity due to idiosyncratic risk have to be solved numerically. This is a nontrivial task as the cross-sectional distribution of endogenous variables becomes an element of the state space due to aggregate risk. Existing global solution methods have assumed bounded rationality in terms of a parametric law of motion of aggregate variables in order to reduce dimensionality. In this paper,\u00a0I remove that assumption and compute a fully rational equilibrium dependent on the whole cross-sectional distribution. Dimensionality is tackled by polynomial chaos expansions, a projection technique for square-integrable random variables, resulting in a nonparametric law of motion.\u00a0I establish conditions under which the method converges and approximation error bounds. To illustrate the method, I compute the Aiyagari-Bewley growth model and the Huggett model with aggregate risk. In\u00a0the former,\u00a0I find that the bounded rationality assumption leads to significantly more inequality than in a fully rational equilibrium. Furthermore, more risk sharing in form of redistribution can lead to higher systemic risk. In the latter model, I find that prices increase with more stringent selling constraints, but are also more negatively skewed.","filename":"msa189s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa122","type":"child","title":"The Extended Perturbation Method","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This presentation introduces the extended perturbation method, which improves upon standard perturbation by removing approximation errors under certainty equivalence. Using the neoclassical growth model and a New Keynesian model, we show that extended perturbation achieves higher accuracy than standard perturbation when using third order approximations. We also show that extended perturbation generates stable approximations even when standard perturbation explodes. This paper also adds to the literature on downward nominal wage rigidities in the New Keynesian model, by showing that this friction only plays a significant role when using standard perturbation but not when using the more accurate extended perturbation approximation.","filename":"msa122s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anders","last_name":"Kronborg","affiliation":"Danish National Bank","country":"Denmark","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true}]},{"id":"msa218","type":"child","title":"Back in Time. Fast. Improved Time Iterations.","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We consider a new solution algorithm to solve nonlinear economic models using projections. For Bellman problems, our method is a variant of Howard\u0027s improvement steps. Contrary to the original improvements, it generalizes to models specified by equilibrium conditions in which case it is equivalent to the Newton-Raphson algorithm applied to one big nonlinear system of equations, without requiring the explicit inversion of the (memory-hungry) Jacobian. In particular, convergence is quadratic, i.e. much faster than regular time-iterations. Convergence of each gradient improvement step requires the (local) contractivity of the time-iterations operator. We show how this property relates to eigenvalues coming from local perturbation analysis, and how to estimate the local spectral radius of this operator close to a solution candidate. Gradient improvements can be implemented easily, essentially by composing the same elements as time-iterations. Our timing comparisons still suggest it performs much faster, especially when the number of dimensions or the number of grid points increase.","filename":"msa218s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa175","type":"child","title":"Taking Risk into Account with Higher-Order Approximations","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a nonlinear model, expectation of future shocks entails expected benefits or expected losses. Rational agents can make decisions today so as to maximize expected benefits or minimize expected loss. These behaviors are related to economic concepts such as precautionary saving, asset prices, risk premium, term premium. By contrast, linear models are characterized by certainty equivalence, and in such environments, agents are indifferent to future uncertainty. One of the major benefits of using higher order approximation of a certain class of economic models, is the ability to analyse attitude towards risk. Computing higher-order approximation of DSGE models involves several computational challenges. Derivatives of the original model must be evaluated. These high dimensional objects must be stored in a convenient manner. Above second order, computations involve tensor algebra. A key component is a fast implementation of the Faa Di Bruno formula for the derivatives of the composition of two functions. Until now, all these steps have only been programmed in C++ in dynare++. They represent challenging tasks for a rather new programming language such as Julia and an interesting test case.","filename":"msa175s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa218","type":"child","title":"Back in Time. Fast. Improved Time Iterations.","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We consider a new solution algorithm to solve nonlinear economic models using projections. For Bellman problems, our method is a variant of Howard\u0027s improvement steps. Contrary to the original improvements, it generalizes to models specified by equilibrium conditions in which case it is equivalent to the Newton-Raphson algorithm applied to one big nonlinear system of equations, without requiring the explicit inversion of the (memory-hungry) Jacobian. In particular, convergence is quadratic, i.e. much faster than regular time-iterations. Convergence of each gradient improvement step requires the (local) contractivity of the time-iterations operator. We show how this property relates to eigenvalues coming from local perturbation analysis, and how to estimate the local spectral radius of this operator close to a solution candidate. Gradient improvements can be implemented easily, essentially by composing the same elements as time-iterations. Our timing comparisons still suggest it performs much faster, especially when the number of dimensions or the number of grid points increase.","filename":"msa218s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}] } Presentation
17:00 - 17:30
Taking Risk into Account with Higher-Order Approximations
, Michel Juillard (Banque de France, France)
+ Abstract { "session": {"id":"sess163","title":"MS11 - Computing the Effect of Risk","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Montreal Room","contributors":[{"type":"Session Chair","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains"],"slots":[{"id":"symp134","type":"minisymposia","title":"MS11 - Computing the Effect of Risk","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Many important economic phenomena relate to the notion of risk. Economic actors not only make decisions as a function of their current situation but also depending on their expectation of future developments. Because economic systems are not deterministic, future economic events are usually treated as stochastic phenomena. The general form of the problem at hand is to determine how the probabilistic distribution of future economic events influences current decisions. The wider the distribution, the more risk in today\u0027s decisions. The papers in this session present different computation challenges involved in attempting to describe the effect of risk on economic decisions.","bio":"","contributors":[{"type":"Organizer","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa189","type":"child","title":"Approximating Equilibria with Ex-Post Heterogeneity and Aggregate Risk","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Dynamic stochastic general equilibrium models with ex-post heterogeneity due to idiosyncratic risk have to be solved numerically. This is a nontrivial task as the cross-sectional distribution of endogenous variables becomes an element of the state space due to aggregate risk. Existing global solution methods have assumed bounded rationality in terms of a parametric law of motion of aggregate variables in order to reduce dimensionality. In this paper,\u00a0I remove that assumption and compute a fully rational equilibrium dependent on the whole cross-sectional distribution. Dimensionality is tackled by polynomial chaos expansions, a projection technique for square-integrable random variables, resulting in a nonparametric law of motion.\u00a0I establish conditions under which the method converges and approximation error bounds. To illustrate the method, I compute the Aiyagari-Bewley growth model and the Huggett model with aggregate risk. In\u00a0the former,\u00a0I find that the bounded rationality assumption leads to significantly more inequality than in a fully rational equilibrium. Furthermore, more risk sharing in form of redistribution can lead to higher systemic risk. In the latter model, I find that prices increase with more stringent selling constraints, but are also more negatively skewed.","filename":"msa189s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Elisabeth","last_name":"Proehl","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa122","type":"child","title":"The Extended Perturbation Method","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This presentation introduces the extended perturbation method, which improves upon standard perturbation by removing approximation errors under certainty equivalence. Using the neoclassical growth model and a New Keynesian model, we show that extended perturbation achieves higher accuracy than standard perturbation when using third order approximations. We also show that extended perturbation generates stable approximations even when standard perturbation explodes. This paper also adds to the literature on downward nominal wage rigidities in the New Keynesian model, by showing that this friction only plays a significant role when using standard perturbation but not when using the more accurate extended perturbation approximation.","filename":"msa122s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anders","last_name":"Kronborg","affiliation":"Danish National Bank","country":"Denmark","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Martin M.","last_name":"Andreasen","affiliation":"Aarhus University","country":"Denmark","bio":"","order":"1","is_presenter":true}]},{"id":"msa218","type":"child","title":"Back in Time. Fast. Improved Time Iterations.","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We consider a new solution algorithm to solve nonlinear economic models using projections. For Bellman problems, our method is a variant of Howard\u0027s improvement steps. Contrary to the original improvements, it generalizes to models specified by equilibrium conditions in which case it is equivalent to the Newton-Raphson algorithm applied to one big nonlinear system of equations, without requiring the explicit inversion of the (memory-hungry) Jacobian. In particular, convergence is quadratic, i.e. much faster than regular time-iterations. Convergence of each gradient improvement step requires the (local) contractivity of the time-iterations operator. We show how this property relates to eigenvalues coming from local perturbation analysis, and how to estimate the local spectral radius of this operator close to a solution candidate. Gradient improvements can be implemented easily, essentially by composing the same elements as time-iterations. Our timing comparisons still suggest it performs much faster, especially when the number of dimensions or the number of grid points increase.","filename":"msa218s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pablo","last_name":"Winant","affiliation":"Bank of England","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa175","type":"child","title":"Taking Risk into Account with Higher-Order Approximations","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a nonlinear model, expectation of future shocks entails expected benefits or expected losses. Rational agents can make decisions today so as to maximize expected benefits or minimize expected loss. These behaviors are related to economic concepts such as precautionary saving, asset prices, risk premium, term premium. By contrast, linear models are characterized by certainty equivalence, and in such environments, agents are indifferent to future uncertainty. One of the major benefits of using higher order approximation of a certain class of economic models, is the ability to analyse attitude towards risk. Computing higher-order approximation of DSGE models involves several computational challenges. Derivatives of the original model must be evaluated. These high dimensional objects must be stored in a convenient manner. Above second order, computations involve tensor algebra. A key component is a fast implementation of the Faa Di Bruno formula for the derivatives of the composition of two functions. Until now, all these steps have only been programmed in C++ in dynare++. They represent challenging tasks for a rather new programming language such as Julia and an interesting test case.","filename":"msa175s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa175","type":"child","title":"Taking Risk into Account with Higher-Order Approximations","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a nonlinear model, expectation of future shocks entails expected benefits or expected losses. Rational agents can make decisions today so as to maximize expected benefits or minimize expected loss. These behaviors are related to economic concepts such as precautionary saving, asset prices, risk premium, term premium. By contrast, linear models are characterized by certainty equivalence, and in such environments, agents are indifferent to future uncertainty. One of the major benefits of using higher order approximation of a certain class of economic models, is the ability to analyse attitude towards risk. Computing higher-order approximation of DSGE models involves several computational challenges. Derivatives of the original model must be evaluated. These high dimensional objects must be stored in a convenient manner. Above second order, computations involve tensor algebra. A key component is a fast implementation of the Faa Di Bruno formula for the derivatives of the composition of two functions. Until now, all these steps have only been programmed in C++ in dynare++. They represent challenging tasks for a rather new programming language such as Julia and an interesting test case.","filename":"msa175s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Michel","last_name":"Juillard","affiliation":"Banque de France","country":"France","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Guido Juckeland (Helmholtz-Zentrum Dresden-Rossendorf, Germany)
Track(s):
Emerging Application Domains, Computer Science and Applied Mathematics
The dawn of Web 2.0 applications, smartphones/tablets and the omnipresent yet invisible cloud computing have dramatically changed our perception of the IT landscape in the last decade. At the same time these technologies delivered an abundance of new tools that are more suited to the needs of domain scientists: GitHub/GitLab with their inherent support for agile programming, user space package managers for easy software installations even when facing complex dependencies, and web portals to HPC systems or private clouds so that no prior knowledge is needed to use state-of-the-art compute resources. This minisymposium will showcase all these tools and how they are used in real scientific workflows. The shown best practices are easy to reproduce since they are all based on freely available software packages, so that the audience can use the presentations both as an inspiration but also as a kickstart for their own better science.
15:30 - 16:00
HPC-as-a-Service to Domain Scientists
, Sunita Chandrasekaran (University of Delaware, United States of America)
+ Abstract { "session": {"id":"sess169","title":"MS12 - Engineering Scientific Software in times of Agile Development, Continuous Integration and Cloud Computing","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Sydney Room","contributors":[{"type":"Session Chair","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp128","type":"minisymposia","title":"MS12 - Engineering Scientific Software in times of Agile Development, Continuous Integration and Cloud Computing","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The dawn of Web 2.0 applications, smartphones\/tablets and the omnipresent yet invisible cloud computing have dramatically changed our perception of the IT landscape in the last decade. At the same time these technologies delivered an abundance of new tools that are more suited to the needs of domain scientists: GitHub\/GitLab with their inherent support for agile programming, user space package managers for easy software installations even when facing complex dependencies, and web portals to HPC systems or private clouds so that no prior knowledge is needed to use state-of-the-art compute resources. This minisymposium will showcase all these tools and how they are used in real scientific workflows. The shown best practices are easy to reproduce since they are all based on freely available software packages, so that the audience can use the presentations both as an inspiration but also as a kickstart for their own better science.","bio":"","contributors":[{"type":"Organizer","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa137","type":"child","title":"HPC-as-a-Service to Domain Scientists","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Applications-as-a-service operating modes have changed the computing landscape in a multi-disciplinary research laboratory both from a user\u0027s and an HPC operator\u0027s perspective. Lately, applications are being offered as web-based user interfaces regardless of the actual location of the computation. To this end, the cloud computing revolution has had a wonderful side effect that everybody can now easily accept that certain tasks are transparently performed elsewhere - this talk will give an example from the bioinformatics application domain showing how cloud resources can be used for DNA sequencing. As such more and more HPC centers offer web-portals to access their systems along with applications developers also offering a web-based front-end so that the \u0022obscure green font on black screen magic\u0022 of a typical SSH session is hidden from the end user. This enables both new groups to use HPC systems but also provides users a more error-proof and efficient way of using installed applications. This talk will highlight the criticality of an application-as-a service mode and will also discuss how Docker containers and Jupyter notebooks can be used for this easy-to-use application-as-a-service notion. The parallel benchmark suite from SPEC HPG organization will be used for demo purposes.","filename":"msa137s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa226","type":"child","title":"The Reality of Scientific Software Development is Agile - Best Practices and Lessons Learned","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The reality of the development of scientific software is often far from the clearly structured, cascading work packages that grant applications require, but rather in the spirit of agile programming: Start from a working minimal prototype, always have running code, work in sprints (typically towards the end of reporting periods). Those characteristics actually match rather well onto the concept of agile programming. This talk will explain the principles of agile software development, the existing software tool support using the free-of-charge GitLab\u00a0Libre Edition as an example, and the implementation of the processes, including team communication, continuous integration and publication of documentation.","bio":"","contributors":[{"type":"Author","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tobias","last_name":"Frust","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa212","type":"child","title":"Using Jetstream and High Performance Remote Research Desktops to Lower the Barrier of Entry for HPC Resources","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Indiana University operates two environments designed to lower the barrier of entry for HPC resources. One is the NSF Jetstream project, the first NSF funded cloud designed for those who have not previously used high performance computing resources. Jetstream provides users with long running virtual machines with a customizable software stack to meet the needs of non-traditional HPC applications. The other environment is a research desktop solution that is making high performance Linux desktops available remotely. The desktops contain all the normal HPC command line tools and allow for direct job submission to the HPC machines, but also provide access to interactive applications like Matlab, Comsol Multiphysics, R-Studio and Jupyter. The goal of both projects is to lower the barrier of entry and broaden adoption of traditional HPC and high-throughput computing environments. The talk will provide an architectural overview, use cases and experiences for operating such environments.","filename":"msa212s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Robert","last_name":"Henschel","affiliation":"Indiana University","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dave","last_name":"Hancock","affiliation":"Indiana University","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Robert","last_name":"Henschel","affiliation":"Indiana University","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa213","type":"child","title":"Spack: A Package Manager for Scientific Software","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"On mainstream Linux distributions, package managers simplify the software installation process by providing pre-built, generic binaries. Users can leverage a wide variety of libraries and applications without knowing how to build from source. On HPC systems users typically build from source, and software, is notoriously complex. Building even a moderately sized parallel simulation code can be a major effort. Scientists who use applications codes must typically also know how to build them from scratch, along with tens or hundreds of dependency libraries. Spack is an open source package manager that handles the complexity of HPC environments and allows scientists to automatically and reproducibly install complex software stacks. It allows users to experiment with different compilers, optimizations, build options, and dependency versions, without in-depth build knowledge. Spack is built to handle the complexities of the HPC environment that seldom arise on commodity systems, such as swapping compilers and ABI-incompatible dependencies, cross-compilation, compiler runtime libraries, and optimized binaries. Spack has a rapidly growing community, with over 240 contributors at organizations worldwide. In this talk, we will introduce Spack, show how it can make scientists more productive, and give an overview of ongoing Spack projects and its development road map.","bio":"","contributors":[{"type":"Author","first_name":"Todd","last_name":"Gamblin","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Massimiliano","last_name":"Culpo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Todd","last_name":"Gamblin","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Massimiliano","last_name":"Culpo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"msa137","type":"child","title":"HPC-as-a-Service to Domain Scientists","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Applications-as-a-service operating modes have changed the computing landscape in a multi-disciplinary research laboratory both from a user\u0027s and an HPC operator\u0027s perspective. Lately, applications are being offered as web-based user interfaces regardless of the actual location of the computation. To this end, the cloud computing revolution has had a wonderful side effect that everybody can now easily accept that certain tasks are transparently performed elsewhere - this talk will give an example from the bioinformatics application domain showing how cloud resources can be used for DNA sequencing. As such more and more HPC centers offer web-portals to access their systems along with applications developers also offering a web-based front-end so that the \u0022obscure green font on black screen magic\u0022 of a typical SSH session is hidden from the end user. This enables both new groups to use HPC systems but also provides users a more error-proof and efficient way of using installed applications. This talk will highlight the criticality of an application-as-a service mode and will also discuss how Docker containers and Jupyter notebooks can be used for this easy-to-use application-as-a-service notion. The parallel benchmark suite from SPEC HPG organization will be used for demo purposes.","filename":"msa137s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
16:30 - 17:00
Using Jetstream and High Performance Remote Research Desktops to Lower the Barrier of Entry for HPC Resources
, Robert Henschel (Indiana University, United States of America)
+ Abstract { "session": {"id":"sess169","title":"MS12 - Engineering Scientific Software in times of Agile Development, Continuous Integration and Cloud Computing","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Sydney Room","contributors":[{"type":"Session Chair","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp128","type":"minisymposia","title":"MS12 - Engineering Scientific Software in times of Agile Development, Continuous Integration and Cloud Computing","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The dawn of Web 2.0 applications, smartphones\/tablets and the omnipresent yet invisible cloud computing have dramatically changed our perception of the IT landscape in the last decade. At the same time these technologies delivered an abundance of new tools that are more suited to the needs of domain scientists: GitHub\/GitLab with their inherent support for agile programming, user space package managers for easy software installations even when facing complex dependencies, and web portals to HPC systems or private clouds so that no prior knowledge is needed to use state-of-the-art compute resources. This minisymposium will showcase all these tools and how they are used in real scientific workflows. The shown best practices are easy to reproduce since they are all based on freely available software packages, so that the audience can use the presentations both as an inspiration but also as a kickstart for their own better science.","bio":"","contributors":[{"type":"Organizer","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa137","type":"child","title":"HPC-as-a-Service to Domain Scientists","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Applications-as-a-service operating modes have changed the computing landscape in a multi-disciplinary research laboratory both from a user\u0027s and an HPC operator\u0027s perspective. Lately, applications are being offered as web-based user interfaces regardless of the actual location of the computation. To this end, the cloud computing revolution has had a wonderful side effect that everybody can now easily accept that certain tasks are transparently performed elsewhere - this talk will give an example from the bioinformatics application domain showing how cloud resources can be used for DNA sequencing. As such more and more HPC centers offer web-portals to access their systems along with applications developers also offering a web-based front-end so that the \u0022obscure green font on black screen magic\u0022 of a typical SSH session is hidden from the end user. This enables both new groups to use HPC systems but also provides users a more error-proof and efficient way of using installed applications. This talk will highlight the criticality of an application-as-a service mode and will also discuss how Docker containers and Jupyter notebooks can be used for this easy-to-use application-as-a-service notion. The parallel benchmark suite from SPEC HPG organization will be used for demo purposes.","filename":"msa137s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa226","type":"child","title":"The Reality of Scientific Software Development is Agile - Best Practices and Lessons Learned","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The reality of the development of scientific software is often far from the clearly structured, cascading work packages that grant applications require, but rather in the spirit of agile programming: Start from a working minimal prototype, always have running code, work in sprints (typically towards the end of reporting periods). Those characteristics actually match rather well onto the concept of agile programming. This talk will explain the principles of agile software development, the existing software tool support using the free-of-charge GitLab\u00a0Libre Edition as an example, and the implementation of the processes, including team communication, continuous integration and publication of documentation.","bio":"","contributors":[{"type":"Author","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tobias","last_name":"Frust","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa212","type":"child","title":"Using Jetstream and High Performance Remote Research Desktops to Lower the Barrier of Entry for HPC Resources","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Indiana University operates two environments designed to lower the barrier of entry for HPC resources. One is the NSF Jetstream project, the first NSF funded cloud designed for those who have not previously used high performance computing resources. Jetstream provides users with long running virtual machines with a customizable software stack to meet the needs of non-traditional HPC applications. The other environment is a research desktop solution that is making high performance Linux desktops available remotely. The desktops contain all the normal HPC command line tools and allow for direct job submission to the HPC machines, but also provide access to interactive applications like Matlab, Comsol Multiphysics, R-Studio and Jupyter. The goal of both projects is to lower the barrier of entry and broaden adoption of traditional HPC and high-throughput computing environments. The talk will provide an architectural overview, use cases and experiences for operating such environments.","filename":"msa212s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Robert","last_name":"Henschel","affiliation":"Indiana University","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dave","last_name":"Hancock","affiliation":"Indiana University","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Robert","last_name":"Henschel","affiliation":"Indiana University","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa213","type":"child","title":"Spack: A Package Manager for Scientific Software","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"On mainstream Linux distributions, package managers simplify the software installation process by providing pre-built, generic binaries. Users can leverage a wide variety of libraries and applications without knowing how to build from source. On HPC systems users typically build from source, and software, is notoriously complex. Building even a moderately sized parallel simulation code can be a major effort. Scientists who use applications codes must typically also know how to build them from scratch, along with tens or hundreds of dependency libraries. Spack is an open source package manager that handles the complexity of HPC environments and allows scientists to automatically and reproducibly install complex software stacks. It allows users to experiment with different compilers, optimizations, build options, and dependency versions, without in-depth build knowledge. Spack is built to handle the complexities of the HPC environment that seldom arise on commodity systems, such as swapping compilers and ABI-incompatible dependencies, cross-compilation, compiler runtime libraries, and optimized binaries. Spack has a rapidly growing community, with over 240 contributors at organizations worldwide. In this talk, we will introduce Spack, show how it can make scientists more productive, and give an overview of ongoing Spack projects and its development road map.","bio":"","contributors":[{"type":"Author","first_name":"Todd","last_name":"Gamblin","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Massimiliano","last_name":"Culpo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Todd","last_name":"Gamblin","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Massimiliano","last_name":"Culpo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"msa212","type":"child","title":"Using Jetstream and High Performance Remote Research Desktops to Lower the Barrier of Entry for HPC Resources","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Indiana University operates two environments designed to lower the barrier of entry for HPC resources. One is the NSF Jetstream project, the first NSF funded cloud designed for those who have not previously used high performance computing resources. Jetstream provides users with long running virtual machines with a customizable software stack to meet the needs of non-traditional HPC applications. The other environment is a research desktop solution that is making high performance Linux desktops available remotely. The desktops contain all the normal HPC command line tools and allow for direct job submission to the HPC machines, but also provide access to interactive applications like Matlab, Comsol Multiphysics, R-Studio and Jupyter. The goal of both projects is to lower the barrier of entry and broaden adoption of traditional HPC and high-throughput computing environments. The talk will provide an architectural overview, use cases and experiences for operating such environments.","filename":"msa212s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Robert","last_name":"Henschel","affiliation":"Indiana University","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dave","last_name":"Hancock","affiliation":"Indiana University","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Robert","last_name":"Henschel","affiliation":"Indiana University","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Robert","last_name":"Henschel","affiliation":"Indiana University","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dave","last_name":"Hancock","affiliation":"Indiana University","country":"United States of America","bio":"","order":"2","is_presenter":false}] } Presentation
Organizer(s):
Sofia Vallecorsa (CERN, Switzerland)
, Jean-Roch Vlimant (California Institute of Technology, United States of America)
, Michela Paganini (Yale University, United States of America)
Track(s):
Computer Science and Applied Mathematics, Physics
The Large Hadron Collider at CERN is smashing high density bunches of protons near the speed of light at a frequency of 40 MHz. Most of the thousands of particles emitted at each bunch crossing are measured and collected with building-sized detectors consisting of multiple sub-detectors each serving its own purpose. The simulation of the signal created by a particle interacting with such a detector is typically done with very detailed simulations and needs to be stepped infinitesimally over meters of material. This simulation is as much computing intensive as the geometry is complex. In current and future detector design, the fine-grained simulation of such a detector is taking a great part of the full computing budget of experiments and poses a computing challenge. While a great deal of effort is being made to parallelise such software, one possible avenue to reduce the computational requirements is with generative models from the field of deep learning. Such generative models have seen success in conditionally generating images and video of various types. We present how such models are built and trained, and how well they can capture the physics of particle interaction and help generate realistic samples for high energy physics analysis.
15:30 - 16:00
The Success of Deep Generative Models
, Jakub Tomczak (University of Amsterdam, Netherlands)
+ Abstract { "session": {"id":"sess176","title":"MS13 - Generative Models and Density Estimator for High Energy Physics","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Osaka Room","contributors":[{"type":"Session Chair","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Physics"],"slots":[{"id":"symp144","type":"minisymposia","title":"MS13 - Generative Models and Density Estimator for High Energy Physics","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The Large Hadron Collider at CERN is smashing high density bunches of protons near the speed of light at a frequency of 40 MHz. Most of the thousands of particles emitted at each bunch crossing are measured and collected with building-sized detectors consisting of multiple sub-detectors each serving its own purpose. The simulation of the signal created by a particle interacting with such a detector is typically done with very detailed simulations and needs to be stepped infinitesimally over meters of material. This simulation is as much computing intensive as the geometry is complex. In current and future detector design, the fine-grained simulation of such a detector is taking a great part of the full computing budget of experiments and poses a computing challenge. While a great deal of effort is being made to parallelise such software, one possible avenue to reduce the computational requirements is with generative models from the field of deep learning. Such generative models have seen success in conditionally generating images and video of various types. We present how such models are built and trained, and how well they can capture the physics of particle interaction and help generate realistic samples for high energy physics analysis.","bio":"","contributors":[{"type":"Organizer","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Jean-Roch","last_name":"Vlimant","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Michela","last_name":"Paganini","affiliation":"Yale University","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa291","type":"child","title":"The Success of Deep Generative Models","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Deep generative models allow us to learn hidden representations of data and generate new examples. There are two major families of models that are exploited in current applications: Generative Adversarial Networks (GANs), and Variational Auto-Encoders (VAE). The principle of GANs is to train a generator that can generate examples from random noise, in adversary of a discriminative model that is forced to confuse true samples from generated ones. Generated images by GANs are very sharp and detailed. The biggest disadvantage of GANs is that they are trained through solving a minimax optimization problem that causes significant learning instability issues. VAEs are based on a fully probabilistic perspective of the variational inference. The learning problem aims at maximizing the variational lower bound for a given family of variational posteriors. The model can be trained by backpropagation but it was noticed that the resulting generated images are rather blurry. However, VAEs are probabilistic models, thus, they could be incorporated in almost any probabilistic framework. We will discuss basics of both approaches and present recent extensions. We will point out advantages and disadvantages of GANs and VAE. Some of most promising applications of deep generative models will be shown.","filename":"msa291s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakub","last_name":"Tomczak","affiliation":"University of Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakub","last_name":"Tomczak","affiliation":"University of Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}]},{"id":"msa256","type":"child","title":"Generative Models for Application-Specific Fast Simulation of LHC Collision Events","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We investigate the possibility of using generative models (e.g., GANs and variational autoencoders) as analysis-specific data augmentation tools to increase the size of the simulation data used by the LHC experiments. With the LHC entering its high-luminosity phase in 2025, the projected computing resources will not be able to sustain the demand for simulated events. Generative models are already investigated as the mean to speed up the centralized simulation process. Here we propose to investigate a different strategy: training deep networks to generate small-dimension ntuples of numbers (physics quantities such as reconstructed particle energy and direction), learning the distribution of these quantities from a sample of simulated data. In one step, one would then be able to generate the outcome of the full processing workflow (generation + simulation + reconstruction + selection).","bio":"","contributors":[{"type":"Author","first_name":"Maurizio","last_name":"Pierini","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dominick","last_name":"Olivito","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Bobak","last_name":"Hashemi","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Nick","last_name":"Amin","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Maurizio","last_name":"Pierini","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa249","type":"child","title":"Using Generative Models for Fast Clusters Simulations in the TPC Detector for the ALICE Experiment","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Simulation of the events happening in the particle detector is a key component of many High Energy Physics experiments. Currently used Monte Carlo techniques allow to do it accurately, but their precision often comes at the expense of relatively high computational cost. In this work, we present a proof-of-concept solution for simulating clusters that occur after particle collision in the TPC detector in the ALICE Experiment at CERN. The new method we propose, dubbed ParticleGAN for simplicity, leverages recently developed Generative Adversarial Networks to learn the trajectories of particle tracks after collision. Although the quality of generated events is not even with the currently used solutions yet, ParticleGAN offer up to 10^3 speedups over the existing approaches. This applies also to other evaluated generative models namely Variational Autoencoders and variants of GANs. In this work we outline current bottlenecks of the proposed approach and discuss further steps that can allow to deploy the proposed generative models for simulation in production.","bio":"","contributors":[{"type":"Author","first_name":"Kamil","last_name":"Deja","affiliation":"Warsaw University of Technology","country":"Poland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tomasz","last_name":"Trzci\u0144ski","affiliation":"Warsaw University of Technology","country":"Poland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"\u0141ukasz","last_name":"Graczykowski","affiliation":"Warsaw University of Technology","country":"Poland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Kamil","last_name":"Deja","affiliation":"Warsaw University of Technology","country":"Poland","bio":"","order":"1","is_presenter":true}]},{"id":"msa251","type":"child","title":"Generative Models for Simulating Highly Granular Calorimeters","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Machine Learning techniques have been used in different applications by the HEP community: in this talk, we discuss the case of detector simulation. The need for simulated events, expected in future High Luminosity LHC experiments is increasing dramatically and requires new fast simulation solutions. We will describe R\u0026amp;D activities, aimed at reproducing the detector response and replace standard Monte Carlo simulation with generative models, typically used in computer vision applications. Two common aspects characterize many of these applications: the representation of input data as regular arrays of numerical values and the use of raw data as the input information to feed the network. Next generation HEP experiments are expected to be more and more characterized by detector components that could comply to this paradigm. Calorimeters of the ILC and CLIC detector concepts are effectively 3D arrays of sensors. We will introduce the first application of three-dimensional convolutional Generative Adversarial Networks and of Variational Auto Econders to the simulation of highly granular calorimeters.\u00a0Finally we will present detailed validation studies comparing results to Monte Carlo simulation, showing the very good agreement we obtain for high level physics quantities and calorimeter response.","filename":"msa251s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Golling","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Dalila","last_name":"Salamani","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Federico","last_name":"Carminati","affiliation":"CERN","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Gul Rukh","last_name":"Khattak","affiliation":"CERN","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Golling","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa291","type":"child","title":"The Success of Deep Generative Models","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Deep generative models allow us to learn hidden representations of data and generate new examples. There are two major families of models that are exploited in current applications: Generative Adversarial Networks (GANs), and Variational Auto-Encoders (VAE). The principle of GANs is to train a generator that can generate examples from random noise, in adversary of a discriminative model that is forced to confuse true samples from generated ones. Generated images by GANs are very sharp and detailed. The biggest disadvantage of GANs is that they are trained through solving a minimax optimization problem that causes significant learning instability issues. VAEs are based on a fully probabilistic perspective of the variational inference. The learning problem aims at maximizing the variational lower bound for a given family of variational posteriors. The model can be trained by backpropagation but it was noticed that the resulting generated images are rather blurry. However, VAEs are probabilistic models, thus, they could be incorporated in almost any probabilistic framework. We will discuss basics of both approaches and present recent extensions. We will point out advantages and disadvantages of GANs and VAE. Some of most promising applications of deep generative models will be shown.","filename":"msa291s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakub","last_name":"Tomczak","affiliation":"University of Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakub","last_name":"Tomczak","affiliation":"University of Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jakub","last_name":"Tomczak","affiliation":"University of Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}] } Presentation
17:00 - 17:30
Generative Models for Simulating Highly Granular Calorimeters
, Tobias Golling (University of Geneva, Switzerland)
+ Abstract { "session": {"id":"sess176","title":"MS13 - Generative Models and Density Estimator for High Energy Physics","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Osaka Room","contributors":[{"type":"Session Chair","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Physics"],"slots":[{"id":"symp144","type":"minisymposia","title":"MS13 - Generative Models and Density Estimator for High Energy Physics","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The Large Hadron Collider at CERN is smashing high density bunches of protons near the speed of light at a frequency of 40 MHz. Most of the thousands of particles emitted at each bunch crossing are measured and collected with building-sized detectors consisting of multiple sub-detectors each serving its own purpose. The simulation of the signal created by a particle interacting with such a detector is typically done with very detailed simulations and needs to be stepped infinitesimally over meters of material. This simulation is as much computing intensive as the geometry is complex. In current and future detector design, the fine-grained simulation of such a detector is taking a great part of the full computing budget of experiments and poses a computing challenge. While a great deal of effort is being made to parallelise such software, one possible avenue to reduce the computational requirements is with generative models from the field of deep learning. Such generative models have seen success in conditionally generating images and video of various types. We present how such models are built and trained, and how well they can capture the physics of particle interaction and help generate realistic samples for high energy physics analysis.","bio":"","contributors":[{"type":"Organizer","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Jean-Roch","last_name":"Vlimant","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Michela","last_name":"Paganini","affiliation":"Yale University","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa291","type":"child","title":"The Success of Deep Generative Models","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Deep generative models allow us to learn hidden representations of data and generate new examples. There are two major families of models that are exploited in current applications: Generative Adversarial Networks (GANs), and Variational Auto-Encoders (VAE). The principle of GANs is to train a generator that can generate examples from random noise, in adversary of a discriminative model that is forced to confuse true samples from generated ones. Generated images by GANs are very sharp and detailed. The biggest disadvantage of GANs is that they are trained through solving a minimax optimization problem that causes significant learning instability issues. VAEs are based on a fully probabilistic perspective of the variational inference. The learning problem aims at maximizing the variational lower bound for a given family of variational posteriors. The model can be trained by backpropagation but it was noticed that the resulting generated images are rather blurry. However, VAEs are probabilistic models, thus, they could be incorporated in almost any probabilistic framework. We will discuss basics of both approaches and present recent extensions. We will point out advantages and disadvantages of GANs and VAE. Some of most promising applications of deep generative models will be shown.","filename":"msa291s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakub","last_name":"Tomczak","affiliation":"University of Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakub","last_name":"Tomczak","affiliation":"University of Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}]},{"id":"msa256","type":"child","title":"Generative Models for Application-Specific Fast Simulation of LHC Collision Events","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We investigate the possibility of using generative models (e.g., GANs and variational autoencoders) as analysis-specific data augmentation tools to increase the size of the simulation data used by the LHC experiments. With the LHC entering its high-luminosity phase in 2025, the projected computing resources will not be able to sustain the demand for simulated events. Generative models are already investigated as the mean to speed up the centralized simulation process. Here we propose to investigate a different strategy: training deep networks to generate small-dimension ntuples of numbers (physics quantities such as reconstructed particle energy and direction), learning the distribution of these quantities from a sample of simulated data. In one step, one would then be able to generate the outcome of the full processing workflow (generation + simulation + reconstruction + selection).","bio":"","contributors":[{"type":"Author","first_name":"Maurizio","last_name":"Pierini","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dominick","last_name":"Olivito","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Bobak","last_name":"Hashemi","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Nick","last_name":"Amin","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Maurizio","last_name":"Pierini","affiliation":"CERN","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa249","type":"child","title":"Using Generative Models for Fast Clusters Simulations in the TPC Detector for the ALICE Experiment","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Simulation of the events happening in the particle detector is a key component of many High Energy Physics experiments. Currently used Monte Carlo techniques allow to do it accurately, but their precision often comes at the expense of relatively high computational cost. In this work, we present a proof-of-concept solution for simulating clusters that occur after particle collision in the TPC detector in the ALICE Experiment at CERN. The new method we propose, dubbed ParticleGAN for simplicity, leverages recently developed Generative Adversarial Networks to learn the trajectories of particle tracks after collision. Although the quality of generated events is not even with the currently used solutions yet, ParticleGAN offer up to 10^3 speedups over the existing approaches. This applies also to other evaluated generative models namely Variational Autoencoders and variants of GANs. In this work we outline current bottlenecks of the proposed approach and discuss further steps that can allow to deploy the proposed generative models for simulation in production.","bio":"","contributors":[{"type":"Author","first_name":"Kamil","last_name":"Deja","affiliation":"Warsaw University of Technology","country":"Poland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tomasz","last_name":"Trzci\u0144ski","affiliation":"Warsaw University of Technology","country":"Poland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"\u0141ukasz","last_name":"Graczykowski","affiliation":"Warsaw University of Technology","country":"Poland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Kamil","last_name":"Deja","affiliation":"Warsaw University of Technology","country":"Poland","bio":"","order":"1","is_presenter":true}]},{"id":"msa251","type":"child","title":"Generative Models for Simulating Highly Granular Calorimeters","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Machine Learning techniques have been used in different applications by the HEP community: in this talk, we discuss the case of detector simulation. The need for simulated events, expected in future High Luminosity LHC experiments is increasing dramatically and requires new fast simulation solutions. We will describe R\u0026amp;D activities, aimed at reproducing the detector response and replace standard Monte Carlo simulation with generative models, typically used in computer vision applications. Two common aspects characterize many of these applications: the representation of input data as regular arrays of numerical values and the use of raw data as the input information to feed the network. Next generation HEP experiments are expected to be more and more characterized by detector components that could comply to this paradigm. Calorimeters of the ILC and CLIC detector concepts are effectively 3D arrays of sensors. We will introduce the first application of three-dimensional convolutional Generative Adversarial Networks and of Variational Auto Econders to the simulation of highly granular calorimeters.\u00a0Finally we will present detailed validation studies comparing results to Monte Carlo simulation, showing the very good agreement we obtain for high level physics quantities and calorimeter response.","filename":"msa251s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Golling","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Dalila","last_name":"Salamani","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Federico","last_name":"Carminati","affiliation":"CERN","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Gul Rukh","last_name":"Khattak","affiliation":"CERN","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Golling","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa251","type":"child","title":"Generative Models for Simulating Highly Granular Calorimeters","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Machine Learning techniques have been used in different applications by the HEP community: in this talk, we discuss the case of detector simulation. The need for simulated events, expected in future High Luminosity LHC experiments is increasing dramatically and requires new fast simulation solutions. We will describe R\u0026amp;D activities, aimed at reproducing the detector response and replace standard Monte Carlo simulation with generative models, typically used in computer vision applications. Two common aspects characterize many of these applications: the representation of input data as regular arrays of numerical values and the use of raw data as the input information to feed the network. Next generation HEP experiments are expected to be more and more characterized by detector components that could comply to this paradigm. Calorimeters of the ILC and CLIC detector concepts are effectively 3D arrays of sensors. We will introduce the first application of three-dimensional convolutional Generative Adversarial Networks and of Variational Auto Econders to the simulation of highly granular calorimeters.\u00a0Finally we will present detailed validation studies comparing results to Monte Carlo simulation, showing the very good agreement we obtain for high level physics quantities and calorimeter response.","filename":"msa251s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Golling","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Dalila","last_name":"Salamani","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Federico","last_name":"Carminati","affiliation":"CERN","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Gul Rukh","last_name":"Khattak","affiliation":"CERN","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Golling","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Tobias","last_name":"Golling","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sofia","last_name":"Vallecorsa","affiliation":"CERN","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Dalila","last_name":"Salamani","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Federico","last_name":"Carminati","affiliation":"CERN","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Gul Rukh","last_name":"Khattak","affiliation":"CERN","country":"Switzerland","bio":"","order":"5","is_presenter":false}] } Presentation
Organizer(s):
Jean-Michel Benkert (Baloise Group, Switzerland)
, Michelle Allgöwer (Baloise Group, Switzerland)
Track(s):
Emerging Application Domains, Computer Science and Applied Mathematics
In the past couple years a large number of fintech and more recently insurtech startups have been founded and are challenging established players in the financial services sector. At Baloise – a Swiss company providing insurance services in Switzerland, Belgium, Germany and Luxembourg as well as banking services in Switzerland – we view startups as potential partners on our digital transformation journey rather than competition. This minisymposium aims to demonstrate what problems companies such as Baloise face in terms of digitizing their business and making use of their large amounts of data. To do so, the four sessions will cover the innovation framework Baloise employs in order to rapidly test prototypes, a presentation by Brainalyzed, a startup which aims to optimize and automatize investment decisions at Baloise using AI, a presentation about the challenges arising in the context of data warehouses and legacy systems, and a panel discussion with all the speakers.
15:30 - 16:00
Open Innovation at Baloise
, Jean-Michel Benkert (Baloise Group, Switzerland)
+ Abstract { "session": {"id":"sess179","title":"MS14 - How Fintech and Big Data Change and Challenge the Insurance Sector","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Nairobi Room","contributors":[{"type":"Session Chair","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp120","type":"minisymposia","title":"MS14 - How Fintech and Big Data Change and Challenge the Insurance Sector","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"In the past couple years a large number of fintech and more recently insurtech startups have been founded and are challenging established players in the financial services sector. At Baloise \u2013 a Swiss company providing insurance services in Switzerland, Belgium, Germany and Luxembourg as well as banking services in Switzerland \u2013 we view startups as potential partners on our digital transformation journey rather than competition. This minisymposium aims to demonstrate what problems companies such as Baloise face in terms of digitizing their business and making use of their large amounts of data. To do so, the four sessions will cover the innovation framework Baloise employs in order to rapidly test prototypes, a presentation by Brainalyzed, a startup which aims to optimize and automatize investment decisions at Baloise using AI, a presentation about the challenges arising in the context of data warehouses and legacy systems, and a panel discussion with all the speakers.","bio":"","contributors":[{"type":"Organizer","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Michelle","last_name":"Allg\u00f6wer","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa194","type":"child","title":"Open Innovation at Baloise","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years a large number of fintech and more recently insurtech startups have been founded and are challenging established players in the financial services sector.\u00a0 At Baloise \u2013 a Swiss company providing insurance services in Switzerland, Belgium, Germany and Luxembourg as well as banking services in Switzerland \u2013 we view startups as potential partners on our digital transformation journey rather than competition. Baloise has developed an open innovation framework with the goal of enabling easy and fast cooperation with startups and other external partners as well as intrapreneurs. In this session we will present this open innovation framework and its evolution over time as we have tailored it to the requirements of startups.","filename":"msa194s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa116","type":"child","title":"Artificial Intelligence for Automated Investment Management","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Due to increasing digitization in all sectors, the amount of available data is almost unlimited. The challenge is not only to manage this data, but to make it usable. Therefore, data analysis becomes a key success factor for organizations. Especially in the financial sector, data-driven applications are necessary to keep up with the fast-moving financial market and growing competition. The answer to low interest rates and high volatility in the market are automated data-driven investment processes. Data analysis using artificial intelligence (AI) is therefore becoming increasingly important. In this session we will give insights and some practical examples how we worked together with Baloise Asset Management to use some of their data to enhance the investment management process. We will show how the scalability of the learning solution helps to analyze even very complex problems in a short time and what our vision of AI in the financial world looks like.","filename":"msa116s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Gunter","last_name":"Fischer","affiliation":"Brainalyzed","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Gunter","last_name":"Fischer","affiliation":"Brainalyzed","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa169","type":"child","title":"The Challenges of Big Data for a Traditional Insurance Company","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With a company history of over 150 years, our IT landscape has grown in a highly fragmented way and consists of numerous legacy systems which have evolved over the last couple of decades covering a wide range of computer languages. Therefore a greenfield approach in terms of big data is out of question and the integration of data originating from these systems represents a costly and time-consuming challenge for Baloise. Securing the availability of internal data on one side and meeting the fast growing business requirements in connection with external (big) data integration on the other side is the balancing act of our digital transformation in the domain of business intelligence. How Baloise tackles these challenges and how the company benefits from cooperation with startups using artificial intelligence to boost this transformation will be explained in this session of the minisymposium. In the second part of this session an insight into projected use cases will be delivered to illuminate Baloise\u0027s strategic approaches related to machine learning and big data.","filename":"msa169s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Geering","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Klaus","last_name":"Rieger","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Klaus","last_name":"Rieger","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa195","type":"child","title":"Panel Discussion on How Fintech and Big Data Change and Challenge the Insurance Sector","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Join us for a panel discussion on how fintech and big data change and challenge the insurance sector. The panelists are Dr. Gunter Fischer from Brainalyzed, an AI startup in the fintech space, Christoph Geering, responsible for business intelligence at Baloise Switzerland, and Dr. Jean-Michel Benkert, Innovation Manager at Baloise Group.","bio":"","contributors":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa194","type":"child","title":"Open Innovation at Baloise","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years a large number of fintech and more recently insurtech startups have been founded and are challenging established players in the financial services sector.\u00a0 At Baloise \u2013 a Swiss company providing insurance services in Switzerland, Belgium, Germany and Luxembourg as well as banking services in Switzerland \u2013 we view startups as potential partners on our digital transformation journey rather than competition. Baloise has developed an open innovation framework with the goal of enabling easy and fast cooperation with startups and other external partners as well as intrapreneurs. In this session we will present this open innovation framework and its evolution over time as we have tailored it to the requirements of startups.","filename":"msa194s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
16:00 - 16:30
Artificial Intelligence for Automated Investment Management
, Gunter Fischer (Brainalyzed, Germany)
+ Abstract { "session": {"id":"sess179","title":"MS14 - How Fintech and Big Data Change and Challenge the Insurance Sector","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Nairobi Room","contributors":[{"type":"Session Chair","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp120","type":"minisymposia","title":"MS14 - How Fintech and Big Data Change and Challenge the Insurance Sector","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"In the past couple years a large number of fintech and more recently insurtech startups have been founded and are challenging established players in the financial services sector. At Baloise \u2013 a Swiss company providing insurance services in Switzerland, Belgium, Germany and Luxembourg as well as banking services in Switzerland \u2013 we view startups as potential partners on our digital transformation journey rather than competition. This minisymposium aims to demonstrate what problems companies such as Baloise face in terms of digitizing their business and making use of their large amounts of data. To do so, the four sessions will cover the innovation framework Baloise employs in order to rapidly test prototypes, a presentation by Brainalyzed, a startup which aims to optimize and automatize investment decisions at Baloise using AI, a presentation about the challenges arising in the context of data warehouses and legacy systems, and a panel discussion with all the speakers.","bio":"","contributors":[{"type":"Organizer","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Michelle","last_name":"Allg\u00f6wer","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa194","type":"child","title":"Open Innovation at Baloise","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years a large number of fintech and more recently insurtech startups have been founded and are challenging established players in the financial services sector.\u00a0 At Baloise \u2013 a Swiss company providing insurance services in Switzerland, Belgium, Germany and Luxembourg as well as banking services in Switzerland \u2013 we view startups as potential partners on our digital transformation journey rather than competition. Baloise has developed an open innovation framework with the goal of enabling easy and fast cooperation with startups and other external partners as well as intrapreneurs. In this session we will present this open innovation framework and its evolution over time as we have tailored it to the requirements of startups.","filename":"msa194s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa116","type":"child","title":"Artificial Intelligence for Automated Investment Management","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Due to increasing digitization in all sectors, the amount of available data is almost unlimited. The challenge is not only to manage this data, but to make it usable. Therefore, data analysis becomes a key success factor for organizations. Especially in the financial sector, data-driven applications are necessary to keep up with the fast-moving financial market and growing competition. The answer to low interest rates and high volatility in the market are automated data-driven investment processes. Data analysis using artificial intelligence (AI) is therefore becoming increasingly important. In this session we will give insights and some practical examples how we worked together with Baloise Asset Management to use some of their data to enhance the investment management process. We will show how the scalability of the learning solution helps to analyze even very complex problems in a short time and what our vision of AI in the financial world looks like.","filename":"msa116s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Gunter","last_name":"Fischer","affiliation":"Brainalyzed","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Gunter","last_name":"Fischer","affiliation":"Brainalyzed","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa169","type":"child","title":"The Challenges of Big Data for a Traditional Insurance Company","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With a company history of over 150 years, our IT landscape has grown in a highly fragmented way and consists of numerous legacy systems which have evolved over the last couple of decades covering a wide range of computer languages. Therefore a greenfield approach in terms of big data is out of question and the integration of data originating from these systems represents a costly and time-consuming challenge for Baloise. Securing the availability of internal data on one side and meeting the fast growing business requirements in connection with external (big) data integration on the other side is the balancing act of our digital transformation in the domain of business intelligence. How Baloise tackles these challenges and how the company benefits from cooperation with startups using artificial intelligence to boost this transformation will be explained in this session of the minisymposium. In the second part of this session an insight into projected use cases will be delivered to illuminate Baloise\u0027s strategic approaches related to machine learning and big data.","filename":"msa169s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Geering","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Klaus","last_name":"Rieger","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Klaus","last_name":"Rieger","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa195","type":"child","title":"Panel Discussion on How Fintech and Big Data Change and Challenge the Insurance Sector","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Join us for a panel discussion on how fintech and big data change and challenge the insurance sector. The panelists are Dr. Gunter Fischer from Brainalyzed, an AI startup in the fintech space, Christoph Geering, responsible for business intelligence at Baloise Switzerland, and Dr. Jean-Michel Benkert, Innovation Manager at Baloise Group.","bio":"","contributors":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa116","type":"child","title":"Artificial Intelligence for Automated Investment Management","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Due to increasing digitization in all sectors, the amount of available data is almost unlimited. The challenge is not only to manage this data, but to make it usable. Therefore, data analysis becomes a key success factor for organizations. Especially in the financial sector, data-driven applications are necessary to keep up with the fast-moving financial market and growing competition. The answer to low interest rates and high volatility in the market are automated data-driven investment processes. Data analysis using artificial intelligence (AI) is therefore becoming increasingly important. In this session we will give insights and some practical examples how we worked together with Baloise Asset Management to use some of their data to enhance the investment management process. We will show how the scalability of the learning solution helps to analyze even very complex problems in a short time and what our vision of AI in the financial world looks like.","filename":"msa116s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Gunter","last_name":"Fischer","affiliation":"Brainalyzed","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Gunter","last_name":"Fischer","affiliation":"Brainalyzed","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Gunter","last_name":"Fischer","affiliation":"Brainalyzed","country":"Germany","bio":"","order":"1","is_presenter":true}] } Presentation
16:30 - 17:00
The Challenges of Big Data for a Traditional Insurance Company
, Klaus Rieger (Baloise Group, Switzerland)
+ Abstract { "session": {"id":"sess179","title":"MS14 - How Fintech and Big Data Change and Challenge the Insurance Sector","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Nairobi Room","contributors":[{"type":"Session Chair","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp120","type":"minisymposia","title":"MS14 - How Fintech and Big Data Change and Challenge the Insurance Sector","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"In the past couple years a large number of fintech and more recently insurtech startups have been founded and are challenging established players in the financial services sector. At Baloise \u2013 a Swiss company providing insurance services in Switzerland, Belgium, Germany and Luxembourg as well as banking services in Switzerland \u2013 we view startups as potential partners on our digital transformation journey rather than competition. This minisymposium aims to demonstrate what problems companies such as Baloise face in terms of digitizing their business and making use of their large amounts of data. To do so, the four sessions will cover the innovation framework Baloise employs in order to rapidly test prototypes, a presentation by Brainalyzed, a startup which aims to optimize and automatize investment decisions at Baloise using AI, a presentation about the challenges arising in the context of data warehouses and legacy systems, and a panel discussion with all the speakers.","bio":"","contributors":[{"type":"Organizer","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Michelle","last_name":"Allg\u00f6wer","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa194","type":"child","title":"Open Innovation at Baloise","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years a large number of fintech and more recently insurtech startups have been founded and are challenging established players in the financial services sector.\u00a0 At Baloise \u2013 a Swiss company providing insurance services in Switzerland, Belgium, Germany and Luxembourg as well as banking services in Switzerland \u2013 we view startups as potential partners on our digital transformation journey rather than competition. Baloise has developed an open innovation framework with the goal of enabling easy and fast cooperation with startups and other external partners as well as intrapreneurs. In this session we will present this open innovation framework and its evolution over time as we have tailored it to the requirements of startups.","filename":"msa194s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa116","type":"child","title":"Artificial Intelligence for Automated Investment Management","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Due to increasing digitization in all sectors, the amount of available data is almost unlimited. The challenge is not only to manage this data, but to make it usable. Therefore, data analysis becomes a key success factor for organizations. Especially in the financial sector, data-driven applications are necessary to keep up with the fast-moving financial market and growing competition. The answer to low interest rates and high volatility in the market are automated data-driven investment processes. Data analysis using artificial intelligence (AI) is therefore becoming increasingly important. In this session we will give insights and some practical examples how we worked together with Baloise Asset Management to use some of their data to enhance the investment management process. We will show how the scalability of the learning solution helps to analyze even very complex problems in a short time and what our vision of AI in the financial world looks like.","filename":"msa116s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Gunter","last_name":"Fischer","affiliation":"Brainalyzed","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Gunter","last_name":"Fischer","affiliation":"Brainalyzed","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa169","type":"child","title":"The Challenges of Big Data for a Traditional Insurance Company","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With a company history of over 150 years, our IT landscape has grown in a highly fragmented way and consists of numerous legacy systems which have evolved over the last couple of decades covering a wide range of computer languages. Therefore a greenfield approach in terms of big data is out of question and the integration of data originating from these systems represents a costly and time-consuming challenge for Baloise. Securing the availability of internal data on one side and meeting the fast growing business requirements in connection with external (big) data integration on the other side is the balancing act of our digital transformation in the domain of business intelligence. How Baloise tackles these challenges and how the company benefits from cooperation with startups using artificial intelligence to boost this transformation will be explained in this session of the minisymposium. In the second part of this session an insight into projected use cases will be delivered to illuminate Baloise\u0027s strategic approaches related to machine learning and big data.","filename":"msa169s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Geering","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Klaus","last_name":"Rieger","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Klaus","last_name":"Rieger","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa195","type":"child","title":"Panel Discussion on How Fintech and Big Data Change and Challenge the Insurance Sector","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Join us for a panel discussion on how fintech and big data change and challenge the insurance sector. The panelists are Dr. Gunter Fischer from Brainalyzed, an AI startup in the fintech space, Christoph Geering, responsible for business intelligence at Baloise Switzerland, and Dr. Jean-Michel Benkert, Innovation Manager at Baloise Group.","bio":"","contributors":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jean-Michel","last_name":"Benkert","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa169","type":"child","title":"The Challenges of Big Data for a Traditional Insurance Company","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With a company history of over 150 years, our IT landscape has grown in a highly fragmented way and consists of numerous legacy systems which have evolved over the last couple of decades covering a wide range of computer languages. Therefore a greenfield approach in terms of big data is out of question and the integration of data originating from these systems represents a costly and time-consuming challenge for Baloise. Securing the availability of internal data on one side and meeting the fast growing business requirements in connection with external (big) data integration on the other side is the balancing act of our digital transformation in the domain of business intelligence. How Baloise tackles these challenges and how the company benefits from cooperation with startups using artificial intelligence to boost this transformation will be explained in this session of the minisymposium. In the second part of this session an insight into projected use cases will be delivered to illuminate Baloise\u0027s strategic approaches related to machine learning and big data.","filename":"msa169s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Geering","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Klaus","last_name":"Rieger","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Klaus","last_name":"Rieger","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Christoph","last_name":"Geering","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Klaus","last_name":"Rieger","affiliation":"Baloise Group","country":"Switzerland","bio":"","order":"2","is_presenter":true}] } Presentation
Organizer(s):
Roland Lindh (Uppsala University, Sweden)
Track(s):
Emerging Application Domains, Computer Science and Applied Mathematics, Chemistry and Materials, Physics
Machine Learning is right now a booming field of computer science which finds applications in the development of computer-human interfaces, in the analysis of medical data of huge populations, in the maintenance of cars, planes and elevators, and self-driving cars, to mention a few. During the last twenty years the field has gone through a development and refinement which has been spectacular. For some reason the use of the technology in pure science has been lagging behind; however, we are now starting to see the use of machine learning in the field of quantum chemistry. Here, the approach will enhance the performance of standard quantum chemical calculations – improving convergence, could serve as a tool for post-analysis of huge sets of ab initio results, or could simply replace computationally expensive procedures. Machine learning offers practical alternatives where standard quantum chemical simulations would be prohibitive. During the last few years a small number of quantum chemistry groups have explored the potential of machine learning – the results have been extraordinary and spectacular. Here in this minisymposium we would like to inspire by presenting four different applications in which machine learning is fundamental to success.
15:30 - 16:00
Quantum Machine Learning in Chemical Compound Space
, Anders S. Christensen (University of Basel, Switzerland)
+ Abstract { "session": {"id":"sess186","title":"MS15 - Machine Learning and Quantum Chemistry","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics","Chemistry and Materials","Physics"],"slots":[{"id":"symp135","type":"minisymposia","title":"MS15 - Machine Learning and Quantum Chemistry","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Machine Learning is right now a booming field of computer science which finds applications in the development of computer-human interfaces, in the analysis of medical data of huge populations, in the maintenance of cars, planes and elevators, and self-driving cars, to mention a few. During the last twenty years the field has gone through a development and refinement which has been spectacular. For some reason the use of the technology in pure science has been lagging behind; however, we are now starting to see the use of machine learning in the field of quantum chemistry. Here, the approach will enhance the performance of standard quantum chemical calculations \u2013 improving convergence, could serve as a tool for post-analysis of huge sets of \u003Cem\u003Eab initio\u003C\/em\u003E results, or could simply replace computationally expensive procedures. Machine learning offers practical alternatives where standard quantum chemical simulations would be prohibitive. During the last few years a small number of quantum chemistry groups have explored the potential of machine learning \u2013 the results have been extraordinary and spectacular. Here in this minisymposium we would like to inspire by presenting four different applications in which machine learning is fundamental to success.","bio":"","contributors":[{"type":"Organizer","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa199","type":"child","title":"Quantum Machine Learning in Chemical Compound Space","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Many of the most relevant chemical properties of matter depend explicitly on atomistic and electronic details, rendering a first principles approach to chemistry mandatory. Alas, even when using high-performance computers, brute force high-throughput screening of compounds is beyond any capacity for all but the simplest systems and properties due to the combinatorial nature of chemical space, i.e. all compositional, constitutional, and conformational isomers. Consequently, efficient exploration algorithms need to exploit all implicit redundancies present in chemical space. I will discuss recently developed statistical learning approaches for interpolating quantum mechanical observables in compositional and constitutional space. Results for our models indicate remarkable performance in terms of accuracy, speed, universality, and size scalability.","filename":"msa199s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa269","type":"child","title":"Neural Networks Learning Quantum Chemistry","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this talk, we will present a fully transferable deep learning potential that is applicable to complex and diverse molecular systems well beyond the training dataset. Recently we introduced ANAKIN-ME (Accurate NeurAl networK engINe for Molecular Energies) or ANI in short. [doi: 10.1039\/C6SC05720A] ANI is a new \u003Cem\u003Emethod and sampling procedure\u003C\/em\u003E for training NNPs that utilizes a special kind of symmetry functions to build single-atom atomic environment vectors (AEV) as a molecular representation. To train ANI potential we use fully automated approach for the generation of datasets.[arXiv:1801.09319] It is based on the concept of active learning (AL). We show the use of our proposed AL technique develops a universal ANI potential, which provides very accurate energy and force predictions on the entire COMP6 benchmark. This universal potential achieves a level of accuracy on par with the best ML potentials for single molecule or materials while remaining applicable to the general class of organic molecules comprised of the elements CHNOSFCl.","filename":"msa269s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa237","type":"child","title":"Neural Network Representations of Non-Equilibrium Potential Energy Surfaces Sampled in Virtual Reality","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"I will outline recent developments in our group aimed at developing efficient potential energy surface (PES) representations of molecular geometries which are far from equilibrium. Recent progress developing a framework for interactive molecular dynamics in a multi-user virtual reality environment (combining rigorous cloud-mounted physical atomistic simulation with commodity virtual reality hardware) enables us to visualize and sample, with atomic-level precision, the structures and dynamics of complex molecular structures \u0027on the fly\u0027.(arXiv:1801.02884) From within this framework, we can run real-time molecular dynamics (using density functional and semi-empirical theory), accelerating the sampling of high-energy reaction pathways. Combined, these reactive pathways represent a test set of molecular geometries whose energies and forces we then calculate at higher levels (e.g., explicitly correlated local coupled cluster theory), and fit using neural networks. The resultant PES provides coupled cluster quality energies at the cost of classical force fields, enabling us to run thousands of trajectories and thereby make comparisons with experimental dynamical observables in non-equilibrium regimes. I will illustrate this coupled virtual-reality-machine-learning workflow by focusing on recent applications where we have been studying heterogeneous reaction dynamics wherein cyano radicals (CN) undergo reactive scattering at the surfaces of liquids which are composed of long chain hydrocarbons.","filename":"msa237s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Silvia","last_name":"Amabilino","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lars","last_name":"Bratholm","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Simon","last_name":"Bennie","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}]},{"id":"msa272","type":"child","title":"Predicting the Stability of Solids with Density Functional Theory and Machine Learning","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We use a combination of machine learning techniques and high-throughput density-functional theory calculations to explore ternary compounds with the AB2C2 composition. We chose the two most common intermetallic prototypes for this composition, namely the tI10-CeAl2Ga 2 and the tP10-FeMo2B2 structures. We find that there may be \u223c10 times more stable compounds in these phases than previously known. These are mostly metallic and non-magnetic. While the use of machine learning reduces the overall calculation cost by around 75%, some limitations still exist, in particular for compounds involving the second-row of the periodic table or magnetic elements.","filename":"msa272s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa199","type":"child","title":"Quantum Machine Learning in Chemical Compound Space","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Many of the most relevant chemical properties of matter depend explicitly on atomistic and electronic details, rendering a first principles approach to chemistry mandatory. Alas, even when using high-performance computers, brute force high-throughput screening of compounds is beyond any capacity for all but the simplest systems and properties due to the combinatorial nature of chemical space, i.e. all compositional, constitutional, and conformational isomers. Consequently, efficient exploration algorithms need to exploit all implicit redundancies present in chemical space. I will discuss recently developed statistical learning approaches for interpolating quantum mechanical observables in compositional and constitutional space. Results for our models indicate remarkable performance in terms of accuracy, speed, universality, and size scalability.","filename":"msa199s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
16:00 - 16:30
Neural Networks Learning Quantum Chemistry
, Olexandr Isayev (University of North Carolina, United States of America)
+ Abstract { "session": {"id":"sess186","title":"MS15 - Machine Learning and Quantum Chemistry","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics","Chemistry and Materials","Physics"],"slots":[{"id":"symp135","type":"minisymposia","title":"MS15 - Machine Learning and Quantum Chemistry","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Machine Learning is right now a booming field of computer science which finds applications in the development of computer-human interfaces, in the analysis of medical data of huge populations, in the maintenance of cars, planes and elevators, and self-driving cars, to mention a few. During the last twenty years the field has gone through a development and refinement which has been spectacular. For some reason the use of the technology in pure science has been lagging behind; however, we are now starting to see the use of machine learning in the field of quantum chemistry. Here, the approach will enhance the performance of standard quantum chemical calculations \u2013 improving convergence, could serve as a tool for post-analysis of huge sets of \u003Cem\u003Eab initio\u003C\/em\u003E results, or could simply replace computationally expensive procedures. Machine learning offers practical alternatives where standard quantum chemical simulations would be prohibitive. During the last few years a small number of quantum chemistry groups have explored the potential of machine learning \u2013 the results have been extraordinary and spectacular. Here in this minisymposium we would like to inspire by presenting four different applications in which machine learning is fundamental to success.","bio":"","contributors":[{"type":"Organizer","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa199","type":"child","title":"Quantum Machine Learning in Chemical Compound Space","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Many of the most relevant chemical properties of matter depend explicitly on atomistic and electronic details, rendering a first principles approach to chemistry mandatory. Alas, even when using high-performance computers, brute force high-throughput screening of compounds is beyond any capacity for all but the simplest systems and properties due to the combinatorial nature of chemical space, i.e. all compositional, constitutional, and conformational isomers. Consequently, efficient exploration algorithms need to exploit all implicit redundancies present in chemical space. I will discuss recently developed statistical learning approaches for interpolating quantum mechanical observables in compositional and constitutional space. Results for our models indicate remarkable performance in terms of accuracy, speed, universality, and size scalability.","filename":"msa199s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa269","type":"child","title":"Neural Networks Learning Quantum Chemistry","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this talk, we will present a fully transferable deep learning potential that is applicable to complex and diverse molecular systems well beyond the training dataset. Recently we introduced ANAKIN-ME (Accurate NeurAl networK engINe for Molecular Energies) or ANI in short. [doi: 10.1039\/C6SC05720A] ANI is a new \u003Cem\u003Emethod and sampling procedure\u003C\/em\u003E for training NNPs that utilizes a special kind of symmetry functions to build single-atom atomic environment vectors (AEV) as a molecular representation. To train ANI potential we use fully automated approach for the generation of datasets.[arXiv:1801.09319] It is based on the concept of active learning (AL). We show the use of our proposed AL technique develops a universal ANI potential, which provides very accurate energy and force predictions on the entire COMP6 benchmark. This universal potential achieves a level of accuracy on par with the best ML potentials for single molecule or materials while remaining applicable to the general class of organic molecules comprised of the elements CHNOSFCl.","filename":"msa269s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa237","type":"child","title":"Neural Network Representations of Non-Equilibrium Potential Energy Surfaces Sampled in Virtual Reality","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"I will outline recent developments in our group aimed at developing efficient potential energy surface (PES) representations of molecular geometries which are far from equilibrium. Recent progress developing a framework for interactive molecular dynamics in a multi-user virtual reality environment (combining rigorous cloud-mounted physical atomistic simulation with commodity virtual reality hardware) enables us to visualize and sample, with atomic-level precision, the structures and dynamics of complex molecular structures \u0027on the fly\u0027.(arXiv:1801.02884) From within this framework, we can run real-time molecular dynamics (using density functional and semi-empirical theory), accelerating the sampling of high-energy reaction pathways. Combined, these reactive pathways represent a test set of molecular geometries whose energies and forces we then calculate at higher levels (e.g., explicitly correlated local coupled cluster theory), and fit using neural networks. The resultant PES provides coupled cluster quality energies at the cost of classical force fields, enabling us to run thousands of trajectories and thereby make comparisons with experimental dynamical observables in non-equilibrium regimes. I will illustrate this coupled virtual-reality-machine-learning workflow by focusing on recent applications where we have been studying heterogeneous reaction dynamics wherein cyano radicals (CN) undergo reactive scattering at the surfaces of liquids which are composed of long chain hydrocarbons.","filename":"msa237s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Silvia","last_name":"Amabilino","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lars","last_name":"Bratholm","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Simon","last_name":"Bennie","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}]},{"id":"msa272","type":"child","title":"Predicting the Stability of Solids with Density Functional Theory and Machine Learning","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We use a combination of machine learning techniques and high-throughput density-functional theory calculations to explore ternary compounds with the AB2C2 composition. We chose the two most common intermetallic prototypes for this composition, namely the tI10-CeAl2Ga 2 and the tP10-FeMo2B2 structures. We find that there may be \u223c10 times more stable compounds in these phases than previously known. These are mostly metallic and non-magnetic. While the use of machine learning reduces the overall calculation cost by around 75%, some limitations still exist, in particular for compounds involving the second-row of the periodic table or magnetic elements.","filename":"msa272s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa269","type":"child","title":"Neural Networks Learning Quantum Chemistry","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this talk, we will present a fully transferable deep learning potential that is applicable to complex and diverse molecular systems well beyond the training dataset. Recently we introduced ANAKIN-ME (Accurate NeurAl networK engINe for Molecular Energies) or ANI in short. [doi: 10.1039\/C6SC05720A] ANI is a new \u003Cem\u003Emethod and sampling procedure\u003C\/em\u003E for training NNPs that utilizes a special kind of symmetry functions to build single-atom atomic environment vectors (AEV) as a molecular representation. To train ANI potential we use fully automated approach for the generation of datasets.[arXiv:1801.09319] It is based on the concept of active learning (AL). We show the use of our proposed AL technique develops a universal ANI potential, which provides very accurate energy and force predictions on the entire COMP6 benchmark. This universal potential achieves a level of accuracy on par with the best ML potentials for single molecule or materials while remaining applicable to the general class of organic molecules comprised of the elements CHNOSFCl.","filename":"msa269s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
16:30 - 17:00
Neural Network Representations of Non-Equilibrium Potential Energy Surfaces Sampled in Virtual Reality
, David Glowacki (University of Bristol, United Kingdom)
+ Abstract { "session": {"id":"sess186","title":"MS15 - Machine Learning and Quantum Chemistry","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics","Chemistry and Materials","Physics"],"slots":[{"id":"symp135","type":"minisymposia","title":"MS15 - Machine Learning and Quantum Chemistry","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Machine Learning is right now a booming field of computer science which finds applications in the development of computer-human interfaces, in the analysis of medical data of huge populations, in the maintenance of cars, planes and elevators, and self-driving cars, to mention a few. During the last twenty years the field has gone through a development and refinement which has been spectacular. For some reason the use of the technology in pure science has been lagging behind; however, we are now starting to see the use of machine learning in the field of quantum chemistry. Here, the approach will enhance the performance of standard quantum chemical calculations \u2013 improving convergence, could serve as a tool for post-analysis of huge sets of \u003Cem\u003Eab initio\u003C\/em\u003E results, or could simply replace computationally expensive procedures. Machine learning offers practical alternatives where standard quantum chemical simulations would be prohibitive. During the last few years a small number of quantum chemistry groups have explored the potential of machine learning \u2013 the results have been extraordinary and spectacular. Here in this minisymposium we would like to inspire by presenting four different applications in which machine learning is fundamental to success.","bio":"","contributors":[{"type":"Organizer","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa199","type":"child","title":"Quantum Machine Learning in Chemical Compound Space","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Many of the most relevant chemical properties of matter depend explicitly on atomistic and electronic details, rendering a first principles approach to chemistry mandatory. Alas, even when using high-performance computers, brute force high-throughput screening of compounds is beyond any capacity for all but the simplest systems and properties due to the combinatorial nature of chemical space, i.e. all compositional, constitutional, and conformational isomers. Consequently, efficient exploration algorithms need to exploit all implicit redundancies present in chemical space. I will discuss recently developed statistical learning approaches for interpolating quantum mechanical observables in compositional and constitutional space. Results for our models indicate remarkable performance in terms of accuracy, speed, universality, and size scalability.","filename":"msa199s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa269","type":"child","title":"Neural Networks Learning Quantum Chemistry","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this talk, we will present a fully transferable deep learning potential that is applicable to complex and diverse molecular systems well beyond the training dataset. Recently we introduced ANAKIN-ME (Accurate NeurAl networK engINe for Molecular Energies) or ANI in short. [doi: 10.1039\/C6SC05720A] ANI is a new \u003Cem\u003Emethod and sampling procedure\u003C\/em\u003E for training NNPs that utilizes a special kind of symmetry functions to build single-atom atomic environment vectors (AEV) as a molecular representation. To train ANI potential we use fully automated approach for the generation of datasets.[arXiv:1801.09319] It is based on the concept of active learning (AL). We show the use of our proposed AL technique develops a universal ANI potential, which provides very accurate energy and force predictions on the entire COMP6 benchmark. This universal potential achieves a level of accuracy on par with the best ML potentials for single molecule or materials while remaining applicable to the general class of organic molecules comprised of the elements CHNOSFCl.","filename":"msa269s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa237","type":"child","title":"Neural Network Representations of Non-Equilibrium Potential Energy Surfaces Sampled in Virtual Reality","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"I will outline recent developments in our group aimed at developing efficient potential energy surface (PES) representations of molecular geometries which are far from equilibrium. Recent progress developing a framework for interactive molecular dynamics in a multi-user virtual reality environment (combining rigorous cloud-mounted physical atomistic simulation with commodity virtual reality hardware) enables us to visualize and sample, with atomic-level precision, the structures and dynamics of complex molecular structures \u0027on the fly\u0027.(arXiv:1801.02884) From within this framework, we can run real-time molecular dynamics (using density functional and semi-empirical theory), accelerating the sampling of high-energy reaction pathways. Combined, these reactive pathways represent a test set of molecular geometries whose energies and forces we then calculate at higher levels (e.g., explicitly correlated local coupled cluster theory), and fit using neural networks. The resultant PES provides coupled cluster quality energies at the cost of classical force fields, enabling us to run thousands of trajectories and thereby make comparisons with experimental dynamical observables in non-equilibrium regimes. I will illustrate this coupled virtual-reality-machine-learning workflow by focusing on recent applications where we have been studying heterogeneous reaction dynamics wherein cyano radicals (CN) undergo reactive scattering at the surfaces of liquids which are composed of long chain hydrocarbons.","filename":"msa237s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Silvia","last_name":"Amabilino","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lars","last_name":"Bratholm","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Simon","last_name":"Bennie","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}]},{"id":"msa272","type":"child","title":"Predicting the Stability of Solids with Density Functional Theory and Machine Learning","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We use a combination of machine learning techniques and high-throughput density-functional theory calculations to explore ternary compounds with the AB2C2 composition. We chose the two most common intermetallic prototypes for this composition, namely the tI10-CeAl2Ga 2 and the tP10-FeMo2B2 structures. We find that there may be \u223c10 times more stable compounds in these phases than previously known. These are mostly metallic and non-magnetic. While the use of machine learning reduces the overall calculation cost by around 75%, some limitations still exist, in particular for compounds involving the second-row of the periodic table or magnetic elements.","filename":"msa272s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa237","type":"child","title":"Neural Network Representations of Non-Equilibrium Potential Energy Surfaces Sampled in Virtual Reality","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"I will outline recent developments in our group aimed at developing efficient potential energy surface (PES) representations of molecular geometries which are far from equilibrium. Recent progress developing a framework for interactive molecular dynamics in a multi-user virtual reality environment (combining rigorous cloud-mounted physical atomistic simulation with commodity virtual reality hardware) enables us to visualize and sample, with atomic-level precision, the structures and dynamics of complex molecular structures \u0027on the fly\u0027.(arXiv:1801.02884) From within this framework, we can run real-time molecular dynamics (using density functional and semi-empirical theory), accelerating the sampling of high-energy reaction pathways. Combined, these reactive pathways represent a test set of molecular geometries whose energies and forces we then calculate at higher levels (e.g., explicitly correlated local coupled cluster theory), and fit using neural networks. The resultant PES provides coupled cluster quality energies at the cost of classical force fields, enabling us to run thousands of trajectories and thereby make comparisons with experimental dynamical observables in non-equilibrium regimes. I will illustrate this coupled virtual-reality-machine-learning workflow by focusing on recent applications where we have been studying heterogeneous reaction dynamics wherein cyano radicals (CN) undergo reactive scattering at the surfaces of liquids which are composed of long chain hydrocarbons.","filename":"msa237s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Silvia","last_name":"Amabilino","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lars","last_name":"Bratholm","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Simon","last_name":"Bennie","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Silvia","last_name":"Amabilino","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lars","last_name":"Bratholm","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Simon","last_name":"Bennie","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}] } Presentation
17:00 - 17:30
Predicting the Stability of Solids with Density Functional Theory and Machine Learning
, Miguel A. L. Marques (Martin Luther University Halle-Wittenberg, Germany)
+ Abstract { "session": {"id":"sess186","title":"MS15 - Machine Learning and Quantum Chemistry","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:30","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics","Chemistry and Materials","Physics"],"slots":[{"id":"symp135","type":"minisymposia","title":"MS15 - Machine Learning and Quantum Chemistry","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Machine Learning is right now a booming field of computer science which finds applications in the development of computer-human interfaces, in the analysis of medical data of huge populations, in the maintenance of cars, planes and elevators, and self-driving cars, to mention a few. During the last twenty years the field has gone through a development and refinement which has been spectacular. For some reason the use of the technology in pure science has been lagging behind; however, we are now starting to see the use of machine learning in the field of quantum chemistry. Here, the approach will enhance the performance of standard quantum chemical calculations \u2013 improving convergence, could serve as a tool for post-analysis of huge sets of \u003Cem\u003Eab initio\u003C\/em\u003E results, or could simply replace computationally expensive procedures. Machine learning offers practical alternatives where standard quantum chemical simulations would be prohibitive. During the last few years a small number of quantum chemistry groups have explored the potential of machine learning \u2013 the results have been extraordinary and spectacular. Here in this minisymposium we would like to inspire by presenting four different applications in which machine learning is fundamental to success.","bio":"","contributors":[{"type":"Organizer","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Roland","last_name":"Lindh","affiliation":"Uppsala University","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa199","type":"child","title":"Quantum Machine Learning in Chemical Compound Space","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Many of the most relevant chemical properties of matter depend explicitly on atomistic and electronic details, rendering a first principles approach to chemistry mandatory. Alas, even when using high-performance computers, brute force high-throughput screening of compounds is beyond any capacity for all but the simplest systems and properties due to the combinatorial nature of chemical space, i.e. all compositional, constitutional, and conformational isomers. Consequently, efficient exploration algorithms need to exploit all implicit redundancies present in chemical space. I will discuss recently developed statistical learning approaches for interpolating quantum mechanical observables in compositional and constitutional space. Results for our models indicate remarkable performance in terms of accuracy, speed, universality, and size scalability.","filename":"msa199s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anders S.","last_name":"Christensen","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa269","type":"child","title":"Neural Networks Learning Quantum Chemistry","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this talk, we will present a fully transferable deep learning potential that is applicable to complex and diverse molecular systems well beyond the training dataset. Recently we introduced ANAKIN-ME (Accurate NeurAl networK engINe for Molecular Energies) or ANI in short. [doi: 10.1039\/C6SC05720A] ANI is a new \u003Cem\u003Emethod and sampling procedure\u003C\/em\u003E for training NNPs that utilizes a special kind of symmetry functions to build single-atom atomic environment vectors (AEV) as a molecular representation. To train ANI potential we use fully automated approach for the generation of datasets.[arXiv:1801.09319] It is based on the concept of active learning (AL). We show the use of our proposed AL technique develops a universal ANI potential, which provides very accurate energy and force predictions on the entire COMP6 benchmark. This universal potential achieves a level of accuracy on par with the best ML potentials for single molecule or materials while remaining applicable to the general class of organic molecules comprised of the elements CHNOSFCl.","filename":"msa269s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Olexandr","last_name":"Isayev","affiliation":"University of North Carolina","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa237","type":"child","title":"Neural Network Representations of Non-Equilibrium Potential Energy Surfaces Sampled in Virtual Reality","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"I will outline recent developments in our group aimed at developing efficient potential energy surface (PES) representations of molecular geometries which are far from equilibrium. Recent progress developing a framework for interactive molecular dynamics in a multi-user virtual reality environment (combining rigorous cloud-mounted physical atomistic simulation with commodity virtual reality hardware) enables us to visualize and sample, with atomic-level precision, the structures and dynamics of complex molecular structures \u0027on the fly\u0027.(arXiv:1801.02884) From within this framework, we can run real-time molecular dynamics (using density functional and semi-empirical theory), accelerating the sampling of high-energy reaction pathways. Combined, these reactive pathways represent a test set of molecular geometries whose energies and forces we then calculate at higher levels (e.g., explicitly correlated local coupled cluster theory), and fit using neural networks. The resultant PES provides coupled cluster quality energies at the cost of classical force fields, enabling us to run thousands of trajectories and thereby make comparisons with experimental dynamical observables in non-equilibrium regimes. I will illustrate this coupled virtual-reality-machine-learning workflow by focusing on recent applications where we have been studying heterogeneous reaction dynamics wherein cyano radicals (CN) undergo reactive scattering at the surfaces of liquids which are composed of long chain hydrocarbons.","filename":"msa237s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Silvia","last_name":"Amabilino","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lars","last_name":"Bratholm","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Simon","last_name":"Bennie","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David","last_name":"Glowacki","affiliation":"University of Bristol","country":"United Kingdom","bio":"","order":"4","is_presenter":true}]},{"id":"msa272","type":"child","title":"Predicting the Stability of Solids with Density Functional Theory and Machine Learning","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We use a combination of machine learning techniques and high-throughput density-functional theory calculations to explore ternary compounds with the AB2C2 composition. We chose the two most common intermetallic prototypes for this composition, namely the tI10-CeAl2Ga 2 and the tP10-FeMo2B2 structures. We find that there may be \u223c10 times more stable compounds in these phases than previously known. These are mostly metallic and non-magnetic. While the use of machine learning reduces the overall calculation cost by around 75%, some limitations still exist, in particular for compounds involving the second-row of the periodic table or magnetic elements.","filename":"msa272s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa272","type":"child","title":"Predicting the Stability of Solids with Density Functional Theory and Machine Learning","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We use a combination of machine learning techniques and high-throughput density-functional theory calculations to explore ternary compounds with the AB2C2 composition. We chose the two most common intermetallic prototypes for this composition, namely the tI10-CeAl2Ga 2 and the tP10-FeMo2B2 structures. We find that there may be \u223c10 times more stable compounds in these phases than previously known. These are mostly metallic and non-magnetic. While the use of machine learning reduces the overall calculation cost by around 75%, some limitations still exist, in particular for compounds involving the second-row of the periodic table or magnetic elements.","filename":"msa272s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Miguel A. L.","last_name":"Marques","affiliation":"Martin Luther University Halle-Wittenberg","country":"Germany","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Sharlee Climer (University of Missouri - St. Louis, United States of America)
, Daniel Jacobson (Oak Ridge National Laboratory, United States of America)
Track(s):
Life Sciences, Engineering, Emerging Application Domains, Computer Science and Applied Mathematics
Difficult combinatorial problems permeate virtually every area of the sciences, business, and government and many of these problems can be cast as mixed-integer programs (MIPs). A MIP is a mathematical definition of a problem that is comprised of a set of constraints and an objective function. In general, MIPs are NP-hard and require exponential amounts of computation time in the worst case. However, search strategies, such as branch-and-bound, branch-and-cut, and cut-and-solve have evolved to provide optimal solutions for many instances.
Although there has been great progress in this field and computational power has dramatically increased over the years, many important MIPs remain intractable and the use of massive parallelization appears to be a promising means to address this great need. However, many challenges lie ahead. This minisymposium will elucidate some of these challenges, while highlighting progress in this field. It includes a round table discussion with Michael Chan, Sharlee Climer, Daniel Jacobson, Sarah Powers, and Daniel Rehfeldt, and is open to conference participants. The goal of the discussions will be to explore and integrate high-performance expertise with domain-specific insights with an aim to identify strategies that may resolve these pressing challenges.
Although there has been great progress in this field and computational power has dramatically increased over the years, many important MIPs remain intractable and the use of massive parallelization appears to be a promising means to address this great need. However, many challenges lie ahead. This minisymposium will elucidate some of these challenges, while highlighting progress in this field. It includes a round table discussion with Michael Chan, Sharlee Climer, Daniel Jacobson, Sarah Powers, and Daniel Rehfeldt, and is open to conference participants. The goal of the discussions will be to explore and integrate high-performance expertise with domain-specific insights with an aim to identify strategies that may resolve these pressing challenges.
16:00 - 16:30
Parallel Cut-and-Solve: A Method for Solving Mixed-Integer Programs Utilizing Distributed Computational Power
, Michael Chan (University of Missouri - St. Louis, United States of America)
+ Abstract { "session": {"id":"sess189","title":"MS16 - NP-Hard Computations: Massively Parallelizing Mixed-Integer Linear Programs","date":"Monday, July 2nd 2018","begin_time":"15:30","end_time":"17:31","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp149","type":"minisymposia","title":"MS16 - NP-Hard Computations: Massively Parallelizing Mixed-Integer Linear Programs","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Difficult combinatorial problems permeate virtually every area of the sciences, business, and government and many of these problems can be cast as mixed-integer programs (MIPs). A MIP is a mathematical definition of a problem that is comprised of a set of constraints and an objective function. In general, MIPs are NP-hard and require exponential amounts of computation time in the worst case. However, search strategies, such as branch-and-bound, branch-and-cut, and cut-and-solve have evolved to provide optimal solutions for many instances. \u003Cbr \/\u003E\u003Cbr \/\u003EAlthough there has been great progress in this field and computational power has dramatically increased over the years, many important MIPs remain intractable and the use of massive parallelization appears to be a promising means to address this great need. However, many challenges lie ahead. This minisymposium will elucidate some of these challenges, while highlighting progress in this field. It includes a round table discussion with Michael Chan, Sharlee Climer, Daniel Jacobson, Sarah Powers, and Daniel Rehfeldt, and is open to conference participants. The goal of the discussions will be to explore and integrate high-performance expertise with domain-specific insights with an aim to identify strategies that may resolve these pressing challenges.","bio":"","contributors":[{"type":"Organizer","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa303","type":"child","title":"Looking Back to Look Forward in Solving Mixed-Integer Linear Programs","begin_time":"15:30","end_time":"16:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerous problems spanning a variety of fields and disciplines can be formulated as a\u00a0Mixed-Integer Linear Program. Initially, only very small cases could be solved exactly. With the advent of greater computational power, the boundaries and limitations have continued to increase. However, continued progress is needed to solve many current day problems of interest (e.g., bioinformatics). The use of parallelization\u00a0and other methods present a potential for further advancing the field and will be discussed during this talk.","bio":"","contributors":[{"type":"Author","first_name":"Sarah","last_name":"Powers","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sarah","last_name":"Powers","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa275","type":"child","title":"Parallel Cut-and-Solve: A Method for Solving Mixed-Integer Programs Utilizing Distributed Computational Power","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A great number of problems can be cast as mixed-integer programs (MIPs), but their difficulty prevents many instances from being solved to optimality. The large amounts of distributed computational power becoming more readily available may provide a solution for tackling previously insolvable instances. But current MIP solvers\u0027 implementations using branch-and-cut can only be effectively parallelized to a certain degree. Here we present a potential method for parallelizing MIPs on a large scale using cut-and-solve and demonstrate our approach for a combinatorial genetics problem.","filename":"msa275s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa274","type":"child","title":"ug[SCIP-Jack, MPI]: A Massively Parallel Steiner Tree Solver","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Steiner tree problem in graphs is a classical combinatorial optimization problem that commonly arises in practical applications as one of many variants. The general-purpose solver SCIP-Jack can solve the classical Steiner tree problem as well as 11 related problems to optimality. Furthermore, the solver comes with shared and distributed parallelization extensions by means of the UG framework that allow the parallelization of its branch-and-bound search. In this talk we briefly introduce the UG framework and go on to show how it is be combined with SCIP-Jack. The resulting solver ug[SCIP-Jack,MPI] has been able to solve several well-known Steiner tree instances for the first time to optimality.","bio":"","contributors":[{"type":"Author","first_name":"Daniel","last_name":"Rehfeldt","affiliation":"Zuse Institute Berlin","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Yuji","last_name":"Shinano","affiliation":"Zuse Institute Berlin","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Thorsten","last_name":"Koch","affiliation":"Zuse Institute Berlin","country":"Germany","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Daniel","last_name":"Rehfeldt","affiliation":"Zuse Institute Berlin","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa294","type":"child","title":"Round Table Discussion: Embracing the Complexity Presented by Combinatorial Problems","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Many real-world combinatorial problems can be cast as Mixed-Integer Linear Programs (MIPs). A MIP is a mathematical definition of a problem that is comprised of a set of decision variables, some or all of which are required to have integral values; a linear objective function to be minimized or maximized; and a set of constraints, all of which are linear equalities or inequalities. MIPs are generally NP-hard problems, yet progress in the field has led to limited successes in solving moderate to large size instances. The application of cutting planes is integral for state-of-the-art solvers that use Branch-and-Cut, but this application is inherently sequential. This round-table discussion will focus on the challenges faced when massively parallelizing computations for solving MIPs and explore strategies for circumventing these challenges.","bio":"","contributors":[{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Rehfeldt","affiliation":"Zuse Institute Berlin","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa275","type":"child","title":"Parallel Cut-and-Solve: A Method for Solving Mixed-Integer Programs Utilizing Distributed Computational Power","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A great number of problems can be cast as mixed-integer programs (MIPs), but their difficulty prevents many instances from being solved to optimality. The large amounts of distributed computational power becoming more readily available may provide a solution for tackling previously insolvable instances. But current MIP solvers\u0027 implementations using branch-and-cut can only be effectively parallelized to a certain degree. Here we present a potential method for parallelizing MIPs on a large scale using cut-and-solve and demonstrate our approach for a combinatorial genetics problem.","filename":"msa275s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Philipp Schlatter (KTH Royal Institute of Technology, Sweden)
, Ramesh Balakrishnan (Argonne National Laboratory, United States of America)
Track(s):
Engineering, Computer Science and Applied Mathematics
Computational Fluid Dynamics (CFD) is a natural driver for exascale computing both for academic and industrial cases, and has the potential for substantial societal impact, like reduced energy consumption, alternative sources of energy, improved health care, and improved climate models. This minisymposium focuses on algorithms and methods applicable on the way to exascale for CFD simulations. Application cases were discussed in Part I, whereas in Part II we focus on some of the relevant methodological aspects. The main driver is the EU funded Horizon 2020 project ExaFLOW and will feature presentations showcasing their work on addressing key algorithmic challenges in CFD in order to facilitate simulations at exascale, e.g. accurate and scalable solvers, data reduction methods (compression) and strategies to ensure fault tolerance and resilience. In particular, the talks in this minisymposium will highlight the following topics: Adaptive mesh refinement and adjoint-based error estimators, resilience in transient flow solvers, efficient communication operators using PGAS and mixed CG-HDG formulations for higher-order simulations.
17:30 - 18:00
Coffee Break
Foyer 2nd Floor
Constantia Alexandrou (University of Cyprus, Cyprus)
, Petros Koumoutsakos (ETH Zurich, Switzerland)
Petros Koumoutsakos, a computational science researcher from ETH Zurich, will interview Constantia Alexandrou from the University of Cyprus about her domain of expertise – quantum chromodynamics. Many – if not most – fields in physics employ high performance computing (HPC), yet quantum chromodynamics (QCD) might be the premiere example of an area very difficult to understand outside of the field. In this dialogue, Constantia and Petros will look at what computational QCD achieves through the use of HPC, contextualizing it within a more general discussion of modern-day scientific computing. They will attempt to answer such questions as, “How do we ‘compute’ theory?” and “Will future computers change the way that theoretical physics ‘experiments’ are performed?”
Further details are available here.
Further details are available here.
If you are interested in participating, make sure you add this option at the time of registration.
Please note that places are limited. Further details available here.
Tuesday, July 3, 2018
Chair: Florina Ciorba (University of Basel, Switzerland)
Traditionally, scientific laws have been applied deductively - from predicting the performance of a pacemaker before implant, downforce of a Formula 1 car, pricing of derivatives in finance or the motion of planets for a trip to Mars. With Artificial Intelligence, we are starting to also use the data-intensive inductive approach, enabled by the re-emergence of Machine Learning which has been fueled by decades of accumulated data.
Moderators:
Florina Ciorba (University of Basel, Switzerland)
Erik Lindahl (Stockholm University, Sweden)
This panel discussion will address the main theme of PASC18: "Fast and Big Data, Fast and Big Computation". Are these two worlds evolving and converging together? Or is HPC facing a game-changing moment as the appetite for computation in the scientific computing community and industry is for a different type of computation than what we're used to? The panelists will discuss the critical challenges facing key HPC application areas in the next 5-10 years, based on a mix of knowledge and speculation. They will explore whether we need to make radical changes to our practices, methods, tools, and techniques to be able to use modern resources and make faster and bigger progress on our scientific problems. Do the current and projected developments of HPC systems and HPC software match the needs of computational scientists? Can we influence these developments in any meaningful way, or is it just a matter of adapting to the (new) hardware? Do computational scientists need to learn and apply techniques and algorithms from other areas, such as artificial intelligence and machine learning? Or is it that the other areas need to learn how to use and apply HPC to their algorithms?
Panelists:
Eng Lim Goh (Hewlett Packard Enterprise, US) will bring a perspective from industry.
Nuria Lopez (ICIQ, Spain) will bring a perspective from the chemistry domain.
Matthias Scheffler (Fritz Haber Institute, Germany) will bring a perspective from the physics and materials domains.
Torsten Schwede (University of Basel, Switzerland) will bring a perspective from the life sciences domain.
Further details are available here.
Florina Ciorba (University of Basel, Switzerland)
Erik Lindahl (Stockholm University, Sweden)
This panel discussion will address the main theme of PASC18: "Fast and Big Data, Fast and Big Computation". Are these two worlds evolving and converging together? Or is HPC facing a game-changing moment as the appetite for computation in the scientific computing community and industry is for a different type of computation than what we're used to? The panelists will discuss the critical challenges facing key HPC application areas in the next 5-10 years, based on a mix of knowledge and speculation. They will explore whether we need to make radical changes to our practices, methods, tools, and techniques to be able to use modern resources and make faster and bigger progress on our scientific problems. Do the current and projected developments of HPC systems and HPC software match the needs of computational scientists? Can we influence these developments in any meaningful way, or is it just a matter of adapting to the (new) hardware? Do computational scientists need to learn and apply techniques and algorithms from other areas, such as artificial intelligence and machine learning? Or is it that the other areas need to learn how to use and apply HPC to their algorithms?
Panelists:
Eng Lim Goh (Hewlett Packard Enterprise, US) will bring a perspective from industry.
Nuria Lopez (ICIQ, Spain) will bring a perspective from the chemistry domain.
Matthias Scheffler (Fritz Haber Institute, Germany) will bring a perspective from the physics and materials domains.
Torsten Schwede (University of Basel, Switzerland) will bring a perspective from the life sciences domain.
Further details are available here.
Chair: Maria Grazia Giuffreda (ETH Zurich / CSCS, Switzerland)
The aim of this rapid-fire session is to allow poster presenters to introduce the topic of their poster and motivate the audience to visit them at the evening poster session. Authors will be strictly limited to 40 seconds each - after this time the presentation will be stopped automatically.
11:00 - 11:30
Coffee Break
Foyer 2nd Floor
11:30 - 12:30
Papers Session
Chair: Sunita Chandrasekaran (University of Delaware, United States of America)
Track(s):
Emerging Application Domains, Computer Science and Applied Mathematics, Physics
Chair: Michael A. Heroux (Sandia National Laboratories, United States of America)
Track(s):
Computer Science and Applied Mathematics
Chair: Olaf Schenk (Università della Svizzera italiana, Switzerland)
Track(s):
Computer Science and Applied Mathematics
12:30 - 13:30
Lunch
Foyer 2nd Floor
13:30 - 15:30
Minisymposia Session III
Organizer(s):
Aurélien Cavelan (University of Basel, Switzerland)
, Florina Ciorba (University of Basel, Switzerland)
Track(s):
Computer Science and Applied Mathematics
This minisymposium will discuss faults, errors, and failures that occur in extreme-scale computing systems. We want to increase awareness that resilience is a critical topic and that there are efforts and results that offer solutions to scientists and users of extreme-scale computing systems. Hardware level faults fall into two categories: hard faults are a consequence of permanent component failures (requiring repairs), while soft, or transient faults result from single upset events (e.g. a bit-flip in a memory cell) and have impermanent effects on the system. Hard faults typically result in system-wide failures, and in the absence of a fault management mechanism, an executed application is interrupted and its data is lost. The standard approach to cope with failures is to checkpoint, rollback and recover applications. However, it is expected that this approach may no longer be a viable solution on the upcoming Exascale systems. In contrast to hard faults, transient faults cannot always be detected and they can lead to Silent Data Corruptions (SDCs). Significant research efforts have been pursued to develop efficient SDC detectors, but there is not a perfect solution yet. The speakers selected for this minisymposium will give an overview of current solutions and future challenges.
13:30 - 14:00
Characterizing Faults, Errors and Failures in Extreme-Scale Computing Systems
, Christian Engelmann (Oak Ridge National Laboratory, United States of America)
+ Abstract { "session": {"id":"sess154","title":"MS18 - Addressing Resilience Challenges for Computing at Extreme Scale","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Montreal Room","contributors":[{"type":"Session Chair","first_name":"Aurelien","last_name":"Cavelan","affiliation":"University of Basel","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"symp121","type":"minisymposia","title":"MS18 - Addressing Resilience Challenges for Computing at Extreme Scale","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"This minisymposium will discuss faults, errors, and failures that occur in extreme-scale computing systems. We want to increase awareness that resilience is a critical topic and that there are efforts and results that offer solutions to scientists and users of extreme-scale computing systems. Hardware level faults fall into two categories: hard faults are a consequence of permanent component failures (requiring repairs), while soft, or transient faults result from single upset events (e.g. a bit-flip in a memory cell) and have impermanent effects on the system. Hard faults typically result in system-wide failures, and in the absence of a fault management mechanism, an executed application is interrupted and its data is lost. The standard approach to cope with failures is to checkpoint, rollback and recover applications. However, it is expected that this approach may no longer be a viable solution on the upcoming Exascale systems. In contrast to hard faults, transient faults cannot always be detected and they can lead to Silent Data Corruptions (SDCs). Significant research efforts have been pursued to develop efficient SDC detectors, but there is not a perfect solution yet. The speakers selected for this minisymposium will give an overview of current solutions and future challenges.","bio":"","contributors":[{"type":"Organizer","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa139","type":"child","title":"Characterizing Faults, Errors and Failures in Extreme-Scale Computing Systems","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Building a reliable supercomputer that achieves the expected performance within a given cost budget and providing efficiency and correctness during operation in the presence of faults, errors,\u00a0and failures requires a full understanding of the resilience problem. The Catalog project develops a fault taxonomy, catalog and models that capture the observed and inferred conditions in\u00a0current supercomputers and extrapolates this knowledge to future-generation systems. To date, the Catalog project has analyzed billions of node hours of system logs from supercomputers at\u00a0Oak Ridge National Laboratory and Argonne National Laboratory. This talk provides an overview of our findings and lessons learned.","filename":"msa139s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christian","last_name":"Engelmann","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christian","last_name":"Engelmann","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa117","type":"child","title":"Easy and Efficient Multilevel Checkpointing for Extreme Scale Systems","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Extreme scale supercomputers offer thousands of computing nodes to their users to satisfy their computing needs. As the need for massively parallel computing increases in industry, computing centers are being forced to increase in size and to transition to new computing technologies. While the advantage for the users is clear, such evolution imposes significant challenges, such as energy consumption and reliability. In this talk, we will discuss how to guarantee high reliability to high performance applications running in extreme scale supercomputers. In particular, we cover the tools necessary to implement scalable multilevel checkpointing for tightly coupled applications. This includes an overview of failure types and frequency in current HPC systems. The talk will also cover the theoretical analysis necessary to achieve optimal utilization of the computing resources. Moreover, we will discuss the internals of the FTI library tool, to study how multilevel checkpointing is implemented today.","filename":"msa117s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Leonardo","last_name":"Bautista","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Leonardo","last_name":"Bautista","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"msa128","type":"child","title":"Recent Results and Open Problems for Resilience at Scale","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The talk will address the following three questions: (i) fail-stop errors: checkpointing or replication or both? (ii) silent errors: application-specific detectors or plain old trustworthy replication? (iii) workflows: how to avoid checkpointing every task?","bio":"","contributors":[{"type":"Author","first_name":"Yves","last_name":"Robert","affiliation":"\u00c9cole normale sup\u00e9rieure de Lyon","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Yves","last_name":"Robert","affiliation":"\u00c9cole normale sup\u00e9rieure de Lyon","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa261","type":"child","title":"Panel Discussion on Upcoming Challenges at Exascale","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This panel discussion will summarize the three preceding talks and offer guidelines and recommendations for the resilience challenges and opportunities available to scientists and users of extreme-scale computing systems.","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonardo","last_name":"Bautista","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yves","last_name":"Robert","affiliation":"\u00c9cole normale sup\u00e9rieure de Lyon","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Christian","last_name":"Engelmann","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa139","type":"child","title":"Characterizing Faults, Errors and Failures in Extreme-Scale Computing Systems","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Building a reliable supercomputer that achieves the expected performance within a given cost budget and providing efficiency and correctness during operation in the presence of faults, errors,\u00a0and failures requires a full understanding of the resilience problem. The Catalog project develops a fault taxonomy, catalog and models that capture the observed and inferred conditions in\u00a0current supercomputers and extrapolates this knowledge to future-generation systems. To date, the Catalog project has analyzed billions of node hours of system logs from supercomputers at\u00a0Oak Ridge National Laboratory and Argonne National Laboratory. This talk provides an overview of our findings and lessons learned.","filename":"msa139s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christian","last_name":"Engelmann","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christian","last_name":"Engelmann","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Christian","last_name":"Engelmann","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
14:00 - 14:30
Easy and Efficient Multilevel Checkpointing for Extreme Scale Systems
, Leonardo Bautista (Barcelona Supercomputing Center, Spain)
+ Abstract { "session": {"id":"sess154","title":"MS18 - Addressing Resilience Challenges for Computing at Extreme Scale","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Montreal Room","contributors":[{"type":"Session Chair","first_name":"Aurelien","last_name":"Cavelan","affiliation":"University of Basel","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"symp121","type":"minisymposia","title":"MS18 - Addressing Resilience Challenges for Computing at Extreme Scale","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"This minisymposium will discuss faults, errors, and failures that occur in extreme-scale computing systems. We want to increase awareness that resilience is a critical topic and that there are efforts and results that offer solutions to scientists and users of extreme-scale computing systems. Hardware level faults fall into two categories: hard faults are a consequence of permanent component failures (requiring repairs), while soft, or transient faults result from single upset events (e.g. a bit-flip in a memory cell) and have impermanent effects on the system. Hard faults typically result in system-wide failures, and in the absence of a fault management mechanism, an executed application is interrupted and its data is lost. The standard approach to cope with failures is to checkpoint, rollback and recover applications. However, it is expected that this approach may no longer be a viable solution on the upcoming Exascale systems. In contrast to hard faults, transient faults cannot always be detected and they can lead to Silent Data Corruptions (SDCs). Significant research efforts have been pursued to develop efficient SDC detectors, but there is not a perfect solution yet. The speakers selected for this minisymposium will give an overview of current solutions and future challenges.","bio":"","contributors":[{"type":"Organizer","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa139","type":"child","title":"Characterizing Faults, Errors and Failures in Extreme-Scale Computing Systems","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Building a reliable supercomputer that achieves the expected performance within a given cost budget and providing efficiency and correctness during operation in the presence of faults, errors,\u00a0and failures requires a full understanding of the resilience problem. The Catalog project develops a fault taxonomy, catalog and models that capture the observed and inferred conditions in\u00a0current supercomputers and extrapolates this knowledge to future-generation systems. To date, the Catalog project has analyzed billions of node hours of system logs from supercomputers at\u00a0Oak Ridge National Laboratory and Argonne National Laboratory. This talk provides an overview of our findings and lessons learned.","filename":"msa139s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christian","last_name":"Engelmann","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christian","last_name":"Engelmann","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa117","type":"child","title":"Easy and Efficient Multilevel Checkpointing for Extreme Scale Systems","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Extreme scale supercomputers offer thousands of computing nodes to their users to satisfy their computing needs. As the need for massively parallel computing increases in industry, computing centers are being forced to increase in size and to transition to new computing technologies. While the advantage for the users is clear, such evolution imposes significant challenges, such as energy consumption and reliability. In this talk, we will discuss how to guarantee high reliability to high performance applications running in extreme scale supercomputers. In particular, we cover the tools necessary to implement scalable multilevel checkpointing for tightly coupled applications. This includes an overview of failure types and frequency in current HPC systems. The talk will also cover the theoretical analysis necessary to achieve optimal utilization of the computing resources. Moreover, we will discuss the internals of the FTI library tool, to study how multilevel checkpointing is implemented today.","filename":"msa117s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Leonardo","last_name":"Bautista","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Leonardo","last_name":"Bautista","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"msa128","type":"child","title":"Recent Results and Open Problems for Resilience at Scale","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The talk will address the following three questions: (i) fail-stop errors: checkpointing or replication or both? (ii) silent errors: application-specific detectors or plain old trustworthy replication? (iii) workflows: how to avoid checkpointing every task?","bio":"","contributors":[{"type":"Author","first_name":"Yves","last_name":"Robert","affiliation":"\u00c9cole normale sup\u00e9rieure de Lyon","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Yves","last_name":"Robert","affiliation":"\u00c9cole normale sup\u00e9rieure de Lyon","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa261","type":"child","title":"Panel Discussion on Upcoming Challenges at Exascale","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This panel discussion will summarize the three preceding talks and offer guidelines and recommendations for the resilience challenges and opportunities available to scientists and users of extreme-scale computing systems.","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonardo","last_name":"Bautista","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yves","last_name":"Robert","affiliation":"\u00c9cole normale sup\u00e9rieure de Lyon","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Christian","last_name":"Engelmann","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa117","type":"child","title":"Easy and Efficient Multilevel Checkpointing for Extreme Scale Systems","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Extreme scale supercomputers offer thousands of computing nodes to their users to satisfy their computing needs. As the need for massively parallel computing increases in industry, computing centers are being forced to increase in size and to transition to new computing technologies. While the advantage for the users is clear, such evolution imposes significant challenges, such as energy consumption and reliability. In this talk, we will discuss how to guarantee high reliability to high performance applications running in extreme scale supercomputers. In particular, we cover the tools necessary to implement scalable multilevel checkpointing for tightly coupled applications. This includes an overview of failure types and frequency in current HPC systems. The talk will also cover the theoretical analysis necessary to achieve optimal utilization of the computing resources. Moreover, we will discuss the internals of the FTI library tool, to study how multilevel checkpointing is implemented today.","filename":"msa117s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Leonardo","last_name":"Bautista","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Leonardo","last_name":"Bautista","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Leonardo","last_name":"Bautista","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Ebru Bozdag (Colorado School of Mines, United States of America)
, Dimitri Komatitsch (CNRS, France)
Track(s):
Computer Science and Applied Mathematics, Solid Earth Dynamics, Physics
Recent advances in theory and numerical methods in parallel to the availability of high-quality massive data sets and high-performance computing provide unprecedented opportunities to improve our understanding of Earth’s interior and its mechanism. The goal of this session is to bring computational and Earth scientists together to form a platform to discuss the current status, challenges and future directions in computational geosciences highlighting numerical simulations, the state-of-the-art HPC applications and their scientific outcomes. Contributions include, but are not limited to, the areas of earthquake engineering, passive and active-source seismic imaging, geodynamical modelling, magneto-fluid dynamics, etc. in conjunction with computational approaches such as numerical solvers, large-scale workflow, big data, optimisation strategies, etc. on HPC systems.
13:30 - 14:00
High-Resolution 3D Viscoelastic Full Waveform Imaging of a Real Seismic Dataset: The Volve Oil Field Studied up to 12 Hz
, Dimitri Komatitsch (CNRS, France)
+ Abstract { "session": {"id":"sess156","title":"MS19 - Advances in Computational Geosciences, Part I","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Darwin Room","contributors":[{"type":"Session Chair","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS, Marseille","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Solid Earth Dynamics","Physics"],"slots":[{"id":"symp158","type":"minisymposia","title":"MS19 - Advances in Computational Geosciences, Part I","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Recent advances in theory and numerical methods in parallel to the availability of high-quality massive data sets and high-performance computing provide unprecedented opportunities to improve our understanding of Earth\u2019s interior and its mechanism. The goal of this session is to bring computational and Earth scientists together to form a platform to discuss the current status, challenges and future directions in computational geosciences highlighting numerical simulations, the state-of-the-art HPC applications and their scientific outcomes. Contributions include, but are not limited to, the areas of earthquake engineering, passive and active-source seismic imaging, geodynamical modelling, magneto-fluid dynamics, etc. in conjunction with computational approaches such as numerical solvers, large-scale workflow, big data, optimisation strategies, etc.\u00a0on HPC systems.","bio":"","contributors":[{"type":"Organizer","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Organizer","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"2","is_presenter":true}]},{"id":"msa115","type":"child","title":"High-Resolution 3D Viscoelastic Full Waveform Imaging of a Real Seismic Dataset: The Volve Oil Field Studied up to 12 Hz","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We will present recent advances on high-resolution 3D viscoelastic full waveform imaging, focusing in particular on a real seismic dataset for the Volve oil field, which we study up to 12 Hz on a large GPU cluster: the Piz Daint machine at CSCS in Switzerland. We will present both the workflow used and the final high-resolution pictures obtained.","filename":"msa115s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Vadim","last_name":"Monteiller","affiliation":"CNRS","country":"France","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa170","type":"child","title":"Elastic Full Waveform Inversion with Active Seismic Data","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In exploration geophysics, seismic full waveform inversion is nowadays regularly applied. Most of the time, the acoustic approximation is made when active seismic data are inverted. This assumption reduced considerably the computation requirement. Indeed, depending on the type of acquisition (streamer versus Ocean bottom node), the number of simulations per iteration can range from several thousands to hundreds of thousands and the grid size can contain from millions to hundreds of millions of points. However, the acoustic approximation limits the range of applications. In complex geology with large earth parameter contrasts, ignoring the elastic effects can lead to significant artefacts in the full waveform results. In this presentation, I will discuss some of the challenges we face when considering the elastic propagation. The first one is obviously the large increase in computational cost, the second one is the multi-parameter inversion aspect. I will also discuss some examples to illustrate the importance of considering complex physical phenomena in our simulation during full waveform inversion.","bio":"","contributors":[{"type":"Author","first_name":"Rene-Edouard","last_name":"Plessix","affiliation":"Shell Technology Center Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Rene-Edouard","last_name":"Plessix","affiliation":"Shell Technology Center Amsterdam","country":"Netherlands","bio":"","order":"1","is_presenter":true}]},{"id":"msa129","type":"child","title":"Accelerating Low-Order Unstructured Finite Element Earthquake Simulation by Time-Parallel Computation on Recent HPC Architectures","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Implicit low-order unstructured finite-element method is suitable for accurate modeling of time-history earthquake problems in complex geometry domains. However, it is costly due to large and random data access. To circumvent this bottleneck, we developed a time-parallel method that reduces the number of solver iterations by using sparse matrix vector products with multiple right-hand sides. This leads to reduction in total data transfer and random data access, and thus faster time-to-solution on recent architectures. We demonstrate the performance of the developed method and show application runs on earthquake problems.","bio":"","contributors":[{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Muneo","last_name":"Hori","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Lalith","last_name":"Maddegedara","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"1","is_presenter":true}]},{"id":"msa113","type":"child","title":"Computational Models of Magnetic Field Generation in the Earth","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Earth\u0027s magnetic field is generated by fluid motion in the outer core by a process termed self-exciting dynamo action. In this process, electrically conducting fluid flows through a magnetic field, inducing electrical currents that reinforce the original magnetic field. The driving force for this is thought to be thermal convection. This process can be simulated on the computer in a self-consistent way, albeit in a parameter regime that is somewhat distant from planetary settings. In particular, the values of viscosity used are too large, and the prospects for reducing these viscosities to more appropriate values are remote. Despite this, the approach has met with great success and has demonstrated that magnetic fields can be generated in this way. Many features are quite Earth-like, most likely because the magnetic Reynolds number (the ratio of magnetic induction to magnetic diffusion) is in the correct regime. We will contrast conventional models with a different approach in which both inertia and viscosity are omitted from the equations at the outset. This approach, whilst in its infancy, holds the promise of providing complementary models of planetary magnetic field generation.","bio":"","contributors":[{"type":"Author","first_name":"Andy","last_name":"Jackson","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andy","last_name":"Jackson","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa115","type":"child","title":"High-Resolution 3D Viscoelastic Full Waveform Imaging of a Real Seismic Dataset: The Volve Oil Field Studied up to 12 Hz","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We will present recent advances on high-resolution 3D viscoelastic full waveform imaging, focusing in particular on a real seismic dataset for the Volve oil field, which we study up to 12 Hz on a large GPU cluster: the Piz Daint machine at CSCS in Switzerland. We will present both the workflow used and the final high-resolution pictures obtained.","filename":"msa115s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Vadim","last_name":"Monteiller","affiliation":"CNRS","country":"France","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Vadim","last_name":"Monteiller","affiliation":"CNRS","country":"France","bio":"","order":"2","is_presenter":false}] } Presentation
Organizer(s):
Richard Loft (National Center for Atmospheric Research, United States of America)
, Oliver Fuhrer (MeteoSwiss, Switzerland)
Track(s):
Climate and Weather
Weather and climate models provide society with increasingly reliable weather forecasts and climate projections: critical information that can save both lives and money. With the advent of accelerators in high performance computing, several efforts around the world have begun porting weather and climate models to these emerging hardware architectures using different approaches and programming models. However, little attention has been given to the impact different porting approaches have on the maintainability of the codes. This minisymposium will provide an overview of the porting and maintainability experiences from four different community models in the US and Europe and will include a discussion of the pros/cons of different approaches in view of maintaining performance-portable production atmospheric community models.
13:30 - 14:00
Porting and Maintaining a GPU-Enabled and Performance-Portable Version of the Model for Prediction Across Scales (MPAS)
, Richard Loft (National Center for Atmospheric Research, United States of America)
+ Abstract { "session": {"id":"sess160","title":"MS20 - Challenges in Porting and Maintaining Atmospheric Codes on Emerging Hardware Architectures","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp122","type":"minisymposia","title":"MS20 - Challenges in Porting and Maintaining Atmospheric Codes on Emerging Hardware Architectures","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Weather and climate models provide society with increasingly reliable weather forecasts and climate projections: critical information that can save both lives and money. With the advent of accelerators in high performance computing, several efforts around the world have begun porting weather and climate models to these emerging hardware architectures using different approaches and programming models. However, little attention has been given to the impact different porting approaches have on the maintainability of the codes. This minisymposium will provide an overview of the porting and maintainability experiences from four different community models in the US and Europe and will include a discussion of the pros\/cons of different approaches in view of maintaining performance-portable production atmospheric community models.","bio":"","contributors":[{"type":"Organizer","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa183","type":"child","title":"Porting and Maintaining a GPU-Enabled and Performance-Portable Version of the Model for Prediction Across Scales (MPAS)","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This talk will discuss our efforts to build a portable and maintainable CPU and GPU-enabled version of the Model for Prediction Across Scales (MPAS), a global atmospheric model currently used for meteorological studies and, in the future, climate research. Our approach uses a combination of OMP and OpenACC directives to achieve performance portability. We have focused on three target architectures, namely: traditional multi-core processors (e.g. Intel Xeon and IBM Power), many core processors like the Intel Xeon Phi, and of course NVIDIA GPUs. Leveraging tools that accelerate the optimization and verification process, our team has managed to keep the port synchronized with developer updates originating from the core MPAS science team and maintain readability and excellent performance across the three architectures. The results are encouraging, suggesting a path forward for our community models based on exposing parallelism to standard directives systems.","filename":"msa183s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raghu-Raj","last_name":"Kumar","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa144","type":"child","title":"Experiences of Porting and Maintaining the ICON Model on Accelerators","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The dynamical core of the ICON model was first ported to accelerators within a PRACE 2IP project using the evolving OpenACC standard for accelerator directives. Building on this GPU porting effort the PASC-funded ENIAC project aims to port the full model using OpenACC compiler directives. While this technique can achieve good performance gains, some further improvement may be achieved by using hardware specific language or optimizations. In addition there are concerns about the long term maintainability: for example, would the investment be lost if OpenMP-4.5 usurped OpenACC as the de facto standard? The CLAW source-to-source translator ensures that OpenACC, OpenMP or other paradigms can be employed at the backend. CLAW also allows for a single-column abstraction of the physical parameterizations: this is more pleasing for the scientific developer, while allowing the introduction of domain- and hardware-specific optimizations. Finally, we envision that the inherently more static dynamical core will be reimplemented in a performance-portable platform-agnostic manner, e.g., using the GridTools framework used for the COSMO model. GridTools is implementing support for the underlying icosahedral grid. We present ICON-component examples for each of these paradigms, along with the resulting performance. We derive therefrom a long-term strategy for a maintainable ICON implementation for GPUs.","filename":"msa144s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa205","type":"child","title":"NOAA Model Development Activities Targeting Exascale","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the last two years, significant efforts have been made to adapt the National Weather Service\u0027s Finite Volume-cubed (FV3) model to run efficiently on GPU processors. The main criteria for adapting and parallelizing the FV3 code have been (1) minimize changes so the code remains acceptable to the scientists, (2) maintain portability and performance on the CPU, and (3) demonstrate bitwise exact results between the original code, and GPU enabled code. Code adaption and parallelization of the FV3 were based on successful experiences with the Non-hydrostatic Icosahedral Model (NIM), which demonstrated good performance portability between CPU, GPU and MIC processors with a single source code. Work on adapting the FV3 for GPU architectures has proven much more difficult than prior work with the NIM code. This talk will explain some of the complexities and challenges with the FV3 parallelization. We will also describe recent efforts to design and develop prototype models for diverse, highly parallel exascale computing systems expected in 5-10 years.","bio":"","contributors":[{"type":"Author","first_name":"Mark","last_name":"Govett","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mark","last_name":"Govett","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa161","type":"child","title":"Experience and Challenges with Maintaining a GPU-Capable Version of COSMO in a Production Environment at MeteoSwiss and ETH","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advances in computer technologies can greatly enhance our climate and weather modeling capabilities by allowing to increase grid resolution or the complexity of physical processes being accounted for. In order to benefit from hardware improvements, models need to be adapted which is a challenging task. This talk will first present the port of the climate and weather model COSMO to heterogeneous GPU architectures which was achieved using different technologies. A rewrite using a domain specific language (DSL) allowing a high-level hardware agnostic formulation of the model equations was considered for some components while OpenACC compiler directives have been used for the rest of the model. Performance results and implication to the production environment at MeteoSwiss for weather forecast as well as for climate simulation on the \u00a0leadership-class heterogeneous HPC system Piz Daint in Switzerland will be shown. Finally, challenges related to the different technologies employed, a domain-specific language and compiler directives, in terms of maintenance, performance portability, compiler support, and user acceptance will be discussed.","filename":"msa161s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa183","type":"child","title":"Porting and Maintaining a GPU-Enabled and Performance-Portable Version of the Model for Prediction Across Scales (MPAS)","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This talk will discuss our efforts to build a portable and maintainable CPU and GPU-enabled version of the Model for Prediction Across Scales (MPAS), a global atmospheric model currently used for meteorological studies and, in the future, climate research. Our approach uses a combination of OMP and OpenACC directives to achieve performance portability. We have focused on three target architectures, namely: traditional multi-core processors (e.g. Intel Xeon and IBM Power), many core processors like the Intel Xeon Phi, and of course NVIDIA GPUs. Leveraging tools that accelerate the optimization and verification process, our team has managed to keep the port synchronized with developer updates originating from the core MPAS science team and maintain readability and excellent performance across the three architectures. The results are encouraging, suggesting a path forward for our community models based on exposing parallelism to standard directives systems.","filename":"msa183s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raghu-Raj","last_name":"Kumar","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raghu-Raj","last_name":"Kumar","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"2","is_presenter":false}] } Presentation
14:00 - 14:30
Experiences of Porting and Maintaining the ICON Model on Accelerators
, William Sawyer (ETH Zurich / CSCS, Switzerland)
+ Abstract { "session": {"id":"sess160","title":"MS20 - Challenges in Porting and Maintaining Atmospheric Codes on Emerging Hardware Architectures","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp122","type":"minisymposia","title":"MS20 - Challenges in Porting and Maintaining Atmospheric Codes on Emerging Hardware Architectures","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Weather and climate models provide society with increasingly reliable weather forecasts and climate projections: critical information that can save both lives and money. With the advent of accelerators in high performance computing, several efforts around the world have begun porting weather and climate models to these emerging hardware architectures using different approaches and programming models. However, little attention has been given to the impact different porting approaches have on the maintainability of the codes. This minisymposium will provide an overview of the porting and maintainability experiences from four different community models in the US and Europe and will include a discussion of the pros\/cons of different approaches in view of maintaining performance-portable production atmospheric community models.","bio":"","contributors":[{"type":"Organizer","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa183","type":"child","title":"Porting and Maintaining a GPU-Enabled and Performance-Portable Version of the Model for Prediction Across Scales (MPAS)","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This talk will discuss our efforts to build a portable and maintainable CPU and GPU-enabled version of the Model for Prediction Across Scales (MPAS), a global atmospheric model currently used for meteorological studies and, in the future, climate research. Our approach uses a combination of OMP and OpenACC directives to achieve performance portability. We have focused on three target architectures, namely: traditional multi-core processors (e.g. Intel Xeon and IBM Power), many core processors like the Intel Xeon Phi, and of course NVIDIA GPUs. Leveraging tools that accelerate the optimization and verification process, our team has managed to keep the port synchronized with developer updates originating from the core MPAS science team and maintain readability and excellent performance across the three architectures. The results are encouraging, suggesting a path forward for our community models based on exposing parallelism to standard directives systems.","filename":"msa183s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raghu-Raj","last_name":"Kumar","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa144","type":"child","title":"Experiences of Porting and Maintaining the ICON Model on Accelerators","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The dynamical core of the ICON model was first ported to accelerators within a PRACE 2IP project using the evolving OpenACC standard for accelerator directives. Building on this GPU porting effort the PASC-funded ENIAC project aims to port the full model using OpenACC compiler directives. While this technique can achieve good performance gains, some further improvement may be achieved by using hardware specific language or optimizations. In addition there are concerns about the long term maintainability: for example, would the investment be lost if OpenMP-4.5 usurped OpenACC as the de facto standard? The CLAW source-to-source translator ensures that OpenACC, OpenMP or other paradigms can be employed at the backend. CLAW also allows for a single-column abstraction of the physical parameterizations: this is more pleasing for the scientific developer, while allowing the introduction of domain- and hardware-specific optimizations. Finally, we envision that the inherently more static dynamical core will be reimplemented in a performance-portable platform-agnostic manner, e.g., using the GridTools framework used for the COSMO model. GridTools is implementing support for the underlying icosahedral grid. We present ICON-component examples for each of these paradigms, along with the resulting performance. We derive therefrom a long-term strategy for a maintainable ICON implementation for GPUs.","filename":"msa144s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa205","type":"child","title":"NOAA Model Development Activities Targeting Exascale","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the last two years, significant efforts have been made to adapt the National Weather Service\u0027s Finite Volume-cubed (FV3) model to run efficiently on GPU processors. The main criteria for adapting and parallelizing the FV3 code have been (1) minimize changes so the code remains acceptable to the scientists, (2) maintain portability and performance on the CPU, and (3) demonstrate bitwise exact results between the original code, and GPU enabled code. Code adaption and parallelization of the FV3 were based on successful experiences with the Non-hydrostatic Icosahedral Model (NIM), which demonstrated good performance portability between CPU, GPU and MIC processors with a single source code. Work on adapting the FV3 for GPU architectures has proven much more difficult than prior work with the NIM code. This talk will explain some of the complexities and challenges with the FV3 parallelization. We will also describe recent efforts to design and develop prototype models for diverse, highly parallel exascale computing systems expected in 5-10 years.","bio":"","contributors":[{"type":"Author","first_name":"Mark","last_name":"Govett","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mark","last_name":"Govett","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa161","type":"child","title":"Experience and Challenges with Maintaining a GPU-Capable Version of COSMO in a Production Environment at MeteoSwiss and ETH","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advances in computer technologies can greatly enhance our climate and weather modeling capabilities by allowing to increase grid resolution or the complexity of physical processes being accounted for. In order to benefit from hardware improvements, models need to be adapted which is a challenging task. This talk will first present the port of the climate and weather model COSMO to heterogeneous GPU architectures which was achieved using different technologies. A rewrite using a domain specific language (DSL) allowing a high-level hardware agnostic formulation of the model equations was considered for some components while OpenACC compiler directives have been used for the rest of the model. Performance results and implication to the production environment at MeteoSwiss for weather forecast as well as for climate simulation on the \u00a0leadership-class heterogeneous HPC system Piz Daint in Switzerland will be shown. Finally, challenges related to the different technologies employed, a domain-specific language and compiler directives, in terms of maintenance, performance portability, compiler support, and user acceptance will be discussed.","filename":"msa161s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa144","type":"child","title":"Experiences of Porting and Maintaining the ICON Model on Accelerators","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The dynamical core of the ICON model was first ported to accelerators within a PRACE 2IP project using the evolving OpenACC standard for accelerator directives. Building on this GPU porting effort the PASC-funded ENIAC project aims to port the full model using OpenACC compiler directives. While this technique can achieve good performance gains, some further improvement may be achieved by using hardware specific language or optimizations. In addition there are concerns about the long term maintainability: for example, would the investment be lost if OpenMP-4.5 usurped OpenACC as the de facto standard? The CLAW source-to-source translator ensures that OpenACC, OpenMP or other paradigms can be employed at the backend. CLAW also allows for a single-column abstraction of the physical parameterizations: this is more pleasing for the scientific developer, while allowing the introduction of domain- and hardware-specific optimizations. Finally, we envision that the inherently more static dynamical core will be reimplemented in a performance-portable platform-agnostic manner, e.g., using the GridTools framework used for the COSMO model. GridTools is implementing support for the underlying icosahedral grid. We present ICON-component examples for each of these paradigms, along with the resulting performance. We derive therefrom a long-term strategy for a maintainable ICON implementation for GPUs.","filename":"msa144s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
15:00 - 15:30
Experience and Challenges with Maintaining a GPU-Capable Version of COSMO in a Production Environment at MeteoSwiss and ETH
, Xavier Lapillonne (MeteoSwiss, Switzerland)
+ Abstract { "session": {"id":"sess160","title":"MS20 - Challenges in Porting and Maintaining Atmospheric Codes on Emerging Hardware Architectures","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp122","type":"minisymposia","title":"MS20 - Challenges in Porting and Maintaining Atmospheric Codes on Emerging Hardware Architectures","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Weather and climate models provide society with increasingly reliable weather forecasts and climate projections: critical information that can save both lives and money. With the advent of accelerators in high performance computing, several efforts around the world have begun porting weather and climate models to these emerging hardware architectures using different approaches and programming models. However, little attention has been given to the impact different porting approaches have on the maintainability of the codes. This minisymposium will provide an overview of the porting and maintainability experiences from four different community models in the US and Europe and will include a discussion of the pros\/cons of different approaches in view of maintaining performance-portable production atmospheric community models.","bio":"","contributors":[{"type":"Organizer","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa183","type":"child","title":"Porting and Maintaining a GPU-Enabled and Performance-Portable Version of the Model for Prediction Across Scales (MPAS)","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This talk will discuss our efforts to build a portable and maintainable CPU and GPU-enabled version of the Model for Prediction Across Scales (MPAS), a global atmospheric model currently used for meteorological studies and, in the future, climate research. Our approach uses a combination of OMP and OpenACC directives to achieve performance portability. We have focused on three target architectures, namely: traditional multi-core processors (e.g. Intel Xeon and IBM Power), many core processors like the Intel Xeon Phi, and of course NVIDIA GPUs. Leveraging tools that accelerate the optimization and verification process, our team has managed to keep the port synchronized with developer updates originating from the core MPAS science team and maintain readability and excellent performance across the three architectures. The results are encouraging, suggesting a path forward for our community models based on exposing parallelism to standard directives systems.","filename":"msa183s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raghu-Raj","last_name":"Kumar","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Richard","last_name":"Loft","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa144","type":"child","title":"Experiences of Porting and Maintaining the ICON Model on Accelerators","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The dynamical core of the ICON model was first ported to accelerators within a PRACE 2IP project using the evolving OpenACC standard for accelerator directives. Building on this GPU porting effort the PASC-funded ENIAC project aims to port the full model using OpenACC compiler directives. While this technique can achieve good performance gains, some further improvement may be achieved by using hardware specific language or optimizations. In addition there are concerns about the long term maintainability: for example, would the investment be lost if OpenMP-4.5 usurped OpenACC as the de facto standard? The CLAW source-to-source translator ensures that OpenACC, OpenMP or other paradigms can be employed at the backend. CLAW also allows for a single-column abstraction of the physical parameterizations: this is more pleasing for the scientific developer, while allowing the introduction of domain- and hardware-specific optimizations. Finally, we envision that the inherently more static dynamical core will be reimplemented in a performance-portable platform-agnostic manner, e.g., using the GridTools framework used for the COSMO model. GridTools is implementing support for the underlying icosahedral grid. We present ICON-component examples for each of these paradigms, along with the resulting performance. We derive therefrom a long-term strategy for a maintainable ICON implementation for GPUs.","filename":"msa144s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa205","type":"child","title":"NOAA Model Development Activities Targeting Exascale","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the last two years, significant efforts have been made to adapt the National Weather Service\u0027s Finite Volume-cubed (FV3) model to run efficiently on GPU processors. The main criteria for adapting and parallelizing the FV3 code have been (1) minimize changes so the code remains acceptable to the scientists, (2) maintain portability and performance on the CPU, and (3) demonstrate bitwise exact results between the original code, and GPU enabled code. Code adaption and parallelization of the FV3 were based on successful experiences with the Non-hydrostatic Icosahedral Model (NIM), which demonstrated good performance portability between CPU, GPU and MIC processors with a single source code. Work on adapting the FV3 for GPU architectures has proven much more difficult than prior work with the NIM code. This talk will explain some of the complexities and challenges with the FV3 parallelization. We will also describe recent efforts to design and develop prototype models for diverse, highly parallel exascale computing systems expected in 5-10 years.","bio":"","contributors":[{"type":"Author","first_name":"Mark","last_name":"Govett","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mark","last_name":"Govett","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa161","type":"child","title":"Experience and Challenges with Maintaining a GPU-Capable Version of COSMO in a Production Environment at MeteoSwiss and ETH","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advances in computer technologies can greatly enhance our climate and weather modeling capabilities by allowing to increase grid resolution or the complexity of physical processes being accounted for. In order to benefit from hardware improvements, models need to be adapted which is a challenging task. This talk will first present the port of the climate and weather model COSMO to heterogeneous GPU architectures which was achieved using different technologies. A rewrite using a domain specific language (DSL) allowing a high-level hardware agnostic formulation of the model equations was considered for some components while OpenACC compiler directives have been used for the rest of the model. Performance results and implication to the production environment at MeteoSwiss for weather forecast as well as for climate simulation on the \u00a0leadership-class heterogeneous HPC system Piz Daint in Switzerland will be shown. Finally, challenges related to the different technologies employed, a domain-specific language and compiler directives, in terms of maintenance, performance portability, compiler support, and user acceptance will be discussed.","filename":"msa161s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa161","type":"child","title":"Experience and Challenges with Maintaining a GPU-Capable Version of COSMO in a Production Environment at MeteoSwiss and ETH","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advances in computer technologies can greatly enhance our climate and weather modeling capabilities by allowing to increase grid resolution or the complexity of physical processes being accounted for. In order to benefit from hardware improvements, models need to be adapted which is a challenging task. This talk will first present the port of the climate and weather model COSMO to heterogeneous GPU architectures which was achieved using different technologies. A rewrite using a domain specific language (DSL) allowing a high-level hardware agnostic formulation of the model equations was considered for some components while OpenACC compiler directives have been used for the rest of the model. Performance results and implication to the production environment at MeteoSwiss for weather forecast as well as for climate simulation on the \u00a0leadership-class heterogeneous HPC system Piz Daint in Switzerland will be shown. Finally, challenges related to the different technologies employed, a domain-specific language and compiler directives, in terms of maintenance, performance portability, compiler support, and user acceptance will be discussed.","filename":"msa161s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Leila Tamara Alexander (Swiss Institute of Bioinformatics, Switzerland)
, Torsten Schwede (University of Basel, Switzerland)
Track(s):
Life Sciences, Emerging Application Domains
Personalized health aims to provide the right treatment at the right time for each individual. A major premise is that empowerment with more knowledge leads to better decision-making. Tailored, predictive interventions have the potential to change from a reactive to a preventative approach, thereby significantly extending the duration of health. To achieve this, highly performant computational environments to store, transfer, analyse and integrate data produced at astonishing rates are prerequisites. Today personalized health becomes a practical reality, and the exploding 'big data' in healthcare provides exciting IT opportunities.
This minisymposium highlights computational approaches to enable the next steps in biomedical discovery. The introductory keynote will outline the current clinical and scientific needs for advanced computational approaches. The second session is about computational methods that already revolutionise medical practice through virtual reality systems for personalized medical surgeries. The third talk discusses the utilisation of HPC for finding new anti-cancer therapies, from discovery to clinical trials. Our final speaker will present the latest developments in workflow environment used by the Swiss Personalized Health Network (SPHN) to support Swiss researchers in exploring the next generation of data-driven health and care innovations.
This minisymposium highlights computational approaches to enable the next steps in biomedical discovery. The introductory keynote will outline the current clinical and scientific needs for advanced computational approaches. The second session is about computational methods that already revolutionise medical practice through virtual reality systems for personalized medical surgeries. The third talk discusses the utilisation of HPC for finding new anti-cancer therapies, from discovery to clinical trials. Our final speaker will present the latest developments in workflow environment used by the Swiss Personalized Health Network (SPHN) to support Swiss researchers in exploring the next generation of data-driven health and care innovations.
13:30 - 14:00
Semantic Interoperability Challenges for Sharing and Reusing Large Amounts of Heterogeneous Data
, Marie-Christine Jaulent (INSERM, France)
+ Abstract { "session": {"id":"sess162","title":"MS21 - Computational Solutions to Large-Scale Data Management and Analysis Challenges in Personalized Health","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Samarkand Room","contributors":[{"type":"Session Chair","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Emerging Application Domains"],"slots":[{"id":"symp109","type":"minisymposia","title":"MS21 - Computational Solutions to Large-Scale Data Management and Analysis Challenges in Personalized Health","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Personalized health aims to provide the right treatment at the right time for each individual. A major premise is that empowerment with more knowledge leads to better decision-making.\u00a0Tailored, predictive interventions have the potential to change from a reactive to a preventative approach, thereby significantly extending the duration of health. To achieve this, highly performant computational environments to store, transfer, analyse and integrate data produced at astonishing rates are prerequisites. Today personalized health becomes a practical reality, and the exploding \u0027big data\u0027 in healthcare provides exciting IT opportunities.\u003Cbr \/\u003E \u003Cbr \/\u003E This minisymposium highlights computational approaches to enable the next steps in biomedical discovery. The introductory keynote will outline the current clinical and scientific needs for advanced computational approaches. The second session is about computational methods that already revolutionise medical practice through virtual reality systems for personalized medical surgeries. The third talk discusses the utilisation of HPC for finding new anti-cancer therapies, from discovery to clinical trials. Our final speaker will present the latest developments in workflow environment used by the Swiss Personalized Health Network (SPHN) to support Swiss researchers in exploring the next generation of data-driven health and care innovations.","bio":"","contributors":[{"type":"Organizer","first_name":"Leila Tamara","last_name":"Alexander","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Organizer","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa233","type":"child","title":"Semantic Interoperability Challenges for Sharing and Reusing Large Amounts of Heterogeneous Data","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Although the Big Data approach seems promising in various analytic uses, sharing or integrating data within the same analysis space remains a complex task, as existing data is highly heterogeneous and difficult to compare. In this presentation, we address the Variety and Veracity dimensions of Big Data when integrating, sharing and reusing large amounts of heterogeneous data for data analysis and decision making applications in the healthcare domain. Many issues are raised by the necessity to conform Big Data to standards in order to make data more interoperable both by humans or computations such as data mining. We discuss how ontologies (computerized meaning) can contribute to the improvement of information sharing and address the problem of data sharing together with semantic interoperability data frameworks.","filename":"msa233s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marie-Christine","last_name":"Jaulent","affiliation":"INSERM","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marie-Christine","last_name":"Jaulent","affiliation":"INSERM","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa248","type":"child","title":"Challenges of Volume Rendering in a Virtual Reality Environment","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Volume Rendering of medical three-dimensional data is challenging for Virtual Reality due to the computational complexity and the required high frame rate of 90 frames per eye. In this presentation I will show how we achieved this goal using standard GPU hardware and what possible applications of the technology are in the Medical Field.","filename":"msa248s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philippe","last_name":"Cattin","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Philippe","last_name":"Cattin","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa285","type":"child","title":"HPC-Supported Therapy Development in Oncology","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Oncology is being revolutionized by technological breakthroughs that permit unprecedented in-depth analysis of the tumour tissue; until now, decision making was based on the analysis of a few well known molecular alterations. Recent technologies are now providing complete interrogation of germline and somatic mutations (genomics), of the gene expression level (transcriptomics), of the resulting protein levels (proteomics), of the metabolic status (metabolomics), of the antigens presented at the surface of tumour cells (immuno-peptidomics), as well as many additional omics to complete this very rich data set. The clinical decision process is therefore expected to heavily rely on computational approaches in the years to come, with machine learning technologies playing a key role. In addition, once key targets are identified through such -omics approaches, computational methods are also key to accelerate the drug discovery process. Two computer-based drug discovery projects will be discussed, both aiming at providing improved immunotherapies for melanoma patients. These projects will illustrate the contribution of high performance computing to the drug design and to the protein design approaches.","bio":"","contributors":[{"type":"Author","first_name":"Olivier","last_name":"Michielin","affiliation":"University of Lausanne","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vincent","last_name":"Zoete","affiliation":"University of Lausanne","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vincent","last_name":"Zoete","affiliation":"University of Lausanne","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa229","type":"child","title":"Achieving Workflow Interoperability for Personalized Health Research in Switzerland","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Swiss Personalized Health Network initiative (SPHN) aims to accelerate biomedical research by making clinical-care data collected in multiple hospitals available to scientists. The first phase of the SPHN initiative is supported by the BioMedIT project (SIB), which will establish the necessary IT environment to achieve this goal. Traditionally research involving clinical data requires defining procedures which are developed \u003Cem\u003Ead hoc\u003C\/em\u003E for each project (e.g. data access control, data transfer processes, security framework, methods to ensure reproducibility of results). Establishing such procedures is highly time consuming and the associated complexity makes them error prone. BioMedIT will establish a unified set of IT standards and services which will greatly mitigate these difficulties. Considering the heterogeneity of the research landscape (hospitals, academic institutions), and the rapid evolution of technologies, it is essential to define interoperability standards which are agnostic to the underlying technologies. In this context, workflow environments are an essential component to facilitate the development of reproducible and portable methods. Analysis of clinical data creates additional constraints regarding security and data access control that must be considered. In this session we will present the current consensus on implementing workflow environments for clinical research, and identify gaps still to be worked on.","bio":"","contributors":[{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa233","type":"child","title":"Semantic Interoperability Challenges for Sharing and Reusing Large Amounts of Heterogeneous Data","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Although the Big Data approach seems promising in various analytic uses, sharing or integrating data within the same analysis space remains a complex task, as existing data is highly heterogeneous and difficult to compare. In this presentation, we address the Variety and Veracity dimensions of Big Data when integrating, sharing and reusing large amounts of heterogeneous data for data analysis and decision making applications in the healthcare domain. Many issues are raised by the necessity to conform Big Data to standards in order to make data more interoperable both by humans or computations such as data mining. We discuss how ontologies (computerized meaning) can contribute to the improvement of information sharing and address the problem of data sharing together with semantic interoperability data frameworks.","filename":"msa233s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marie-Christine","last_name":"Jaulent","affiliation":"INSERM","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marie-Christine","last_name":"Jaulent","affiliation":"INSERM","country":"France","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Marie-Christine","last_name":"Jaulent","affiliation":"INSERM","country":"France","bio":"","order":"1","is_presenter":true}] } Presentation
14:00 - 14:30
Challenges of Volume Rendering in a Virtual Reality Environment
, Philippe Cattin (University of Basel, Switzerland)
+ Abstract { "session": {"id":"sess162","title":"MS21 - Computational Solutions to Large-Scale Data Management and Analysis Challenges in Personalized Health","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Samarkand Room","contributors":[{"type":"Session Chair","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Emerging Application Domains"],"slots":[{"id":"symp109","type":"minisymposia","title":"MS21 - Computational Solutions to Large-Scale Data Management and Analysis Challenges in Personalized Health","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Personalized health aims to provide the right treatment at the right time for each individual. A major premise is that empowerment with more knowledge leads to better decision-making.\u00a0Tailored, predictive interventions have the potential to change from a reactive to a preventative approach, thereby significantly extending the duration of health. To achieve this, highly performant computational environments to store, transfer, analyse and integrate data produced at astonishing rates are prerequisites. Today personalized health becomes a practical reality, and the exploding \u0027big data\u0027 in healthcare provides exciting IT opportunities.\u003Cbr \/\u003E \u003Cbr \/\u003E This minisymposium highlights computational approaches to enable the next steps in biomedical discovery. The introductory keynote will outline the current clinical and scientific needs for advanced computational approaches. The second session is about computational methods that already revolutionise medical practice through virtual reality systems for personalized medical surgeries. The third talk discusses the utilisation of HPC for finding new anti-cancer therapies, from discovery to clinical trials. Our final speaker will present the latest developments in workflow environment used by the Swiss Personalized Health Network (SPHN) to support Swiss researchers in exploring the next generation of data-driven health and care innovations.","bio":"","contributors":[{"type":"Organizer","first_name":"Leila Tamara","last_name":"Alexander","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Organizer","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa233","type":"child","title":"Semantic Interoperability Challenges for Sharing and Reusing Large Amounts of Heterogeneous Data","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Although the Big Data approach seems promising in various analytic uses, sharing or integrating data within the same analysis space remains a complex task, as existing data is highly heterogeneous and difficult to compare. In this presentation, we address the Variety and Veracity dimensions of Big Data when integrating, sharing and reusing large amounts of heterogeneous data for data analysis and decision making applications in the healthcare domain. Many issues are raised by the necessity to conform Big Data to standards in order to make data more interoperable both by humans or computations such as data mining. We discuss how ontologies (computerized meaning) can contribute to the improvement of information sharing and address the problem of data sharing together with semantic interoperability data frameworks.","filename":"msa233s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marie-Christine","last_name":"Jaulent","affiliation":"INSERM","country":"France","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marie-Christine","last_name":"Jaulent","affiliation":"INSERM","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa248","type":"child","title":"Challenges of Volume Rendering in a Virtual Reality Environment","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Volume Rendering of medical three-dimensional data is challenging for Virtual Reality due to the computational complexity and the required high frame rate of 90 frames per eye. In this presentation I will show how we achieved this goal using standard GPU hardware and what possible applications of the technology are in the Medical Field.","filename":"msa248s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philippe","last_name":"Cattin","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Philippe","last_name":"Cattin","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa285","type":"child","title":"HPC-Supported Therapy Development in Oncology","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Oncology is being revolutionized by technological breakthroughs that permit unprecedented in-depth analysis of the tumour tissue; until now, decision making was based on the analysis of a few well known molecular alterations. Recent technologies are now providing complete interrogation of germline and somatic mutations (genomics), of the gene expression level (transcriptomics), of the resulting protein levels (proteomics), of the metabolic status (metabolomics), of the antigens presented at the surface of tumour cells (immuno-peptidomics), as well as many additional omics to complete this very rich data set. The clinical decision process is therefore expected to heavily rely on computational approaches in the years to come, with machine learning technologies playing a key role. In addition, once key targets are identified through such -omics approaches, computational methods are also key to accelerate the drug discovery process. Two computer-based drug discovery projects will be discussed, both aiming at providing improved immunotherapies for melanoma patients. These projects will illustrate the contribution of high performance computing to the drug design and to the protein design approaches.","bio":"","contributors":[{"type":"Author","first_name":"Olivier","last_name":"Michielin","affiliation":"University of Lausanne","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vincent","last_name":"Zoete","affiliation":"University of Lausanne","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vincent","last_name":"Zoete","affiliation":"University of Lausanne","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa229","type":"child","title":"Achieving Workflow Interoperability for Personalized Health Research in Switzerland","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Swiss Personalized Health Network initiative (SPHN) aims to accelerate biomedical research by making clinical-care data collected in multiple hospitals available to scientists. The first phase of the SPHN initiative is supported by the BioMedIT project (SIB), which will establish the necessary IT environment to achieve this goal. Traditionally research involving clinical data requires defining procedures which are developed \u003Cem\u003Ead hoc\u003C\/em\u003E for each project (e.g. data access control, data transfer processes, security framework, methods to ensure reproducibility of results). Establishing such procedures is highly time consuming and the associated complexity makes them error prone. BioMedIT will establish a unified set of IT standards and services which will greatly mitigate these difficulties. Considering the heterogeneity of the research landscape (hospitals, academic institutions), and the rapid evolution of technologies, it is essential to define interoperability standards which are agnostic to the underlying technologies. In this context, workflow environments are an essential component to facilitate the development of reproducible and portable methods. Analysis of clinical data creates additional constraints regarding security and data access control that must be considered. In this session we will present the current consensus on implementing workflow environments for clinical research, and identify gaps still to be worked on.","bio":"","contributors":[{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa248","type":"child","title":"Challenges of Volume Rendering in a Virtual Reality Environment","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Volume Rendering of medical three-dimensional data is challenging for Virtual Reality due to the computational complexity and the required high frame rate of 90 frames per eye. In this presentation I will show how we achieved this goal using standard GPU hardware and what possible applications of the technology are in the Medical Field.","filename":"msa248s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philippe","last_name":"Cattin","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Philippe","last_name":"Cattin","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Philippe","last_name":"Cattin","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Mark Abraham (KTH Royal Institute of Technology, Sweden)
, Anshu Dubey (Argonne National Laboratory, United States of America)
Track(s):
Solid Earth Dynamics, Physics, Life Sciences, Engineering, Emerging Application Domains, Computer Science and Applied Mathematics, Climate and Weather, Chemistry and Materials
Theory and experiment have long been two equal pillars of science, and many would hope to add simulation as a third pillar. However, the challenges for those writing the simulation software are immense. The development team must (a) encompass strong domain expertise, so that the simulations are fit for their purpose; (b) develop the code in a way that can be sustained even without the original authors; and (c) demonstrate to their user communities through testing, benchmarking and documentation that the software will be useful in the hands of researchers, who will not be able to read the code.
In this minisymposium, we will hear from the developers of large community codes about approaches they have adopted to unite the teams of people around the development and maintenance of shared codebases. These will cover not just the programming languages and development tools that have been shown to work well, but also how to encourage adoption of good software engineering techniques by professionals and students of other disciplines, and the career-development needs of the research software engineers who will execute the bulk of the work.
In this minisymposium, we will hear from the developers of large community codes about approaches they have adopted to unite the teams of people around the development and maintenance of shared codebases. These will cover not just the programming languages and development tools that have been shown to work well, but also how to encourage adoption of good software engineering techniques by professionals and students of other disciplines, and the career-development needs of the research software engineers who will execute the bulk of the work.
13:30 - 14:00
The Evolution of Software Practice in GROMACS: To Suit Both the Laptop and the Exascale
, Mark Abraham (KTH Royal Institute of Technology, Sweden)
+ Abstract { "session": {"id":"sess173","title":"MS22 - Fostering Software Engineering Best Practice within Research Teams","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Solid Earth Dynamics","Physics","Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials"],"slots":[{"id":"symp140","type":"minisymposia","title":"MS22 - Fostering Software Engineering Best Practice within Research Teams","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Theory and experiment have long been two equal pillars of science, and many would hope to add simulation as a third pillar. However, the challenges for those writing the simulation software are immense. The development team must (a) encompass strong domain expertise, so that the simulations are fit for their purpose; (b) develop the code in a way that can be sustained even without the original authors; and (c) demonstrate to their user communities through testing, benchmarking and documentation that the software will be useful in the hands of researchers, who will not be able to read the code.\u003Cbr \/\u003E\u003Cbr \/\u003EIn this minisymposium, we will hear from the developers of large community codes about approaches they have adopted to unite the teams of people around the development and maintenance of shared codebases. These will cover not just the programming languages and development tools that have been shown to work well, but also how to encourage adoption of good software engineering techniques by professionals and students of other disciplines, and the career-development needs of the research software engineers who will execute the bulk of the work.","bio":"","contributors":[{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa290","type":"child","title":"The Evolution of Software Practice in GROMACS: To Suit Both the Laptop and the Exascale","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecular dynamics simulations are now a widely used investigative technique, often complementing or even serving in place of experiments. The implementation in GROMACS is already one of the most frequently used of all codes in HPC, however needs radical changes in computational efficiency to maximize users\u0027 scientific quality. Those changes must act at all scales of parallelism, whether a single laptop or the largest supercomputers, so key algorithms have been redesigned to permit implementations that can be tailored to current and emerging architectures. The different implementations require heavy investment in software development process, so that the global development team can deliver their projects in ways that users will trust. In this talk, I will recount some of the changes we have made, describing approaches that have worked, and why. Developers facing similar challenges will learn how they can benefit from these practices.","filename":"msa290s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa118","type":"child","title":"Software Process for FLASH, a Code Serving Multiple Scientific Communities","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"FLASH is a multiphysics multiscale code that has been in existence for nearly two decades. It was originally developed for simulating astrophysical phenomena, however, investment in designing an extensible architecture has resulted in several science communities adding capabilities and adopting FLASH for their use. Challenges of various kinds have occurred at different stages of the code\u0027s evolution, ranging from deeply technical such as interoperating heterogeneous solvers, to sociological such as interdisciplinary interactions and building a community. Many software practices adopted by the FLASH team were ahead of their time compared to the broader scientific communities, therefore, tools such as testing harness, or checking compliance with coding standards were built in-house. This presentation will outline the evolution of FLASH\u0027s software process and tools development in response to specific challenges. It will also highlight the benefits of early investment in software design in terms of ongoing scientific productivity of the code.","filename":"msa118s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa295","type":"child","title":"Challenges in Evolving Software for Cryo-Electron Microscopy: From CPUs to GPUs and Back Again","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a few years, cryo-EM has gone from being the ugly duckling of structural biology to one of the hottest techniques in science, recently recognized by the 2017 Chemistry Nobel Prize.\u00a0Modern cryo-EM is entirely dependent on advanced computational tools to reconstruct three-dimensional structures from millions of extremely noisy two-dimensional images, and with faster detectors and more advanced processing algorithms the computational step has become a critical bottleneck - some experimental facilities have clusters with tens of thousands of CPUs. Here, I will present how we have managed to reformulate the Bayesian REgularized LIkelihood OptimizatoN algorithm used in the RELION code into data-parallel algorithms that made it possible to move the dominant parts to GPU accelerators using CUDA. I will also describe the work necessary to reformulate these algorithms to benefit from GPUs, general challenges when implementing CUDA parts in large production codes, and show examples of how GPU-specific features such as texture units enabled exceptional performance acceleration. However, all processors benefit from data-parallel algorithms: By porting our CUDA implementations back to C++ with threading and fast math libraries we have now also achieved tremendous speedup on standard x86 processors, with the compiler generating SIMD code instead of manually introducing hardware-specific instructions.","filename":"msa295s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa302","type":"child","title":"More than Top-Down or Bottom-Up: Fostering Software Engineering Best Practice in Diverse Groups","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We often think of practice being driven by grass-roots change (encouraging participation) or clear leadership (providing direction). In reality, successful application of best practice requires both, particularly in the diverse communities where research software is developed and used. Over the last five years, significant changes related to research software have taken place\u00a0from training initiatives like Software Carpentry, to the\u00a0recognition of the role\u00a0of the Research Software Engineer, to the development and adoption of community guidelines and practices. My talk will look at how each of these is related, and how the drive towards reproducibility, FAIR research outputs, and open science is having an effect on the way that software development is being done in research teams.","filename":"msa302s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa290","type":"child","title":"The Evolution of Software Practice in GROMACS: To Suit Both the Laptop and the Exascale","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecular dynamics simulations are now a widely used investigative technique, often complementing or even serving in place of experiments. The implementation in GROMACS is already one of the most frequently used of all codes in HPC, however needs radical changes in computational efficiency to maximize users\u0027 scientific quality. Those changes must act at all scales of parallelism, whether a single laptop or the largest supercomputers, so key algorithms have been redesigned to permit implementations that can be tailored to current and emerging architectures. The different implementations require heavy investment in software development process, so that the global development team can deliver their projects in ways that users will trust. In this talk, I will recount some of the changes we have made, describing approaches that have worked, and why. Developers facing similar challenges will learn how they can benefit from these practices.","filename":"msa290s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}] } Presentation
14:00 - 14:30
Software Process for FLASH, a Code Serving Multiple Scientific Communities
, Anshu Dubey (Argonne National Laboratory, United States of America)
+ Abstract { "session": {"id":"sess173","title":"MS22 - Fostering Software Engineering Best Practice within Research Teams","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Solid Earth Dynamics","Physics","Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials"],"slots":[{"id":"symp140","type":"minisymposia","title":"MS22 - Fostering Software Engineering Best Practice within Research Teams","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Theory and experiment have long been two equal pillars of science, and many would hope to add simulation as a third pillar. However, the challenges for those writing the simulation software are immense. The development team must (a) encompass strong domain expertise, so that the simulations are fit for their purpose; (b) develop the code in a way that can be sustained even without the original authors; and (c) demonstrate to their user communities through testing, benchmarking and documentation that the software will be useful in the hands of researchers, who will not be able to read the code.\u003Cbr \/\u003E\u003Cbr \/\u003EIn this minisymposium, we will hear from the developers of large community codes about approaches they have adopted to unite the teams of people around the development and maintenance of shared codebases. These will cover not just the programming languages and development tools that have been shown to work well, but also how to encourage adoption of good software engineering techniques by professionals and students of other disciplines, and the career-development needs of the research software engineers who will execute the bulk of the work.","bio":"","contributors":[{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa290","type":"child","title":"The Evolution of Software Practice in GROMACS: To Suit Both the Laptop and the Exascale","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecular dynamics simulations are now a widely used investigative technique, often complementing or even serving in place of experiments. The implementation in GROMACS is already one of the most frequently used of all codes in HPC, however needs radical changes in computational efficiency to maximize users\u0027 scientific quality. Those changes must act at all scales of parallelism, whether a single laptop or the largest supercomputers, so key algorithms have been redesigned to permit implementations that can be tailored to current and emerging architectures. The different implementations require heavy investment in software development process, so that the global development team can deliver their projects in ways that users will trust. In this talk, I will recount some of the changes we have made, describing approaches that have worked, and why. Developers facing similar challenges will learn how they can benefit from these practices.","filename":"msa290s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa118","type":"child","title":"Software Process for FLASH, a Code Serving Multiple Scientific Communities","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"FLASH is a multiphysics multiscale code that has been in existence for nearly two decades. It was originally developed for simulating astrophysical phenomena, however, investment in designing an extensible architecture has resulted in several science communities adding capabilities and adopting FLASH for their use. Challenges of various kinds have occurred at different stages of the code\u0027s evolution, ranging from deeply technical such as interoperating heterogeneous solvers, to sociological such as interdisciplinary interactions and building a community. Many software practices adopted by the FLASH team were ahead of their time compared to the broader scientific communities, therefore, tools such as testing harness, or checking compliance with coding standards were built in-house. This presentation will outline the evolution of FLASH\u0027s software process and tools development in response to specific challenges. It will also highlight the benefits of early investment in software design in terms of ongoing scientific productivity of the code.","filename":"msa118s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa295","type":"child","title":"Challenges in Evolving Software for Cryo-Electron Microscopy: From CPUs to GPUs and Back Again","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a few years, cryo-EM has gone from being the ugly duckling of structural biology to one of the hottest techniques in science, recently recognized by the 2017 Chemistry Nobel Prize.\u00a0Modern cryo-EM is entirely dependent on advanced computational tools to reconstruct three-dimensional structures from millions of extremely noisy two-dimensional images, and with faster detectors and more advanced processing algorithms the computational step has become a critical bottleneck - some experimental facilities have clusters with tens of thousands of CPUs. Here, I will present how we have managed to reformulate the Bayesian REgularized LIkelihood OptimizatoN algorithm used in the RELION code into data-parallel algorithms that made it possible to move the dominant parts to GPU accelerators using CUDA. I will also describe the work necessary to reformulate these algorithms to benefit from GPUs, general challenges when implementing CUDA parts in large production codes, and show examples of how GPU-specific features such as texture units enabled exceptional performance acceleration. However, all processors benefit from data-parallel algorithms: By porting our CUDA implementations back to C++ with threading and fast math libraries we have now also achieved tremendous speedup on standard x86 processors, with the compiler generating SIMD code instead of manually introducing hardware-specific instructions.","filename":"msa295s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa302","type":"child","title":"More than Top-Down or Bottom-Up: Fostering Software Engineering Best Practice in Diverse Groups","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We often think of practice being driven by grass-roots change (encouraging participation) or clear leadership (providing direction). In reality, successful application of best practice requires both, particularly in the diverse communities where research software is developed and used. Over the last five years, significant changes related to research software have taken place\u00a0from training initiatives like Software Carpentry, to the\u00a0recognition of the role\u00a0of the Research Software Engineer, to the development and adoption of community guidelines and practices. My talk will look at how each of these is related, and how the drive towards reproducibility, FAIR research outputs, and open science is having an effect on the way that software development is being done in research teams.","filename":"msa302s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa118","type":"child","title":"Software Process for FLASH, a Code Serving Multiple Scientific Communities","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"FLASH is a multiphysics multiscale code that has been in existence for nearly two decades. It was originally developed for simulating astrophysical phenomena, however, investment in designing an extensible architecture has resulted in several science communities adding capabilities and adopting FLASH for their use. Challenges of various kinds have occurred at different stages of the code\u0027s evolution, ranging from deeply technical such as interoperating heterogeneous solvers, to sociological such as interdisciplinary interactions and building a community. Many software practices adopted by the FLASH team were ahead of their time compared to the broader scientific communities, therefore, tools such as testing harness, or checking compliance with coding standards were built in-house. This presentation will outline the evolution of FLASH\u0027s software process and tools development in response to specific challenges. It will also highlight the benefits of early investment in software design in terms of ongoing scientific productivity of the code.","filename":"msa118s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
14:30 - 15:00
Challenges in Evolving Software for Cryo-Electron Microscopy: From CPUs to GPUs and Back Again
, Erik Lindahl (Stockholm University, Sweden)
+ Abstract { "session": {"id":"sess173","title":"MS22 - Fostering Software Engineering Best Practice within Research Teams","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Solid Earth Dynamics","Physics","Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials"],"slots":[{"id":"symp140","type":"minisymposia","title":"MS22 - Fostering Software Engineering Best Practice within Research Teams","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Theory and experiment have long been two equal pillars of science, and many would hope to add simulation as a third pillar. However, the challenges for those writing the simulation software are immense. The development team must (a) encompass strong domain expertise, so that the simulations are fit for their purpose; (b) develop the code in a way that can be sustained even without the original authors; and (c) demonstrate to their user communities through testing, benchmarking and documentation that the software will be useful in the hands of researchers, who will not be able to read the code.\u003Cbr \/\u003E\u003Cbr \/\u003EIn this minisymposium, we will hear from the developers of large community codes about approaches they have adopted to unite the teams of people around the development and maintenance of shared codebases. These will cover not just the programming languages and development tools that have been shown to work well, but also how to encourage adoption of good software engineering techniques by professionals and students of other disciplines, and the career-development needs of the research software engineers who will execute the bulk of the work.","bio":"","contributors":[{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa290","type":"child","title":"The Evolution of Software Practice in GROMACS: To Suit Both the Laptop and the Exascale","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecular dynamics simulations are now a widely used investigative technique, often complementing or even serving in place of experiments. The implementation in GROMACS is already one of the most frequently used of all codes in HPC, however needs radical changes in computational efficiency to maximize users\u0027 scientific quality. Those changes must act at all scales of parallelism, whether a single laptop or the largest supercomputers, so key algorithms have been redesigned to permit implementations that can be tailored to current and emerging architectures. The different implementations require heavy investment in software development process, so that the global development team can deliver their projects in ways that users will trust. In this talk, I will recount some of the changes we have made, describing approaches that have worked, and why. Developers facing similar challenges will learn how they can benefit from these practices.","filename":"msa290s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa118","type":"child","title":"Software Process for FLASH, a Code Serving Multiple Scientific Communities","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"FLASH is a multiphysics multiscale code that has been in existence for nearly two decades. It was originally developed for simulating astrophysical phenomena, however, investment in designing an extensible architecture has resulted in several science communities adding capabilities and adopting FLASH for their use. Challenges of various kinds have occurred at different stages of the code\u0027s evolution, ranging from deeply technical such as interoperating heterogeneous solvers, to sociological such as interdisciplinary interactions and building a community. Many software practices adopted by the FLASH team were ahead of their time compared to the broader scientific communities, therefore, tools such as testing harness, or checking compliance with coding standards were built in-house. This presentation will outline the evolution of FLASH\u0027s software process and tools development in response to specific challenges. It will also highlight the benefits of early investment in software design in terms of ongoing scientific productivity of the code.","filename":"msa118s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa295","type":"child","title":"Challenges in Evolving Software for Cryo-Electron Microscopy: From CPUs to GPUs and Back Again","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a few years, cryo-EM has gone from being the ugly duckling of structural biology to one of the hottest techniques in science, recently recognized by the 2017 Chemistry Nobel Prize.\u00a0Modern cryo-EM is entirely dependent on advanced computational tools to reconstruct three-dimensional structures from millions of extremely noisy two-dimensional images, and with faster detectors and more advanced processing algorithms the computational step has become a critical bottleneck - some experimental facilities have clusters with tens of thousands of CPUs. Here, I will present how we have managed to reformulate the Bayesian REgularized LIkelihood OptimizatoN algorithm used in the RELION code into data-parallel algorithms that made it possible to move the dominant parts to GPU accelerators using CUDA. I will also describe the work necessary to reformulate these algorithms to benefit from GPUs, general challenges when implementing CUDA parts in large production codes, and show examples of how GPU-specific features such as texture units enabled exceptional performance acceleration. However, all processors benefit from data-parallel algorithms: By porting our CUDA implementations back to C++ with threading and fast math libraries we have now also achieved tremendous speedup on standard x86 processors, with the compiler generating SIMD code instead of manually introducing hardware-specific instructions.","filename":"msa295s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa302","type":"child","title":"More than Top-Down or Bottom-Up: Fostering Software Engineering Best Practice in Diverse Groups","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We often think of practice being driven by grass-roots change (encouraging participation) or clear leadership (providing direction). In reality, successful application of best practice requires both, particularly in the diverse communities where research software is developed and used. Over the last five years, significant changes related to research software have taken place\u00a0from training initiatives like Software Carpentry, to the\u00a0recognition of the role\u00a0of the Research Software Engineer, to the development and adoption of community guidelines and practices. My talk will look at how each of these is related, and how the drive towards reproducibility, FAIR research outputs, and open science is having an effect on the way that software development is being done in research teams.","filename":"msa302s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa295","type":"child","title":"Challenges in Evolving Software for Cryo-Electron Microscopy: From CPUs to GPUs and Back Again","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a few years, cryo-EM has gone from being the ugly duckling of structural biology to one of the hottest techniques in science, recently recognized by the 2017 Chemistry Nobel Prize.\u00a0Modern cryo-EM is entirely dependent on advanced computational tools to reconstruct three-dimensional structures from millions of extremely noisy two-dimensional images, and with faster detectors and more advanced processing algorithms the computational step has become a critical bottleneck - some experimental facilities have clusters with tens of thousands of CPUs. Here, I will present how we have managed to reformulate the Bayesian REgularized LIkelihood OptimizatoN algorithm used in the RELION code into data-parallel algorithms that made it possible to move the dominant parts to GPU accelerators using CUDA. I will also describe the work necessary to reformulate these algorithms to benefit from GPUs, general challenges when implementing CUDA parts in large production codes, and show examples of how GPU-specific features such as texture units enabled exceptional performance acceleration. However, all processors benefit from data-parallel algorithms: By porting our CUDA implementations back to C++ with threading and fast math libraries we have now also achieved tremendous speedup on standard x86 processors, with the compiler generating SIMD code instead of manually introducing hardware-specific instructions.","filename":"msa295s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}] } Presentation
15:00 - 15:30
More than Top-Down or Bottom-Up: Fostering Software Engineering Best Practice in Diverse Groups
, Neil Chue Hong (University of Edinburgh, United Kingdom)
+ Abstract { "session": {"id":"sess173","title":"MS22 - Fostering Software Engineering Best Practice within Research Teams","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Solid Earth Dynamics","Physics","Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials"],"slots":[{"id":"symp140","type":"minisymposia","title":"MS22 - Fostering Software Engineering Best Practice within Research Teams","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Theory and experiment have long been two equal pillars of science, and many would hope to add simulation as a third pillar. However, the challenges for those writing the simulation software are immense. The development team must (a) encompass strong domain expertise, so that the simulations are fit for their purpose; (b) develop the code in a way that can be sustained even without the original authors; and (c) demonstrate to their user communities through testing, benchmarking and documentation that the software will be useful in the hands of researchers, who will not be able to read the code.\u003Cbr \/\u003E\u003Cbr \/\u003EIn this minisymposium, we will hear from the developers of large community codes about approaches they have adopted to unite the teams of people around the development and maintenance of shared codebases. These will cover not just the programming languages and development tools that have been shown to work well, but also how to encourage adoption of good software engineering techniques by professionals and students of other disciplines, and the career-development needs of the research software engineers who will execute the bulk of the work.","bio":"","contributors":[{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa290","type":"child","title":"The Evolution of Software Practice in GROMACS: To Suit Both the Laptop and the Exascale","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecular dynamics simulations are now a widely used investigative technique, often complementing or even serving in place of experiments. The implementation in GROMACS is already one of the most frequently used of all codes in HPC, however needs radical changes in computational efficiency to maximize users\u0027 scientific quality. Those changes must act at all scales of parallelism, whether a single laptop or the largest supercomputers, so key algorithms have been redesigned to permit implementations that can be tailored to current and emerging architectures. The different implementations require heavy investment in software development process, so that the global development team can deliver their projects in ways that users will trust. In this talk, I will recount some of the changes we have made, describing approaches that have worked, and why. Developers facing similar challenges will learn how they can benefit from these practices.","filename":"msa290s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa118","type":"child","title":"Software Process for FLASH, a Code Serving Multiple Scientific Communities","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"FLASH is a multiphysics multiscale code that has been in existence for nearly two decades. It was originally developed for simulating astrophysical phenomena, however, investment in designing an extensible architecture has resulted in several science communities adding capabilities and adopting FLASH for their use. Challenges of various kinds have occurred at different stages of the code\u0027s evolution, ranging from deeply technical such as interoperating heterogeneous solvers, to sociological such as interdisciplinary interactions and building a community. Many software practices adopted by the FLASH team were ahead of their time compared to the broader scientific communities, therefore, tools such as testing harness, or checking compliance with coding standards were built in-house. This presentation will outline the evolution of FLASH\u0027s software process and tools development in response to specific challenges. It will also highlight the benefits of early investment in software design in terms of ongoing scientific productivity of the code.","filename":"msa118s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa295","type":"child","title":"Challenges in Evolving Software for Cryo-Electron Microscopy: From CPUs to GPUs and Back Again","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a few years, cryo-EM has gone from being the ugly duckling of structural biology to one of the hottest techniques in science, recently recognized by the 2017 Chemistry Nobel Prize.\u00a0Modern cryo-EM is entirely dependent on advanced computational tools to reconstruct three-dimensional structures from millions of extremely noisy two-dimensional images, and with faster detectors and more advanced processing algorithms the computational step has become a critical bottleneck - some experimental facilities have clusters with tens of thousands of CPUs. Here, I will present how we have managed to reformulate the Bayesian REgularized LIkelihood OptimizatoN algorithm used in the RELION code into data-parallel algorithms that made it possible to move the dominant parts to GPU accelerators using CUDA. I will also describe the work necessary to reformulate these algorithms to benefit from GPUs, general challenges when implementing CUDA parts in large production codes, and show examples of how GPU-specific features such as texture units enabled exceptional performance acceleration. However, all processors benefit from data-parallel algorithms: By porting our CUDA implementations back to C++ with threading and fast math libraries we have now also achieved tremendous speedup on standard x86 processors, with the compiler generating SIMD code instead of manually introducing hardware-specific instructions.","filename":"msa295s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa302","type":"child","title":"More than Top-Down or Bottom-Up: Fostering Software Engineering Best Practice in Diverse Groups","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We often think of practice being driven by grass-roots change (encouraging participation) or clear leadership (providing direction). In reality, successful application of best practice requires both, particularly in the diverse communities where research software is developed and used. Over the last five years, significant changes related to research software have taken place\u00a0from training initiatives like Software Carpentry, to the\u00a0recognition of the role\u00a0of the Research Software Engineer, to the development and adoption of community guidelines and practices. My talk will look at how each of these is related, and how the drive towards reproducibility, FAIR research outputs, and open science is having an effect on the way that software development is being done in research teams.","filename":"msa302s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa302","type":"child","title":"More than Top-Down or Bottom-Up: Fostering Software Engineering Best Practice in Diverse Groups","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We often think of practice being driven by grass-roots change (encouraging participation) or clear leadership (providing direction). In reality, successful application of best practice requires both, particularly in the diverse communities where research software is developed and used. Over the last five years, significant changes related to research software have taken place\u00a0from training initiatives like Software Carpentry, to the\u00a0recognition of the role\u00a0of the Research Software Engineer, to the development and adoption of community guidelines and practices. My talk will look at how each of these is related, and how the drive towards reproducibility, FAIR research outputs, and open science is having an effect on the way that software development is being done in research teams.","filename":"msa302s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Neil","last_name":"Chue Hong","affiliation":"University of Edinburgh","country":"United Kingdom","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Olaf Schenk (Università della Svizzera italiana, Switzerland)
, Gerhard Wellein (Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany)
, Georg Hager (Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany)
Track(s):
Computer Science and Applied Mathematics
Graphs (or networks) are a very powerful abstraction of various phenomena that can be expressed as a relation between entities. For several decades researchers in theoretical computer science and discrete mathematics have been developing a wealth of graph theory and graph algorithms. Recently, however, we see a qualitative change in how graph algorithms are used in practice: The complex structure of graphs in new and emerging applications, the size of typical inputs, and the computer architectures on which graph problems are solved call for novel algorithms and hardware efficient approaches. As computer architectures and memory hierarchies are becoming more complex, increasingly parallel and heterogeneous it is important to develop parallel algorithms and tools with these specific hardware constraints in mind. The minisymposium thus aims to bring together experts in developing, implementing and using modern graph algorithms to address this issue. Existing algorithms and tools will be reviewed in terms of modern HPC architectures and novel hardware efficient approaches will be presented. The application areas for such techniques include sparse matrix partitioning/coloring, and network graph analysis that are traditionally sequential as well as applications that need the extreme performance of emerging hardware architectures.
13:30 - 14:00
Tracking Communities in Streaming Graphs
, David Bader (Georgia Institute of Technology, United States of America)
+ Abstract { "session": {"id":"sess178","title":"MS23 - High-Performance Graph Algorithms","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Sydney Room","contributors":[{"type":"Session Chair","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"symp131","type":"minisymposia","title":"MS23 - High-Performance Graph Algorithms","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Graphs (or networks) are a very powerful abstraction of various phenomena that can be expressed as a relation between entities. For several decades researchers in theoretical computer science and discrete mathematics have been developing a wealth of graph theory and graph algorithms. Recently, however, we see a qualitative change in how graph algorithms are used in practice: The complex structure of graphs in new and emerging applications, the size of typical inputs, and the computer architectures on which graph problems are solved call for novel algorithms and hardware efficient approaches. As computer architectures and memory hierarchies are becoming more complex, increasingly parallel and heterogeneous it is important to develop parallel algorithms and tools with these specific hardware constraints in mind. The minisymposium thus aims to bring together experts in developing, implementing and using modern graph algorithms to address this issue. Existing algorithms and tools will be reviewed in terms of modern HPC architectures and novel hardware efficient approaches will be presented. The application areas for such techniques include sparse matrix partitioning\/coloring, and network graph analysis that are traditionally sequential as well as applications that need the extreme performance of emerging hardware architectures.","bio":"","contributors":[{"type":"Organizer","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":true},{"type":"Organizer","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"msa297","type":"child","title":"Tracking Communities in Streaming Graphs","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A variety of massive datasets, such as social networks and biological data, are represented as graphs that reveal underlying connections, trends, and anomalies. Community detection is the task of discovering dense groups of vertices in a graph. Its one specific form is seed set expansion, which finds the best local community for a given set of seed vertices. Greedy, agglomerative algorithms, which are commonly used in seed set expansion, have been previously designed only for a static, unchanging graph. However, in many applications, new data are constantly produced, and vertices and edges are inserted and removed from a graph. We present an algorithm for dynamic seed set expansion, which maintains a local community over time by incrementally updating as the underlying graph changes. We show that our dynamic algorithm outputs high-quality communities that are similar to those found when using a standard static algorithm. It works well both when beginning with an already existing graph and in the fully streaming case when starting with no data. The dynamic approach is also faster than re-computation when low latency updates are needed.","filename":"msa297s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"David","last_name":"Bader","affiliation":"Georgia Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David","last_name":"Bader","affiliation":"Georgia Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa289","type":"child","title":"Parallel Mesh Partitioning with Balanced K-Means","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Graph partitioning is an indispensable tool for efficient matrix and graph processing in distributed memory, balancing the computational load while minimizing communication. The required methods largely depend on the graph type: Numerical simulation meshes mostly have homogeneous degrees, high diameter and often spatial information, enabling geometric approaches. Complex networks have a low diameter, heterogeneous degrees and no useful spatial information. However, even for numerical simulation meshes, purely geometric approaches often suffer from unsatisfactory solution quality. We discuss two graph partitioners addressing these challenges: (i) ParHIP (Meyerhenke, Sanders, and Schulz), the parallel version of KaHIP, a graph partitioner for complex networks and meshes. In a multilevel process, it performs coarsening and local refinement based on size-constrained label propagation. As an example, using 512 cores, the resulting algorithm produces a high-quality partition of a web graph with 3.3G edges in 16 seconds; (ii) Geographer, the main focus of this presentation, is a new approach for mesh partitioning combining space-filling curves, balanced k-means and combinatorial local refinement. In experiments with meshes on up to 16384 processes, it scales well and relevant quality measures are often better than with ParHIP and ParMeTiS. The core of Geographer is a scalable version of k-means adapted to yield balanced clusters.","filename":"msa289s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Moritz","last_name":"von Looz","affiliation":"University of Cologne","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Charilaos","last_name":"Tzovas","affiliation":"University of Cologne","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Henning","last_name":"Meyerhenke","affiliation":"University of Cologne","country":"Germany","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Moritz","last_name":"von Looz","affiliation":"University of Cologne","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa278","type":"child","title":"Improvement of Graph Partitions Using the Graph p-Laplacian","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementations of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits significant improvements over both METIS and KaHIP for graphs originating from various application domains of graph partitioning, ranging from triangular Delaunay meshes to power networks. Particular emphasis is placed on the benefits of applying the p-Laplacian method on graphs emerging from social networks.","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa228","type":"child","title":"RACE: Recursive Algebraic Coloring Engine","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Graph coloring is an important method used to parallelize sparse matrix kernels having inherent data-dependencies. Typical examples range from exact kernels like sparse matrix transpose vector (SpMTV), symmetric sparse matrix vector (SymmSpMV) to iterative solvers like Kaczmarz (KACZ) and Gauss-Seidel (GS). Most of the typical schemes currently available to parallelize such kernels suffer from performance issues on modern hardware or are highly matrix specific or require changes in the entire matrix storage format. We propose a novel method called RACE that achieves high hardware efficiency on modern multi-core architectures and at the same time uses simple storage formats like compressed row storage (CRS). The method used is a recursive level-based method that aims at finding optimal permutations while preserving good data locality. A thorough performance analysis shows that RACE out-performs traditional multi-coloring methods, Intel MKL implementations and even the recursive sparse block (RSB) implementation of SymmSpMV that uses tailored storage format for such operations.","bio":"","contributors":[{"type":"Author","first_name":"Christie Louis","last_name":"Alappat","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Holger","last_name":"Fehske","affiliation":"University of Greifswald","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christie Louis","last_name":"Alappat","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa297","type":"child","title":"Tracking Communities in Streaming Graphs","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A variety of massive datasets, such as social networks and biological data, are represented as graphs that reveal underlying connections, trends, and anomalies. Community detection is the task of discovering dense groups of vertices in a graph. Its one specific form is seed set expansion, which finds the best local community for a given set of seed vertices. Greedy, agglomerative algorithms, which are commonly used in seed set expansion, have been previously designed only for a static, unchanging graph. However, in many applications, new data are constantly produced, and vertices and edges are inserted and removed from a graph. We present an algorithm for dynamic seed set expansion, which maintains a local community over time by incrementally updating as the underlying graph changes. We show that our dynamic algorithm outputs high-quality communities that are similar to those found when using a standard static algorithm. It works well both when beginning with an already existing graph and in the fully streaming case when starting with no data. The dynamic approach is also faster than re-computation when low latency updates are needed.","filename":"msa297s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"David","last_name":"Bader","affiliation":"Georgia Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David","last_name":"Bader","affiliation":"Georgia Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"David","last_name":"Bader","affiliation":"Georgia Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
14:00 - 14:30
Parallel Mesh Partitioning with Balanced K-Means
, Moritz von Looz (University of Cologne, Germany)
+ Abstract { "session": {"id":"sess178","title":"MS23 - High-Performance Graph Algorithms","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Sydney Room","contributors":[{"type":"Session Chair","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"symp131","type":"minisymposia","title":"MS23 - High-Performance Graph Algorithms","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Graphs (or networks) are a very powerful abstraction of various phenomena that can be expressed as a relation between entities. For several decades researchers in theoretical computer science and discrete mathematics have been developing a wealth of graph theory and graph algorithms. Recently, however, we see a qualitative change in how graph algorithms are used in practice: The complex structure of graphs in new and emerging applications, the size of typical inputs, and the computer architectures on which graph problems are solved call for novel algorithms and hardware efficient approaches. As computer architectures and memory hierarchies are becoming more complex, increasingly parallel and heterogeneous it is important to develop parallel algorithms and tools with these specific hardware constraints in mind. The minisymposium thus aims to bring together experts in developing, implementing and using modern graph algorithms to address this issue. Existing algorithms and tools will be reviewed in terms of modern HPC architectures and novel hardware efficient approaches will be presented. The application areas for such techniques include sparse matrix partitioning\/coloring, and network graph analysis that are traditionally sequential as well as applications that need the extreme performance of emerging hardware architectures.","bio":"","contributors":[{"type":"Organizer","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":true},{"type":"Organizer","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"msa297","type":"child","title":"Tracking Communities in Streaming Graphs","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A variety of massive datasets, such as social networks and biological data, are represented as graphs that reveal underlying connections, trends, and anomalies. Community detection is the task of discovering dense groups of vertices in a graph. Its one specific form is seed set expansion, which finds the best local community for a given set of seed vertices. Greedy, agglomerative algorithms, which are commonly used in seed set expansion, have been previously designed only for a static, unchanging graph. However, in many applications, new data are constantly produced, and vertices and edges are inserted and removed from a graph. We present an algorithm for dynamic seed set expansion, which maintains a local community over time by incrementally updating as the underlying graph changes. We show that our dynamic algorithm outputs high-quality communities that are similar to those found when using a standard static algorithm. It works well both when beginning with an already existing graph and in the fully streaming case when starting with no data. The dynamic approach is also faster than re-computation when low latency updates are needed.","filename":"msa297s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"David","last_name":"Bader","affiliation":"Georgia Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David","last_name":"Bader","affiliation":"Georgia Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa289","type":"child","title":"Parallel Mesh Partitioning with Balanced K-Means","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Graph partitioning is an indispensable tool for efficient matrix and graph processing in distributed memory, balancing the computational load while minimizing communication. The required methods largely depend on the graph type: Numerical simulation meshes mostly have homogeneous degrees, high diameter and often spatial information, enabling geometric approaches. Complex networks have a low diameter, heterogeneous degrees and no useful spatial information. However, even for numerical simulation meshes, purely geometric approaches often suffer from unsatisfactory solution quality. We discuss two graph partitioners addressing these challenges: (i) ParHIP (Meyerhenke, Sanders, and Schulz), the parallel version of KaHIP, a graph partitioner for complex networks and meshes. In a multilevel process, it performs coarsening and local refinement based on size-constrained label propagation. As an example, using 512 cores, the resulting algorithm produces a high-quality partition of a web graph with 3.3G edges in 16 seconds; (ii) Geographer, the main focus of this presentation, is a new approach for mesh partitioning combining space-filling curves, balanced k-means and combinatorial local refinement. In experiments with meshes on up to 16384 processes, it scales well and relevant quality measures are often better than with ParHIP and ParMeTiS. The core of Geographer is a scalable version of k-means adapted to yield balanced clusters.","filename":"msa289s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Moritz","last_name":"von Looz","affiliation":"University of Cologne","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Charilaos","last_name":"Tzovas","affiliation":"University of Cologne","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Henning","last_name":"Meyerhenke","affiliation":"University of Cologne","country":"Germany","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Moritz","last_name":"von Looz","affiliation":"University of Cologne","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa278","type":"child","title":"Improvement of Graph Partitions Using the Graph p-Laplacian","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementations of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits significant improvements over both METIS and KaHIP for graphs originating from various application domains of graph partitioning, ranging from triangular Delaunay meshes to power networks. Particular emphasis is placed on the benefits of applying the p-Laplacian method on graphs emerging from social networks.","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa228","type":"child","title":"RACE: Recursive Algebraic Coloring Engine","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Graph coloring is an important method used to parallelize sparse matrix kernels having inherent data-dependencies. Typical examples range from exact kernels like sparse matrix transpose vector (SpMTV), symmetric sparse matrix vector (SymmSpMV) to iterative solvers like Kaczmarz (KACZ) and Gauss-Seidel (GS). Most of the typical schemes currently available to parallelize such kernels suffer from performance issues on modern hardware or are highly matrix specific or require changes in the entire matrix storage format. We propose a novel method called RACE that achieves high hardware efficiency on modern multi-core architectures and at the same time uses simple storage formats like compressed row storage (CRS). The method used is a recursive level-based method that aims at finding optimal permutations while preserving good data locality. A thorough performance analysis shows that RACE out-performs traditional multi-coloring methods, Intel MKL implementations and even the recursive sparse block (RSB) implementation of SymmSpMV that uses tailored storage format for such operations.","bio":"","contributors":[{"type":"Author","first_name":"Christie Louis","last_name":"Alappat","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Georg","last_name":"Hager","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Holger","last_name":"Fehske","affiliation":"University of Greifswald","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christie Louis","last_name":"Alappat","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa289","type":"child","title":"Parallel Mesh Partitioning with Balanced K-Means","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Graph partitioning is an indispensable tool for efficient matrix and graph processing in distributed memory, balancing the computational load while minimizing communication. The required methods largely depend on the graph type: Numerical simulation meshes mostly have homogeneous degrees, high diameter and often spatial information, enabling geometric approaches. Complex networks have a low diameter, heterogeneous degrees and no useful spatial information. However, even for numerical simulation meshes, purely geometric approaches often suffer from unsatisfactory solution quality. We discuss two graph partitioners addressing these challenges: (i) ParHIP (Meyerhenke, Sanders, and Schulz), the parallel version of KaHIP, a graph partitioner for complex networks and meshes. In a multilevel process, it performs coarsening and local refinement based on size-constrained label propagation. As an example, using 512 cores, the resulting algorithm produces a high-quality partition of a web graph with 3.3G edges in 16 seconds; (ii) Geographer, the main focus of this presentation, is a new approach for mesh partitioning combining space-filling curves, balanced k-means and combinatorial local refinement. In experiments with meshes on up to 16384 processes, it scales well and relevant quality measures are often better than with ParHIP and ParMeTiS. The core of Geographer is a scalable version of k-means adapted to yield balanced clusters.","filename":"msa289s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Moritz","last_name":"von Looz","affiliation":"University of Cologne","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Charilaos","last_name":"Tzovas","affiliation":"University of Cologne","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Henning","last_name":"Meyerhenke","affiliation":"University of Cologne","country":"Germany","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Moritz","last_name":"von Looz","affiliation":"University of Cologne","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Moritz","last_name":"von Looz","affiliation":"University of Cologne","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Charilaos","last_name":"Tzovas","affiliation":"University of Cologne","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Henning","last_name":"Meyerhenke","affiliation":"University of Cologne","country":"Germany","bio":"","order":"3","is_presenter":false}] } Presentation
Organizer(s):
Frank Jenko (Max Planck Institute for Plasma Physics, Germany)
Track(s):
Physics
This minisymposium will address exciting opportunities for plasma simulation in the pre-exascale era, addressing many issues which transcend plasma physics and are of interest to a wide audience. Plasma physics offers many opportunities to explore quintessential complex systems characterized by multi-scale and multi-physics problems. The underlying nonlinear integro-differential equations can only be solved with the help of supercomputers. Therefore, not surprisingly, many computational plasma scientists view the upcoming exascale era as the golden age, finally allowing them for the very first time to attack several long-standing fundamental questions, from turbulence to magnetic reconnection to dynamo action - as well as their self-consistent interactions. The goal of the present minisymposium is to present examples of how the computational plasma physics community is preparing for the exascale era. A particularly fascinating aspect of this theme can be put into the formula "big data meets computation." Handling massive amounts of data - before, during, and after a simulation - and combining experimental or observational data with simulation data are two key challenges in this context. Novel ideas along these lines will be presented.
14:30 - 15:00
Variable Precision: Making Every Bit Count
, Jeffrey A. F. Hittinger (Lawrence Livermore National Laboratory, United States of America)
+ Abstract { "session": {"id":"sess171","title":"MS24 - Plasma I: Exciting Opportunities for Plasma Simulation in the Pre-Exascale Era","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Osaka Room","contributors":[{"type":"Session Chair","first_name":"Frank","last_name":"Jenko","affiliation":"IPP Max Planck","country":"Germany"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Physics"],"slots":[{"id":"symp152","type":"minisymposia","title":"MS24 - Plasma I: Exciting Opportunities for Plasma Simulation in the Pre-Exascale Era","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"This minisymposium will address exciting opportunities for plasma simulation in the pre-exascale era, addressing many issues which transcend plasma physics and are of interest to a wide audience. Plasma physics offers many opportunities to explore quintessential complex systems characterized by multi-scale and multi-physics problems. The underlying nonlinear integro-differential equations can only be solved with the help of supercomputers. Therefore, not surprisingly, many computational plasma scientists view the upcoming exascale era as the golden age, finally allowing them for the very first time to attack several long-standing fundamental questions, from turbulence to magnetic reconnection to dynamo action - as well as their self-consistent interactions. The goal of the present minisymposium is to present examples of how the computational plasma physics community is preparing for the exascale era. A particularly fascinating aspect of this theme can be put into the formula \u0022big data meets computation.\u0022 Handling massive amounts of data - before, during, and after a simulation - and combining experimental or observational data with simulation data are two key challenges in this context. Novel ideas along these lines will be presented.","bio":"","contributors":[{"type":"Organizer","first_name":"Frank","last_name":"Jenko","affiliation":"Max Planck Institute for Plasma Physics","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Frank","last_name":"Jenko","affiliation":"Max Planck Institute for Plasma Physics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa222","type":"child","title":"Design and Development of Particle-in-Cell Methods for Emerging Tensor Architectures","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Several companies are developing specialized hardware to boost the performance of dense matrix low-precision computations as the market of AI-based data analytics considerably increased in the last decade. For instance, Google and NVIDIA designed the Tensor Processing Unit (TPU) and Tensor Cores in Volta GPUs [1] respectively. The next pre-exascale machines, such as the Summit and Sierra supercomputers, will feature NVIDIA Volta Tensor Cores. However, it is still unclear how codes for plasma simulations will take advantage of tensor architectures. Widely used massively parallel Particle-in-Cell (PIC) models of plasmas are not yet capable of exploiting these new systems and need to redesigned. Two main aspects have to be considered: first, the PIC algorithms have to be reformulated to use dense matrix multiplications; second, new algorithms have to cope with\u00a0low-precision calculations, still retaining acceptable accuracy. In this talk, we review the emerging tensor architectures and propose algorithmic changes in PIC codes to exploit tensor hardware. [1] S. Markidis, S.W.D. Chien, E. Laure, I.B. Peng, J.S. Vetter,\u00a0\u003Cem\u003ENVIDIA Tensor Core Programmability, Performance \u0026amp; Precision, \u003C\/em\u003EAccepted for publication in AsHES\u002718 workshop at IPDPS 2018, 2018.","bio":"","contributors":[{"type":"Author","first_name":"Stefano","last_name":"Markidis","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Chaitanya","last_name":"Prasad","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Steven Wei Der","last_name":"Chien","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Erwin","last_name":"Laure","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vyacheslav","last_name":"Olshevsky","affiliation":"KU Leuven","country":"Belgium","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Lapenta","affiliation":"KU Leuven","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Stefano","last_name":"Markidis","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa114","type":"child","title":"Vlasiator \u2013 Understanding Near-Earth Space in Six Dimensions","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The constant flow of solar wind from our star, the Sun, builds the richest reachable plasma laboratory with spatial and temporal scales not attainable in terrestrial laboratories. Plasma phenomena within the near-Earth space create space weather, referring to harmful effects that can endanger technological systems or human life in space. Space weather predictions are mostly at an empirical stage, while future forecasts will be based on numerical simulations. Up to now, large-scale space weather simulations are based on a very simple theory assuming that plasma is a fluid. Vlasiator is a newly developed large-scale space physics model. Vlasiator modelling targets are immense: to model the entire near-Earth space with a breakthrough resolution, using a description going far beyond the existing large-scale plasma simulations. Therefore, Vlasiator includes advanced high-performance computing techniques available from load-balancing to highly scalable grids to allow massively parallel computations. Due to the unprecedented accuracy at global scales, Vlasiator has been used to discover phenomena that no one thought would exist. The presentation introduces Vlasiator, and some of the recent science results. Future application areas may include space weather, 6D fusion modelling, and spacecraft instrument specification definitions, and\u00a0as benchmark to test new facilities and architectures.","bio":"","contributors":[{"type":"Author","first_name":"Minna","last_name":"Palmroth","affiliation":"University of Helsinki","country":"Finland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Urs","last_name":"Ganse","affiliation":"University of Helsinki","country":"Finland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Battarbee","affiliation":"University of Helsinki","country":"Finland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Brito","last_name":"Thiago","affiliation":"University of Helsinki","country":"Finland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Maxime","last_name":"Grandin","affiliation":"University of Helsinki","country":"Finland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Yann","last_name":"Pfau-Kempf","affiliation":"University of Helsinki","country":"Finland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Lucile","last_name":"Turc","affiliation":"University of Helsinki","country":"Finland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Sebastian","last_name":"von Alfthan","affiliation":"CSC - IT Centre for Science","country":"Finland","bio":"","order":"8","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Minna","last_name":"Palmroth","affiliation":"University of Helsinki","country":"Finland","bio":"","order":"1","is_presenter":true}]},{"id":"msa273","type":"child","title":"Variable Precision: Making Every Bit Count","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Decades ago, when memory was a scarce resource, computational scientists routinely worked in single precision and were more sophisticated in dealing with the pitfalls finite-precision arithmetic. Today, however, we typically compute and store results in 64-bit double precision by default even when very few significant digits are required. Many of these bits are representing errors instead of useful information. This over-allocation of resources is wasteful; we communicate and compute on many meaningless bits. At LLNL, we are developing the methods and tools that will enable the routine use of dynamically adjustable precision at a per-bit level depending on the needs of the task at hand. Our goal is to provide more or less precision as needed locally. Acceptance from the community requires that we address three concerns: that we can ensure accuracy, ensure efficiency, and ensure ease of use in development, debugging, and application. In this talk, I will discuss the benefits and the challenges of variable precision computing, highlighting aspects of our ongoing research in data representations, numerical algorithms, and testing and development tools. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.","filename":"msa273s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jeffrey A. F.","last_name":"Hittinger","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jeffrey A. F.","last_name":"Hittinger","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa110","type":"child","title":"Towards a Virtual Fusion Facility on Exascale Supercomputers","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Building on remarkable advances throughout the last two decades or so, and in expectation of the emergence of exascale supercomputers, the fusion theory\u00a0community has started to target its ultimate goal: the development of a validated predictive capability aka a virtual fusion facility. This will constitute a milestone in fusion research, providing countless novel opportunities for optimizing the design and operation of fusion experiments and for\u00a0accelerating the development of fusion energy. One crucial step in this direction is the creation of a backbone for a virtual plasma via the tight coupling of two scalable, cutting-edge gyrokinetic codes, GENE and XGC, addressing the physics in the core and boundary region, respectively. This work is carried out within the U.S. Exascale Computing Project.\u00a0We will present mathematical and computational aspects of the coupling scheme as well as initial simulations results. Moreover, we will\u00a0discuss general lessons learned in this ambitious attempt to couple two ab initio codes on pre-exascale machines, which are likely to be relevant to similar efforts in other areas of computational science.","bio":"","contributors":[{"type":"Author","first_name":"Frank","last_name":"Jenko","affiliation":"Max Planck Institute for Plasma Physics","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Frank","last_name":"Jenko","affiliation":"Max Planck Institute for Plasma Physics","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa273","type":"child","title":"Variable Precision: Making Every Bit Count","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Decades ago, when memory was a scarce resource, computational scientists routinely worked in single precision and were more sophisticated in dealing with the pitfalls finite-precision arithmetic. Today, however, we typically compute and store results in 64-bit double precision by default even when very few significant digits are required. Many of these bits are representing errors instead of useful information. This over-allocation of resources is wasteful; we communicate and compute on many meaningless bits. At LLNL, we are developing the methods and tools that will enable the routine use of dynamically adjustable precision at a per-bit level depending on the needs of the task at hand. Our goal is to provide more or less precision as needed locally. Acceptance from the community requires that we address three concerns: that we can ensure accuracy, ensure efficiency, and ensure ease of use in development, debugging, and application. In this talk, I will discuss the benefits and the challenges of variable precision computing, highlighting aspects of our ongoing research in data representations, numerical algorithms, and testing and development tools. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.","filename":"msa273s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jeffrey A. F.","last_name":"Hittinger","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jeffrey A. F.","last_name":"Hittinger","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jeffrey A. F.","last_name":"Hittinger","affiliation":"Lawrence Livermore National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Sunita Chandrasekaran (University of Delaware, United States of America)
Track(s):
Emerging Application Domains, Physics
Heterogeneity has been very evident among HPC systems and the trend only continues to rapidly evolve. Such a trend involves systems equipped with hierarchical processors along with accelerators and memory/storage components that is expected to facilitate migration of an increasingly diverse set of scientific applications thus meeting the demands of a wide user community. However in order to do so, effectively, we need rich programming models and languages that can tap into the massive potential of these large scale systems. This minisymposium addresses how the widely popular parallel programming paradigms such as CUDA, OpenMP 4.5, OpenACC and Alpaka can be adapted for a variety of applications such as atomistic simulation, turbulence combustion, simulation of smoke propagation, and plasma physics. The talks will explain using these real world applications how to balance performance and portability for a minimum of future work.
14:00 - 14:30
Zero Overhead Modern C++ for Mapping to Any Programming Model
, Axel Huebl (Helmholtz-Zentrum Dresden-Rossendorf, Germany)
+ Abstract { "session": {"id":"sess194","title":"MS25 - Scientific Computing in times of MPI+X: Looking at Multiple \u201cX\u201d with regard to Performance and Portability","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Nairobi Room","contributors":[{"type":"Session Chair","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Physics"],"slots":[{"id":"symp148","type":"minisymposia","title":"MS25 - Scientific Computing in times of MPI+X: Looking at Multiple \u201cX\u201d with regard to Performance and Portability","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Heterogeneity has been very evident among HPC systems and the trend only continues to rapidly evolve. Such a trend involves systems equipped with hierarchical processors along with accelerators and memory\/storage components that is expected to facilitate migration of an increasingly diverse set of scientific applications thus meeting the demands of a wide user community. However in order to do so, effectively, we need rich programming models and languages that can tap into the massive potential of these large scale systems. This minisymposium addresses how the widely popular parallel programming paradigms such as CUDA, OpenMP 4.5, OpenACC and Alpaka can be adapted for a variety of applications such as atomistic simulation, turbulence combustion, simulation of smoke propagation, and plasma physics. The talks will explain using these real world applications how to balance performance and portability for a minimum of future work.","bio":"","contributors":[{"type":"Organizer","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Sunita","last_name":"Chandrasekaran","affiliation":"University of Delaware","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa313","type":"child","title":"Porting Physical Parameterizations from a Climate Model to Accelerators","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ICON (ICOsahedral Non-hydrostatic) is a climate and numerical weather prediction model being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). Together with MPI-M and DWD, MeteoSwiss, the Center for Climate Systems Modeling (C2SM\/ETH), and the Swiss National Supercomputing Center (CSCS) are porting ICON to GPUs and many-core architectures. Within the model, physical parameterizations calculate the collective effect of physical phenomena which occur on a sub-grid scale. We suggest multiple directive-based approaches of porting these parameterizations to accelerators, such as using the OpenACC standard or the CLAW source-to-source translator. Allowing the retention of a single Fortran code, directive approach can offer a high degree of performance portability. Using the FortranTestGenerator tool for automatic unit test generation for Fortran subroutines, the turbulence parameterization is isolated in a testbed subset of the model, so that subsequent changes can be easily validated. Within our talk we accentuate the challenges of porting CPU cache memory bound programs to GPUs. Tool-based analysis of loop kernels is used to estimate attainable performance on various platforms, in particular i86-based multi-core CPUs as well as NVIDIA GPUs. The validated turbulence parameterization, running within a testbed framework, can be integrated into the overall ICON model.","bio":"","contributors":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa247","type":"child","title":"Zero Overhead Modern C++ for Mapping to Any Programming Model","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Towards exascale computing, today\u0027s HPC systems have become heterogeneous and diverse. Accounting for both host and accelerator, the TOP10 supercomputers in 11\/2017 alone provided as much as 11 different computing architectures. On top of the hardware follow the accompanying programming models: from directive based, implicit and explicit descriptions up to task-based. Scientific code developers are facing a tough choice as commitment to a specific hardware and\/or programming model narrows down potential target systems. With limited development resources but usually multi-decade long project lifetimes, maintaining multiple implementations of the same algorithms to widen platform support is unfeasible for most teams. Alpaka is a standard C++, compile-time meta-programming library providing a unified, explicit, parallel programming model. On typical MPI+X parallelized applications, Alpaka enables developers to describe shared-memory, in-node parallelism. Zero-overhead abstraction is achieved by compile-time specializing C++ templates to native backends (e.g. CUDA, OpenMP, TBB, ...). Alpaka stays with modern C++ as a standardized, widely supported language without introducing pre-processor or pragma-based annotations to the user directly. It naturally allows inlining, kernel fusion and code-reuse on a single-source programming paradigm. With such, abstractions and control within the final software stack are achievable without duplicating implementations leading to a maintainable code base even for large applications.","filename":"msa247s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Axel","last_name":"Huebl","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alexander","last_name":"Matthes","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Benjamin","last_name":"Worpitz","affiliation":"LogMeIn Inc.","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Erik","last_name":"Zenker","affiliation":"LogMeIn Inc.","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Ren\u00e9","last_name":"Widera","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Bussmann","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Axel","last_name":"Huebl","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa215","type":"child","title":"Porting Quantum ESPRESSO to GPUs - Lessons Learnt and Remaining Challenges","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Quantum ESPRESSO is a very popular open-source suite of codes for electronic-structure calculations and materials modelling at the nanoscale. PWscf package is the main package and focus of this talk. The aim of this talk is to present the multi-year journey to allow Quantum ESPRESSO to exploit NVIDIA GPU accelerators. The first GPU porting was done in CUDA C back in 2012, and a new version based on CUDA Fortran and CUF kernel was developed and released during the last year by Nvidia and Filippo Spiga. Based on these two experiences, a new effort to embed specific accelerated kernels in the main repository is ongoing. We will present the envisaged strategy for the integration of the accelerated code and discuss the lessons learnt developing and maintaining a continuously evolving complex Fortran community code. The talk will also review how the porting is done, various tricks to integrate in the same code base both the CPU and the GPU code path, integration with libraries and the comparison between CUF kernels and OpenACC directives for selected kernels.","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Bonf\u00e0","affiliation":"CINECA","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabio","last_name":"Affinito","affiliation":"CINECA","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlo","last_name":"Cavazzoni","affiliation":"CINECA","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Bonf\u00e0","affiliation":"CINECA","country":"Italy","bio":"","order":"1","is_presenter":true}]},{"id":"msa167","type":"child","title":"OpenMP 4.5 Acceleration for Turbulence Simulations on GPUs","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Optimal use of GPUs in a heterogeneous computing environment requires careful consideration of data movement and (usually) how operations on the CPU and attached GPU(s) can be overlapped as much as possible. Further complexities also arise if the base algorithm requires substantial communication among a large number of MPI parallel processes. We have devised a successful OpenMP 4.5 application that solves 3D advection-diffusion equations for mixing of a scalar (concentration) field transported in a turbulent fluid flow, with a 5X GPU-to-CPU speedup on 8192 nodes of the Cray XK7 Titan machine at Oak Ridge National Laboratory, USA. To minimize data movements between CPU and GPU memory spaces, the entire memory space required for the scalar field is transferred to the GPUs, where most of the computations are performed. Computational loops that perform compact finite difference calculations are accelerated using OpenMP 4.X constructs, where in some instances a change from the default memory layout is beneficial. Scalability is improved by overlapping computation on the GPUs with (i) communication on the CPUs and (ii) data movement between the CPUs and GPUs using the latest tasking capabilities added to the TARGET constructs in OpenMP 4.5 (e.g., the DEPEND and NOWAIT clauses).","bio":"","contributors":[{"type":"Author","first_name":"Matthew P.","last_name":"Clay","affiliation":"Georgia Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Dhawal","last_name":"Buaria","affiliation":"Max Planck Institute for Dynamics and Self Organization","country":"Germany","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"P. K.","last_name":"Yeung","affiliation":"Georgia Institute of Technology","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dhawal","last_name":"Buaria","affiliation":"Max Planck Institute for Dynamics and Self Organization","country":"Germany","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"msa247","type":"child","title":"Zero Overhead Modern C++ for Mapping to Any Programming Model","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Towards exascale computing, today\u0027s HPC systems have become heterogeneous and diverse. Accounting for both host and accelerator, the TOP10 supercomputers in 11\/2017 alone provided as much as 11 different computing architectures. On top of the hardware follow the accompanying programming models: from directive based, implicit and explicit descriptions up to task-based. Scientific code developers are facing a tough choice as commitment to a specific hardware and\/or programming model narrows down potential target systems. With limited development resources but usually multi-decade long project lifetimes, maintaining multiple implementations of the same algorithms to widen platform support is unfeasible for most teams. Alpaka is a standard C++, compile-time meta-programming library providing a unified, explicit, parallel programming model. On typical MPI+X parallelized applications, Alpaka enables developers to describe shared-memory, in-node parallelism. Zero-overhead abstraction is achieved by compile-time specializing C++ templates to native backends (e.g. CUDA, OpenMP, TBB, ...). Alpaka stays with modern C++ as a standardized, widely supported language without introducing pre-processor or pragma-based annotations to the user directly. It naturally allows inlining, kernel fusion and code-reuse on a single-source programming paradigm. With such, abstractions and control within the final software stack are achievable without duplicating implementations leading to a maintainable code base even for large applications.","filename":"msa247s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Axel","last_name":"Huebl","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alexander","last_name":"Matthes","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Benjamin","last_name":"Worpitz","affiliation":"LogMeIn Inc.","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Erik","last_name":"Zenker","affiliation":"LogMeIn Inc.","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Ren\u00e9","last_name":"Widera","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Bussmann","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Axel","last_name":"Huebl","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Axel","last_name":"Huebl","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alexander","last_name":"Matthes","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Benjamin","last_name":"Worpitz","affiliation":"LogMeIn Inc.","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Erik","last_name":"Zenker","affiliation":"LogMeIn Inc.","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Ren\u00e9","last_name":"Widera","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Guido","last_name":"Juckeland","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Bussmann","affiliation":"Helmholtz-Zentrum Dresden-Rossendorf","country":"Germany","bio":"","order":"7","is_presenter":false}] } Presentation
Organizer(s):
Alfio Lazzaro (University of Zurich, Switzerland)
, Edgar Solomonik (University of Illinois Urbana-Champaign, United States of America)
, Juerg Hutter (University of Zurich, Switzerland)
Track(s):
Computer Science and Applied Mathematics, Chemistry and Materials
Tensor algebra operations are ubiquitous in domains including data analytics, machine learning, engineering, and science. Moreover, the use of tensor methods in these domains has grown tremendously in the last ten years. High-performance implementations and parallelization of tensor algebra operations underlying these methods require considerations beyond standard techniques used in linear algebra. A further key challenge is the development of effective abstractions and libraries for tensor algebra, as no standard interface or library (like LAPACK) has been established in the scientific community. The presentations in this minisymposium will cover challenges faced in the development of four distinct major tensor software library efforts. These libraries focus on sparse and dense tensor contractions, with three targeting distributed memory: Cyclops, NWChem, and DBCSR, and a fourth, TACO, targeting shared memory. These efforts have been application-driven, in particular by the tensor problems prevalent in electronic structure calculations. The minisymposium will serve as a discussion on the current state of the art of the tensor libraries in order to understand possible synergy between the projects and with the possibility to build a common interface for tensor algebra operations.
13:30 - 14:00
Parallel Tensor Computations in Python or C++ Using Cyclops
, Edgar Solomonik (University of Illinois Urbana-Champaign, United States of America)
+ Abstract { "session": {"id":"sess195","title":"MS26 - Tensor Algebra Computation: Implementations and Applications","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Chemistry and Materials"],"slots":[{"id":"symp129","type":"minisymposia","title":"MS26 - Tensor Algebra Computation: Implementations and Applications","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Tensor algebra operations are ubiquitous in domains including data analytics, machine learning, engineering, and science. Moreover, the use of tensor methods in these domains has grown tremendously in the last ten years. High-performance implementations and parallelization of tensor algebra operations underlying these methods require considerations beyond standard techniques used in linear algebra. A further key challenge is the development of effective abstractions and libraries for tensor algebra, as no standard interface or library (like LAPACK) has been established in the scientific community. The presentations in this minisymposium will cover challenges faced in the development of four distinct major tensor software library efforts. These libraries focus on sparse and dense tensor contractions, with three targeting distributed memory: Cyclops, NWChem, and DBCSR, and a fourth, TACO, targeting shared memory. These efforts have been application-driven, in particular by the tensor problems prevalent in electronic structure calculations. The minisymposium will serve as a discussion on the current state of the art of the tensor libraries in order to understand possible synergy between the projects and with the possibility to build a common interface for tensor algebra operations.","bio":"","contributors":[{"type":"Organizer","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa200","type":"child","title":"Parallel Tensor Computations in Python or C++ Using Cyclops","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Tensor algebra provides a mathematical language prevalent in numerous domains of computing, including computational chemistry, machine learning, and quantum information. The Cyclops library provides high-performance algorithms for fundamental operations (summation, contraction, factorization, slicing, reshaping, etc.) on dense or sparse tensors. The library uses distributed storage via MPI and executes each tensor operation bulk synchronously. Algebraic tensor operations are specified via a high-level Einstein summation syntax, accessible as a standalone library in C++ or Python. We will describe the methods Cyclops uses to achieve high-performance, including communication-avoiding algorithms for (sparse) matrix multiplication, optimized transposition\/redistribution routines, and runtime mapping\/algorithm selection based on trainable performance models. The library has been used to break computational frontiers in electronic structure calculations and quantum circuit simulation, with further efforts on Cyclops applications for graph analysis, neural networks, and multilevel optimization underway.","filename":"msa200s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa164","type":"child","title":"Tensor Transposition and Contraction on GPUs","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A large\u00a0variety of contractions\u00a0involving tensors of different dimensionalities and index combinations represent the most computationally demanding components for several models in the NWChem computational chemistry suite. We discuss a library for implementing arbitrary dense tensor contractions on GPUs, using a lower level library for tensor transposition. For a given tensor contraction, there often exist many different\u00a0choices of intermediate transposed tensors\u00a0that enable\u00a0efficient vendor libraries for matrix multiplication\u00a0(eg., cuBLAS) to be used to perform the tensor contraction. Performance models are\u00a0used\u00a0to enable choice between alternatives. The effectiveness of the library on tensor contractions for the CCSD(T) coupled cluster method will be discussed.","filename":"msa164s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jinsung","last_name":"Kim","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Sriram","last_name":"Krishnamoorthy","affiliation":"Pacific Northwest National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Aravind","last_name":"Sukumaran-Rajam","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"msa174","type":"child","title":"Extending the DBCSR Library to Sparse Tensor Linear Algebra for Electronic Structure Methods beyond Density Functional Theory","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advanced algorithms for large-scale electronic structure calculations are mostly based on processing multi-dimensional sparse data. Examples are sparse matrix-matrix multiplications in linear-scaling Kohn-Sham calculations or the efficient determination of the exact exchange energy. When going beyond mean field approaches, e.g. for Moller-Plesset perturbation theory, RPA and Coupled Cluster methods, or the GW methods, it becomes necessary to manipulate higher-order sparse tensors. Very similar problems are also encountered in other domains, like signal processing, data mining, computer vision, and machine learning. Our\u00a0project is concerned with the development of such a tensor library.\u00a0The starting point of the project is the realization that most tensor operations can be mapped on matrix multiplications. We can therefore base the development on the already existing domain library DBCSR, a distributed block sparse matrix multiplication library. DBCSR\u00a0has a multi-layered structure that automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering.\u00a0In this presentation, we describe the status of the library development, the implemented functionalities,\u00a0and the API in Fortran and C\/C++. Then, we report on a comparison with other libraries available in the community, in terms of functionalities and performance.","filename":"msa174s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa162","type":"child","title":"The Tensor Algebra Compiler","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Tensor Algebra Compiler (taco) automatically generates kernels to compute tensor and linear algebra expressions on both dense and sparse data. This frees application and library developers from hand-coding these kernels. The generated sparse kernels have excellent performance and match hand-coded kernels where these are available, while generalizing to an uncountable number of other kernels. Ref: www.tensor-compiler.org.","filename":"msa162s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fredrik","last_name":"Kjolstad","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Shoaib","last_name":"Kamil","affiliation":"Adobe Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephen","last_name":"Chou","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Lugato","affiliation":"CEA","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}]}]}, "slot": {"id":"msa200","type":"child","title":"Parallel Tensor Computations in Python or C++ Using Cyclops","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Tensor algebra provides a mathematical language prevalent in numerous domains of computing, including computational chemistry, machine learning, and quantum information. The Cyclops library provides high-performance algorithms for fundamental operations (summation, contraction, factorization, slicing, reshaping, etc.) on dense or sparse tensors. The library uses distributed storage via MPI and executes each tensor operation bulk synchronously. Algebraic tensor operations are specified via a high-level Einstein summation syntax, accessible as a standalone library in C++ or Python. We will describe the methods Cyclops uses to achieve high-performance, including communication-avoiding algorithms for (sparse) matrix multiplication, optimized transposition\/redistribution routines, and runtime mapping\/algorithm selection based on trainable performance models. The library has been used to break computational frontiers in electronic structure calculations and quantum circuit simulation, with further efforts on Cyclops applications for graph analysis, neural networks, and multilevel optimization underway.","filename":"msa200s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
14:00 - 14:30
Tensor Transposition and Contraction on GPUs
, Ponnuswamy Sadayappan (Ohio State University, United States of America)
+ Abstract { "session": {"id":"sess195","title":"MS26 - Tensor Algebra Computation: Implementations and Applications","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Chemistry and Materials"],"slots":[{"id":"symp129","type":"minisymposia","title":"MS26 - Tensor Algebra Computation: Implementations and Applications","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Tensor algebra operations are ubiquitous in domains including data analytics, machine learning, engineering, and science. Moreover, the use of tensor methods in these domains has grown tremendously in the last ten years. High-performance implementations and parallelization of tensor algebra operations underlying these methods require considerations beyond standard techniques used in linear algebra. A further key challenge is the development of effective abstractions and libraries for tensor algebra, as no standard interface or library (like LAPACK) has been established in the scientific community. The presentations in this minisymposium will cover challenges faced in the development of four distinct major tensor software library efforts. These libraries focus on sparse and dense tensor contractions, with three targeting distributed memory: Cyclops, NWChem, and DBCSR, and a fourth, TACO, targeting shared memory. These efforts have been application-driven, in particular by the tensor problems prevalent in electronic structure calculations. The minisymposium will serve as a discussion on the current state of the art of the tensor libraries in order to understand possible synergy between the projects and with the possibility to build a common interface for tensor algebra operations.","bio":"","contributors":[{"type":"Organizer","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa200","type":"child","title":"Parallel Tensor Computations in Python or C++ Using Cyclops","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Tensor algebra provides a mathematical language prevalent in numerous domains of computing, including computational chemistry, machine learning, and quantum information. The Cyclops library provides high-performance algorithms for fundamental operations (summation, contraction, factorization, slicing, reshaping, etc.) on dense or sparse tensors. The library uses distributed storage via MPI and executes each tensor operation bulk synchronously. Algebraic tensor operations are specified via a high-level Einstein summation syntax, accessible as a standalone library in C++ or Python. We will describe the methods Cyclops uses to achieve high-performance, including communication-avoiding algorithms for (sparse) matrix multiplication, optimized transposition\/redistribution routines, and runtime mapping\/algorithm selection based on trainable performance models. The library has been used to break computational frontiers in electronic structure calculations and quantum circuit simulation, with further efforts on Cyclops applications for graph analysis, neural networks, and multilevel optimization underway.","filename":"msa200s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa164","type":"child","title":"Tensor Transposition and Contraction on GPUs","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A large\u00a0variety of contractions\u00a0involving tensors of different dimensionalities and index combinations represent the most computationally demanding components for several models in the NWChem computational chemistry suite. We discuss a library for implementing arbitrary dense tensor contractions on GPUs, using a lower level library for tensor transposition. For a given tensor contraction, there often exist many different\u00a0choices of intermediate transposed tensors\u00a0that enable\u00a0efficient vendor libraries for matrix multiplication\u00a0(eg., cuBLAS) to be used to perform the tensor contraction. Performance models are\u00a0used\u00a0to enable choice between alternatives. The effectiveness of the library on tensor contractions for the CCSD(T) coupled cluster method will be discussed.","filename":"msa164s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jinsung","last_name":"Kim","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Sriram","last_name":"Krishnamoorthy","affiliation":"Pacific Northwest National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Aravind","last_name":"Sukumaran-Rajam","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"msa174","type":"child","title":"Extending the DBCSR Library to Sparse Tensor Linear Algebra for Electronic Structure Methods beyond Density Functional Theory","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advanced algorithms for large-scale electronic structure calculations are mostly based on processing multi-dimensional sparse data. Examples are sparse matrix-matrix multiplications in linear-scaling Kohn-Sham calculations or the efficient determination of the exact exchange energy. When going beyond mean field approaches, e.g. for Moller-Plesset perturbation theory, RPA and Coupled Cluster methods, or the GW methods, it becomes necessary to manipulate higher-order sparse tensors. Very similar problems are also encountered in other domains, like signal processing, data mining, computer vision, and machine learning. Our\u00a0project is concerned with the development of such a tensor library.\u00a0The starting point of the project is the realization that most tensor operations can be mapped on matrix multiplications. We can therefore base the development on the already existing domain library DBCSR, a distributed block sparse matrix multiplication library. DBCSR\u00a0has a multi-layered structure that automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering.\u00a0In this presentation, we describe the status of the library development, the implemented functionalities,\u00a0and the API in Fortran and C\/C++. Then, we report on a comparison with other libraries available in the community, in terms of functionalities and performance.","filename":"msa174s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa162","type":"child","title":"The Tensor Algebra Compiler","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Tensor Algebra Compiler (taco) automatically generates kernels to compute tensor and linear algebra expressions on both dense and sparse data. This frees application and library developers from hand-coding these kernels. The generated sparse kernels have excellent performance and match hand-coded kernels where these are available, while generalizing to an uncountable number of other kernels. Ref: www.tensor-compiler.org.","filename":"msa162s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fredrik","last_name":"Kjolstad","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Shoaib","last_name":"Kamil","affiliation":"Adobe Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephen","last_name":"Chou","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Lugato","affiliation":"CEA","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}]}]}, "slot": {"id":"msa164","type":"child","title":"Tensor Transposition and Contraction on GPUs","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A large\u00a0variety of contractions\u00a0involving tensors of different dimensionalities and index combinations represent the most computationally demanding components for several models in the NWChem computational chemistry suite. We discuss a library for implementing arbitrary dense tensor contractions on GPUs, using a lower level library for tensor transposition. For a given tensor contraction, there often exist many different\u00a0choices of intermediate transposed tensors\u00a0that enable\u00a0efficient vendor libraries for matrix multiplication\u00a0(eg., cuBLAS) to be used to perform the tensor contraction. Performance models are\u00a0used\u00a0to enable choice between alternatives. The effectiveness of the library on tensor contractions for the CCSD(T) coupled cluster method will be discussed.","filename":"msa164s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jinsung","last_name":"Kim","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Sriram","last_name":"Krishnamoorthy","affiliation":"Pacific Northwest National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Aravind","last_name":"Sukumaran-Rajam","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jinsung","last_name":"Kim","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Sriram","last_name":"Krishnamoorthy","affiliation":"Pacific Northwest National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Aravind","last_name":"Sukumaran-Rajam","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}] } Presentation
14:30 - 15:00
Extending the DBCSR Library to Sparse Tensor Linear Algebra for Electronic Structure Methods beyond Density Functional Theory
, Alfio Lazzaro (University of Zurich, Switzerland)
+ Abstract { "session": {"id":"sess195","title":"MS26 - Tensor Algebra Computation: Implementations and Applications","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Chemistry and Materials"],"slots":[{"id":"symp129","type":"minisymposia","title":"MS26 - Tensor Algebra Computation: Implementations and Applications","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Tensor algebra operations are ubiquitous in domains including data analytics, machine learning, engineering, and science. Moreover, the use of tensor methods in these domains has grown tremendously in the last ten years. High-performance implementations and parallelization of tensor algebra operations underlying these methods require considerations beyond standard techniques used in linear algebra. A further key challenge is the development of effective abstractions and libraries for tensor algebra, as no standard interface or library (like LAPACK) has been established in the scientific community. The presentations in this minisymposium will cover challenges faced in the development of four distinct major tensor software library efforts. These libraries focus on sparse and dense tensor contractions, with three targeting distributed memory: Cyclops, NWChem, and DBCSR, and a fourth, TACO, targeting shared memory. These efforts have been application-driven, in particular by the tensor problems prevalent in electronic structure calculations. The minisymposium will serve as a discussion on the current state of the art of the tensor libraries in order to understand possible synergy between the projects and with the possibility to build a common interface for tensor algebra operations.","bio":"","contributors":[{"type":"Organizer","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa200","type":"child","title":"Parallel Tensor Computations in Python or C++ Using Cyclops","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Tensor algebra provides a mathematical language prevalent in numerous domains of computing, including computational chemistry, machine learning, and quantum information. The Cyclops library provides high-performance algorithms for fundamental operations (summation, contraction, factorization, slicing, reshaping, etc.) on dense or sparse tensors. The library uses distributed storage via MPI and executes each tensor operation bulk synchronously. Algebraic tensor operations are specified via a high-level Einstein summation syntax, accessible as a standalone library in C++ or Python. We will describe the methods Cyclops uses to achieve high-performance, including communication-avoiding algorithms for (sparse) matrix multiplication, optimized transposition\/redistribution routines, and runtime mapping\/algorithm selection based on trainable performance models. The library has been used to break computational frontiers in electronic structure calculations and quantum circuit simulation, with further efforts on Cyclops applications for graph analysis, neural networks, and multilevel optimization underway.","filename":"msa200s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa164","type":"child","title":"Tensor Transposition and Contraction on GPUs","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A large\u00a0variety of contractions\u00a0involving tensors of different dimensionalities and index combinations represent the most computationally demanding components for several models in the NWChem computational chemistry suite. We discuss a library for implementing arbitrary dense tensor contractions on GPUs, using a lower level library for tensor transposition. For a given tensor contraction, there often exist many different\u00a0choices of intermediate transposed tensors\u00a0that enable\u00a0efficient vendor libraries for matrix multiplication\u00a0(eg., cuBLAS) to be used to perform the tensor contraction. Performance models are\u00a0used\u00a0to enable choice between alternatives. The effectiveness of the library on tensor contractions for the CCSD(T) coupled cluster method will be discussed.","filename":"msa164s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jinsung","last_name":"Kim","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Sriram","last_name":"Krishnamoorthy","affiliation":"Pacific Northwest National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Aravind","last_name":"Sukumaran-Rajam","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"msa174","type":"child","title":"Extending the DBCSR Library to Sparse Tensor Linear Algebra for Electronic Structure Methods beyond Density Functional Theory","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advanced algorithms for large-scale electronic structure calculations are mostly based on processing multi-dimensional sparse data. Examples are sparse matrix-matrix multiplications in linear-scaling Kohn-Sham calculations or the efficient determination of the exact exchange energy. When going beyond mean field approaches, e.g. for Moller-Plesset perturbation theory, RPA and Coupled Cluster methods, or the GW methods, it becomes necessary to manipulate higher-order sparse tensors. Very similar problems are also encountered in other domains, like signal processing, data mining, computer vision, and machine learning. Our\u00a0project is concerned with the development of such a tensor library.\u00a0The starting point of the project is the realization that most tensor operations can be mapped on matrix multiplications. We can therefore base the development on the already existing domain library DBCSR, a distributed block sparse matrix multiplication library. DBCSR\u00a0has a multi-layered structure that automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering.\u00a0In this presentation, we describe the status of the library development, the implemented functionalities,\u00a0and the API in Fortran and C\/C++. Then, we report on a comparison with other libraries available in the community, in terms of functionalities and performance.","filename":"msa174s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa162","type":"child","title":"The Tensor Algebra Compiler","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Tensor Algebra Compiler (taco) automatically generates kernels to compute tensor and linear algebra expressions on both dense and sparse data. This frees application and library developers from hand-coding these kernels. The generated sparse kernels have excellent performance and match hand-coded kernels where these are available, while generalizing to an uncountable number of other kernels. Ref: www.tensor-compiler.org.","filename":"msa162s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fredrik","last_name":"Kjolstad","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Shoaib","last_name":"Kamil","affiliation":"Adobe Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephen","last_name":"Chou","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Lugato","affiliation":"CEA","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}]}]}, "slot": {"id":"msa174","type":"child","title":"Extending the DBCSR Library to Sparse Tensor Linear Algebra for Electronic Structure Methods beyond Density Functional Theory","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advanced algorithms for large-scale electronic structure calculations are mostly based on processing multi-dimensional sparse data. Examples are sparse matrix-matrix multiplications in linear-scaling Kohn-Sham calculations or the efficient determination of the exact exchange energy. When going beyond mean field approaches, e.g. for Moller-Plesset perturbation theory, RPA and Coupled Cluster methods, or the GW methods, it becomes necessary to manipulate higher-order sparse tensors. Very similar problems are also encountered in other domains, like signal processing, data mining, computer vision, and machine learning. Our\u00a0project is concerned with the development of such a tensor library.\u00a0The starting point of the project is the realization that most tensor operations can be mapped on matrix multiplications. We can therefore base the development on the already existing domain library DBCSR, a distributed block sparse matrix multiplication library. DBCSR\u00a0has a multi-layered structure that automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering.\u00a0In this presentation, we describe the status of the library development, the implemented functionalities,\u00a0and the API in Fortran and C\/C++. Then, we report on a comparison with other libraries available in the community, in terms of functionalities and performance.","filename":"msa174s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}] } Presentation
15:00 - 15:30
The Tensor Algebra Compiler
, Saman Amarasinghe (Massachusetts Institute of Technology, United States of America)
+ Abstract { "session": {"id":"sess195","title":"MS26 - Tensor Algebra Computation: Implementations and Applications","date":"Tuesday, July 3rd 2018","begin_time":"13:30","end_time":"15:30","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Chemistry and Materials"],"slots":[{"id":"symp129","type":"minisymposia","title":"MS26 - Tensor Algebra Computation: Implementations and Applications","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Tensor algebra operations are ubiquitous in domains including data analytics, machine learning, engineering, and science. Moreover, the use of tensor methods in these domains has grown tremendously in the last ten years. High-performance implementations and parallelization of tensor algebra operations underlying these methods require considerations beyond standard techniques used in linear algebra. A further key challenge is the development of effective abstractions and libraries for tensor algebra, as no standard interface or library (like LAPACK) has been established in the scientific community. The presentations in this minisymposium will cover challenges faced in the development of four distinct major tensor software library efforts. These libraries focus on sparse and dense tensor contractions, with three targeting distributed memory: Cyclops, NWChem, and DBCSR, and a fourth, TACO, targeting shared memory. These efforts have been application-driven, in particular by the tensor problems prevalent in electronic structure calculations. The minisymposium will serve as a discussion on the current state of the art of the tensor libraries in order to understand possible synergy between the projects and with the possibility to build a common interface for tensor algebra operations.","bio":"","contributors":[{"type":"Organizer","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa200","type":"child","title":"Parallel Tensor Computations in Python or C++ Using Cyclops","begin_time":"13:30","end_time":"14:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Tensor algebra provides a mathematical language prevalent in numerous domains of computing, including computational chemistry, machine learning, and quantum information. The Cyclops library provides high-performance algorithms for fundamental operations (summation, contraction, factorization, slicing, reshaping, etc.) on dense or sparse tensors. The library uses distributed storage via MPI and executes each tensor operation bulk synchronously. Algebraic tensor operations are specified via a high-level Einstein summation syntax, accessible as a standalone library in C++ or Python. We will describe the methods Cyclops uses to achieve high-performance, including communication-avoiding algorithms for (sparse) matrix multiplication, optimized transposition\/redistribution routines, and runtime mapping\/algorithm selection based on trainable performance models. The library has been used to break computational frontiers in electronic structure calculations and quantum circuit simulation, with further efforts on Cyclops applications for graph analysis, neural networks, and multilevel optimization underway.","filename":"msa200s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Edgar","last_name":"Solomonik","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa164","type":"child","title":"Tensor Transposition and Contraction on GPUs","begin_time":"14:00","end_time":"14:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A large\u00a0variety of contractions\u00a0involving tensors of different dimensionalities and index combinations represent the most computationally demanding components for several models in the NWChem computational chemistry suite. We discuss a library for implementing arbitrary dense tensor contractions on GPUs, using a lower level library for tensor transposition. For a given tensor contraction, there often exist many different\u00a0choices of intermediate transposed tensors\u00a0that enable\u00a0efficient vendor libraries for matrix multiplication\u00a0(eg., cuBLAS) to be used to perform the tensor contraction. Performance models are\u00a0used\u00a0to enable choice between alternatives. The effectiveness of the library on tensor contractions for the CCSD(T) coupled cluster method will be discussed.","filename":"msa164s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jinsung","last_name":"Kim","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Sriram","last_name":"Krishnamoorthy","affiliation":"Pacific Northwest National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Aravind","last_name":"Sukumaran-Rajam","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ponnuswamy","last_name":"Sadayappan","affiliation":"Ohio State University","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"msa174","type":"child","title":"Extending the DBCSR Library to Sparse Tensor Linear Algebra for Electronic Structure Methods beyond Density Functional Theory","begin_time":"14:30","end_time":"15:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advanced algorithms for large-scale electronic structure calculations are mostly based on processing multi-dimensional sparse data. Examples are sparse matrix-matrix multiplications in linear-scaling Kohn-Sham calculations or the efficient determination of the exact exchange energy. When going beyond mean field approaches, e.g. for Moller-Plesset perturbation theory, RPA and Coupled Cluster methods, or the GW methods, it becomes necessary to manipulate higher-order sparse tensors. Very similar problems are also encountered in other domains, like signal processing, data mining, computer vision, and machine learning. Our\u00a0project is concerned with the development of such a tensor library.\u00a0The starting point of the project is the realization that most tensor operations can be mapped on matrix multiplications. We can therefore base the development on the already existing domain library DBCSR, a distributed block sparse matrix multiplication library. DBCSR\u00a0has a multi-layered structure that automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering.\u00a0In this presentation, we describe the status of the library development, the implemented functionalities,\u00a0and the API in Fortran and C\/C++. Then, we report on a comparison with other libraries available in the community, in terms of functionalities and performance.","filename":"msa174s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa162","type":"child","title":"The Tensor Algebra Compiler","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Tensor Algebra Compiler (taco) automatically generates kernels to compute tensor and linear algebra expressions on both dense and sparse data. This frees application and library developers from hand-coding these kernels. The generated sparse kernels have excellent performance and match hand-coded kernels where these are available, while generalizing to an uncountable number of other kernels. Ref: www.tensor-compiler.org.","filename":"msa162s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fredrik","last_name":"Kjolstad","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Shoaib","last_name":"Kamil","affiliation":"Adobe Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephen","last_name":"Chou","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Lugato","affiliation":"CEA","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}]}]}, "slot": {"id":"msa162","type":"child","title":"The Tensor Algebra Compiler","begin_time":"15:00","end_time":"15:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Tensor Algebra Compiler (taco) automatically generates kernels to compute tensor and linear algebra expressions on both dense and sparse data. This frees application and library developers from hand-coding these kernels. The generated sparse kernels have excellent performance and match hand-coded kernels where these are available, while generalizing to an uncountable number of other kernels. Ref: www.tensor-compiler.org.","filename":"msa162s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fredrik","last_name":"Kjolstad","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Shoaib","last_name":"Kamil","affiliation":"Adobe Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephen","last_name":"Chou","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Lugato","affiliation":"CEA","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Fredrik","last_name":"Kjolstad","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Shoaib","last_name":"Kamil","affiliation":"Adobe Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephen","last_name":"Chou","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Lugato","affiliation":"CEA","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Saman","last_name":"Amarasinghe","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"5","is_presenter":true}] } Presentation
15:30 - 16:00
Coffee Break
Foyer 2nd Floor
16:00 - 18:00
Minisymposia Session IV
Organizer(s):
Georgia Tourassi (Oak Ridge National Laboratory, United States of America)
, Simon Scheidegger (University of Zurich, Switzerland)
Track(s):
Emerging Application Domains
The healthcare sector is clearly experiencing a data revolution. With advances in digital health records, genomic sequencing, burgeoning growth of social networks and media for community-health, and the emerging "App" market for health-related mobile and web-enabled applications – there is tremendous access and availability of private and public data for advancing precision medicine and improving population health. This ability to leverage these datasets for translational value, across the continuum of basic, preclinical, and clinical science will be critical for addressing in an effective and timely manner emerging personalized and population healthcare challenges. At the same time, artificial intelligence is making continuing advances in biomedicine. However, there are outstanding questions of how AI can provide actionable clinical insights. We will bring together a community of biomedical researchers and computer scientists to present the latest advances as well as discuss successes, challenges, and next frontiers in health intelligence. The overarching goal of the symposium is to highlight health informatics applications and related methodological advances that involve heterogeneous biomedical data (e.g., imaging, genomic, text, and sensor data) while emphasizing the current and emerging challenges of ensuring their translational value and broad population impact.
Organizer(s):
Rossen Apostolov (KTH Royal Institute of Technology, Sweden)
Track(s):
Life Sciences, Chemistry and Materials
Life Sciences have become crucially dependent on software for analysis of experimental data, systems modelling and simulation, data integration across various repositories and databases, etc. The dramatic increase of available tools has enabled scientists to perform ever more complex studies while taking advantage of modern high-end (HPC and HTC) compute facilities. Experimental facilities are producing staggering amounts of data which led to the rapid development of novel data analytics and machine learning techniques. We are at a stage where there is a need for additional focus on improving the interoperability of software applications, enabling better coupling of tools with data sources and devising efficient workflows and libraries for the upcoming Exascale era in HPC. Such advances will considerably improve the productivity of researchers and allow them to address novel scientific problems. This minisymposium brings together invited experts from two leading institutions in the field – BioExcel, the European Center of Excellence for Computational Biomolecular Research (www.bioexcel.eu), and MolSSI, Molecular Sciences Software Institute (www.molssi.org) in the US to discuss advances in this important field. After the talks we will hold a round-table discussion on "Simulations at Exascale - myth or reality?".
16:30 - 17:00
Facing Compute Platform Portability Challenges with Scientific Workflows - Experiences from Common Workflow Language
, Stian Soiland-Reyes (University of Manchester, United Kingdom)
+ Abstract { "session": {"id":"sess155","title":"MS28 - Advances in Automation and Efficiency for the Exascale Era - Experiences from the Biomolecular Sciences","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Samarkand Room","contributors":[{"type":"Session Chair","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Chemistry and Materials"],"slots":[{"id":"symp143","type":"minisymposia","title":"MS28 - Advances in Automation and Efficiency for the Exascale Era - Experiences from the Biomolecular Sciences","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Life Sciences have become crucially dependent on software for analysis of experimental data, systems modelling and simulation, data integration across various repositories and databases, etc. The dramatic increase of available tools has enabled scientists to perform ever more complex studies while taking advantage of modern high-end (HPC and HTC) compute facilities. Experimental facilities are producing staggering amounts of data which led to the rapid development of novel data analytics and machine learning techniques. We are at a stage where there is a need for additional focus on improving the interoperability of software applications, enabling better coupling of tools with data sources and devising efficient workflows and libraries for the upcoming Exascale era in HPC. Such advances will considerably improve the productivity of researchers and allow them to address novel scientific problems. This minisymposium brings together invited experts from two leading institutions in the field \u2013 BioExcel, the European Center of Excellence for Computational Biomolecular Research (www.bioexcel.eu), and MolSSI, Molecular Sciences Software Institute (www.molssi.org) in the US to discuss advances in this important field. After the talks we will hold a round-table discussion on \u0022Simulations at Exascale - myth or reality?\u0022.","bio":"","contributors":[{"type":"Organizer","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa147","type":"child","title":"Building Blocks for Adaptive Workflows","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Next-generation exascale systems will fundamentally expand the reach of biomolecular simulations and the resulting scientific insight, enabling the simulation of larger biological systems (weak scaling), longer timescales (strong scaling), more complex molecular interactions, and robust uncertainty quantification (more accurate sampling). Solving biological problems that require longer timescales, involve more complex interactions and robust uncertainty quantification will require significant algorithmic improvements that incorporate high-level parallelism and leverage the statistical nature of molecular processes. Interestingly, many such simulation algorithms require adaptive workflows. We argue the need for workflow-systems using a building blocks approach to support adaptive workflows on extreme-scale heterogeneous and dynamic resources. We discuss RADICAL-Cybertools as an implementation of the building block concept, and discuss how RADICAL-Cybertools are being used to support a wide range of adaptive workflows in biomolecular simulations.","bio":"","contributors":[{"type":"Author","first_name":"Shantenu","last_name":"Jha","affiliation":"Rutgers University","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Shantenu","last_name":"Jha","affiliation":"Rutgers University","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa265","type":"child","title":"Facing Compute Platform Portability Challenges with Scientific Workflows - Experiences from Common Workflow Language","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"\u003Cem\u003EScientific Workflow\u003C\/em\u003E systems are well established for computational analysis in all science domains, following the rapid development of workflow technology and community practices spanning the two recent decades, the \u003Cem\u003EeScience era\u003C\/em\u003E. Workflow systems have gained traction in the era of Big Data Science due to their \u0022ASAP properties\u0022: \u003C!-- \uf04a\uf04a Bertram Ludascher: WORKS 2015 Keynote \n\uf04a --\u003EAutomation over repetitive pipelines and simulation sweep campaigns; Scaling over computational infrastructure and handling large data; Abstraction to shield users and programs from complexity and incompatibilities; and Provenance to auto-document execution logs and data lineage for future analysis. A major hindrance for wider adaptation and reuse of workflows, even when open source, is that they are written for specific workflow systems or infrastructures. \u003Cem\u003ECommon Workflow Language\u003C\/em\u003E (CWL) has emerged as a community initiative with support across a range of existing workflow engines, using a language specification that focus on the common denominator of command line tools exchanging files. Support for CWL on HPC expanded in the recent months, such as IBM\u0027s CWLEXEC on LSF, or Toil with Singularity. In this talk we will present the challenges of moving CWL workflows towards Exascale, while retaining key features of workflows such as \u003Cem\u003Ereproducibility, interoperability, usability \u003C\/em\u003Eand\u003Cem\u003E provenance.\u003C\/em\u003E","filename":"msa265s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Stian","last_name":"Soiland-Reyes","affiliation":"University of Manchester","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Stian","last_name":"Soiland-Reyes","affiliation":"University of Manchester","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa135","type":"child","title":"Workflow Automation and Efficiency for Macromolecular Simulations and Screening","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Life science is one of the largest and fastest growing communities in terms of needs for high-end computing. Biological studies usually require an integration of different computational approaches, defining complex, automated multi-step analysis workflows with inter-dependent steps, including CPU-intensive tasks generating large amounts of data. This number and diversity of tasks to be integrated, together with the short lifetime and fast turnover of computer codes and life sciences-related methods, make standardization of these workflows an extremely challenging task. BioExcel CoE has been working, together with Elixir project, on putting forward a set of best practices to develop, document and describe life sciences workflows, following the FAIR principles: Findability, Accessibility, Interoperability and Reproducibility. Examples of the first workflow prototypes implemented following this approach (Automatic modeling of protein mutations and Virtual Screening), illustrating the benefits of the introduced best practices, will be presented.","filename":"msa135s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Adam","last_name":"Hospital","affiliation":"Institute for Research in Biomedicine","country":"Spain","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Adam","last_name":"Hospital","affiliation":"Institute for Research in Biomedicine","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"msa284","type":"child","title":"Round-Table Discussion: Simulations at Exascale - Myth or Reality?","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Exascale supercomputers seem to be around the corner. Producing them will be a real challenge, no doubt, considering issues with processor design, power-consumption and so on but engineers are confident about their delivery within a few years. Life science (and not only) software applications are capable of running at peta-scale in HPC\/HTC regime, but are they ready for the next level push? When the Exa-machines come, will there be simulation engines and job dispatchers able to orchestrate billions of cores? Will researchers be able to tackle major scientific problems and deliver amazing discoveries that are unattainable at lower computing scale? How well prepared are the communities? We have invited Prof. Erik Lindahl, Lead Scientist of BioExcel, the European Center of Excellence for Computational Biomolecular Research (www.bioexcel.eu) and Prof. Daniel Crawford, Director of MolSSI, the Molecular Sciences Software Institute (www.molssi.org), USA, together with leading experts in the field (Shantenu Jha, MolSSI) and Adam Hospital, Stian Soiland-Reyes (BioExcel) to address these questions and try to understand what is needed to improve the interoperability of software applications, enable better coupling of tools with data sources, develop efficient libraries and devise user-friendly and extensible workflows\/pipelines for the upcoming Exascale era in HPC.","bio":"","contributors":[{"type":"Author","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Crawford","affiliation":"Virginia Tech","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Shantenu","last_name":"Jha","affiliation":"Rutgers University","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Adam","last_name":"Hospital","affiliation":"Institute for Research in Biomedicine","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Stian","last_name":"Soiland-Reyes","affiliation":"University of Manchester","country":"United Kingdom","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa265","type":"child","title":"Facing Compute Platform Portability Challenges with Scientific Workflows - Experiences from Common Workflow Language","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"\u003Cem\u003EScientific Workflow\u003C\/em\u003E systems are well established for computational analysis in all science domains, following the rapid development of workflow technology and community practices spanning the two recent decades, the \u003Cem\u003EeScience era\u003C\/em\u003E. Workflow systems have gained traction in the era of Big Data Science due to their \u0022ASAP properties\u0022: \u003C!-- \uf04a\uf04a Bertram Ludascher: WORKS 2015 Keynote \n\uf04a --\u003EAutomation over repetitive pipelines and simulation sweep campaigns; Scaling over computational infrastructure and handling large data; Abstraction to shield users and programs from complexity and incompatibilities; and Provenance to auto-document execution logs and data lineage for future analysis. A major hindrance for wider adaptation and reuse of workflows, even when open source, is that they are written for specific workflow systems or infrastructures. \u003Cem\u003ECommon Workflow Language\u003C\/em\u003E (CWL) has emerged as a community initiative with support across a range of existing workflow engines, using a language specification that focus on the common denominator of command line tools exchanging files. Support for CWL on HPC expanded in the recent months, such as IBM\u0027s CWLEXEC on LSF, or Toil with Singularity. In this talk we will present the challenges of moving CWL workflows towards Exascale, while retaining key features of workflows such as \u003Cem\u003Ereproducibility, interoperability, usability \u003C\/em\u003Eand\u003Cem\u003E provenance.\u003C\/em\u003E","filename":"msa265s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Stian","last_name":"Soiland-Reyes","affiliation":"University of Manchester","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Stian","last_name":"Soiland-Reyes","affiliation":"University of Manchester","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Stian","last_name":"Soiland-Reyes","affiliation":"University of Manchester","country":"United Kingdom","bio":"","order":"1","is_presenter":true}] } Presentation
17:00 - 17:30
Workflow Automation and Efficiency for Macromolecular Simulations and Screening
, Adam Hospital (Institute for Research in Biomedicine, Spain)
+ Abstract { "session": {"id":"sess155","title":"MS28 - Advances in Automation and Efficiency for the Exascale Era - Experiences from the Biomolecular Sciences","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Samarkand Room","contributors":[{"type":"Session Chair","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Chemistry and Materials"],"slots":[{"id":"symp143","type":"minisymposia","title":"MS28 - Advances in Automation and Efficiency for the Exascale Era - Experiences from the Biomolecular Sciences","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Life Sciences have become crucially dependent on software for analysis of experimental data, systems modelling and simulation, data integration across various repositories and databases, etc. The dramatic increase of available tools has enabled scientists to perform ever more complex studies while taking advantage of modern high-end (HPC and HTC) compute facilities. Experimental facilities are producing staggering amounts of data which led to the rapid development of novel data analytics and machine learning techniques. We are at a stage where there is a need for additional focus on improving the interoperability of software applications, enabling better coupling of tools with data sources and devising efficient workflows and libraries for the upcoming Exascale era in HPC. Such advances will considerably improve the productivity of researchers and allow them to address novel scientific problems. This minisymposium brings together invited experts from two leading institutions in the field \u2013 BioExcel, the European Center of Excellence for Computational Biomolecular Research (www.bioexcel.eu), and MolSSI, Molecular Sciences Software Institute (www.molssi.org) in the US to discuss advances in this important field. After the talks we will hold a round-table discussion on \u0022Simulations at Exascale - myth or reality?\u0022.","bio":"","contributors":[{"type":"Organizer","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"msa147","type":"child","title":"Building Blocks for Adaptive Workflows","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Next-generation exascale systems will fundamentally expand the reach of biomolecular simulations and the resulting scientific insight, enabling the simulation of larger biological systems (weak scaling), longer timescales (strong scaling), more complex molecular interactions, and robust uncertainty quantification (more accurate sampling). Solving biological problems that require longer timescales, involve more complex interactions and robust uncertainty quantification will require significant algorithmic improvements that incorporate high-level parallelism and leverage the statistical nature of molecular processes. Interestingly, many such simulation algorithms require adaptive workflows. We argue the need for workflow-systems using a building blocks approach to support adaptive workflows on extreme-scale heterogeneous and dynamic resources. We discuss RADICAL-Cybertools as an implementation of the building block concept, and discuss how RADICAL-Cybertools are being used to support a wide range of adaptive workflows in biomolecular simulations.","bio":"","contributors":[{"type":"Author","first_name":"Shantenu","last_name":"Jha","affiliation":"Rutgers University","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Shantenu","last_name":"Jha","affiliation":"Rutgers University","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa265","type":"child","title":"Facing Compute Platform Portability Challenges with Scientific Workflows - Experiences from Common Workflow Language","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"\u003Cem\u003EScientific Workflow\u003C\/em\u003E systems are well established for computational analysis in all science domains, following the rapid development of workflow technology and community practices spanning the two recent decades, the \u003Cem\u003EeScience era\u003C\/em\u003E. Workflow systems have gained traction in the era of Big Data Science due to their \u0022ASAP properties\u0022: \u003C!-- \uf04a\uf04a Bertram Ludascher: WORKS 2015 Keynote \n\uf04a --\u003EAutomation over repetitive pipelines and simulation sweep campaigns; Scaling over computational infrastructure and handling large data; Abstraction to shield users and programs from complexity and incompatibilities; and Provenance to auto-document execution logs and data lineage for future analysis. A major hindrance for wider adaptation and reuse of workflows, even when open source, is that they are written for specific workflow systems or infrastructures. \u003Cem\u003ECommon Workflow Language\u003C\/em\u003E (CWL) has emerged as a community initiative with support across a range of existing workflow engines, using a language specification that focus on the common denominator of command line tools exchanging files. Support for CWL on HPC expanded in the recent months, such as IBM\u0027s CWLEXEC on LSF, or Toil with Singularity. In this talk we will present the challenges of moving CWL workflows towards Exascale, while retaining key features of workflows such as \u003Cem\u003Ereproducibility, interoperability, usability \u003C\/em\u003Eand\u003Cem\u003E provenance.\u003C\/em\u003E","filename":"msa265s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Stian","last_name":"Soiland-Reyes","affiliation":"University of Manchester","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Stian","last_name":"Soiland-Reyes","affiliation":"University of Manchester","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa135","type":"child","title":"Workflow Automation and Efficiency for Macromolecular Simulations and Screening","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Life science is one of the largest and fastest growing communities in terms of needs for high-end computing. Biological studies usually require an integration of different computational approaches, defining complex, automated multi-step analysis workflows with inter-dependent steps, including CPU-intensive tasks generating large amounts of data. This number and diversity of tasks to be integrated, together with the short lifetime and fast turnover of computer codes and life sciences-related methods, make standardization of these workflows an extremely challenging task. BioExcel CoE has been working, together with Elixir project, on putting forward a set of best practices to develop, document and describe life sciences workflows, following the FAIR principles: Findability, Accessibility, Interoperability and Reproducibility. Examples of the first workflow prototypes implemented following this approach (Automatic modeling of protein mutations and Virtual Screening), illustrating the benefits of the introduced best practices, will be presented.","filename":"msa135s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Adam","last_name":"Hospital","affiliation":"Institute for Research in Biomedicine","country":"Spain","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Adam","last_name":"Hospital","affiliation":"Institute for Research in Biomedicine","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"msa284","type":"child","title":"Round-Table Discussion: Simulations at Exascale - Myth or Reality?","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Exascale supercomputers seem to be around the corner. Producing them will be a real challenge, no doubt, considering issues with processor design, power-consumption and so on but engineers are confident about their delivery within a few years. Life science (and not only) software applications are capable of running at peta-scale in HPC\/HTC regime, but are they ready for the next level push? When the Exa-machines come, will there be simulation engines and job dispatchers able to orchestrate billions of cores? Will researchers be able to tackle major scientific problems and deliver amazing discoveries that are unattainable at lower computing scale? How well prepared are the communities? We have invited Prof. Erik Lindahl, Lead Scientist of BioExcel, the European Center of Excellence for Computational Biomolecular Research (www.bioexcel.eu) and Prof. Daniel Crawford, Director of MolSSI, the Molecular Sciences Software Institute (www.molssi.org), USA, together with leading experts in the field (Shantenu Jha, MolSSI) and Adam Hospital, Stian Soiland-Reyes (BioExcel) to address these questions and try to understand what is needed to improve the interoperability of software applications, enable better coupling of tools with data sources, develop efficient libraries and devise user-friendly and extensible workflows\/pipelines for the upcoming Exascale era in HPC.","bio":"","contributors":[{"type":"Author","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Erik","last_name":"Lindahl","affiliation":"Stockholm University","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Crawford","affiliation":"Virginia Tech","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Shantenu","last_name":"Jha","affiliation":"Rutgers University","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Adam","last_name":"Hospital","affiliation":"Institute for Research in Biomedicine","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Stian","last_name":"Soiland-Reyes","affiliation":"University of Manchester","country":"United Kingdom","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Rossen","last_name":"Apostolov","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa135","type":"child","title":"Workflow Automation and Efficiency for Macromolecular Simulations and Screening","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Life science is one of the largest and fastest growing communities in terms of needs for high-end computing. Biological studies usually require an integration of different computational approaches, defining complex, automated multi-step analysis workflows with inter-dependent steps, including CPU-intensive tasks generating large amounts of data. This number and diversity of tasks to be integrated, together with the short lifetime and fast turnover of computer codes and life sciences-related methods, make standardization of these workflows an extremely challenging task. BioExcel CoE has been working, together with Elixir project, on putting forward a set of best practices to develop, document and describe life sciences workflows, following the FAIR principles: Findability, Accessibility, Interoperability and Reproducibility. Examples of the first workflow prototypes implemented following this approach (Automatic modeling of protein mutations and Virtual Screening), illustrating the benefits of the introduced best practices, will be presented.","filename":"msa135s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Adam","last_name":"Hospital","affiliation":"Institute for Research in Biomedicine","country":"Spain","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Adam","last_name":"Hospital","affiliation":"Institute for Research in Biomedicine","country":"Spain","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Adam","last_name":"Hospital","affiliation":"Institute for Research in Biomedicine","country":"Spain","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Ebru Bozdag (Colorado School of Mines, United States of America)
, Dimitri Komatitsch (CNRS, France)
Track(s):
Computer Science and Applied Mathematics, Solid Earth Dynamics, Physics
Recent advances in theory and numerical methods in parallel to the availability of high-quality massive data sets and high-performance computing provide unprecedented opportunities to improve our understanding of Earth’s interior and its mechanism. The goal of this session is to bring computational and Earth scientists together to form a platform to discuss the current status, challenges and future directions in computational geosciences highlighting numerical simulations, the state-of-the-art HPC applications and their scientific outcomes. Contributions include, but are not limited to, the areas of earthquake engineering, passive and active-source seismic imaging, geodynamical modelling, magneto-fluid dynamics, etc. in conjunction with computational approaches such as numerical solvers, large-scale workflow, big data, optimisation strategies, etc. on HPC systems.
16:00 - 16:30
Simulating the Solid Earth and Planets over Billions of Years: From Magma Oceans to Plate Tectonics to Exoplanets
, Paul J. Tackley (ETH Zurich, Switzerland)
+ Abstract { "session": {"id":"sess157","title":"MS29 - Advances in Computational Geosciences, Part II","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Darwin Room","contributors":[{"type":"Session Chair","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Solid Earth Dynamics","Physics"],"slots":[{"id":"symp159","type":"minisymposia","title":"MS29 - Advances in Computational Geosciences, Part II","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Recent advances in theory and numerical methods in parallel to the availability of high-quality massive data sets and high-performance computing provide unprecedented opportunities to improve our understanding of Earth\u2019s interior and its mechanism. The goal of this session is to bring computational and Earth scientists together to form a platform to discuss the current status, challenges and future directions in computational geosciences highlighting numerical simulations, the state-of-the-art HPC applications and their scientific outcomes. Contributions include, but are not limited to, the areas of earthquake engineering, passive and active-source seismic imaging, geodynamical modelling, magneto-fluid dynamics, etc. in conjunction with computational approaches such as numerical solvers, large-scale workflow, big data, optimisation strategies, etc.\u00a0on HPC systems.","bio":"","contributors":[{"type":"Organizer","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa280","type":"child","title":"Simulating the Solid Earth and Planets over Billions of Years: From Magma Oceans to Plate Tectonics to Exoplanets","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The coupled system of plate tectonics and convection of the Earth\u0027s solid mantle is the driver of geological change on our planet, including continental drift, volcanoes, earthquakes, crustal production, atmospheric degassing and recycling, and cooling of the core, which drives the geodynamo. Modelling this process is challenging due to the wide range of length scales (from faults to continents) and time scales (seconds to billions of years) and the complex rheology of rocks, which exhibit visco-elasto-plastic behaviour with strongly temperature-dependent viscosity varying by orders of magnitude over short length scales. Nevertheless, it is now routine to perform global-scale 3-D spherical simulations that span the 4.5 billion year age of our planet and contain complex effects such as partial melting and crustal production and solid-solid phase transitions. StagYY is one of the leading codes for performing such simulations. It uses a finite-volume discretization on a yin-yang spherical grid. Both built-in geometric multigrid, and the range of solvers available through PETSc, can be used. Here, technical details will be discussed, including recent enhancements made with PASC funding such as use of hybrid (GPU-CPU) architectures. Some recent scientific results published in Science and Nature will then be summarised.","filename":"msa280s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Paul J.","last_name":"Tackley","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Paul J.","last_name":"Tackley","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa282","type":"child","title":"Dynamic Viability of Earthquake Rupture Cascades on Complex Fault Systems","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Puzzling features of earthquake dynamics are inferred from recent well recorded events. A prominent example is the 2016 Mw7.8 Kaikoura, New Zealand earthquake, considered the most complex rupture observed to date and causing surface rupture of at least 21 segments of the Marlborough fault system. High-quality observations suggest a large gap separating surface rupture traces, the possibility of significant slip on the subduction interface, and slow apparent rupture speed. I will present a comprehensive 3D dynamic model of the Kaikoura earthquake unraveling the event\u0027s riddles in a physics-based manner. High resolution modeling is enabled by the open-source software SeisSol (www.seissol.org) that couples seismic wave propagation with frictional fault failure and off-fault inelasticity with high-order accuracy in space and time (minimal dispersion errors). SeisSol exploits unstructured tetrahedral meshes to account for complex geometries, e.g. high resolution topography and bathymetry, 3D subsurface structure, and fault networks.\u00a0The\u00a0achieved degree of realism and accuracy is enabled by recent computational optimizations targeting strong scalability on many-core CPUs and a ten-fold speedup owing to an efficient local time-stepping algorithm. Understanding the physical conditions that allow rupture cascades will advance our ability to quantify earthquake hazard, especially\u00a0regarding the possibility of extreme events on real fault networks.","filename":"msa282s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alice-Agnes","last_name":"Gabriel","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alice-Agnes","last_name":"Gabriel","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa246","type":"child","title":"Imaging of the Italian Lithosphere Based on Adjoint Tomography","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"During the PRACE project IMAGINE_IT (3D full-wave tomographic IMAGINg\u00a0of the Entire ITalian lithosphere) we iteratively create a 3D tomographic\u00a0model of the Italian lithospheric structure. Our goal was to build a new reference 3D seismic\u00a0velocity model for the region constrained by a large number of\u00a0observed full seismic waveforms. We used recorded data of dense seismological\u00a0networks (400 seismic stations) together with extremely efficient numerical techniques and an enormous computational\u00a0power (36 million core hours) provided by European Tier-0 system CURIE (GENCI, FR). We exploited the powerful combination of a spectral-element method (code SPECFEM3D)\u00a0and an adjoint method, for\u00a0tomographic inversion and imaging based on misfit reduction between observed data (associated to 163 regional earthquakes) and synthetic full\u00a0waveforms. We performed 25\u00a0tomographic iterations, moment tensor inversions and some point spread function resolution analysis.\u00a0We are able to constrain Vp and in particular Vs at unprecedented\u00a0resolution and interesting structural and tectonic features start to be accurately modelled. Creating a refined geological model of\u00a0the lithosphere in Italy will enhance the capability of analysing seismic effects. This has consequences for the assessment of seismic hazard and for planning effective measures based on rapid scenarios.","bio":"","contributors":[{"type":"Author","first_name":"Emanuele","last_name":"Casarotti","affiliation":"INGV","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Federica","last_name":"Magnoni","affiliation":"INGV","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Jeroen","last_name":"Tromp","affiliation":"Princeton University","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Emanuele","last_name":"Casarotti","affiliation":"INGV","country":"Italy","bio":"","order":"1","is_presenter":true}]},{"id":"msa211","type":"child","title":"Full-Waveform Inversion of the Solid Earth from Crust to Core","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Accurate and high-resolution images of Earth\u0027s interior are crucial to improve our understanding of the inner dynamics of our planet. Global adjoint tomography is one of the extreme projects in seismology due to the intense computational requirements and vast amount of data that can potentially be assimilated in inversions. The first-generation global adjoint tomography model, GLAD-M15, was constructed using data from 253 earthquakes with transverse isotropy confined to the upper mantle, inverting crust and mantle simultaneously. We now perform inversions for next-generation global adjoint models with more complete parameterisations including surface-wave azimuthal anisotropy, anelasticity, etc. while increasing the database in complementary inversions. The GPU version of SPECFEM3D_GLOBE is used for forward and adjoint simulations on the Oak Ridge Leadership Computing Facility\u0027s Cray XK7 Titan system, a computer with 18,688 GPU accelerators. We will perform 9 s simulations (currently 17 s) on Oak Ridge\u0027s next generation supercomputer \u0022Summit\u0022. The ultimate aim is to go down to 1 Hz in global simulations to perform whole-Earth inversions including the core and assimilate all available seismic data from all global CMT earthquakes within the magnitude range of 5.5 to 7.0 in the construction of global models.","bio":"","contributors":[{"type":"Author","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Matthieu","last_name":"Lefebvre","affiliation":"Princeton University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Wenjie","last_name":"Lei","affiliation":"Princeton University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ridvan","last_name":"Orsvuran","affiliation":"University of Cote d\u0027Azur","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Peter","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Youyi","last_name":"Ruan","affiliation":"Princeton University","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"James","last_name":"Smith","affiliation":"Princeton University","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Jeroen","last_name":"Tromp","affiliation":"Princeton University","country":"United States of America","bio":"","order":"9","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa280","type":"child","title":"Simulating the Solid Earth and Planets over Billions of Years: From Magma Oceans to Plate Tectonics to Exoplanets","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The coupled system of plate tectonics and convection of the Earth\u0027s solid mantle is the driver of geological change on our planet, including continental drift, volcanoes, earthquakes, crustal production, atmospheric degassing and recycling, and cooling of the core, which drives the geodynamo. Modelling this process is challenging due to the wide range of length scales (from faults to continents) and time scales (seconds to billions of years) and the complex rheology of rocks, which exhibit visco-elasto-plastic behaviour with strongly temperature-dependent viscosity varying by orders of magnitude over short length scales. Nevertheless, it is now routine to perform global-scale 3-D spherical simulations that span the 4.5 billion year age of our planet and contain complex effects such as partial melting and crustal production and solid-solid phase transitions. StagYY is one of the leading codes for performing such simulations. It uses a finite-volume discretization on a yin-yang spherical grid. Both built-in geometric multigrid, and the range of solvers available through PETSc, can be used. Here, technical details will be discussed, including recent enhancements made with PASC funding such as use of hybrid (GPU-CPU) architectures. Some recent scientific results published in Science and Nature will then be summarised.","filename":"msa280s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Paul J.","last_name":"Tackley","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Paul J.","last_name":"Tackley","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Paul J.","last_name":"Tackley","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
16:30 - 17:00
Dynamic Viability of Earthquake Rupture Cascades on Complex Fault Systems
, Alice-Agnes Gabriel (Ludwig Maximilian University of Munich, Germany)
+ Abstract { "session": {"id":"sess157","title":"MS29 - Advances in Computational Geosciences, Part II","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Darwin Room","contributors":[{"type":"Session Chair","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Solid Earth Dynamics","Physics"],"slots":[{"id":"symp159","type":"minisymposia","title":"MS29 - Advances in Computational Geosciences, Part II","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Recent advances in theory and numerical methods in parallel to the availability of high-quality massive data sets and high-performance computing provide unprecedented opportunities to improve our understanding of Earth\u2019s interior and its mechanism. The goal of this session is to bring computational and Earth scientists together to form a platform to discuss the current status, challenges and future directions in computational geosciences highlighting numerical simulations, the state-of-the-art HPC applications and their scientific outcomes. Contributions include, but are not limited to, the areas of earthquake engineering, passive and active-source seismic imaging, geodynamical modelling, magneto-fluid dynamics, etc. in conjunction with computational approaches such as numerical solvers, large-scale workflow, big data, optimisation strategies, etc.\u00a0on HPC systems.","bio":"","contributors":[{"type":"Organizer","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa280","type":"child","title":"Simulating the Solid Earth and Planets over Billions of Years: From Magma Oceans to Plate Tectonics to Exoplanets","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The coupled system of plate tectonics and convection of the Earth\u0027s solid mantle is the driver of geological change on our planet, including continental drift, volcanoes, earthquakes, crustal production, atmospheric degassing and recycling, and cooling of the core, which drives the geodynamo. Modelling this process is challenging due to the wide range of length scales (from faults to continents) and time scales (seconds to billions of years) and the complex rheology of rocks, which exhibit visco-elasto-plastic behaviour with strongly temperature-dependent viscosity varying by orders of magnitude over short length scales. Nevertheless, it is now routine to perform global-scale 3-D spherical simulations that span the 4.5 billion year age of our planet and contain complex effects such as partial melting and crustal production and solid-solid phase transitions. StagYY is one of the leading codes for performing such simulations. It uses a finite-volume discretization on a yin-yang spherical grid. Both built-in geometric multigrid, and the range of solvers available through PETSc, can be used. Here, technical details will be discussed, including recent enhancements made with PASC funding such as use of hybrid (GPU-CPU) architectures. Some recent scientific results published in Science and Nature will then be summarised.","filename":"msa280s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Paul J.","last_name":"Tackley","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Paul J.","last_name":"Tackley","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa282","type":"child","title":"Dynamic Viability of Earthquake Rupture Cascades on Complex Fault Systems","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Puzzling features of earthquake dynamics are inferred from recent well recorded events. A prominent example is the 2016 Mw7.8 Kaikoura, New Zealand earthquake, considered the most complex rupture observed to date and causing surface rupture of at least 21 segments of the Marlborough fault system. High-quality observations suggest a large gap separating surface rupture traces, the possibility of significant slip on the subduction interface, and slow apparent rupture speed. I will present a comprehensive 3D dynamic model of the Kaikoura earthquake unraveling the event\u0027s riddles in a physics-based manner. High resolution modeling is enabled by the open-source software SeisSol (www.seissol.org) that couples seismic wave propagation with frictional fault failure and off-fault inelasticity with high-order accuracy in space and time (minimal dispersion errors). SeisSol exploits unstructured tetrahedral meshes to account for complex geometries, e.g. high resolution topography and bathymetry, 3D subsurface structure, and fault networks.\u00a0The\u00a0achieved degree of realism and accuracy is enabled by recent computational optimizations targeting strong scalability on many-core CPUs and a ten-fold speedup owing to an efficient local time-stepping algorithm. Understanding the physical conditions that allow rupture cascades will advance our ability to quantify earthquake hazard, especially\u00a0regarding the possibility of extreme events on real fault networks.","filename":"msa282s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alice-Agnes","last_name":"Gabriel","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alice-Agnes","last_name":"Gabriel","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa246","type":"child","title":"Imaging of the Italian Lithosphere Based on Adjoint Tomography","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"During the PRACE project IMAGINE_IT (3D full-wave tomographic IMAGINg\u00a0of the Entire ITalian lithosphere) we iteratively create a 3D tomographic\u00a0model of the Italian lithospheric structure. Our goal was to build a new reference 3D seismic\u00a0velocity model for the region constrained by a large number of\u00a0observed full seismic waveforms. We used recorded data of dense seismological\u00a0networks (400 seismic stations) together with extremely efficient numerical techniques and an enormous computational\u00a0power (36 million core hours) provided by European Tier-0 system CURIE (GENCI, FR). We exploited the powerful combination of a spectral-element method (code SPECFEM3D)\u00a0and an adjoint method, for\u00a0tomographic inversion and imaging based on misfit reduction between observed data (associated to 163 regional earthquakes) and synthetic full\u00a0waveforms. We performed 25\u00a0tomographic iterations, moment tensor inversions and some point spread function resolution analysis.\u00a0We are able to constrain Vp and in particular Vs at unprecedented\u00a0resolution and interesting structural and tectonic features start to be accurately modelled. Creating a refined geological model of\u00a0the lithosphere in Italy will enhance the capability of analysing seismic effects. This has consequences for the assessment of seismic hazard and for planning effective measures based on rapid scenarios.","bio":"","contributors":[{"type":"Author","first_name":"Emanuele","last_name":"Casarotti","affiliation":"INGV","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Federica","last_name":"Magnoni","affiliation":"INGV","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Jeroen","last_name":"Tromp","affiliation":"Princeton University","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Emanuele","last_name":"Casarotti","affiliation":"INGV","country":"Italy","bio":"","order":"1","is_presenter":true}]},{"id":"msa211","type":"child","title":"Full-Waveform Inversion of the Solid Earth from Crust to Core","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Accurate and high-resolution images of Earth\u0027s interior are crucial to improve our understanding of the inner dynamics of our planet. Global adjoint tomography is one of the extreme projects in seismology due to the intense computational requirements and vast amount of data that can potentially be assimilated in inversions. The first-generation global adjoint tomography model, GLAD-M15, was constructed using data from 253 earthquakes with transverse isotropy confined to the upper mantle, inverting crust and mantle simultaneously. We now perform inversions for next-generation global adjoint models with more complete parameterisations including surface-wave azimuthal anisotropy, anelasticity, etc. while increasing the database in complementary inversions. The GPU version of SPECFEM3D_GLOBE is used for forward and adjoint simulations on the Oak Ridge Leadership Computing Facility\u0027s Cray XK7 Titan system, a computer with 18,688 GPU accelerators. We will perform 9 s simulations (currently 17 s) on Oak Ridge\u0027s next generation supercomputer \u0022Summit\u0022. The ultimate aim is to go down to 1 Hz in global simulations to perform whole-Earth inversions including the core and assimilate all available seismic data from all global CMT earthquakes within the magnitude range of 5.5 to 7.0 in the construction of global models.","bio":"","contributors":[{"type":"Author","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Matthieu","last_name":"Lefebvre","affiliation":"Princeton University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Wenjie","last_name":"Lei","affiliation":"Princeton University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ridvan","last_name":"Orsvuran","affiliation":"University of Cote d\u0027Azur","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Peter","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Youyi","last_name":"Ruan","affiliation":"Princeton University","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"James","last_name":"Smith","affiliation":"Princeton University","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Dimitri","last_name":"Komatitsch","affiliation":"CNRS","country":"France","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Jeroen","last_name":"Tromp","affiliation":"Princeton University","country":"United States of America","bio":"","order":"9","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ebru","last_name":"Bozdag","affiliation":"Colorado School of Mines","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa282","type":"child","title":"Dynamic Viability of Earthquake Rupture Cascades on Complex Fault Systems","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Puzzling features of earthquake dynamics are inferred from recent well recorded events. A prominent example is the 2016 Mw7.8 Kaikoura, New Zealand earthquake, considered the most complex rupture observed to date and causing surface rupture of at least 21 segments of the Marlborough fault system. High-quality observations suggest a large gap separating surface rupture traces, the possibility of significant slip on the subduction interface, and slow apparent rupture speed. I will present a comprehensive 3D dynamic model of the Kaikoura earthquake unraveling the event\u0027s riddles in a physics-based manner. High resolution modeling is enabled by the open-source software SeisSol (www.seissol.org) that couples seismic wave propagation with frictional fault failure and off-fault inelasticity with high-order accuracy in space and time (minimal dispersion errors). SeisSol exploits unstructured tetrahedral meshes to account for complex geometries, e.g. high resolution topography and bathymetry, 3D subsurface structure, and fault networks.\u00a0The\u00a0achieved degree of realism and accuracy is enabled by recent computational optimizations targeting strong scalability on many-core CPUs and a ten-fold speedup owing to an efficient local time-stepping algorithm. Understanding the physical conditions that allow rupture cascades will advance our ability to quantify earthquake hazard, especially\u00a0regarding the possibility of extreme events on real fault networks.","filename":"msa282s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alice-Agnes","last_name":"Gabriel","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alice-Agnes","last_name":"Gabriel","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Alice-Agnes","last_name":"Gabriel","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Helmut Harbrecht (University of Basel, Switzerland)
, Peter Zaspel (University of Basel, Switzerland)
Track(s):
Engineering, Computer Science and Applied Mathematics, Chemistry and Materials, Physics
The aim of this minisymposium is to discuss research at the intersection of high-dimensional approximation and parallel computing. High-dimensional approximation drives e.g. uncertainty quantification, optimization, machine learning and big data as well as simulations of complex physics models. It is well-known that approximation of functions of growing dimension has the challenge of the curse of dimensionality. Over the last decades, many powerful mathematical tools have been developed to do weaken or overcome this. These include, but are not limited to (quasi) Monte Carlo, multi-level / multi-fidelity techniques, sparse grids, low-rank and tensor product approximations, hierarchical matrices, compressed sensing and meshfree methods.
There is a growing interest to solve high-dimensional approximation problems at large scale. While many of the discussed methods have good or even optimal approximation properties and complexities for larger dimensions, some have been primarily developed for sequential execution. However, to solve large scale approximation problems, it becomes necessary to develop fast, scalable and parallel numerical methods. This minisymposium invites contributions in high-dimensional approximation ranging from initial studies for the use of parallel techniques up to full scale parallel methods that run on large HPC clusters. Speakers will showcase both algorithmic-oriented and application-centered research.
There is a growing interest to solve high-dimensional approximation problems at large scale. While many of the discussed methods have good or even optimal approximation properties and complexities for larger dimensions, some have been primarily developed for sequential execution. However, to solve large scale approximation problems, it becomes necessary to develop fast, scalable and parallel numerical methods. This minisymposium invites contributions in high-dimensional approximation ranging from initial studies for the use of parallel techniques up to full scale parallel methods that run on large HPC clusters. Speakers will showcase both algorithmic-oriented and application-centered research.
Organizer(s):
Joachim Biercamp (German Climate Computing Center, Germany)
, Oliver Fuhrer (MeteoSwiss, Switzerland)
, Christoph Schär (ETH Zurich, Switzerland)
Track(s):
Computer Science and Applied Mathematics, Climate and Weather
Increasing ensemble size, grid resolution and complexity of Earth system models have driven data volumes to a point where scientists are forced to give some of it up. Estimates for total data volumes of the Climate Model Intercomparison Project 6 (CMIP6, Eyring et al. 2016) range from 15 to 30 PBytes. Traditional disk or tape storage bandwidth and capacity is not keeping pace with increases in computational capacity, and data storage and movement constraints are already limiting scientific analysis of these enormous simulation efforts.
With significant momentum towards global convection-resolving climate simulations, traditional compute-store-analyze workflows still common in climate science today will break down and research into alternative data analysis workflows is urgently needed. This minisymposium will highlight and critically discuss the potential and challenges of several pathways to alleviate the data bottleneck in climate science: data compression, in-situ/in-transit data analysis and processing, recomputation.
With significant momentum towards global convection-resolving climate simulations, traditional compute-store-analyze workflows still common in climate science today will break down and research into alternative data analysis workflows is urgently needed. This minisymposium will highlight and critically discuss the potential and challenges of several pathways to alleviate the data bottleneck in climate science: data compression, in-situ/in-transit data analysis and processing, recomputation.
16:00 - 16:30
Trends in Data Technology: Opportunities and Challenges for Earth System Simulation and Analysis
, Venkatramani Balaji (Princeton University, United States of America)
+ Abstract { "session": {"id":"sess180","title":"MS31 - How Can We Escape the Data Avalanche in Climate Science?","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Centre","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Climate and Weather"],"slots":[{"id":"symp123","type":"minisymposia","title":"MS31 - How Can We Escape the Data Avalanche in Climate Science?","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Increasing ensemble size, grid resolution and complexity of Earth system models have driven data volumes to a point where scientists are forced to give some of it up. Estimates for total data volumes of the Climate Model Intercomparison Project 6 (CMIP6, Eyring et al. 2016) range from 15 to 30 PBytes. Traditional disk or tape storage bandwidth and capacity is not keeping pace with increases in computational capacity, and data storage and movement constraints are already limiting scientific analysis of these enormous simulation efforts.\u003Cbr \/\u003E\u003Cbr \/\u003E With significant momentum towards global convection-resolving climate simulations, traditional compute-store-analyze workflows still common in climate science today will break down and research into alternative data analysis workflows is urgently needed. This minisymposium will highlight and critically discuss the potential and challenges of several pathways to alleviate the data bottleneck in climate science: data compression, in-situ\/in-transit data analysis and processing, recomputation.","bio":"","contributors":[{"type":"Organizer","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Christoph","last_name":"Sch\u00e4r","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa314","type":"child","title":"Trends in Data Technology: Opportunities and Challenges for Earth System Simulation and Analysis","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Earth system modeling, since its origin at the dawn of modern computing, has operated at the very limits of technological possibility. This has led to tremendous advances in weather forecasting, and the use of models to project climate change both for understanding the Earth system, and in service of downstream science and policy. In this talk, we examine changes in underlying technology, including the physical limits of miniaturization, the emergence of a deep memory-strategy hierarchy, which make \u0022business as usual\u0022 approaches to simulation and analysis appear somewhat risky. We look simultaneously at trends in Earth system modeling, in terms of the evolution of globally coordinated climate science experiments (CMIP-IPCC) and the emergence of \u0022seamless prediction\u0022, blurring the boundaries between weather and climate. Together, these point to new directions of research and development in data software and data science. Innovative and nimble approaches to analysis will be needed. A companion talk (Session MS42) examines this in the context of computational science and software, but it seems apparent that computation and data are inseparable problems, and a unified approach is indicated.","filename":"msa314s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa146","type":"child","title":"Lossy Data Compression for Climate Simulation Data: Reducing Data Volume while Preserving Information","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"High-resolution climate model simulations generate enormous data volumes that strain computing center storage resources at institutions such as the National Center for Atmospheric Research (NCAR). Further, storage limitations are negatively impacting science objectives by forcing scientists to run fewer or shorter simulations and\/or output data less frequently. Therefore, NCAR has been investigating using data compression to reduce data volumes from the widely used Community Earth System Model (CESM). Striking a balance between meaningfully reducing data volume and preserving the integrity of the simulation data is non-trivial, particularly given the large and diverse set of climate variables. In this talk, we first discuss the challenges of compressing climate data. We then describe our efforts thus far to evaluate the effects of data compression on the original data, which we believe should, at a minimum, not be distinguishable from the natural variability of the climate system. The ultimate goal is that the reconstructed and original climate simulation data are indistinguishable during post-processing analyses, which vary widely according to climate scientists\u0027 interests.","filename":"msa146s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Allison H.","last_name":"Baker","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dorit M.","last_name":"Hammerling","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Haiying","last_name":"Xu","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Clyne","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Dennis","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Shaomeng","last_name":"Li","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Allison H.","last_name":"Baker","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa173","type":"child","title":"In-Situ to the Rescue?","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The maturing of high-resolution simulation models signifies the end of the classic post-processing\/post-visualization workflow. With growing data sizes and simulation complexity, the processes of visualization and analysis become technically more difficult, but at the same time also more important. The human observer needs assistance to perceive and comprehend these large amounts of data, as well as guidance to find the important information within. Fortunately, a large variety of different approaches exist to handle such simulation data. One of which is in-situ visualization, which analyzes and visualizes the data alongside the simulation process. A choice of different setups exist, that allow either a loose or a tight coupling between the simulation model and the visualization software, and as such, the data can be visualized on-the-fly, or images, animations and even geometry stored to disk for later analysis. The biggest benefit of in-situ visualization is also its largest drawback: The data size and complexity is vastly reduced and can not be recovered to its full complexity after the simulation and visualization are finished. This presentation discusses both benefits and drawbacks of the most common in-situ setups and presents such an implementation based on the ICON model using ParaView\/Catalyst.","bio":"","contributors":[{"type":"Author","first_name":"Niklas","last_name":"R\u00f6ber","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jan Frederik","last_name":"Engels","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jan Frederik","last_name":"Engels","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"msa133","type":"child","title":"SimFS: A Simulation Data Virtualizing File System Interface","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific simulations, such as climate predictions, often produce petabytes of data to be stored in parallel filesystems or large-scale databases. This data is then analyzed, often by thousands of researchers, over the course of decades. However, storing these volumes of data for long time periods is not cost effective and, in some cases, practically impossible. SimFS virtualizes the simulation output: it only stores a small part and the missing data is re-simulated on-demand. SimFS decides which data to store according to the observed analysis access patterns. The data virtualization is invisible to the analysis applications because SimFS intercepts all calls to standard I\/O libraries. SimFS enables a trade-off between on-disk solutions, where all the simulation data is stored on disk, and in situ, where no data is stored and analyses are always coupled with simulations. This trade-off is driven by the amount of storage and compute resources assigned to SimFS. Overall, by exploiting the growing computing power and relaxing the storage capacity requirements, SimFS offers a viable path towards exascale simulations.","bio":"","contributors":[{"type":"Author","first_name":"Salvatore","last_name":"Di Girolamo","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Pirmin","last_name":"Schmid","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Salvatore","last_name":"Di Girolamo","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa314","type":"child","title":"Trends in Data Technology: Opportunities and Challenges for Earth System Simulation and Analysis","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Earth system modeling, since its origin at the dawn of modern computing, has operated at the very limits of technological possibility. This has led to tremendous advances in weather forecasting, and the use of models to project climate change both for understanding the Earth system, and in service of downstream science and policy. In this talk, we examine changes in underlying technology, including the physical limits of miniaturization, the emergence of a deep memory-strategy hierarchy, which make \u0022business as usual\u0022 approaches to simulation and analysis appear somewhat risky. We look simultaneously at trends in Earth system modeling, in terms of the evolution of globally coordinated climate science experiments (CMIP-IPCC) and the emergence of \u0022seamless prediction\u0022, blurring the boundaries between weather and climate. Together, these point to new directions of research and development in data software and data science. Innovative and nimble approaches to analysis will be needed. A companion talk (Session MS42) examines this in the context of computational science and software, but it seems apparent that computation and data are inseparable problems, and a unified approach is indicated.","filename":"msa314s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
16:30 - 17:00
Lossy Data Compression for Climate Simulation Data: Reducing Data Volume while Preserving Information
, Allison H. Baker (National Center for Atmospheric Research, United States of America)
+ Abstract { "session": {"id":"sess180","title":"MS31 - How Can We Escape the Data Avalanche in Climate Science?","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Centre","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Climate and Weather"],"slots":[{"id":"symp123","type":"minisymposia","title":"MS31 - How Can We Escape the Data Avalanche in Climate Science?","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Increasing ensemble size, grid resolution and complexity of Earth system models have driven data volumes to a point where scientists are forced to give some of it up. Estimates for total data volumes of the Climate Model Intercomparison Project 6 (CMIP6, Eyring et al. 2016) range from 15 to 30 PBytes. Traditional disk or tape storage bandwidth and capacity is not keeping pace with increases in computational capacity, and data storage and movement constraints are already limiting scientific analysis of these enormous simulation efforts.\u003Cbr \/\u003E\u003Cbr \/\u003E With significant momentum towards global convection-resolving climate simulations, traditional compute-store-analyze workflows still common in climate science today will break down and research into alternative data analysis workflows is urgently needed. This minisymposium will highlight and critically discuss the potential and challenges of several pathways to alleviate the data bottleneck in climate science: data compression, in-situ\/in-transit data analysis and processing, recomputation.","bio":"","contributors":[{"type":"Organizer","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Christoph","last_name":"Sch\u00e4r","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa314","type":"child","title":"Trends in Data Technology: Opportunities and Challenges for Earth System Simulation and Analysis","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Earth system modeling, since its origin at the dawn of modern computing, has operated at the very limits of technological possibility. This has led to tremendous advances in weather forecasting, and the use of models to project climate change both for understanding the Earth system, and in service of downstream science and policy. In this talk, we examine changes in underlying technology, including the physical limits of miniaturization, the emergence of a deep memory-strategy hierarchy, which make \u0022business as usual\u0022 approaches to simulation and analysis appear somewhat risky. We look simultaneously at trends in Earth system modeling, in terms of the evolution of globally coordinated climate science experiments (CMIP-IPCC) and the emergence of \u0022seamless prediction\u0022, blurring the boundaries between weather and climate. Together, these point to new directions of research and development in data software and data science. Innovative and nimble approaches to analysis will be needed. A companion talk (Session MS42) examines this in the context of computational science and software, but it seems apparent that computation and data are inseparable problems, and a unified approach is indicated.","filename":"msa314s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa146","type":"child","title":"Lossy Data Compression for Climate Simulation Data: Reducing Data Volume while Preserving Information","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"High-resolution climate model simulations generate enormous data volumes that strain computing center storage resources at institutions such as the National Center for Atmospheric Research (NCAR). Further, storage limitations are negatively impacting science objectives by forcing scientists to run fewer or shorter simulations and\/or output data less frequently. Therefore, NCAR has been investigating using data compression to reduce data volumes from the widely used Community Earth System Model (CESM). Striking a balance between meaningfully reducing data volume and preserving the integrity of the simulation data is non-trivial, particularly given the large and diverse set of climate variables. In this talk, we first discuss the challenges of compressing climate data. We then describe our efforts thus far to evaluate the effects of data compression on the original data, which we believe should, at a minimum, not be distinguishable from the natural variability of the climate system. The ultimate goal is that the reconstructed and original climate simulation data are indistinguishable during post-processing analyses, which vary widely according to climate scientists\u0027 interests.","filename":"msa146s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Allison H.","last_name":"Baker","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dorit M.","last_name":"Hammerling","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Haiying","last_name":"Xu","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Clyne","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Dennis","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Shaomeng","last_name":"Li","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Allison H.","last_name":"Baker","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa173","type":"child","title":"In-Situ to the Rescue?","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The maturing of high-resolution simulation models signifies the end of the classic post-processing\/post-visualization workflow. With growing data sizes and simulation complexity, the processes of visualization and analysis become technically more difficult, but at the same time also more important. The human observer needs assistance to perceive and comprehend these large amounts of data, as well as guidance to find the important information within. Fortunately, a large variety of different approaches exist to handle such simulation data. One of which is in-situ visualization, which analyzes and visualizes the data alongside the simulation process. A choice of different setups exist, that allow either a loose or a tight coupling between the simulation model and the visualization software, and as such, the data can be visualized on-the-fly, or images, animations and even geometry stored to disk for later analysis. The biggest benefit of in-situ visualization is also its largest drawback: The data size and complexity is vastly reduced and can not be recovered to its full complexity after the simulation and visualization are finished. This presentation discusses both benefits and drawbacks of the most common in-situ setups and presents such an implementation based on the ICON model using ParaView\/Catalyst.","bio":"","contributors":[{"type":"Author","first_name":"Niklas","last_name":"R\u00f6ber","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jan Frederik","last_name":"Engels","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jan Frederik","last_name":"Engels","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"msa133","type":"child","title":"SimFS: A Simulation Data Virtualizing File System Interface","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific simulations, such as climate predictions, often produce petabytes of data to be stored in parallel filesystems or large-scale databases. This data is then analyzed, often by thousands of researchers, over the course of decades. However, storing these volumes of data for long time periods is not cost effective and, in some cases, practically impossible. SimFS virtualizes the simulation output: it only stores a small part and the missing data is re-simulated on-demand. SimFS decides which data to store according to the observed analysis access patterns. The data virtualization is invisible to the analysis applications because SimFS intercepts all calls to standard I\/O libraries. SimFS enables a trade-off between on-disk solutions, where all the simulation data is stored on disk, and in situ, where no data is stored and analyses are always coupled with simulations. This trade-off is driven by the amount of storage and compute resources assigned to SimFS. Overall, by exploiting the growing computing power and relaxing the storage capacity requirements, SimFS offers a viable path towards exascale simulations.","bio":"","contributors":[{"type":"Author","first_name":"Salvatore","last_name":"Di Girolamo","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Pirmin","last_name":"Schmid","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Salvatore","last_name":"Di Girolamo","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa146","type":"child","title":"Lossy Data Compression for Climate Simulation Data: Reducing Data Volume while Preserving Information","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"High-resolution climate model simulations generate enormous data volumes that strain computing center storage resources at institutions such as the National Center for Atmospheric Research (NCAR). Further, storage limitations are negatively impacting science objectives by forcing scientists to run fewer or shorter simulations and\/or output data less frequently. Therefore, NCAR has been investigating using data compression to reduce data volumes from the widely used Community Earth System Model (CESM). Striking a balance between meaningfully reducing data volume and preserving the integrity of the simulation data is non-trivial, particularly given the large and diverse set of climate variables. In this talk, we first discuss the challenges of compressing climate data. We then describe our efforts thus far to evaluate the effects of data compression on the original data, which we believe should, at a minimum, not be distinguishable from the natural variability of the climate system. The ultimate goal is that the reconstructed and original climate simulation data are indistinguishable during post-processing analyses, which vary widely according to climate scientists\u0027 interests.","filename":"msa146s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Allison H.","last_name":"Baker","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dorit M.","last_name":"Hammerling","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Haiying","last_name":"Xu","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Clyne","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Dennis","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Shaomeng","last_name":"Li","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Allison H.","last_name":"Baker","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Allison H.","last_name":"Baker","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Dorit M.","last_name":"Hammerling","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Haiying","last_name":"Xu","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Clyne","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Dennis","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Shaomeng","last_name":"Li","affiliation":"National Center for Atmospheric Research","country":"United States of America","bio":"","order":"6","is_presenter":false}] } Presentation
Organizer(s):
Anshu Dubey (Argonne National Laboratory, United States of America)
, Michael A. Heroux (Sandia National Laboratories, United States of America)
, Mark Abraham (KTH Royal Institute of Technology, Sweden)
Track(s):
Solid Earth Dynamics, Physics, Life Sciences, Engineering, Emerging Application Domains, Computer Science and Applied Mathematics, Climate and Weather, Chemistry and Materials
Recent years have seen a rise in both scrutiny and attention being paid to scientific software because of the corresponding rise in scientific discovery through simulations and data analysis. To produce high quality science it is critical to employ high quality tools and processes. The vast majority of projects engaged in using computations for advancing scientific understanding have yet to attain sufficient robustness in either their software or their process. What has changed is the realization by these projects that they must strive for robustness if they want credibility. However, given the fundamentally multidisciplinary nature of computational science, and scarcity of resources, training and experience, research teams struggle. One of the recently launched community efforts for addressing this gap is the Better Scientific Software website (BSSw.io), a resource for scientific software developers and users which includes curated and contributed content, discussion forums and many other helpful features. Through this minisymposium we seek to introduce the scientific software community to BSSw, highlighting some of the content offered. We also seek to broaden our reach with presentations from practitioners in several scientific communities highlighting what they do and their plans for the sustainability of their software.
16:30 - 17:00
Reproducibility in Scientific Software
, Michael A. Heroux (Sandia National Laboratories, United States of America)
+ Abstract { "session": {"id":"sess183","title":"MS32 - Increasing Credibility of Simulation and Analytic Software for Science","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Solid Earth Dynamics","Physics","Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials"],"slots":[{"id":"symp150","type":"minisymposia","title":"MS32 - Increasing Credibility of Simulation and Analytic Software for Science","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Recent years have seen a rise in both scrutiny and attention being paid to scientific software because of the corresponding rise in scientific discovery through simulations and data analysis. To produce high quality science it is critical to employ high quality tools and processes. The vast majority of projects engaged in using computations for advancing scientific understanding have yet to attain sufficient robustness in either their software or their process. What has changed is the realization by these projects that they must strive for robustness if they want credibility. However, given the fundamentally multidisciplinary nature of computational science, and scarcity of resources, training and experience, research teams struggle. One of the recently launched community efforts for addressing this gap is the Better Scientific Software website (BSSw.io), a resource for scientific software developers and users which includes curated and contributed content, discussion forums and many other helpful features. Through this minisymposium we seek to introduce the scientific software community to BSSw, highlighting some of the content offered. We also seek to broaden our reach with presentations from practitioners in several scientific communities highlighting what they do and their plans for the sustainability of their software.","bio":"","contributors":[{"type":"Organizer","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Organizer","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":true},{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":true}]},{"id":"msa242","type":"child","title":"Software Engineering for Simulation Neuroscience","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Compared to other scientific fields, neuroscience only more recently picked up on the capabilities offered by simulation. While this \u0022late start\u0022 in principle allows learning sustainable practices from other fields, it also is\u00a0leaving the field with substantial technical debt. Selected neuro simulator projects\u00a0are leading good computational science and engineering practices, but for simulation neuroscience an ecosystem of model generation, simulation and analysis software is needed.\u00a0Big science projects such as the Swiss Blue Brain Project or the European Human Brain Project offer the organizational\u00a0environment for sustainable science\/software co-development. On the example of selected software projects, this talk will illustrate domain specific challenges and recent progress.","bio":"","contributors":[{"type":"Author","first_name":"Felix","last_name":"Schuermann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Felix","last_name":"Schuermann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa210","type":"child","title":"Reproducibility in Scientific Software","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Reproducibility is at the heart of the scientific method. However, reproducible computational science has historically been a challenge. Difficulties in fully capturing and preserving the computational environment details, challenges of working with floating point arithmetic, and the lack of software practices that enable confident retrieval of specific computational data and software are among the challenges. In addition, our incentive systems do not adequately reward investment in best practices for reproducibility in relation to other demands. In this presentation, we discuss recent changes in expectations for publications, funding and community recognition that provide improved incentives for better reproducibility, and highlight a few of the technical improvements that make reproducible computation more achievable.","filename":"msa210s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa203","type":"child","title":"Outreach for Better Scientific Software","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Historically, there have been many impediments to widespread improvement in the software development practices in the scientific and HPC software communities, a situation which has clear implications for the credibility of the resulting software. Professional reward systems tend to place far more emphasis on scientific results than on software; workshops and journals that welcome discussions of software and the software development experience are still very limited. Researchers are not trained in software engineering, and there is limited and sometimes conflicting information about which software engineering practices are useful in scientific computing and how to implement them in environments which may differ significantly from the typical commercial software settings in which they were typically developed. In this talk, I will discuss some of the strategies the IDEAS Productivity project has been pursuing to try to enhance awareness and bring relevant resources to the community through a broad range of outreach activities. And I hope to engage the audience in discussions of how they can contribute as well.","filename":"msa203s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa262","type":"child","title":"General Discussion and Community Input","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We seek to broaden the reach of the BSSw portal by engaging with a broader community. Therefore we will follow the formal presentations with an open discussion session. Given the emergent nature of this field, we believe that gathering input from the community, and brainstorming for ideas, is extremely important.","bio":"","contributors":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa210","type":"child","title":"Reproducibility in Scientific Software","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Reproducibility is at the heart of the scientific method. However, reproducible computational science has historically been a challenge. Difficulties in fully capturing and preserving the computational environment details, challenges of working with floating point arithmetic, and the lack of software practices that enable confident retrieval of specific computational data and software are among the challenges. In addition, our incentive systems do not adequately reward investment in best practices for reproducibility in relation to other demands. In this presentation, we discuss recent changes in expectations for publications, funding and community recognition that provide improved incentives for better reproducibility, and highlight a few of the technical improvements that make reproducible computation more achievable.","filename":"msa210s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
17:00 - 17:30
Outreach for Better Scientific Software
, David E. Bernholdt (Oak Ridge National Laboratory, United States of America)
+ Abstract { "session": {"id":"sess183","title":"MS32 - Increasing Credibility of Simulation and Analytic Software for Science","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Solid Earth Dynamics","Physics","Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics","Climate and Weather","Chemistry and Materials"],"slots":[{"id":"symp150","type":"minisymposia","title":"MS32 - Increasing Credibility of Simulation and Analytic Software for Science","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Recent years have seen a rise in both scrutiny and attention being paid to scientific software because of the corresponding rise in scientific discovery through simulations and data analysis. To produce high quality science it is critical to employ high quality tools and processes. The vast majority of projects engaged in using computations for advancing scientific understanding have yet to attain sufficient robustness in either their software or their process. What has changed is the realization by these projects that they must strive for robustness if they want credibility. However, given the fundamentally multidisciplinary nature of computational science, and scarcity of resources, training and experience, research teams struggle. One of the recently launched community efforts for addressing this gap is the Better Scientific Software website (BSSw.io), a resource for scientific software developers and users which includes curated and contributed content, discussion forums and many other helpful features. Through this minisymposium we seek to introduce the scientific software community to BSSw, highlighting some of the content offered. We also seek to broaden our reach with presentations from practitioners in several scientific communities highlighting what they do and their plans for the sustainability of their software.","bio":"","contributors":[{"type":"Organizer","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Organizer","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":true},{"type":"Organizer","first_name":"Mark","last_name":"Abraham","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":true}]},{"id":"msa242","type":"child","title":"Software Engineering for Simulation Neuroscience","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Compared to other scientific fields, neuroscience only more recently picked up on the capabilities offered by simulation. While this \u0022late start\u0022 in principle allows learning sustainable practices from other fields, it also is\u00a0leaving the field with substantial technical debt. Selected neuro simulator projects\u00a0are leading good computational science and engineering practices, but for simulation neuroscience an ecosystem of model generation, simulation and analysis software is needed.\u00a0Big science projects such as the Swiss Blue Brain Project or the European Human Brain Project offer the organizational\u00a0environment for sustainable science\/software co-development. On the example of selected software projects, this talk will illustrate domain specific challenges and recent progress.","bio":"","contributors":[{"type":"Author","first_name":"Felix","last_name":"Schuermann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Felix","last_name":"Schuermann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa210","type":"child","title":"Reproducibility in Scientific Software","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Reproducibility is at the heart of the scientific method. However, reproducible computational science has historically been a challenge. Difficulties in fully capturing and preserving the computational environment details, challenges of working with floating point arithmetic, and the lack of software practices that enable confident retrieval of specific computational data and software are among the challenges. In addition, our incentive systems do not adequately reward investment in best practices for reproducibility in relation to other demands. In this presentation, we discuss recent changes in expectations for publications, funding and community recognition that provide improved incentives for better reproducibility, and highlight a few of the technical improvements that make reproducible computation more achievable.","filename":"msa210s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa203","type":"child","title":"Outreach for Better Scientific Software","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Historically, there have been many impediments to widespread improvement in the software development practices in the scientific and HPC software communities, a situation which has clear implications for the credibility of the resulting software. Professional reward systems tend to place far more emphasis on scientific results than on software; workshops and journals that welcome discussions of software and the software development experience are still very limited. Researchers are not trained in software engineering, and there is limited and sometimes conflicting information about which software engineering practices are useful in scientific computing and how to implement them in environments which may differ significantly from the typical commercial software settings in which they were typically developed. In this talk, I will discuss some of the strategies the IDEAS Productivity project has been pursuing to try to enhance awareness and bring relevant resources to the community through a broad range of outreach activities. And I hope to engage the audience in discussions of how they can contribute as well.","filename":"msa203s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa262","type":"child","title":"General Discussion and Community Input","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We seek to broaden the reach of the BSSw portal by engaging with a broader community. Therefore we will follow the formal presentations with an open discussion session. Given the emergent nature of this field, we believe that gathering input from the community, and brainstorming for ideas, is extremely important.","bio":"","contributors":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa203","type":"child","title":"Outreach for Better Scientific Software","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Historically, there have been many impediments to widespread improvement in the software development practices in the scientific and HPC software communities, a situation which has clear implications for the credibility of the resulting software. Professional reward systems tend to place far more emphasis on scientific results than on software; workshops and journals that welcome discussions of software and the software development experience are still very limited. Researchers are not trained in software engineering, and there is limited and sometimes conflicting information about which software engineering practices are useful in scientific computing and how to implement them in environments which may differ significantly from the typical commercial software settings in which they were typically developed. In this talk, I will discuss some of the strategies the IDEAS Productivity project has been pursuing to try to enhance awareness and bring relevant resources to the community through a broad range of outreach activities. And I hope to engage the audience in discussions of how they can contribute as well.","filename":"msa203s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Stefan Goedecker (University of Basel, Switzerland)
Track(s):
Emerging Application Domains, Computer Science and Applied Mathematics, Chemistry and Materials
Materials discovery requires one to explore many possible chemical compositions as well as many possible structures for a given composition. Even though highly powerful computers allow us to perform density functional calculations quite rapidly for a limited number of configurations of moderate size, the huge number of force evaluations required for structural and stochiometric explorations of a very large number of materials is too expensive on the density functional level.
Machine learning schemes may come to our assistance in this context. They have been shown to deliver density functional accuracy at a greatly reduced numerical cost for limited test sets. Since all machine learning schemes are intrinsically interpolation schemes it is however not clear how well these schemes can extrapolate to discover for instance entirely new materials that were not contained in the fitting database.
The minisymposium will therefore focus on problems and systems that are difficult to treat with present day machine learning schemes and present methods that might overcome some of the current limitations.
Machine learning schemes may come to our assistance in this context. They have been shown to deliver density functional accuracy at a greatly reduced numerical cost for limited test sets. Since all machine learning schemes are intrinsically interpolation schemes it is however not clear how well these schemes can extrapolate to discover for instance entirely new materials that were not contained in the fitting database.
The minisymposium will therefore focus on problems and systems that are difficult to treat with present day machine learning schemes and present methods that might overcome some of the current limitations.
16:30 - 17:00
Structure and Dynamics of Au Nanoclusters Using ANN Based Interatomic Potentials
, Satya Bulusu (IIT Indore, India)
+ Abstract { "session": {"id":"sess187","title":"MS33 - Machine Learning Schemes with High Extrapolation Accuracy for Materials Discovery","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Emerging Application Domains","Computer Science and Applied Mathematics","Chemistry and Materials"],"slots":[{"id":"symp132","type":"minisymposia","title":"MS33 - Machine Learning Schemes with High Extrapolation Accuracy for Materials Discovery","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Materials discovery requires one to explore many possible chemical compositions as well as many possible structures for a given composition. Even though highly powerful computers allow us to perform density functional calculations quite rapidly for a limited number of configurations of moderate size, the huge number of force evaluations required for structural and stochiometric explorations of a very large number of materials is too expensive on the density functional level. \u003Cbr \/\u003E\u003Cbr \/\u003EMachine learning schemes may come to our assistance in this context. They have been shown to deliver density functional accuracy at a greatly reduced numerical cost for limited test sets. Since all machine learning schemes are intrinsically interpolation schemes it is however not clear how well these schemes can extrapolate to discover for instance entirely new materials that were not contained in the fitting database.\u003Cbr \/\u003E\u003Cbr \/\u003EThe minisymposium will therefore focus on problems and systems that are difficult to treat with present day machine learning schemes and present methods that might overcome some of the current limitations.","bio":"","contributors":[{"type":"Organizer","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Stefan","last_name":"Goedecker","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa207","type":"child","title":"On Creating Databases for Machine Learned Interatomic Potentials","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The last few years have seen fervent activity in applying machine learning techniques to create interatomic potentials and force fields, with a wide variety of approaches being tried: kernel methods, neural networks, scattering transforms, symmetrised polynomials, etc. Much less attention has been devoted to thinking about the databases that these approximants are fit to. My talk is an attempt to begin to address this, using the case study of a very extensive database of periodic structures of silicon, and a correspondingly extensive suite of benchmark tests that a materials modeller would use to judge the quality of a potential. I will argue that alternatives to Boltzmann sampling to generate such\u00a0databases will be important in the future.","bio":"","contributors":[{"type":"Author","first_name":"Gabor","last_name":"Csanyi","affiliation":"University of Cambridge","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Gabor","last_name":"Csanyi","affiliation":"University of Cambridge","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa185","type":"child","title":"Structure and Dynamics of Au Nanoclusters Using ANN Based Interatomic Potentials","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Here we demonstrate that artificial neural network (ANN) based interatomic potentials are accurate in describing interatomic interactions in metal nanoclusters and also make the calculations affordable. In terms of computational speed, ANNs are very fast and hence it allows us to run molecular dynamic simulations up to time scales of 3 ns on a single CPU for a medium sized nanocluster. ANN potentials are explored for bare, doped ((AgAu)55) and thio protected Au nanoclusters to study the effect of size, composition on structure and dynamical properties. For bare Au nanoclusters, molecular dynamics simulations were performed on Au17, Au34, Au58, Au147 and Au309. The study shows that there is a dynamical coexistence of solidlike and liquidlike phases near melting transition. For (AgAu)55, using c-T phase diagram, surface area, surface charge, probability of isomers and Landau free energies, we show enhancement of catalytic property of Ag-Au nanoalloys by incorporation of Ag up to 24 % by composition in Au nanoparticles. We show, using ANN, the effect of composition of SH for different sizes of thio protected nanoclusters. UV-visible spectra were utilized to probe the structure of nanoclusters.","filename":"msa185s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Satya","last_name":"Bulusu","affiliation":"IIT Indore","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Shweta","last_name":"Jindal","affiliation":"IIT Indore","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Siva","last_name":"Chiriki","affiliation":"IIT Indore","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Satya","last_name":"Bulusu","affiliation":"IIT Indore","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"msa142","type":"child","title":"Materials Modeling Using Neural Networks","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The potential energy surface (PES) of a system lies at the heart of many problems in materials modeling, theoretical condensed matter physics and computational chemistry. Neural networks can be parameterized to provide a computationally inexpensive way of predicting the PES, by training to reference data obtained from quantum mechanical calculations. In this way, neural networks can be used together with molecular simulation techniques to model materials in a realistic environment. Here, the strategies for parameterizing the neural networks, as well as some results from simulations for electrolyte solutions and solid\/liquid interfaces, are presented.","bio":"","contributors":[{"type":"Author","first_name":"Matti","last_name":"Hellstr\u00f6m","affiliation":"University of G\u00f6ttingen","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"J\u00f6rg","last_name":"Behler","affiliation":"University of G\u00f6ttingen","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matti","last_name":"Hellstr\u00f6m","affiliation":"University of G\u00f6ttingen","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa186","type":"child","title":"Using Machine Learning Interatomic Potentials for Crystal Structure Prediction","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Fueled by the rapid development of computer power, crystal structure predictions using density functional calculations have become viable based on powerful global optimization techniques. Nevertheless, structure predictions are still computationally quite expensive and can become prohibitive for complex systems such as surfaces, large systems, and large crystalline unit cells as well as in cases where various different external parameters such as pressure or particle numbers have to be examined. Due to their reduced numerical cost, recently developed interatomic potentials based on machine learning are able to tackle such problems, provided the interatomic potentials are sufficiently transferable. This transferability is hard to achieve since it is difficult to prepare a set of structures that are diverse enough to probe the entire low energy configuration space during the training process. I will present our approach to overcome these problems as well as results obtained by our machine learning scheme called CENT. It is based on a charge equilibration process where the neural network is used to interpolate an intermediate physical quantity, namely atomic electronegativity. I will present several applications done so far, including TiO2 sheets, ZnO low-density phases, CaF2 and alkali-halide compounds.","bio":"","contributors":[{"type":"Author","first_name":"Seyed-Alireza","last_name":"Ghasemi","affiliation":"Institute for Advanced Studies in Basic Sciences","country":"Iran","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Seyed-Alireza","last_name":"Ghasemi","affiliation":"Institute for Advanced Studies in Basic Sciences","country":"Iran","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa185","type":"child","title":"Structure and Dynamics of Au Nanoclusters Using ANN Based Interatomic Potentials","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Here we demonstrate that artificial neural network (ANN) based interatomic potentials are accurate in describing interatomic interactions in metal nanoclusters and also make the calculations affordable. In terms of computational speed, ANNs are very fast and hence it allows us to run molecular dynamic simulations up to time scales of 3 ns on a single CPU for a medium sized nanocluster. ANN potentials are explored for bare, doped ((AgAu)55) and thio protected Au nanoclusters to study the effect of size, composition on structure and dynamical properties. For bare Au nanoclusters, molecular dynamics simulations were performed on Au17, Au34, Au58, Au147 and Au309. The study shows that there is a dynamical coexistence of solidlike and liquidlike phases near melting transition. For (AgAu)55, using c-T phase diagram, surface area, surface charge, probability of isomers and Landau free energies, we show enhancement of catalytic property of Ag-Au nanoalloys by incorporation of Ag up to 24 % by composition in Au nanoparticles. We show, using ANN, the effect of composition of SH for different sizes of thio protected nanoclusters. UV-visible spectra were utilized to probe the structure of nanoclusters.","filename":"msa185s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Satya","last_name":"Bulusu","affiliation":"IIT Indore","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Shweta","last_name":"Jindal","affiliation":"IIT Indore","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Siva","last_name":"Chiriki","affiliation":"IIT Indore","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Satya","last_name":"Bulusu","affiliation":"IIT Indore","country":"India","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Satya","last_name":"Bulusu","affiliation":"IIT Indore","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Shweta","last_name":"Jindal","affiliation":"IIT Indore","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Siva","last_name":"Chiriki","affiliation":"IIT Indore","country":"India","bio":"","order":"3","is_presenter":false}] } Presentation
Organizer(s):
Stephan Brunner (EPFL, Switzerland)
, Laurent Villard (EPFL, Switzerland)
Track(s):
Physics
Turbulence in plasmas confined by a background magnetic field stand out as one of today’s great challenges in physics. To describe such systems, the most appropriate approach is based on gyrokinetic theory, which takes advantage of the small characteristic frequency of turbulence as compared to the cyclotron frequency. It results in a reduction of phase space from 6D to 5D, and eliminates the fastest time scales from the description. Despite this reduction in complexity, solving the set of gyrokinetic equations, consistently with the electromagnetic field equations, remains a formidable task. It is an intrinsically nonlinear, multi-scale problem and thus requires major HPC resources, for which advanced numerical techniques are a necessity in order to obtain a reasonable time-to-solution. In this minisymposium, progress on state-of-the-art codes based on all three major computational approaches (Eulerian, semi-Lagrangian and Lagrangian-PIC) will be presented, as well as possible innovative alternatives.
16:30 - 17:00
Advances and Optimizations of Gyrokinetic Turbulence Code GKV towards Exascale Computing
, Masanori Nunami (National Institute for Fusion Science, Japan)
+ Abstract { "session": {"id":"sess175","title":"MS34 - Plasma II: Frontiers in Gyrokinetic Turbulence Simulation on New and Emerging HPC Platforms","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Osaka Room","contributors":[{"type":"Session Chair","first_name":"Stephan","last_name":"Brunner","affiliation":"EPFL","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Physics"],"slots":[{"id":"symp110","type":"minisymposia","title":"MS34 - Plasma II: Frontiers in Gyrokinetic Turbulence Simulation on New and Emerging HPC Platforms","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Turbulence in plasmas confined by a background magnetic field stand out as one of today\u2019s great challenges in physics. To describe such systems, the most appropriate approach is based on gyrokinetic theory, which takes advantage of the small characteristic frequency of turbulence as compared to the cyclotron frequency. It results in a reduction of phase space from 6D to 5D, and eliminates the fastest time scales from the description. Despite this reduction in complexity, solving the set of gyrokinetic equations, consistently with the electromagnetic field equations, remains a formidable task. It is an intrinsically nonlinear, multi-scale problem and thus requires major HPC resources, for which advanced numerical techniques are a necessity in order to obtain a reasonable time-to-solution. In this minisymposium, progress on state-of-the-art codes based on all three major computational approaches (Eulerian, semi-Lagrangian and Lagrangian-PIC) will be presented, as well as possible innovative alternatives.","bio":"","contributors":[{"type":"Organizer","first_name":"Stephan","last_name":"Brunner","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Laurent","last_name":"Villard","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Stephan","last_name":"Brunner","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa279","type":"child","title":"How to Prepare the Gyrokinetic Code GYSELA for Future Exascale Machines","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Understanding turbulence and heat transport in fusion plasma is a key challenge for fusion devices like ITER. Non-linear 5D gyrokinetic codes are the most accurate framework to tackle this problem. They are extremely HPC challenging. The petascale code GYSELA [V. Grandgirard et al., Comp. Phys. Com. 2017 (2016) 35] is one of them based on a Backward Semi-Lagrangian scheme. During the 2013-2016 period, we obtained good strong scaling on different supercomputers (60% efficiency at 65k cores on Sandy-Bridge machine, 87% on a BlueGene\/Q machine at 32k cores). Since then, three main physics developments have been performed: (i) The trapped electrons are no longer assumed adiabatic but kinetic; (ii) a more general collision operator has been implemented to study synergies between neoclassical and turbulent transport; and (iii) GYSELA has been upgraded to model a Scrape-Off-Layer like transition between closed and open field lines. These modifications have had a considerable impact on the numerical cost of the code. We will present here the numerical and parallel optimizations performed to minimize this impact and to increase the global performance of the code to prepare for future exascale simulations.","bio":"","contributors":[{"type":"Author","first_name":"Virginie","last_name":"Grandgirard","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Yuuichi","last_name":"Asahi","affiliation":"CEA","country":"France","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Julien","last_name":"Bigot","affiliation":"CEA","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Bouzat","affiliation":"INRIA","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Elisabetta","last_name":"Caschera","affiliation":"CEA","country":"France","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Guilhem","last_name":"Dif-Pradalier","affiliation":"CEA","country":"France","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Peter","last_name":"Donnel","affiliation":"CEA","country":"France","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Garbet","affiliation":"CEA","country":"France","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Philippe","last_name":"Ghendrih","affiliation":"CEA","country":"France","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Latu","affiliation":"CEA","country":"France","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Mehrenberger","affiliation":"University of Strasbourg","country":"France","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Yanick","last_name":"Sarazin","affiliation":"CEA","country":"France","bio":"","order":"12","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Virginie","last_name":"Grandgirard","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa148","type":"child","title":"Advances and Optimizations of Gyrokinetic Turbulence Code GKV towards Exascale Computing","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Understanding of the physics in the plasma turbulent transport is one of the critical issues in fusion researches. The gyrokinetic simulation, which solves time evolution of five-dimensional plasma distribution function, is a promising approach for that, but computationally challenging. In order to establish the predictive turbulence simulations with the multiple scale fluctuations and the multiple particle species, the local flux-tube gyrokinetic code GKV has been developed. Towards exascale supercomputing, a fully non-blocking optimized communication-computation overlap technique using assistant cores (AC), which are independent from the calculation cores, is proposed for the application to GKV with spectral (FFT) and finite-difference schemes. The effects of optimization are examined in Fujitsu FX100 (with 32 computing cores and 2 assistant cores\/node), where AC enables us to employ the fully non-blocking communications overlapped by the OpenMP thread-parallelizations with much less overhead. It is clarified that the combination of the non-blocking techniques by AC and the thread-parallelization scheduling leads to not only reduction in OpenMP overhead, but also improved load\/store and cache performance. In this talk, we would like to discuss recent advances and the optimization techniques of GKV towards exascale computing.","filename":"msa148s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Masanori","last_name":"Nunami","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Motoki","last_name":"Nakata","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Tomo-Hiko","last_name":"Watanabe","affiliation":"Nagoya University","country":"Japan","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Shinya","last_name":"Maeyama","affiliation":"Nagoya University","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Akihiro","last_name":"Ishizawa","affiliation":"Kyoto University","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Yasuhiro","last_name":"Idomura","affiliation":"Japan Atomic Energy Agency","country":"Japan","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Masanori","last_name":"Nunami","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"1","is_presenter":true}]},{"id":"msa132","type":"child","title":"CPU and GPU Parallelization of Spectral Particle Methods","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Lagrangian particle methods suffer from noise in form of variance which increases with the degrees of freedom of the field discretization. Using a spectral basis e.g. Fourier modes for the fields reduces the degrees of freedom while retaining high order convergence. Additionally, such a Fourier spectral field discretization yields an energy and momentum conserving scheme for the Vlasov-Poisson and also Vlasov-Maxwell equations, which is very attractive from a physical point of view. Contrary to the standard Particle in Cell every particle contributes to every spectral basis function yielding a dense charge assignment. The evaluation of such an orthogonal basis can be implemented in various ways ranging from decompositions into expensive work packages with tiny memory footprint and vice-versa. This means that the underlying algorithm can be tuned between a compute- or memory-bound regime. Hence benchmarks between different methods and kernels using OpenCL on CPU and GPU are discussed.","filename":"msa132s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Ameres","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Eric","last_name":"Sonnendr\u00fccker","affiliation":"TU Munich","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakob","last_name":"Ameres","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa245","type":"child","title":"Porting a Legacy Global Lagrangian PIC Code on Many-Core and GPU-Accelerated Architectures","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Modern supercomputer architectures are evolving towards embedding more and more cores per compute node, often making use of accelerators such as GPUs, in which thousands of threads can be executed concurrently. To make legacy codes profit efficiently from such resources usually requires a major refactoring effort. I will present the strategy that we adopted for the production code ORB5, a global gyrokinetic Particle-In-Cell (PIC) code for studying turbulence in tokamak plasmas, developed by many physicists over a period of 20 years, which clearly exceeds the timescale of HPC architecture evolution. Among others, the code now includes multiple kinetic species, electromagnetic effects, and collisions. The present refactoring work includes the restructuring of the main kernels, changing the data structure, multithreading with OpenMP on CPUs or OpenACC on GPUs, and optimization on different architectures. The modularity of the resulting code makes it more \u0022future-proof\u0022, i.e. extensible to new physics features or computing architectures, and easier to maintain and develop in a collaborative fashion.","bio":"","contributors":[{"type":"Author","first_name":"No\u00e9","last_name":"Ohana","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Jocksch","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Emmanuel","last_name":"Lanti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Aaron","last_name":"Sheinberg","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Brunner","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Laurent","last_name":"Villard","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"No\u00e9","last_name":"Ohana","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa148","type":"child","title":"Advances and Optimizations of Gyrokinetic Turbulence Code GKV towards Exascale Computing","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Understanding of the physics in the plasma turbulent transport is one of the critical issues in fusion researches. The gyrokinetic simulation, which solves time evolution of five-dimensional plasma distribution function, is a promising approach for that, but computationally challenging. In order to establish the predictive turbulence simulations with the multiple scale fluctuations and the multiple particle species, the local flux-tube gyrokinetic code GKV has been developed. Towards exascale supercomputing, a fully non-blocking optimized communication-computation overlap technique using assistant cores (AC), which are independent from the calculation cores, is proposed for the application to GKV with spectral (FFT) and finite-difference schemes. The effects of optimization are examined in Fujitsu FX100 (with 32 computing cores and 2 assistant cores\/node), where AC enables us to employ the fully non-blocking communications overlapped by the OpenMP thread-parallelizations with much less overhead. It is clarified that the combination of the non-blocking techniques by AC and the thread-parallelization scheduling leads to not only reduction in OpenMP overhead, but also improved load\/store and cache performance. In this talk, we would like to discuss recent advances and the optimization techniques of GKV towards exascale computing.","filename":"msa148s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Masanori","last_name":"Nunami","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Motoki","last_name":"Nakata","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Tomo-Hiko","last_name":"Watanabe","affiliation":"Nagoya University","country":"Japan","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Shinya","last_name":"Maeyama","affiliation":"Nagoya University","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Akihiro","last_name":"Ishizawa","affiliation":"Kyoto University","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Yasuhiro","last_name":"Idomura","affiliation":"Japan Atomic Energy Agency","country":"Japan","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Masanori","last_name":"Nunami","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Masanori","last_name":"Nunami","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Motoki","last_name":"Nakata","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Tomo-Hiko","last_name":"Watanabe","affiliation":"Nagoya University","country":"Japan","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Shinya","last_name":"Maeyama","affiliation":"Nagoya University","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Akihiro","last_name":"Ishizawa","affiliation":"Kyoto University","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Yasuhiro","last_name":"Idomura","affiliation":"Japan Atomic Energy Agency","country":"Japan","bio":"","order":"6","is_presenter":false}] } Presentation
17:00 - 17:30
CPU and GPU Parallelization of Spectral Particle Methods
, Jakob Ameres (TU Munich, Germany)
+ Abstract { "session": {"id":"sess175","title":"MS34 - Plasma II: Frontiers in Gyrokinetic Turbulence Simulation on New and Emerging HPC Platforms","date":"Tuesday, July 3rd 2018","begin_time":"16:00","end_time":"18:00","room":"Osaka Room","contributors":[{"type":"Session Chair","first_name":"Stephan","last_name":"Brunner","affiliation":"EPFL","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Physics"],"slots":[{"id":"symp110","type":"minisymposia","title":"MS34 - Plasma II: Frontiers in Gyrokinetic Turbulence Simulation on New and Emerging HPC Platforms","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Turbulence in plasmas confined by a background magnetic field stand out as one of today\u2019s great challenges in physics. To describe such systems, the most appropriate approach is based on gyrokinetic theory, which takes advantage of the small characteristic frequency of turbulence as compared to the cyclotron frequency. It results in a reduction of phase space from 6D to 5D, and eliminates the fastest time scales from the description. Despite this reduction in complexity, solving the set of gyrokinetic equations, consistently with the electromagnetic field equations, remains a formidable task. It is an intrinsically nonlinear, multi-scale problem and thus requires major HPC resources, for which advanced numerical techniques are a necessity in order to obtain a reasonable time-to-solution. In this minisymposium, progress on state-of-the-art codes based on all three major computational approaches (Eulerian, semi-Lagrangian and Lagrangian-PIC) will be presented, as well as possible innovative alternatives.","bio":"","contributors":[{"type":"Organizer","first_name":"Stephan","last_name":"Brunner","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Laurent","last_name":"Villard","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Stephan","last_name":"Brunner","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa279","type":"child","title":"How to Prepare the Gyrokinetic Code GYSELA for Future Exascale Machines","begin_time":"16:00","end_time":"16:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Understanding turbulence and heat transport in fusion plasma is a key challenge for fusion devices like ITER. Non-linear 5D gyrokinetic codes are the most accurate framework to tackle this problem. They are extremely HPC challenging. The petascale code GYSELA [V. Grandgirard et al., Comp. Phys. Com. 2017 (2016) 35] is one of them based on a Backward Semi-Lagrangian scheme. During the 2013-2016 period, we obtained good strong scaling on different supercomputers (60% efficiency at 65k cores on Sandy-Bridge machine, 87% on a BlueGene\/Q machine at 32k cores). Since then, three main physics developments have been performed: (i) The trapped electrons are no longer assumed adiabatic but kinetic; (ii) a more general collision operator has been implemented to study synergies between neoclassical and turbulent transport; and (iii) GYSELA has been upgraded to model a Scrape-Off-Layer like transition between closed and open field lines. These modifications have had a considerable impact on the numerical cost of the code. We will present here the numerical and parallel optimizations performed to minimize this impact and to increase the global performance of the code to prepare for future exascale simulations.","bio":"","contributors":[{"type":"Author","first_name":"Virginie","last_name":"Grandgirard","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Yuuichi","last_name":"Asahi","affiliation":"CEA","country":"France","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Julien","last_name":"Bigot","affiliation":"CEA","country":"France","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Bouzat","affiliation":"INRIA","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Elisabetta","last_name":"Caschera","affiliation":"CEA","country":"France","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Guilhem","last_name":"Dif-Pradalier","affiliation":"CEA","country":"France","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Peter","last_name":"Donnel","affiliation":"CEA","country":"France","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Garbet","affiliation":"CEA","country":"France","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Philippe","last_name":"Ghendrih","affiliation":"CEA","country":"France","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Latu","affiliation":"CEA","country":"France","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Mehrenberger","affiliation":"University of Strasbourg","country":"France","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Yanick","last_name":"Sarazin","affiliation":"CEA","country":"France","bio":"","order":"12","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Virginie","last_name":"Grandgirard","affiliation":"CEA","country":"France","bio":"","order":"1","is_presenter":true}]},{"id":"msa148","type":"child","title":"Advances and Optimizations of Gyrokinetic Turbulence Code GKV towards Exascale Computing","begin_time":"16:30","end_time":"17:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Understanding of the physics in the plasma turbulent transport is one of the critical issues in fusion researches. The gyrokinetic simulation, which solves time evolution of five-dimensional plasma distribution function, is a promising approach for that, but computationally challenging. In order to establish the predictive turbulence simulations with the multiple scale fluctuations and the multiple particle species, the local flux-tube gyrokinetic code GKV has been developed. Towards exascale supercomputing, a fully non-blocking optimized communication-computation overlap technique using assistant cores (AC), which are independent from the calculation cores, is proposed for the application to GKV with spectral (FFT) and finite-difference schemes. The effects of optimization are examined in Fujitsu FX100 (with 32 computing cores and 2 assistant cores\/node), where AC enables us to employ the fully non-blocking communications overlapped by the OpenMP thread-parallelizations with much less overhead. It is clarified that the combination of the non-blocking techniques by AC and the thread-parallelization scheduling leads to not only reduction in OpenMP overhead, but also improved load\/store and cache performance. In this talk, we would like to discuss recent advances and the optimization techniques of GKV towards exascale computing.","filename":"msa148s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Masanori","last_name":"Nunami","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Motoki","last_name":"Nakata","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Tomo-Hiko","last_name":"Watanabe","affiliation":"Nagoya University","country":"Japan","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Shinya","last_name":"Maeyama","affiliation":"Nagoya University","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Akihiro","last_name":"Ishizawa","affiliation":"Kyoto University","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Yasuhiro","last_name":"Idomura","affiliation":"Japan Atomic Energy Agency","country":"Japan","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Masanori","last_name":"Nunami","affiliation":"National Institute for Fusion Science","country":"Japan","bio":"","order":"1","is_presenter":true}]},{"id":"msa132","type":"child","title":"CPU and GPU Parallelization of Spectral Particle Methods","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Lagrangian particle methods suffer from noise in form of variance which increases with the degrees of freedom of the field discretization. Using a spectral basis e.g. Fourier modes for the fields reduces the degrees of freedom while retaining high order convergence. Additionally, such a Fourier spectral field discretization yields an energy and momentum conserving scheme for the Vlasov-Poisson and also Vlasov-Maxwell equations, which is very attractive from a physical point of view. Contrary to the standard Particle in Cell every particle contributes to every spectral basis function yielding a dense charge assignment. The evaluation of such an orthogonal basis can be implemented in various ways ranging from decompositions into expensive work packages with tiny memory footprint and vice-versa. This means that the underlying algorithm can be tuned between a compute- or memory-bound regime. Hence benchmarks between different methods and kernels using OpenCL on CPU and GPU are discussed.","filename":"msa132s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Ameres","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Eric","last_name":"Sonnendr\u00fccker","affiliation":"TU Munich","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakob","last_name":"Ameres","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa245","type":"child","title":"Porting a Legacy Global Lagrangian PIC Code on Many-Core and GPU-Accelerated Architectures","begin_time":"17:30","end_time":"18:00","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Modern supercomputer architectures are evolving towards embedding more and more cores per compute node, often making use of accelerators such as GPUs, in which thousands of threads can be executed concurrently. To make legacy codes profit efficiently from such resources usually requires a major refactoring effort. I will present the strategy that we adopted for the production code ORB5, a global gyrokinetic Particle-In-Cell (PIC) code for studying turbulence in tokamak plasmas, developed by many physicists over a period of 20 years, which clearly exceeds the timescale of HPC architecture evolution. Among others, the code now includes multiple kinetic species, electromagnetic effects, and collisions. The present refactoring work includes the restructuring of the main kernels, changing the data structure, multithreading with OpenMP on CPUs or OpenACC on GPUs, and optimization on different architectures. The modularity of the resulting code makes it more \u0022future-proof\u0022, i.e. extensible to new physics features or computing architectures, and easier to maintain and develop in a collaborative fashion.","bio":"","contributors":[{"type":"Author","first_name":"No\u00e9","last_name":"Ohana","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Jocksch","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Emmanuel","last_name":"Lanti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Aaron","last_name":"Sheinberg","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Brunner","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Laurent","last_name":"Villard","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"No\u00e9","last_name":"Ohana","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa132","type":"child","title":"CPU and GPU Parallelization of Spectral Particle Methods","begin_time":"17:00","end_time":"17:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Lagrangian particle methods suffer from noise in form of variance which increases with the degrees of freedom of the field discretization. Using a spectral basis e.g. Fourier modes for the fields reduces the degrees of freedom while retaining high order convergence. Additionally, such a Fourier spectral field discretization yields an energy and momentum conserving scheme for the Vlasov-Poisson and also Vlasov-Maxwell equations, which is very attractive from a physical point of view. Contrary to the standard Particle in Cell every particle contributes to every spectral basis function yielding a dense charge assignment. The evaluation of such an orthogonal basis can be implemented in various ways ranging from decompositions into expensive work packages with tiny memory footprint and vice-versa. This means that the underlying algorithm can be tuned between a compute- or memory-bound regime. Hence benchmarks between different methods and kernels using OpenCL on CPU and GPU are discussed.","filename":"msa132s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Ameres","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Eric","last_name":"Sonnendr\u00fccker","affiliation":"TU Munich","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jakob","last_name":"Ameres","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jakob","last_name":"Ameres","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Eric","last_name":"Sonnendr\u00fccker","affiliation":"TU Munich","country":"Germany","bio":"","order":"2","is_presenter":false}] } Presentation
18:00 - 18:30
Break
Foyer 2nd Floor
Chair: Bastien Chopard (University of Geneva, Switzerland)
Emerging real-world graph problems include: detecting and preventing disease in human populations; revealing community structure in large social networks; and improving the resilience of the electric power grid. Unlike traditional applications in computational science and engineering, solving these social problems at scale often raises new challenges because of the sparsity and lack of locality in the data, the need for research on scalable algorithms and development of frameworks for solving these real-world problems on high performance computers, and for improved models that capture the noise and bias inherent in the torrential data streams.In this talk, Bader will discuss the opportunities and challenges in massive data-intensive computing for applications in social sciences, physical sciences, and engineering.
+ Biography { "slot": {"id":"evtypp134","type":"parent","title":"","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":true,"abstract":"Emerging real-world graph problems include: detecting and preventing disease in human populations; revealing community structure in large social networks; and improving the resilience of the electric power grid. Unlike traditional applications in computational science and engineering, solving these social problems at scale often raises new challenges because of the sparsity and lack of locality in the data, the need for research on scalable algorithms and development of frameworks for solving these real-world problems on high performance computers, and for improved models that capture the noise and bias inherent in the torrential data streams.\n\nIn this talk, Bader will discuss the opportunities and challenges in massive data-intensive computing for applications in social sciences, physical sciences, and engineering.","filename":"evtypp134s1-file1.pdf","bio":"David Bader is Professor and Chair of the School of Computational Science and Engineering at Georgia Institute of Technology, and is regarded as one of the world\u2019s leading experts in data sciences. His interests are at the intersection of high performance computing (HPC) and real-world applications, including cybersecurity, massive-scale analytics, and computational genomics. Bader has co-authored over 200 articles in peer-reviewed journals and conferences, and is an associate editor for high-impact publications including IEEE Transactions on Computers, ACM Transactions on Parallel Computing, and ACM Journal of Experimental Algorithmics. He is a Fellow of the IEEE and AAAS, and has served on a number of advisory committees in scientific computing and cyber-infrastructure, including the White House\u0027s National Strategic Computing Initiative. Bader has served as a lead scientist in several DARPA programs and is a co-founder of the Graph500 list, a rating of \u0022Big Data\u0022 computing platforms. He was recognized as a \u201cRock Star of HPC\u201d by InsideHPC and as HPCwire\u0027s \u201cPeople to Watch\u201d in 2012 and 2014.","contributors":[{"type":"Session chair \/ organizer \/ interviewer","first_name":"Chair: Bastien","last_name":"Chopard","affiliation":"University of Geneva","country":"Switzerland","bio":"David Bader is Professor and Chair of the School of Computational Science and Engineering at Georgia Institute of Technology, and is regarded as one of the world\u2019s leading experts in data sciences. His interests are at the intersection of high performance computing (HPC) and real-world applications, including cybersecurity, massive-scale analytics, and computational genomics. Bader has co-authored over 200 articles in peer-reviewed journals and conferences, and is an associate editor for high-impact publications including IEEE Transactions on Computers, ACM Transactions on Parallel Computing, and ACM Journal of Experimental Algorithmics. He is a Fellow of the IEEE and AAAS, and has served on a number of advisory committees in scientific computing and cyber-infrastructure, including the White House\u0027s National Strategic Computing Initiative. Bader has served as a lead scientist in several DARPA programs and is a co-founder of the Graph500 list, a rating of \u0022Big Data\u0022 computing platforms. He was recognized as a \u201cRock Star of HPC\u201d by InsideHPC and as HPCwire\u0027s \u201cPeople to Watch\u201d in 2012 and 2014.","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Session chair \/ organizer \/ interviewer","first_name":"Chair: Bastien","last_name":"Chopard","affiliation":"University of Geneva","country":"Switzerland","bio":"David Bader is Professor and Chair of the School of Computational Science and Engineering at Georgia Institute of Technology, and is regarded as one of the world\u2019s leading experts in data sciences. His interests are at the intersection of high performance computing (HPC) and real-world applications, including cybersecurity, massive-scale analytics, and computational genomics. Bader has co-authored over 200 articles in peer-reviewed journals and conferences, and is an associate editor for high-impact publications including IEEE Transactions on Computers, ACM Transactions on Parallel Computing, and ACM Journal of Experimental Algorithmics. He is a Fellow of the IEEE and AAAS, and has served on a number of advisory committees in scientific computing and cyber-infrastructure, including the White House\u0027s National Strategic Computing Initiative. Bader has served as a lead scientist in several DARPA programs and is a co-founder of the Graph500 list, a rating of \u0022Big Data\u0022 computing platforms. He was recognized as a \u201cRock Star of HPC\u201d by InsideHPC and as HPCwire\u0027s \u201cPeople to Watch\u201d in 2012 and 2014.","order":"1","is_presenter":true}]} } Presentation
19:30 - 21:30
Poster Session & Reception (sponsored by PRACE)
Foyer 2nd Floor
CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects
, Venkat Kapil (EPFL, Switzerland)
+ Abstract { "session": {"id":"sess143","title":"Posters in Chemistry and Materials","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Chemistry and Materials"],"slots":[{"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post171","type":"poster","title":"CHM03 - Bridging the Gap between Atomistic and Macroscopic Models of Homogeneous Nucleation","begin_time":"19:54","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Nucleation has many implications in science and technology, including metal casting, the assembly of microtubules in cells, and the formation of water droplets in the atmosphere. Because the experimental investigation of dynamical nucleation processes is very difficult, much attention has been paid to atomistic simulation efforts in the last two decades. However, atomistic simulation studies of nucleation face two major challenges. Firstly, the free energy barrier separating the metastable phase and the stable phase can be very high, making nucleation times much larger than the time scales accessible to molecular dynamics simulations. Secondly, it is highly non-trivial to develop a predictive macroscopic model of nucleation using the microscopic quantities directly obtained from atomistic simulations. In this poster, I aim to address the aforementioned difficulties. I will first briefly introduce state-of-the-art enhanced sampling methods for atomistic simulations, and their applications to studying homogeneous nucleation. I will then discuss our latest thermodynamic model that links macroscopic theories and atomic-scale simulations and thus provide a simple and elegant framework to verify and extend classical nucleation theory.","bio":"","contributors":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}] } Presentation
CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows
, Spyros Zoupanos (EPFL, Switzerland)
+ Abstract { "session": {"id":"sess143","title":"Posters in Chemistry and Materials","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Chemistry and Materials"],"slots":[{"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post171","type":"poster","title":"CHM03 - Bridging the Gap between Atomistic and Macroscopic Models of Homogeneous Nucleation","begin_time":"19:54","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Nucleation has many implications in science and technology, including metal casting, the assembly of microtubules in cells, and the formation of water droplets in the atmosphere. Because the experimental investigation of dynamical nucleation processes is very difficult, much attention has been paid to atomistic simulation efforts in the last two decades. However, atomistic simulation studies of nucleation face two major challenges. Firstly, the free energy barrier separating the metastable phase and the stable phase can be very high, making nucleation times much larger than the time scales accessible to molecular dynamics simulations. Secondly, it is highly non-trivial to develop a predictive macroscopic model of nucleation using the microscopic quantities directly obtained from atomistic simulations. In this poster, I aim to address the aforementioned difficulties. I will first briefly introduce state-of-the-art enhanced sampling methods for atomistic simulations, and their applications to studying homogeneous nucleation. I will then discuss our latest thermodynamic model that links macroscopic theories and atomic-scale simulations and thus provide a simple and elegant framework to verify and extend classical nucleation theory.","bio":"","contributors":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}] } Presentation
CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a Cobalt-Cubane
, Mauro Schilling (University of Zurich, Switzerland)
+ Abstract { "session": {"id":"sess143","title":"Posters in Chemistry and Materials","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Chemistry and Materials"],"slots":[{"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post171","type":"poster","title":"CHM03 - Bridging the Gap between Atomistic and Macroscopic Models of Homogeneous Nucleation","begin_time":"19:54","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Nucleation has many implications in science and technology, including metal casting, the assembly of microtubules in cells, and the formation of water droplets in the atmosphere. Because the experimental investigation of dynamical nucleation processes is very difficult, much attention has been paid to atomistic simulation efforts in the last two decades. However, atomistic simulation studies of nucleation face two major challenges. Firstly, the free energy barrier separating the metastable phase and the stable phase can be very high, making nucleation times much larger than the time scales accessible to molecular dynamics simulations. Secondly, it is highly non-trivial to develop a predictive macroscopic model of nucleation using the microscopic quantities directly obtained from atomistic simulations. In this poster, I aim to address the aforementioned difficulties. I will first briefly introduce state-of-the-art enhanced sampling methods for atomistic simulations, and their applications to studying homogeneous nucleation. I will then discuss our latest thermodynamic model that links macroscopic theories and atomic-scale simulations and thus provide a simple and elegant framework to verify and extend classical nucleation theory.","bio":"","contributors":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}] } Presentation
CHM05 - Datamining of Magnetic Double Perovskites
, Michele Visciarelli (KTH Royal Institute of Technology, Sweden)
+ Abstract { "session": {"id":"sess143","title":"Posters in Chemistry and Materials","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Chemistry and Materials"],"slots":[{"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post171","type":"poster","title":"CHM03 - Bridging the Gap between Atomistic and Macroscopic Models of Homogeneous Nucleation","begin_time":"19:54","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Nucleation has many implications in science and technology, including metal casting, the assembly of microtubules in cells, and the formation of water droplets in the atmosphere. Because the experimental investigation of dynamical nucleation processes is very difficult, much attention has been paid to atomistic simulation efforts in the last two decades. However, atomistic simulation studies of nucleation face two major challenges. Firstly, the free energy barrier separating the metastable phase and the stable phase can be very high, making nucleation times much larger than the time scales accessible to molecular dynamics simulations. Secondly, it is highly non-trivial to develop a predictive macroscopic model of nucleation using the microscopic quantities directly obtained from atomistic simulations. In this poster, I aim to address the aforementioned difficulties. I will first briefly introduce state-of-the-art enhanced sampling methods for atomistic simulations, and their applications to studying homogeneous nucleation. I will then discuss our latest thermodynamic model that links macroscopic theories and atomic-scale simulations and thus provide a simple and elegant framework to verify and extend classical nucleation theory.","bio":"","contributors":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}] } Presentation
CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations
, Prashanth Kanduri (ETH Zurich / CSCS, Switzerland)
+ Abstract { "session": {"id":"sess143","title":"Posters in Chemistry and Materials","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Chemistry and Materials"],"slots":[{"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post171","type":"poster","title":"CHM03 - Bridging the Gap between Atomistic and Macroscopic Models of Homogeneous Nucleation","begin_time":"19:54","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Nucleation has many implications in science and technology, including metal casting, the assembly of microtubules in cells, and the formation of water droplets in the atmosphere. Because the experimental investigation of dynamical nucleation processes is very difficult, much attention has been paid to atomistic simulation efforts in the last two decades. However, atomistic simulation studies of nucleation face two major challenges. Firstly, the free energy barrier separating the metastable phase and the stable phase can be very high, making nucleation times much larger than the time scales accessible to molecular dynamics simulations. Secondly, it is highly non-trivial to develop a predictive macroscopic model of nucleation using the microscopic quantities directly obtained from atomistic simulations. In this poster, I aim to address the aforementioned difficulties. I will first briefly introduce state-of-the-art enhanced sampling methods for atomistic simulations, and their applications to studying homogeneous nucleation. I will then discuss our latest thermodynamic model that links macroscopic theories and atomic-scale simulations and thus provide a simple and elegant framework to verify and extend classical nucleation theory.","bio":"","contributors":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}] } Presentation
CHM07 - DFT+U Gamma-Surfaces of UO2
, Monica Kosa (Paul Scherrer Institute, Switzerland)
+ Abstract { "session": {"id":"sess143","title":"Posters in Chemistry and Materials","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Chemistry and Materials"],"slots":[{"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post171","type":"poster","title":"CHM03 - Bridging the Gap between Atomistic and Macroscopic Models of Homogeneous Nucleation","begin_time":"19:54","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Nucleation has many implications in science and technology, including metal casting, the assembly of microtubules in cells, and the formation of water droplets in the atmosphere. Because the experimental investigation of dynamical nucleation processes is very difficult, much attention has been paid to atomistic simulation efforts in the last two decades. However, atomistic simulation studies of nucleation face two major challenges. Firstly, the free energy barrier separating the metastable phase and the stable phase can be very high, making nucleation times much larger than the time scales accessible to molecular dynamics simulations. Secondly, it is highly non-trivial to develop a predictive macroscopic model of nucleation using the microscopic quantities directly obtained from atomistic simulations. In this poster, I aim to address the aforementioned difficulties. I will first briefly introduce state-of-the-art enhanced sampling methods for atomistic simulations, and their applications to studying homogeneous nucleation. I will then discuss our latest thermodynamic model that links macroscopic theories and atomic-scale simulations and thus provide a simple and elegant framework to verify and extend classical nucleation theory.","bio":"","contributors":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}] } Presentation
CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems
, Andreas Gloess (University of Zurich, Switzerland)
+ Abstract { "session": {"id":"sess143","title":"Posters in Chemistry and Materials","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Chemistry and Materials"],"slots":[{"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post171","type":"poster","title":"CHM03 - Bridging the Gap between Atomistic and Macroscopic Models of Homogeneous Nucleation","begin_time":"19:54","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Nucleation has many implications in science and technology, including metal casting, the assembly of microtubules in cells, and the formation of water droplets in the atmosphere. Because the experimental investigation of dynamical nucleation processes is very difficult, much attention has been paid to atomistic simulation efforts in the last two decades. However, atomistic simulation studies of nucleation face two major challenges. Firstly, the free energy barrier separating the metastable phase and the stable phase can be very high, making nucleation times much larger than the time scales accessible to molecular dynamics simulations. Secondly, it is highly non-trivial to develop a predictive macroscopic model of nucleation using the microscopic quantities directly obtained from atomistic simulations. In this poster, I aim to address the aforementioned difficulties. I will first briefly introduce state-of-the-art enhanced sampling methods for atomistic simulations, and their applications to studying homogeneous nucleation. I will then discuss our latest thermodynamic model that links macroscopic theories and atomic-scale simulations and thus provide a simple and elegant framework to verify and extend classical nucleation theory.","bio":"","contributors":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}] } Presentation
CHM09 - Materials Cloud: A Platform for Open Materials Science
, Giovanni Pizzi (EPFL, Switzerland)
+ Abstract { "session": {"id":"sess143","title":"Posters in Chemistry and Materials","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Chemistry and Materials"],"slots":[{"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post171","type":"poster","title":"CHM03 - Bridging the Gap between Atomistic and Macroscopic Models of Homogeneous Nucleation","begin_time":"19:54","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Nucleation has many implications in science and technology, including metal casting, the assembly of microtubules in cells, and the formation of water droplets in the atmosphere. Because the experimental investigation of dynamical nucleation processes is very difficult, much attention has been paid to atomistic simulation efforts in the last two decades. However, atomistic simulation studies of nucleation face two major challenges. Firstly, the free energy barrier separating the metastable phase and the stable phase can be very high, making nucleation times much larger than the time scales accessible to molecular dynamics simulations. Secondly, it is highly non-trivial to develop a predictive macroscopic model of nucleation using the microscopic quantities directly obtained from atomistic simulations. In this poster, I aim to address the aforementioned difficulties. I will first briefly introduce state-of-the-art enhanced sampling methods for atomistic simulations, and their applications to studying homogeneous nucleation. I will then discuss our latest thermodynamic model that links macroscopic theories and atomic-scale simulations and thus provide a simple and elegant framework to verify and extend classical nucleation theory.","bio":"","contributors":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}] } Presentation
CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors
, David M. Wilkins (EPFL, Switzerland)
+ Abstract { "session": {"id":"sess143","title":"Posters in Chemistry and Materials","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Chemistry and Materials"],"slots":[{"id":"post163","type":"poster","title":"CHM01 - Accurate and Efficient Molecular Dynamics with Nuclear Quantum Effects","begin_time":"19:30","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Molecules and materials that contain light nuclei exhibit considerable deviations from classical behavior which are most pronounced at cryogenic temperatures, but extend up to room temperature and beyond. Properties such as dissociation of water in bulk phase or on catalytic surfaces, heat capacity, band gaps etc. are influenced by the quantum nature of nuclei. The precise description of quantum nuclear fluctuations in atomistic simulations is possible by employing path integral techniques, which involve a considerable computational overhead due to the need for simulating multiple replicas of the system. Consequently, simulations combined with advanced electronic structure methods are still prohibitive. In this talk, I will present some methodologies based on high order factorizations of the Boltzmann operator and multiple time steps in real and imaginary time, that can practically reduce the computational cost of including nuclear quantum fluctuations down to zero, while keeping interatomic interactions at high levels of electronic structure theory.","filename":"post163s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkat","last_name":"Kapil","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post168","type":"poster","title":"CHM02 - AiiDA: A Simulation Platform with Full Provenance Support and Flexible Workflows","begin_time":"19:42","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, there has been a great increase in the performance and capabilities of computers. Materials science has greatly benefited from this computational boom, which is continuously boosting research, the discovery of new materials and the development of simulation codes. The \u0022materials by design\u0022 approach has become very powerful, but requires running large numbers of simulations and building databases of computed properties. A key challenge is the need to automatically prepare, execute and monitor workflows of calculations, and then retrieve and store the results in a format that is easy to browse and query. The AiiDA open-source platform[1] provides researchers with a tool that fulfills those requirements, by implementing the four \u0022ADES\u0022 requirement pillars of Automation, Data, Environment and Sharing. AiiDA is continuously being developed and has matured into an ecosystem with multiple backend options for increased performance and flexibility, a powerful graph querying tool for easy result analysis, a redesigned plugin system to simplify external user contributions, new more powerful and easy to write workflows and a continuous integration system to ensure the stability of the platform. [1]https:\/\/doi.org\/10.1016\/j.commatsci.2015.09.013.","filename":"post168s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Sebastiaan","last_name":"Huber","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Martin","last_name":"Uhrin","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicolas","last_name":"Mounet","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Rico Andreas","last_name":"H\u00e4uselmann","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Andrea","last_name":"Cepellotti","affiliation":"UC Berkeley","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Andrius","last_name":"Merkys","affiliation":"Vilnius University","country":"Lithuania","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Boris","last_name":"Kozinsky","affiliation":"Harvard University","country":"United States of America","bio":"","order":"12","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"13","is_presenter":false},{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"14","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Spyros","last_name":"Zoupanos","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post171","type":"poster","title":"CHM03 - Bridging the Gap between Atomistic and Macroscopic Models of Homogeneous Nucleation","begin_time":"19:54","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Nucleation has many implications in science and technology, including metal casting, the assembly of microtubules in cells, and the formation of water droplets in the atmosphere. Because the experimental investigation of dynamical nucleation processes is very difficult, much attention has been paid to atomistic simulation efforts in the last two decades. However, atomistic simulation studies of nucleation face two major challenges. Firstly, the free energy barrier separating the metastable phase and the stable phase can be very high, making nucleation times much larger than the time scales accessible to molecular dynamics simulations. Secondly, it is highly non-trivial to develop a predictive macroscopic model of nucleation using the microscopic quantities directly obtained from atomistic simulations. In this poster, I aim to address the aforementioned difficulties. I will first briefly introduce state-of-the-art enhanced sampling methods for atomistic simulations, and their applications to studying homogeneous nucleation. I will then discuss our latest thermodynamic model that links macroscopic theories and atomic-scale simulations and thus provide a simple and elegant framework to verify and extend classical nucleation theory.","bio":"","contributors":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Bingqing","last_name":"Cheng","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post143","type":"poster","title":"CHM04 - The Crucial Role of the Hydrogen Bonding Network in Water Oxidation Catalyzed by a\u00a0Cobalt-Cubane","begin_time":"20:06","end_time":"20:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Facing a potentially serious energy crisis towards the end of the century, development of renewable energy sources has become one of the \u0022hot\u0022 topics in modern day research. Among many different solutions, artificial water splitting promises to be a valuable source of sustainable and affordable energy in the future. However, the improvement of existing catalysts, as well as the design of novel catalysts, is a perquisite towards the successful implementation of this technology into devices and powerplants. Informed design requires a firm understanding of underlying reaction mechanisms and how certain bottlenecks might be overcome. We employ ab initio molecular dynamics simulations to an explicitly solvated water oxidation catalyst (Co4(dpO(OH))4) in order to elucidate the elementary reaction steps and propose guidelines for the design of novel catalysts. In our studies we do not only study the catalyst itself \u2013 we also pay close attention to the crucial solute-solvent interactions which were found to play a decisive role in intramolecular proton transfer reactions giving access to a large variety of possible intermediates during the water oxidation cycles.","filename":"post143s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mauro","last_name":"Schilling","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post127","type":"poster","title":"CHM05 - Datamining of Magnetic Double Perovskites","begin_time":"20:18","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Double perovskites (DPs) are a class of materials with AB\u0027B\u0027\u0027C perovskite-like structures. Given the element for the A site, the choice of B\u0027 and B\u0027\u0027 lead to different properties (conducting\/insulating\/semiconducting behaviour, different magnetic properties, etc. [1]). Even though a fair number of B\u0027\/B\u0027\u0027 combinations have been studied, there\u0027s still a lot of \u0022uncharted territory\u0022 to explore. [1] We adopt a data-driven approach, generating a database of computed formation energies for more than 1000 A = Sr, Ba, Ca DPs. All the calculations and database generation and management have been done the through AiiDA materials\u0027 informatics infrastructure. [2] Comparison with data available in the Materials Project (MP) database is promising, with formation energy errors less than 50 meV\/atom. We retrieve the structures of all the competing phases from the MP databse, we generate convex hulls and we give stability predictions for all the DPs analyzed in this study. With a specific focus on A = Sr, B\u0027 or B\u0027\u0027 = Ir DPs, we are able to determine the stability of compounds already present in literature, and give stability predictions for compounds for which there isn\u0027t experimental or theoretical data available. [1] S. Vasala et al., Progress in Solid State Chemistry 43 (2015). [2] G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016).","filename":"post127s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thor","last_name":"Wikfeldt","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Anna","last_name":"Delin","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michele","last_name":"Visciarelli","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"1","is_presenter":true}]},{"id":"post182","type":"poster","title":"CHM06 - Development of a Modular API for Computation of Non-Bonded Interactions in Particle Simulations","begin_time":"20:30","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Non-Bonded Interactions are the heart of every particle simulation program. Over 90% of the time spent in classical MD simulations is on this step of computing forces due to non-bonded interactions. Current methods don\u0027t offer scalability properties usable in the exascale. It is also a huge software engineering overhead to use new \u0027exascale-friendly\u0027 methods on the most popular packages. We also need to expand the realm of the physics which our interactions describe. Particle simulation packages presently restrict their force fields and methods catering to specific domains. We propose an API to bring modularity and enable proliferation of methods, force fields (physics), parallelisation paradigms and hardware optimisations. The vision is to be able to call alternative back-ends to any of these on custom MD programs as well as popular software packages. This separation of concerns of the most resource intensive part from the rest of the MD pipeline is important for focussed development for the emerging exascale platforms. This would also enable particle simulations to be practical for usage in more domains such as astrophysics, fluid mechanics, etc. This poster highlights our vision, current progress and preliminary results.","filename":"post182s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Victor","last_name":"Holanda Rusu","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Claudio","last_name":"Gheller","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Prashanth","last_name":"Kanduri","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post161","type":"poster","title":"CHM07 - DFT+U Gamma-Surfaces of UO2","begin_time":"20:42","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The nuclear fuel material uranium dioxide, UO2, undergoes severe microstructural changes along its fuel cycle, forming extended defects like dislocations. Experimental characterization of structural deformations in UO2 is difficult due to safety and cost, making computer simulations vital to bridge this gap. We have carried out a systematic DFT+U study on {001}, {110} and {111} oriented gamma-surfaces of UO2, i.e. the potential energy surfaces of displacement of one crystal part with respect to the other. The DFT+U scheme relies on our earlier work on f-orbitals occupations\u0027 control. Using similar strategy, all possible f-orbital occupation patterns were considered both via single point energy calculations and the subsequent structure optimization of UO2. The f-orbital occupations patterns resulting in lowest energy and minimal deformed structures were further used for gamma-surface calculations. This procedure was repeated for the {001}, {110} and {111} oriented planes. The resulting gamma-surfaces calculated at the DFT+U level of theory are both qualitatively and quantitatively different, i.e. the shape and the energies, from those computed previously by us and others using any empirical potential. It is the result of a peculiar bonding interactions imposed by the resulting geometries during the gamma-surface calculation in conjunction with the specific f-orbital occupation pattern.","filename":"post161s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Raoul","last_name":"Ngayam Happy","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"S\u00e9bastien","last_name":"Groh","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Matthias","last_name":"Krack","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Monica","last_name":"Kosa","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post148","type":"poster","title":"CHM08 - Improving the Performance of the DBCSR Library for Sparse Matrix Multiplication for Many-Core and GPU Computing Systems","begin_time":"20:54","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Sparse matrix-matrix multiplication is an essential building block for a wide range of algorithms in various scientific fields. For this task, the sparse matrix library DBCSR (Distributed Block Compressed Sparse Row) has been developed. Its multi-layered structure automatically takes care of and optimizes several computational aspects like parallelism (MPI, OpenMP, CUDA), data (cache) locality and on-the-fly filtering. Here we report on the latest performance optimization we implemented for improving the CUDA and OpenMP parallelization. For the former, a novel algorithm was implemented for the work-scheduling of the multiplication of the matrix blocks. The latter specifically addresses many-core computing systems, namely Intel Xeon Phi, where we implemented an OpenMP task-based parallel algorithm in order to improve the load-balance by means of a more dynamic scheduling of the workload. We report the performance results, in terms of time-to-solution and energy-to-solution,\u00a0of DBCSR\u00a0on systems with Intel Xeon Phi Knights Landing (KNL) processors,\u00a0and\u00a0systems with Intel Xeon CPUs,\u00a0and NVIDIA GPUs. Finally, we analyze the performance of the library when using compilers from different vendors (GNU, Intel, NAG, PGI, FLANG).","filename":"post148s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alfio","last_name":"Lazzaro","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Juerg","last_name":"Hutter","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tiziano","last_name":"Mueller","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Seewald","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ilia","last_name":"Sivkov","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Gloess","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post165","type":"poster","title":"CHM09 - Materials Cloud: A Platform for Open Materials Science","begin_time":"21:06","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Materials Cloud (www.materialscloud.org) is an Open Science web portal, built to enable seamless sharing and dissemination of resources in computational materials science. It includes educational material and videos, interactive tools, cloud simulation services based on AiiDA[1] and Jupyter, and displays interactively both curated data along with the corresponding raw data. Being powered by AiiDA, all data is accompanied by their full provenance, allowing peers to inspect how the results have been obtained, download individual files or the whole database, and start their research from where the original authors left off. Combined also with the Archive section, where DOIs are assigned to each entry (making them citable), Materials Cloud empowers data-based discovery, while being compliant with data management plans and the FAIR principles. Among the curated data, it features SSSP, a library of pseudopotentials for electronic-structure calculations, tested and optimized for accuracy and efficiency, as well as a large database of novel 2D materials [2] with their materials properties. [1]G. Pizzi et al., Comp. Mat. Sci. 111, 218 (2016) - www.aiida.net. [2]N. Mounet et al., Nat. Nanotech. doi:10.1038\/s41565-017-0035-5 (2018)","filename":"post165s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Leopold","last_name":"Talirz","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Snehal","last_name":"Kumbhar","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Fernando","last_name":"Gargiulo","affiliation":"EPFL","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Borelli","affiliation":"EPFL","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Elsa","last_name":"Passaro","affiliation":"EPFL","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Aliaksandr","last_name":"Yakutovich","affiliation":"EPFL","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Ole","last_name":"Sch\u00fctt","affiliation":"Empa","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Giovanni","last_name":"Pizzi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}]}, "slot": {"id":"post176","type":"poster","title":"CHM10 - A Symmetry-Adapted Approach to Machine Learning of Tensors","begin_time":"21:18","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, machine learning has become a popular method to predict atomic-scale properties of molecular and material systems. While perhaps the most popular applications of machine learning in Chemistry and Materials Science have been to the potential energy surface of a system, a full statistical-mechanical description requires not only the potential energy but also tensorial properties such as dipole moments and polarizabilities. A machine-learning model for these properties must give predictions that transform covariantly with rigid-body rotations of the system, which adds an extra layer of complexity. This poster describes a framework for machine-learning of tensor properties of arbitrary ranks, accounting fully for the covariance of these properties. This method is an extension of Gaussian process regression (GPR), and involves a generalization of the similarity function, or kernel, of GPR to a tensorial kernel. This kernel builds upon the smooth overlap of atomic positions\u00a0(SOAP) kernel used for scalar properties. Results for prediction of electric response tensors of several orders, for a number of water oligomers and for bulk water, show that this method is an extremely promising one for machine-learning of these properties.","filename":"post176s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Andrea","last_name":"Grisafi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David M.","last_name":"Wilkins","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Michele","last_name":"Ceriotti","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}] } Presentation
CLW01 - Automatic Optimization of Domain Specific Languages for Weather and Climate Models
, Tobias Wicky (ETH Zurich, Switzerland)
+ Abstract { "session": {"id":"sess144","title":"Posters in Climate and Weather","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Climate and Weather"],"slots":[{"id":"post155","type":"poster","title":"CLW01 - Automatic Optimization of Domain Specific Languages for Weather and Climate Models","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We present a compiler toolchain for domain-specific languages (DSLs) that allows for easier design of high-level DSLs for weather and climate models. The growing diversity of computing architectures where scientific models need to run is leading to a decrease of productivity due to the necessity to incorporate multiple programming models. DSLs have been proposed to separate the algorithmic implementation from the architecture-specific implementation. However, DSLs are developed for specific domains or individual applications. Therefore, there is little reuse between existing complex tools, leading to high maintenance costs. Our toolchain provides efficient code optimizers and code generation for multiple architectures to any high-level DSL by means of a standard intermediate representation (IR). This significantly reduces development efforts since these parts can be fully reused. Based on that toolchain, we build a high-level DSL that allows for a full implementation of the dynamical core of the COSMO numerical weather predication and regional climate model. This results in a decrease of code size by a factor of about ten compared to the implementation that is currently in production. This is valuable as it increases productivity as well as maintainability. The automatic optimization is able to achieve speedups compared to expert tuned production code.","filename":"post155s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabian","last_name":"Thuering","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post121","type":"poster","title":"CLW02 - ESiWACE: Performance Predictions for Storm-Resolving Simulations of the Climate System","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With exascale computing becoming available in the next decade, global weather prediction at the kilometer scale will be enabled. Moreover, the climate community has already begun to contemplate a new generation of high-resolution climate models. High-resolution model development is confronted with several challenges. Scalability of the models needs to be optimal, including all relevant components such as I\/O which easily becomes a bottleneck; both runtime and I\/O will dictate how fine a resolution can be chosen while still being able to run the model at production level, e.g. at 1-30 years\/day depending on the questions to be addressed. Moreover, given various scalability experiments from prototypical runs and additional model data, estimating performance for new simulations can become challenging. I present results achieved in the scope of the Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) using the ICON model for global high-resolution simulations. I give an overview of the project, I show results from multi-week global 5km simulations, and I discuss current features and limits of the simulations. I further link the findings to a new intercomparison initiative DYAMOND for high-resolution predictions. Finally, I discuss performance prediction approaches for existing performance data.","filename":"post121s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philipp","last_name":"Neumann","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"post136","type":"poster","title":"CLW03 - Experiments with Containerising a State-of-the-Art Weather and Climate Model for Application in HPC","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Modern weather and climate models have massively parallel complex codes and complex workflow environments. Porting these codes is inherently difficult, as is installing, developing and maintaining the relevant infrastructure - particularly where reproducibility of workflow and output is required. The associated effort is a major drain on resources. Containerisation directly addresses many of the issues; by packaging the entire application software stack and runtime into a portable stand-alone container, much of the repetitive work is removed, as is the scope for \u0022installer error\u0022. In this work we present an ongoing study into the use of containers to package (specific) climate configurations of the Met Office Unified Model - a complete weather and climate model and workflow system. Three different container solutions have been used: Docker, Singularity and Shifter. A single container was developed, compatible with all three container systems, and comprising model build and control infrastructures and executables. Issues addressed include making use of the MPI ABI, to facilitate the use by the containerised executable of local fast interconnects, and access to the local file system. Results are presented from environments which range from exploiting a virtual cluster using SLURM on a laptop through to using a Cray XC30 supercomputer.","filename":"post136s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bryan","last_name":"Lawrence","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post133","type":"poster","title":"CLW04 - Performance Study of Climate and Weather Models: Towards a More Efficiently Scalable Model","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The enhancement of numerical codes is given a lot of attention around Europe. Weather and climate models are improving the accuracy of their simulations with some factors such as the reduction of parametrization or the increase of grid resolution. However, this accuracy improvement will need more computational resources through a new generation of supercomputers. To take advantage of this new generation, performance analysis could help to know in detail the computational behavior of the models and the information obtained could be used to introduce optimizations. The optimizations will improve the energy efficiency of the models when thousands of resources are used for their parallel execution. Similarly to previous works using profiling tools for EC-Earth, NEMO and IFS, in this study we present our methodology and analyse results to know more about the computational behavior of different Weather and Climate models. Additionally, we present how to improve the functionality and performance of some of the bottlenecks.","filename":"post133s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Xavier","last_name":"Yepes","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Oriol","last_name":"Tinto","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kim","last_name":"Serradell","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Miguel","last_name":"Castrillo","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"post164","type":"poster","title":"CLW05 - Porting Physical Parameterizations from a Climate Model to Accelerators","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ICON (ICOsahedral Non-hydrostatic) is a climate and numerical weather prediction model being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). Together with MPI-M and DWD, MeteoSwiss, the Center for Climate Systems Modeling (C2SM\/ETH), and the Swiss National Supercomputing Center (CSCS) are porting ICON to GPUs and many-core architectures. Within the model, physical parameterizations calculate the collective effect of physical phenomena which occur on a sub-grid scale. We suggest multiple directive-based approaches of porting these parameterizations to accelerators, such as using the OpenACC standard or the CLAW source-to-source translator. Allowing the retention of a single Fortran code, directive approach can offer a high degree of performance portability. Using the FortranTestGenerator tool for automatic unit test generation for Fortran subroutines, the turbulence parameterization is isolated in a testbed subset of the model, so that subsequent changes can be easily validated. Tool-based analysis of loop kernels using the Roofline model is used to estimate attainable performance on various platforms, in particular i86-based multi-core CPUs as well as NVIDIA GPUs. The validated turbulence parameterization, running within a testbed framework, can be integrated into the overall ICON model.","filename":"post164s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post129","type":"poster","title":"CLW06 - Towards More Efficient Adjustment of Free Parameters in a Global Climate Model?","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Several climate relevant processes are not resolved by the spatial discretization of current global climate models (GCMs). Tropical thunderstorms, for example, with their deep convection must be included via some sub-grid-scale parameterization. Associated free parameters must be adjusted to obtain a \u0027physically meaningful climate\u0027. This \u0027model tuning\u0027 is expensive: the number of tuning parameters is significant (several tens), the response of the model climate to changes in these parameters is typically non-linear, and to quantify this response the GCM must be run for some time, at considerable computational costs. We illustrate model tuning with the example of MPI-ESM-HAM (Max Planck Earth System Model, coupled to the Hamburg Aerosol Module), and discuss how the steadily growing number of GCM simulations (data points) during the tuning process may be exploited to arrive at an overall more efficient, more resource friendly tuning procedure.","filename":"post129s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Wild","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post155","type":"poster","title":"CLW01 - Automatic Optimization of Domain Specific Languages for Weather and Climate Models","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We present a compiler toolchain for domain-specific languages (DSLs) that allows for easier design of high-level DSLs for weather and climate models. The growing diversity of computing architectures where scientific models need to run is leading to a decrease of productivity due to the necessity to incorporate multiple programming models. DSLs have been proposed to separate the algorithmic implementation from the architecture-specific implementation. However, DSLs are developed for specific domains or individual applications. Therefore, there is little reuse between existing complex tools, leading to high maintenance costs. Our toolchain provides efficient code optimizers and code generation for multiple architectures to any high-level DSL by means of a standard intermediate representation (IR). This significantly reduces development efforts since these parts can be fully reused. Based on that toolchain, we build a high-level DSL that allows for a full implementation of the dynamical core of the COSMO numerical weather predication and regional climate model. This results in a decrease of code size by a factor of about ten compared to the implementation that is currently in production. This is valuable as it increases productivity as well as maintainability. The automatic optimization is able to achieve speedups compared to expert tuned production code.","filename":"post155s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabian","last_name":"Thuering","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabian","last_name":"Thuering","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}] } Presentation
CLW02 - ESiWACE: Performance Predictions for Storm-Resolving Simulations of the Climate System
, Joachim Biercamp (German Climate Computing Center, Germany)
+ Abstract { "session": {"id":"sess144","title":"Posters in Climate and Weather","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Climate and Weather"],"slots":[{"id":"post155","type":"poster","title":"CLW01 - Automatic Optimization of Domain Specific Languages for Weather and Climate Models","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We present a compiler toolchain for domain-specific languages (DSLs) that allows for easier design of high-level DSLs for weather and climate models. The growing diversity of computing architectures where scientific models need to run is leading to a decrease of productivity due to the necessity to incorporate multiple programming models. DSLs have been proposed to separate the algorithmic implementation from the architecture-specific implementation. However, DSLs are developed for specific domains or individual applications. Therefore, there is little reuse between existing complex tools, leading to high maintenance costs. Our toolchain provides efficient code optimizers and code generation for multiple architectures to any high-level DSL by means of a standard intermediate representation (IR). This significantly reduces development efforts since these parts can be fully reused. Based on that toolchain, we build a high-level DSL that allows for a full implementation of the dynamical core of the COSMO numerical weather predication and regional climate model. This results in a decrease of code size by a factor of about ten compared to the implementation that is currently in production. This is valuable as it increases productivity as well as maintainability. The automatic optimization is able to achieve speedups compared to expert tuned production code.","filename":"post155s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabian","last_name":"Thuering","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post121","type":"poster","title":"CLW02 - ESiWACE: Performance Predictions for Storm-Resolving Simulations of the Climate System","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With exascale computing becoming available in the next decade, global weather prediction at the kilometer scale will be enabled. Moreover, the climate community has already begun to contemplate a new generation of high-resolution climate models. High-resolution model development is confronted with several challenges. Scalability of the models needs to be optimal, including all relevant components such as I\/O which easily becomes a bottleneck; both runtime and I\/O will dictate how fine a resolution can be chosen while still being able to run the model at production level, e.g. at 1-30 years\/day depending on the questions to be addressed. Moreover, given various scalability experiments from prototypical runs and additional model data, estimating performance for new simulations can become challenging. I present results achieved in the scope of the Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) using the ICON model for global high-resolution simulations. I give an overview of the project, I show results from multi-week global 5km simulations, and I discuss current features and limits of the simulations. I further link the findings to a new intercomparison initiative DYAMOND for high-resolution predictions. Finally, I discuss performance prediction approaches for existing performance data.","filename":"post121s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philipp","last_name":"Neumann","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"post136","type":"poster","title":"CLW03 - Experiments with Containerising a State-of-the-Art Weather and Climate Model for Application in HPC","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Modern weather and climate models have massively parallel complex codes and complex workflow environments. Porting these codes is inherently difficult, as is installing, developing and maintaining the relevant infrastructure - particularly where reproducibility of workflow and output is required. The associated effort is a major drain on resources. Containerisation directly addresses many of the issues; by packaging the entire application software stack and runtime into a portable stand-alone container, much of the repetitive work is removed, as is the scope for \u0022installer error\u0022. In this work we present an ongoing study into the use of containers to package (specific) climate configurations of the Met Office Unified Model - a complete weather and climate model and workflow system. Three different container solutions have been used: Docker, Singularity and Shifter. A single container was developed, compatible with all three container systems, and comprising model build and control infrastructures and executables. Issues addressed include making use of the MPI ABI, to facilitate the use by the containerised executable of local fast interconnects, and access to the local file system. Results are presented from environments which range from exploiting a virtual cluster using SLURM on a laptop through to using a Cray XC30 supercomputer.","filename":"post136s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bryan","last_name":"Lawrence","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post133","type":"poster","title":"CLW04 - Performance Study of Climate and Weather Models: Towards a More Efficiently Scalable Model","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The enhancement of numerical codes is given a lot of attention around Europe. Weather and climate models are improving the accuracy of their simulations with some factors such as the reduction of parametrization or the increase of grid resolution. However, this accuracy improvement will need more computational resources through a new generation of supercomputers. To take advantage of this new generation, performance analysis could help to know in detail the computational behavior of the models and the information obtained could be used to introduce optimizations. The optimizations will improve the energy efficiency of the models when thousands of resources are used for their parallel execution. Similarly to previous works using profiling tools for EC-Earth, NEMO and IFS, in this study we present our methodology and analyse results to know more about the computational behavior of different Weather and Climate models. Additionally, we present how to improve the functionality and performance of some of the bottlenecks.","filename":"post133s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Xavier","last_name":"Yepes","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Oriol","last_name":"Tinto","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kim","last_name":"Serradell","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Miguel","last_name":"Castrillo","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"post164","type":"poster","title":"CLW05 - Porting Physical Parameterizations from a Climate Model to Accelerators","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ICON (ICOsahedral Non-hydrostatic) is a climate and numerical weather prediction model being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). Together with MPI-M and DWD, MeteoSwiss, the Center for Climate Systems Modeling (C2SM\/ETH), and the Swiss National Supercomputing Center (CSCS) are porting ICON to GPUs and many-core architectures. Within the model, physical parameterizations calculate the collective effect of physical phenomena which occur on a sub-grid scale. We suggest multiple directive-based approaches of porting these parameterizations to accelerators, such as using the OpenACC standard or the CLAW source-to-source translator. Allowing the retention of a single Fortran code, directive approach can offer a high degree of performance portability. Using the FortranTestGenerator tool for automatic unit test generation for Fortran subroutines, the turbulence parameterization is isolated in a testbed subset of the model, so that subsequent changes can be easily validated. Tool-based analysis of loop kernels using the Roofline model is used to estimate attainable performance on various platforms, in particular i86-based multi-core CPUs as well as NVIDIA GPUs. The validated turbulence parameterization, running within a testbed framework, can be integrated into the overall ICON model.","filename":"post164s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post129","type":"poster","title":"CLW06 - Towards More Efficient Adjustment of Free Parameters in a Global Climate Model?","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Several climate relevant processes are not resolved by the spatial discretization of current global climate models (GCMs). Tropical thunderstorms, for example, with their deep convection must be included via some sub-grid-scale parameterization. Associated free parameters must be adjusted to obtain a \u0027physically meaningful climate\u0027. This \u0027model tuning\u0027 is expensive: the number of tuning parameters is significant (several tens), the response of the model climate to changes in these parameters is typically non-linear, and to quantify this response the GCM must be run for some time, at considerable computational costs. We illustrate model tuning with the example of MPI-ESM-HAM (Max Planck Earth System Model, coupled to the Hamburg Aerosol Module), and discuss how the steadily growing number of GCM simulations (data points) during the tuning process may be exploited to arrive at an overall more efficient, more resource friendly tuning procedure.","filename":"post129s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Wild","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post121","type":"poster","title":"CLW02 - ESiWACE: Performance Predictions for Storm-Resolving Simulations of the Climate System","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With exascale computing becoming available in the next decade, global weather prediction at the kilometer scale will be enabled. Moreover, the climate community has already begun to contemplate a new generation of high-resolution climate models. High-resolution model development is confronted with several challenges. Scalability of the models needs to be optimal, including all relevant components such as I\/O which easily becomes a bottleneck; both runtime and I\/O will dictate how fine a resolution can be chosen while still being able to run the model at production level, e.g. at 1-30 years\/day depending on the questions to be addressed. Moreover, given various scalability experiments from prototypical runs and additional model data, estimating performance for new simulations can become challenging. I present results achieved in the scope of the Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) using the ICON model for global high-resolution simulations. I give an overview of the project, I show results from multi-week global 5km simulations, and I discuss current features and limits of the simulations. I further link the findings to a new intercomparison initiative DYAMOND for high-resolution predictions. Finally, I discuss performance prediction approaches for existing performance data.","filename":"post121s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philipp","last_name":"Neumann","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Philipp","last_name":"Neumann","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}] } Presentation
CLW03 - Experiments with Containerising a State-of-the-Art Weather and Climate Model for Application in HPC
, Simon Wilson (NCAS-CMS, United Kingdom)
+ Abstract { "session": {"id":"sess144","title":"Posters in Climate and Weather","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Climate and Weather"],"slots":[{"id":"post155","type":"poster","title":"CLW01 - Automatic Optimization of Domain Specific Languages for Weather and Climate Models","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We present a compiler toolchain for domain-specific languages (DSLs) that allows for easier design of high-level DSLs for weather and climate models. The growing diversity of computing architectures where scientific models need to run is leading to a decrease of productivity due to the necessity to incorporate multiple programming models. DSLs have been proposed to separate the algorithmic implementation from the architecture-specific implementation. However, DSLs are developed for specific domains or individual applications. Therefore, there is little reuse between existing complex tools, leading to high maintenance costs. Our toolchain provides efficient code optimizers and code generation for multiple architectures to any high-level DSL by means of a standard intermediate representation (IR). This significantly reduces development efforts since these parts can be fully reused. Based on that toolchain, we build a high-level DSL that allows for a full implementation of the dynamical core of the COSMO numerical weather predication and regional climate model. This results in a decrease of code size by a factor of about ten compared to the implementation that is currently in production. This is valuable as it increases productivity as well as maintainability. The automatic optimization is able to achieve speedups compared to expert tuned production code.","filename":"post155s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabian","last_name":"Thuering","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post121","type":"poster","title":"CLW02 - ESiWACE: Performance Predictions for Storm-Resolving Simulations of the Climate System","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With exascale computing becoming available in the next decade, global weather prediction at the kilometer scale will be enabled. Moreover, the climate community has already begun to contemplate a new generation of high-resolution climate models. High-resolution model development is confronted with several challenges. Scalability of the models needs to be optimal, including all relevant components such as I\/O which easily becomes a bottleneck; both runtime and I\/O will dictate how fine a resolution can be chosen while still being able to run the model at production level, e.g. at 1-30 years\/day depending on the questions to be addressed. Moreover, given various scalability experiments from prototypical runs and additional model data, estimating performance for new simulations can become challenging. I present results achieved in the scope of the Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) using the ICON model for global high-resolution simulations. I give an overview of the project, I show results from multi-week global 5km simulations, and I discuss current features and limits of the simulations. I further link the findings to a new intercomparison initiative DYAMOND for high-resolution predictions. Finally, I discuss performance prediction approaches for existing performance data.","filename":"post121s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philipp","last_name":"Neumann","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"post136","type":"poster","title":"CLW03 - Experiments with Containerising a State-of-the-Art Weather and Climate Model for Application in HPC","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Modern weather and climate models have massively parallel complex codes and complex workflow environments. Porting these codes is inherently difficult, as is installing, developing and maintaining the relevant infrastructure - particularly where reproducibility of workflow and output is required. The associated effort is a major drain on resources. Containerisation directly addresses many of the issues; by packaging the entire application software stack and runtime into a portable stand-alone container, much of the repetitive work is removed, as is the scope for \u0022installer error\u0022. In this work we present an ongoing study into the use of containers to package (specific) climate configurations of the Met Office Unified Model - a complete weather and climate model and workflow system. Three different container solutions have been used: Docker, Singularity and Shifter. A single container was developed, compatible with all three container systems, and comprising model build and control infrastructures and executables. Issues addressed include making use of the MPI ABI, to facilitate the use by the containerised executable of local fast interconnects, and access to the local file system. Results are presented from environments which range from exploiting a virtual cluster using SLURM on a laptop through to using a Cray XC30 supercomputer.","filename":"post136s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bryan","last_name":"Lawrence","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post133","type":"poster","title":"CLW04 - Performance Study of Climate and Weather Models: Towards a More Efficiently Scalable Model","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The enhancement of numerical codes is given a lot of attention around Europe. Weather and climate models are improving the accuracy of their simulations with some factors such as the reduction of parametrization or the increase of grid resolution. However, this accuracy improvement will need more computational resources through a new generation of supercomputers. To take advantage of this new generation, performance analysis could help to know in detail the computational behavior of the models and the information obtained could be used to introduce optimizations. The optimizations will improve the energy efficiency of the models when thousands of resources are used for their parallel execution. Similarly to previous works using profiling tools for EC-Earth, NEMO and IFS, in this study we present our methodology and analyse results to know more about the computational behavior of different Weather and Climate models. Additionally, we present how to improve the functionality and performance of some of the bottlenecks.","filename":"post133s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Xavier","last_name":"Yepes","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Oriol","last_name":"Tinto","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kim","last_name":"Serradell","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Miguel","last_name":"Castrillo","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"post164","type":"poster","title":"CLW05 - Porting Physical Parameterizations from a Climate Model to Accelerators","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ICON (ICOsahedral Non-hydrostatic) is a climate and numerical weather prediction model being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). Together with MPI-M and DWD, MeteoSwiss, the Center for Climate Systems Modeling (C2SM\/ETH), and the Swiss National Supercomputing Center (CSCS) are porting ICON to GPUs and many-core architectures. Within the model, physical parameterizations calculate the collective effect of physical phenomena which occur on a sub-grid scale. We suggest multiple directive-based approaches of porting these parameterizations to accelerators, such as using the OpenACC standard or the CLAW source-to-source translator. Allowing the retention of a single Fortran code, directive approach can offer a high degree of performance portability. Using the FortranTestGenerator tool for automatic unit test generation for Fortran subroutines, the turbulence parameterization is isolated in a testbed subset of the model, so that subsequent changes can be easily validated. Tool-based analysis of loop kernels using the Roofline model is used to estimate attainable performance on various platforms, in particular i86-based multi-core CPUs as well as NVIDIA GPUs. The validated turbulence parameterization, running within a testbed framework, can be integrated into the overall ICON model.","filename":"post164s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post129","type":"poster","title":"CLW06 - Towards More Efficient Adjustment of Free Parameters in a Global Climate Model?","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Several climate relevant processes are not resolved by the spatial discretization of current global climate models (GCMs). Tropical thunderstorms, for example, with their deep convection must be included via some sub-grid-scale parameterization. Associated free parameters must be adjusted to obtain a \u0027physically meaningful climate\u0027. This \u0027model tuning\u0027 is expensive: the number of tuning parameters is significant (several tens), the response of the model climate to changes in these parameters is typically non-linear, and to quantify this response the GCM must be run for some time, at considerable computational costs. We illustrate model tuning with the example of MPI-ESM-HAM (Max Planck Earth System Model, coupled to the Hamburg Aerosol Module), and discuss how the steadily growing number of GCM simulations (data points) during the tuning process may be exploited to arrive at an overall more efficient, more resource friendly tuning procedure.","filename":"post129s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Wild","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post136","type":"poster","title":"CLW03 - Experiments with Containerising a State-of-the-Art Weather and Climate Model for Application in HPC","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Modern weather and climate models have massively parallel complex codes and complex workflow environments. Porting these codes is inherently difficult, as is installing, developing and maintaining the relevant infrastructure - particularly where reproducibility of workflow and output is required. The associated effort is a major drain on resources. Containerisation directly addresses many of the issues; by packaging the entire application software stack and runtime into a portable stand-alone container, much of the repetitive work is removed, as is the scope for \u0022installer error\u0022. In this work we present an ongoing study into the use of containers to package (specific) climate configurations of the Met Office Unified Model - a complete weather and climate model and workflow system. Three different container solutions have been used: Docker, Singularity and Shifter. A single container was developed, compatible with all three container systems, and comprising model build and control infrastructures and executables. Issues addressed include making use of the MPI ABI, to facilitate the use by the containerised executable of local fast interconnects, and access to the local file system. Results are presented from environments which range from exploiting a virtual cluster using SLURM on a laptop through to using a Cray XC30 supercomputer.","filename":"post136s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bryan","last_name":"Lawrence","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bryan","last_name":"Lawrence","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"2","is_presenter":false}] } Presentation
CLW04 - Performance Study of Climate and Weather Models: Towards a More Efficiently Scalable Model
, Mario Acosta (Barcelona Supercomputing Center, Spain)
+ Abstract { "session": {"id":"sess144","title":"Posters in Climate and Weather","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Climate and Weather"],"slots":[{"id":"post155","type":"poster","title":"CLW01 - Automatic Optimization of Domain Specific Languages for Weather and Climate Models","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We present a compiler toolchain for domain-specific languages (DSLs) that allows for easier design of high-level DSLs for weather and climate models. The growing diversity of computing architectures where scientific models need to run is leading to a decrease of productivity due to the necessity to incorporate multiple programming models. DSLs have been proposed to separate the algorithmic implementation from the architecture-specific implementation. However, DSLs are developed for specific domains or individual applications. Therefore, there is little reuse between existing complex tools, leading to high maintenance costs. Our toolchain provides efficient code optimizers and code generation for multiple architectures to any high-level DSL by means of a standard intermediate representation (IR). This significantly reduces development efforts since these parts can be fully reused. Based on that toolchain, we build a high-level DSL that allows for a full implementation of the dynamical core of the COSMO numerical weather predication and regional climate model. This results in a decrease of code size by a factor of about ten compared to the implementation that is currently in production. This is valuable as it increases productivity as well as maintainability. The automatic optimization is able to achieve speedups compared to expert tuned production code.","filename":"post155s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabian","last_name":"Thuering","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post121","type":"poster","title":"CLW02 - ESiWACE: Performance Predictions for Storm-Resolving Simulations of the Climate System","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With exascale computing becoming available in the next decade, global weather prediction at the kilometer scale will be enabled. Moreover, the climate community has already begun to contemplate a new generation of high-resolution climate models. High-resolution model development is confronted with several challenges. Scalability of the models needs to be optimal, including all relevant components such as I\/O which easily becomes a bottleneck; both runtime and I\/O will dictate how fine a resolution can be chosen while still being able to run the model at production level, e.g. at 1-30 years\/day depending on the questions to be addressed. Moreover, given various scalability experiments from prototypical runs and additional model data, estimating performance for new simulations can become challenging. I present results achieved in the scope of the Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) using the ICON model for global high-resolution simulations. I give an overview of the project, I show results from multi-week global 5km simulations, and I discuss current features and limits of the simulations. I further link the findings to a new intercomparison initiative DYAMOND for high-resolution predictions. Finally, I discuss performance prediction approaches for existing performance data.","filename":"post121s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philipp","last_name":"Neumann","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"post136","type":"poster","title":"CLW03 - Experiments with Containerising a State-of-the-Art Weather and Climate Model for Application in HPC","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Modern weather and climate models have massively parallel complex codes and complex workflow environments. Porting these codes is inherently difficult, as is installing, developing and maintaining the relevant infrastructure - particularly where reproducibility of workflow and output is required. The associated effort is a major drain on resources. Containerisation directly addresses many of the issues; by packaging the entire application software stack and runtime into a portable stand-alone container, much of the repetitive work is removed, as is the scope for \u0022installer error\u0022. In this work we present an ongoing study into the use of containers to package (specific) climate configurations of the Met Office Unified Model - a complete weather and climate model and workflow system. Three different container solutions have been used: Docker, Singularity and Shifter. A single container was developed, compatible with all three container systems, and comprising model build and control infrastructures and executables. Issues addressed include making use of the MPI ABI, to facilitate the use by the containerised executable of local fast interconnects, and access to the local file system. Results are presented from environments which range from exploiting a virtual cluster using SLURM on a laptop through to using a Cray XC30 supercomputer.","filename":"post136s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bryan","last_name":"Lawrence","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post133","type":"poster","title":"CLW04 - Performance Study of Climate and Weather Models: Towards a More Efficiently Scalable Model","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The enhancement of numerical codes is given a lot of attention around Europe. Weather and climate models are improving the accuracy of their simulations with some factors such as the reduction of parametrization or the increase of grid resolution. However, this accuracy improvement will need more computational resources through a new generation of supercomputers. To take advantage of this new generation, performance analysis could help to know in detail the computational behavior of the models and the information obtained could be used to introduce optimizations. The optimizations will improve the energy efficiency of the models when thousands of resources are used for their parallel execution. Similarly to previous works using profiling tools for EC-Earth, NEMO and IFS, in this study we present our methodology and analyse results to know more about the computational behavior of different Weather and Climate models. Additionally, we present how to improve the functionality and performance of some of the bottlenecks.","filename":"post133s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Xavier","last_name":"Yepes","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Oriol","last_name":"Tinto","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kim","last_name":"Serradell","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Miguel","last_name":"Castrillo","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"post164","type":"poster","title":"CLW05 - Porting Physical Parameterizations from a Climate Model to Accelerators","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ICON (ICOsahedral Non-hydrostatic) is a climate and numerical weather prediction model being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). Together with MPI-M and DWD, MeteoSwiss, the Center for Climate Systems Modeling (C2SM\/ETH), and the Swiss National Supercomputing Center (CSCS) are porting ICON to GPUs and many-core architectures. Within the model, physical parameterizations calculate the collective effect of physical phenomena which occur on a sub-grid scale. We suggest multiple directive-based approaches of porting these parameterizations to accelerators, such as using the OpenACC standard or the CLAW source-to-source translator. Allowing the retention of a single Fortran code, directive approach can offer a high degree of performance portability. Using the FortranTestGenerator tool for automatic unit test generation for Fortran subroutines, the turbulence parameterization is isolated in a testbed subset of the model, so that subsequent changes can be easily validated. Tool-based analysis of loop kernels using the Roofline model is used to estimate attainable performance on various platforms, in particular i86-based multi-core CPUs as well as NVIDIA GPUs. The validated turbulence parameterization, running within a testbed framework, can be integrated into the overall ICON model.","filename":"post164s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post129","type":"poster","title":"CLW06 - Towards More Efficient Adjustment of Free Parameters in a Global Climate Model?","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Several climate relevant processes are not resolved by the spatial discretization of current global climate models (GCMs). Tropical thunderstorms, for example, with their deep convection must be included via some sub-grid-scale parameterization. Associated free parameters must be adjusted to obtain a \u0027physically meaningful climate\u0027. This \u0027model tuning\u0027 is expensive: the number of tuning parameters is significant (several tens), the response of the model climate to changes in these parameters is typically non-linear, and to quantify this response the GCM must be run for some time, at considerable computational costs. We illustrate model tuning with the example of MPI-ESM-HAM (Max Planck Earth System Model, coupled to the Hamburg Aerosol Module), and discuss how the steadily growing number of GCM simulations (data points) during the tuning process may be exploited to arrive at an overall more efficient, more resource friendly tuning procedure.","filename":"post129s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Wild","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post133","type":"poster","title":"CLW04 - Performance Study of Climate and Weather Models: Towards a More Efficiently Scalable Model","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The enhancement of numerical codes is given a lot of attention around Europe. Weather and climate models are improving the accuracy of their simulations with some factors such as the reduction of parametrization or the increase of grid resolution. However, this accuracy improvement will need more computational resources through a new generation of supercomputers. To take advantage of this new generation, performance analysis could help to know in detail the computational behavior of the models and the information obtained could be used to introduce optimizations. The optimizations will improve the energy efficiency of the models when thousands of resources are used for their parallel execution. Similarly to previous works using profiling tools for EC-Earth, NEMO and IFS, in this study we present our methodology and analyse results to know more about the computational behavior of different Weather and Climate models. Additionally, we present how to improve the functionality and performance of some of the bottlenecks.","filename":"post133s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Xavier","last_name":"Yepes","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Oriol","last_name":"Tinto","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kim","last_name":"Serradell","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Miguel","last_name":"Castrillo","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Xavier","last_name":"Yepes","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Oriol","last_name":"Tinto","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kim","last_name":"Serradell","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Miguel","last_name":"Castrillo","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false}] } Presentation
CLW05 - Porting Physical Parameterizations from a Climate Model to Accelerators
, Thomas Köster (Università della Svizzera italiana, Switzerland)
+ Abstract { "session": {"id":"sess144","title":"Posters in Climate and Weather","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Climate and Weather"],"slots":[{"id":"post155","type":"poster","title":"CLW01 - Automatic Optimization of Domain Specific Languages for Weather and Climate Models","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We present a compiler toolchain for domain-specific languages (DSLs) that allows for easier design of high-level DSLs for weather and climate models. The growing diversity of computing architectures where scientific models need to run is leading to a decrease of productivity due to the necessity to incorporate multiple programming models. DSLs have been proposed to separate the algorithmic implementation from the architecture-specific implementation. However, DSLs are developed for specific domains or individual applications. Therefore, there is little reuse between existing complex tools, leading to high maintenance costs. Our toolchain provides efficient code optimizers and code generation for multiple architectures to any high-level DSL by means of a standard intermediate representation (IR). This significantly reduces development efforts since these parts can be fully reused. Based on that toolchain, we build a high-level DSL that allows for a full implementation of the dynamical core of the COSMO numerical weather predication and regional climate model. This results in a decrease of code size by a factor of about ten compared to the implementation that is currently in production. This is valuable as it increases productivity as well as maintainability. The automatic optimization is able to achieve speedups compared to expert tuned production code.","filename":"post155s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabian","last_name":"Thuering","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post121","type":"poster","title":"CLW02 - ESiWACE: Performance Predictions for Storm-Resolving Simulations of the Climate System","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With exascale computing becoming available in the next decade, global weather prediction at the kilometer scale will be enabled. Moreover, the climate community has already begun to contemplate a new generation of high-resolution climate models. High-resolution model development is confronted with several challenges. Scalability of the models needs to be optimal, including all relevant components such as I\/O which easily becomes a bottleneck; both runtime and I\/O will dictate how fine a resolution can be chosen while still being able to run the model at production level, e.g. at 1-30 years\/day depending on the questions to be addressed. Moreover, given various scalability experiments from prototypical runs and additional model data, estimating performance for new simulations can become challenging. I present results achieved in the scope of the Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) using the ICON model for global high-resolution simulations. I give an overview of the project, I show results from multi-week global 5km simulations, and I discuss current features and limits of the simulations. I further link the findings to a new intercomparison initiative DYAMOND for high-resolution predictions. Finally, I discuss performance prediction approaches for existing performance data.","filename":"post121s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philipp","last_name":"Neumann","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"post136","type":"poster","title":"CLW03 - Experiments with Containerising a State-of-the-Art Weather and Climate Model for Application in HPC","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Modern weather and climate models have massively parallel complex codes and complex workflow environments. Porting these codes is inherently difficult, as is installing, developing and maintaining the relevant infrastructure - particularly where reproducibility of workflow and output is required. The associated effort is a major drain on resources. Containerisation directly addresses many of the issues; by packaging the entire application software stack and runtime into a portable stand-alone container, much of the repetitive work is removed, as is the scope for \u0022installer error\u0022. In this work we present an ongoing study into the use of containers to package (specific) climate configurations of the Met Office Unified Model - a complete weather and climate model and workflow system. Three different container solutions have been used: Docker, Singularity and Shifter. A single container was developed, compatible with all three container systems, and comprising model build and control infrastructures and executables. Issues addressed include making use of the MPI ABI, to facilitate the use by the containerised executable of local fast interconnects, and access to the local file system. Results are presented from environments which range from exploiting a virtual cluster using SLURM on a laptop through to using a Cray XC30 supercomputer.","filename":"post136s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bryan","last_name":"Lawrence","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post133","type":"poster","title":"CLW04 - Performance Study of Climate and Weather Models: Towards a More Efficiently Scalable Model","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The enhancement of numerical codes is given a lot of attention around Europe. Weather and climate models are improving the accuracy of their simulations with some factors such as the reduction of parametrization or the increase of grid resolution. However, this accuracy improvement will need more computational resources through a new generation of supercomputers. To take advantage of this new generation, performance analysis could help to know in detail the computational behavior of the models and the information obtained could be used to introduce optimizations. The optimizations will improve the energy efficiency of the models when thousands of resources are used for their parallel execution. Similarly to previous works using profiling tools for EC-Earth, NEMO and IFS, in this study we present our methodology and analyse results to know more about the computational behavior of different Weather and Climate models. Additionally, we present how to improve the functionality and performance of some of the bottlenecks.","filename":"post133s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Xavier","last_name":"Yepes","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Oriol","last_name":"Tinto","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kim","last_name":"Serradell","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Miguel","last_name":"Castrillo","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"post164","type":"poster","title":"CLW05 - Porting Physical Parameterizations from a Climate Model to Accelerators","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ICON (ICOsahedral Non-hydrostatic) is a climate and numerical weather prediction model being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). Together with MPI-M and DWD, MeteoSwiss, the Center for Climate Systems Modeling (C2SM\/ETH), and the Swiss National Supercomputing Center (CSCS) are porting ICON to GPUs and many-core architectures. Within the model, physical parameterizations calculate the collective effect of physical phenomena which occur on a sub-grid scale. We suggest multiple directive-based approaches of porting these parameterizations to accelerators, such as using the OpenACC standard or the CLAW source-to-source translator. Allowing the retention of a single Fortran code, directive approach can offer a high degree of performance portability. Using the FortranTestGenerator tool for automatic unit test generation for Fortran subroutines, the turbulence parameterization is isolated in a testbed subset of the model, so that subsequent changes can be easily validated. Tool-based analysis of loop kernels using the Roofline model is used to estimate attainable performance on various platforms, in particular i86-based multi-core CPUs as well as NVIDIA GPUs. The validated turbulence parameterization, running within a testbed framework, can be integrated into the overall ICON model.","filename":"post164s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post129","type":"poster","title":"CLW06 - Towards More Efficient Adjustment of Free Parameters in a Global Climate Model?","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Several climate relevant processes are not resolved by the spatial discretization of current global climate models (GCMs). Tropical thunderstorms, for example, with their deep convection must be included via some sub-grid-scale parameterization. Associated free parameters must be adjusted to obtain a \u0027physically meaningful climate\u0027. This \u0027model tuning\u0027 is expensive: the number of tuning parameters is significant (several tens), the response of the model climate to changes in these parameters is typically non-linear, and to quantify this response the GCM must be run for some time, at considerable computational costs. We illustrate model tuning with the example of MPI-ESM-HAM (Max Planck Earth System Model, coupled to the Hamburg Aerosol Module), and discuss how the steadily growing number of GCM simulations (data points) during the tuning process may be exploited to arrive at an overall more efficient, more resource friendly tuning procedure.","filename":"post129s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Wild","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post164","type":"poster","title":"CLW05 - Porting Physical Parameterizations from a Climate Model to Accelerators","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ICON (ICOsahedral Non-hydrostatic) is a climate and numerical weather prediction model being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). Together with MPI-M and DWD, MeteoSwiss, the Center for Climate Systems Modeling (C2SM\/ETH), and the Swiss National Supercomputing Center (CSCS) are porting ICON to GPUs and many-core architectures. Within the model, physical parameterizations calculate the collective effect of physical phenomena which occur on a sub-grid scale. We suggest multiple directive-based approaches of porting these parameterizations to accelerators, such as using the OpenACC standard or the CLAW source-to-source translator. Allowing the retention of a single Fortran code, directive approach can offer a high degree of performance portability. Using the FortranTestGenerator tool for automatic unit test generation for Fortran subroutines, the turbulence parameterization is isolated in a testbed subset of the model, so that subsequent changes can be easily validated. Tool-based analysis of loop kernels using the Roofline model is used to estimate attainable performance on various platforms, in particular i86-based multi-core CPUs as well as NVIDIA GPUs. The validated turbulence parameterization, running within a testbed framework, can be integrated into the overall ICON model.","filename":"post164s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"5","is_presenter":false}] } Presentation
CLW06 - Towards More Efficient Adjustment of Free Parameters in a Global Climate Model?
, Doris Folini (ETH Zurich, Switzerland)
+ Abstract { "session": {"id":"sess144","title":"Posters in Climate and Weather","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Climate and Weather"],"slots":[{"id":"post155","type":"poster","title":"CLW01 - Automatic Optimization of Domain Specific Languages for Weather and Climate Models","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"We present a compiler toolchain for domain-specific languages (DSLs) that allows for easier design of high-level DSLs for weather and climate models. The growing diversity of computing architectures where scientific models need to run is leading to a decrease of productivity due to the necessity to incorporate multiple programming models. DSLs have been proposed to separate the algorithmic implementation from the architecture-specific implementation. However, DSLs are developed for specific domains or individual applications. Therefore, there is little reuse between existing complex tools, leading to high maintenance costs. Our toolchain provides efficient code optimizers and code generation for multiple architectures to any high-level DSL by means of a standard intermediate representation (IR). This significantly reduces development efforts since these parts can be fully reused. Based on that toolchain, we build a high-level DSL that allows for a full implementation of the dynamical core of the COSMO numerical weather predication and regional climate model. This results in a decrease of code size by a factor of about ten compared to the implementation that is currently in production. This is valuable as it increases productivity as well as maintainability. The automatic optimization is able to achieve speedups compared to expert tuned production code.","filename":"post155s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabian","last_name":"Thuering","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Tobias","last_name":"Wicky","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post121","type":"poster","title":"CLW02 - ESiWACE: Performance Predictions for Storm-Resolving Simulations of the Climate System","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With exascale computing becoming available in the next decade, global weather prediction at the kilometer scale will be enabled. Moreover, the climate community has already begun to contemplate a new generation of high-resolution climate models. High-resolution model development is confronted with several challenges. Scalability of the models needs to be optimal, including all relevant components such as I\/O which easily becomes a bottleneck; both runtime and I\/O will dictate how fine a resolution can be chosen while still being able to run the model at production level, e.g. at 1-30 years\/day depending on the questions to be addressed. Moreover, given various scalability experiments from prototypical runs and additional model data, estimating performance for new simulations can become challenging. I present results achieved in the scope of the Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) using the ICON model for global high-resolution simulations. I give an overview of the project, I show results from multi-week global 5km simulations, and I discuss current features and limits of the simulations. I further link the findings to a new intercomparison initiative DYAMOND for high-resolution predictions. Finally, I discuss performance prediction approaches for existing performance data.","filename":"post121s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Philipp","last_name":"Neumann","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Joachim","last_name":"Biercamp","affiliation":"German Climate Computing Center","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"post136","type":"poster","title":"CLW03 - Experiments with Containerising a State-of-the-Art Weather and Climate Model for Application in HPC","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Modern weather and climate models have massively parallel complex codes and complex workflow environments. Porting these codes is inherently difficult, as is installing, developing and maintaining the relevant infrastructure - particularly where reproducibility of workflow and output is required. The associated effort is a major drain on resources. Containerisation directly addresses many of the issues; by packaging the entire application software stack and runtime into a portable stand-alone container, much of the repetitive work is removed, as is the scope for \u0022installer error\u0022. In this work we present an ongoing study into the use of containers to package (specific) climate configurations of the Met Office Unified Model - a complete weather and climate model and workflow system. Three different container solutions have been used: Docker, Singularity and Shifter. A single container was developed, compatible with all three container systems, and comprising model build and control infrastructures and executables. Issues addressed include making use of the MPI ABI, to facilitate the use by the containerised executable of local fast interconnects, and access to the local file system. Results are presented from environments which range from exploiting a virtual cluster using SLURM on a laptop through to using a Cray XC30 supercomputer.","filename":"post136s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bryan","last_name":"Lawrence","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simon","last_name":"Wilson","affiliation":"NCAS-CMS","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post133","type":"poster","title":"CLW04 - Performance Study of Climate and Weather Models: Towards a More Efficiently Scalable Model","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The enhancement of numerical codes is given a lot of attention around Europe. Weather and climate models are improving the accuracy of their simulations with some factors such as the reduction of parametrization or the increase of grid resolution. However, this accuracy improvement will need more computational resources through a new generation of supercomputers. To take advantage of this new generation, performance analysis could help to know in detail the computational behavior of the models and the information obtained could be used to introduce optimizations. The optimizations will improve the energy efficiency of the models when thousands of resources are used for their parallel execution. Similarly to previous works using profiling tools for EC-Earth, NEMO and IFS, in this study we present our methodology and analyse results to know more about the computational behavior of different Weather and Climate models. Additionally, we present how to improve the functionality and performance of some of the bottlenecks.","filename":"post133s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Xavier","last_name":"Yepes","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Oriol","last_name":"Tinto","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kim","last_name":"Serradell","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Miguel","last_name":"Castrillo","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Mario","last_name":"Acosta","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"post164","type":"poster","title":"CLW05 - Porting Physical Parameterizations from a Climate Model to Accelerators","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ICON (ICOsahedral Non-hydrostatic) is a climate and numerical weather prediction model being developed by the Max Planck Institute for Meteorology (MPI-M) and the German Weather Service (DWD). Together with MPI-M and DWD, MeteoSwiss, the Center for Climate Systems Modeling (C2SM\/ETH), and the Swiss National Supercomputing Center (CSCS) are porting ICON to GPUs and many-core architectures. Within the model, physical parameterizations calculate the collective effect of physical phenomena which occur on a sub-grid scale. We suggest multiple directive-based approaches of porting these parameterizations to accelerators, such as using the OpenACC standard or the CLAW source-to-source translator. Allowing the retention of a single Fortran code, directive approach can offer a high degree of performance portability. Using the FortranTestGenerator tool for automatic unit test generation for Fortran subroutines, the turbulence parameterization is isolated in a testbed subset of the model, so that subsequent changes can be easily validated. Tool-based analysis of loop kernels using the Roofline model is used to estimate attainable performance on various platforms, in particular i86-based multi-core CPUs as well as NVIDIA GPUs. The validated turbulence parameterization, running within a testbed framework, can be integrated into the overall ICON model.","filename":"post164s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gerhard","last_name":"Wellein","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"William","last_name":"Sawyer","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Thomas","last_name":"K\u00f6ster","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post129","type":"poster","title":"CLW06 - Towards More Efficient Adjustment of Free Parameters in a Global Climate Model?","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Several climate relevant processes are not resolved by the spatial discretization of current global climate models (GCMs). Tropical thunderstorms, for example, with their deep convection must be included via some sub-grid-scale parameterization. Associated free parameters must be adjusted to obtain a \u0027physically meaningful climate\u0027. This \u0027model tuning\u0027 is expensive: the number of tuning parameters is significant (several tens), the response of the model climate to changes in these parameters is typically non-linear, and to quantify this response the GCM must be run for some time, at considerable computational costs. We illustrate model tuning with the example of MPI-ESM-HAM (Max Planck Earth System Model, coupled to the Hamburg Aerosol Module), and discuss how the steadily growing number of GCM simulations (data points) during the tuning process may be exploited to arrive at an overall more efficient, more resource friendly tuning procedure.","filename":"post129s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Wild","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post129","type":"poster","title":"CLW06 - Towards More Efficient Adjustment of Free Parameters in a Global Climate Model?","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Several climate relevant processes are not resolved by the spatial discretization of current global climate models (GCMs). Tropical thunderstorms, for example, with their deep convection must be included via some sub-grid-scale parameterization. Associated free parameters must be adjusted to obtain a \u0027physically meaningful climate\u0027. This \u0027model tuning\u0027 is expensive: the number of tuning parameters is significant (several tens), the response of the model climate to changes in these parameters is typically non-linear, and to quantify this response the GCM must be run for some time, at considerable computational costs. We illustrate model tuning with the example of MPI-ESM-HAM (Max Planck Earth System Model, coupled to the Hamburg Aerosol Module), and discuss how the steadily growing number of GCM simulations (data points) during the tuning process may be exploited to arrive at an overall more efficient, more resource friendly tuning procedure.","filename":"post129s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Wild","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Doris","last_name":"Folini","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Wild","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}] } Presentation
CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases
, Antonio Maffia (University of Basel, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}] } Presentation
CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method
, Christoph Rettinger (Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}] } Presentation
CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?
, Aurélien Cavelan (University of Basel, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}] } Presentation
CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian
, Dimosthenis Pasadakis (Università della Svizzera italiana, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}] } Presentation
CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis
, Jaroslaw Surkont (University of Basel, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}] } Presentation
CSM06 - A Distributed Parallel Approach for Large Scale Optimal Power Flow with Security Constraints
, Juraj Kardos (Università della Svizzera italiana, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}] } Presentation
CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation
, Samuel Adolfo Cruz Alegría (Università della Svizzera italiana, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}] } Presentation
CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems
, Manav Choudhary (Università della Svizzera italiana, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}] } Presentation
CSM09 - High Performance Topology Optimization
, Ezekiel Barnett (Università della Svizzera italiana, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}] } Presentation
CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery
, Vaclav Svaton (IT4Innovations National Supercomputing Center, Czech Republic)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}] } Presentation
CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms
, Aniello Esposito (Cray Inc., United Kingdom)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}] } Presentation
CSM13 - Neuronal Network Simulation Code for the Exascale Era
, Susanne Kunkel (Norwegian University of Life Sciences, Norway)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}] } Presentation
CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange
, Anshu Dubey (Argonne National Laboratory, United States of America)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}] } Presentation
CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases
, Michael Chan (University of Missouri - St. Louis, United States of America)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}] } Presentation
CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos
, Matthias Frey (Paul Scherrer Institute, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}] } Presentation
CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems
, Ahmed Eleliemy (University of Basel, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}] } Presentation
CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication
, Marko Kabic (ETH Zurich / CSCS, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}] } Presentation
CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software
, Heike Jagode (University of Tennessee, United States of America)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}] } Presentation
CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment
, James W. D. Hobro (Schlumberger, United Kingdom)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}] } Presentation
CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance
, Anthony Danalis (University of Tennessee, United States of America)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}] } Presentation
CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations
, Ali Mohammed (University of Basel, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}] } Presentation
CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics
, Danilo Guerrera (University of Basel, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}] } Presentation
CSM26 - Towards Whole Program Generation for Ocean Modeling
, Sebastian Kuckuk (Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}] } Presentation
CSM27 - Using Data Analysis Techniques to Detect Ransomware
, Sushma Yellapragada (The Northcap University, India)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}] } Presentation
CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing
, Nur Aiman Fadel (ETH Zurich / CSCS, Switzerland)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}] } Presentation
CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator
, Vit Ptosek (IT4Innovations National Supercomputing Center, Czech Republic)
+ Abstract { "session": {"id":"sess145","title":"Posters in Computer Science and Applied Mathematics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"post149","type":"poster","title":"CSM01 - Accelerating Life Science Notebook Applications: Architectural Issues and Use Cases","begin_time":"19:30","end_time":"19:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"For quite some time, life science researchers have increasing demands in using high-performance computing systems. The de-facto HPC programming standards (OpenMP and MPI) are however not appropriate for the majority of this community. These users prefer more wide-spread, high-level approaches, such as given by Python and R environments. Our HPC and web computing project builds a bridge between these two worlds. Computational pharmacists are enabled to specify their problems in a Jupyter Notebook environment (jupyter.org). Depending on the computational load, a notebook can be executed either locally on a user workstation or remotely on an HPC system. Users are freed from knowing HPC system-specific details because remote calls will be assisted by HPC container support (e.g. Docker). Our prototype implementation is a distributed architecture which consists of two subsystems: an extended Jupyter Notebook for supporting Python\/R programming and Prova! (prova.io) for handling user sessions and interfacing with remote HPC systems (computational experiment server). As drug design will more and more depend on simulation, computational reproducibility will be a mandatory requirement, which our system fully supports. During the poster session we explain the architecture and demonstrate sample use cases such as lung cancer image detection and stochastic optimization.","filename":"post149s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Helmar","last_name":"Burkhart","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Gang","last_name":"Mu","affiliation":"Roche","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Antonio","last_name":"Maffia","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post130","type":"poster","title":"CSM02 - Adaptive Grid Refinement Techniques for Particulate Flow Simulations with the Lattice Boltzmann Method","begin_time":"19:34","end_time":"19:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Particulate flows are encountered in various application fields, examples being fluidized beds in chemical engineering and sediment transport in riverbeds relevant in environmental engineering. Here, simulations that feature geometrically fully resolved particles are desired since they enable accurate predictions from first principles. The high computational costs, however, usually impose a strong limitation on the system size. In many cases, the flow structures in the vicinity of the particles are of special interest since they influence the particle motion and thus need to be appropriately numerically resolved. On the other hand, regions without particles have less restrictive resolution requirements and allow for coarser grids. With adaptive grid refinement, we can significantly improve the efficiency of such simulations since the overall workload is reduced. We present and evaluate different refinement approaches for particulate flows by comparing their accuracy and performance to simulations with uniform grids. Furthermore, we discuss load balancing strategies to distribute the workload evenly among the available computing resources. This is essential for efficient massively parallel simulations and requires accurate predictors for the local workload generated by the coupled simulation. Illustrating examples from the aforementioned application fields will be presented to demonstrate the generality and flexibility of our approach.","filename":"post130s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Christoph","last_name":"Rettinger","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post154","type":"poster","title":"CSM03 - Are Smooth Particle Hydrodynamics Applications Inherently Resilient to Faults?","begin_time":"19:38","end_time":"19:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Increasing the number of system components is the most viable path towards increasing the computational power of current and future computing systems. Unfortunately, this also contributes to increasing the number of faults, errors, and failures in high performance computing (HPC) applications. Silent data corruptions (SDC) typically result from bit-flips in the HPC system memory and pose a major threat to the correctness of the results. Current error detection techniques for hydrodynamics applications rely on global invariants: properties that hold in the simulated physical model, such as total mass, momentum, and energy conservation. Yet, state-of-the-art methods to resolve conservations laws are based on approximations, which result in imperfect preservation of the invariant properties. As a result, SDC detection during simulation is only possible when an error causes a significant variation in the quantities of one of these properties. This poster considers smooth particle hydrodynamics applications that tend to conserve such physical properties more accurately than classical hydrodynamics techniques. Initially, the impact and propagation of SDC through the data is investigated. Subsequently, the error detection range of this technique is experimentally quantified in terms of recall and precision for different test cases and problem sizes.","filename":"post154s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post162","type":"poster","title":"CSM04 - Balanced Graph Partition Refinement Using the Graph p-Laplacian","begin_time":"19:42","end_time":"19:46","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A continuous formulation of the optimal 2-way graph partitioning based on the p-norm minimization of the graph Laplacian Rayleigh quotient is presented, which provides a sharp approximation to the balanced graph partitioning problem, the optimality of which is known to be NP-hard. The minimization is initialized from a cut provided by a state-of-the-art multilevel recursive bisection algorithm, and then a continuation approach\u00a0reduces the p-norm from a 2-norm towards a 1-norm, employing for each value of p a feasibility-preserving steepest-descent method that converges on the p-Laplacian eigenvector. A filter favors iterates advancing towards minimum edge-cut and partition load imbalance. The complexity of the suggested approach is linear in graph edges. The simplicity of the steepest-descent algorithm renders the overall approach highly scalable and efficient in parallel distributed architectures. Parallel implementation of recursive bisection on multi-core CPUs and GPUs are presented for large-scale graphs with up to 1.9 billion tetrahedra. The suggested approach exhibits improvements of up to 52.8% over METIS for graphs originating from triangular Delaunay meshes, 34.7% over METIS and 21.9% over KaHIP for power network graphs, 40.8% over METIS and 20.6% over KaHIP for sparse matrix graphs, and finally 93.2% over METIS for graphs emerging from social networks.","filename":"post162s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Toby","last_name":"Simpson","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Kohei","last_name":"Fujita","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Takuma","last_name":"Yamaguchi","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Tsuyoshi","last_name":"Ichimura","affiliation":"The University of Tokyo","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post137","type":"poster","title":"CSM05 - BioMedIT: Enabling Interoperable Biomedical Analysis","begin_time":"19:46","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Personalized medicine will enable more efficient treatment of patients with custom tailored intervention. This will require not only changes in how biomedical research is performed, but also to the associated IT infrastructure utilized. The datasets required to gain insight into complex diseases are often spread across institutions with limits on access, transfer, and software. To address these challenges the BioMedIT, a federation of national IT centers, is developing an interoperable infrastructure for the biomedical research being performed by the Swiss Personalized Health Network (SPHN). This infrastructure will enable researchers to develop new analysis workflows on their local computing environment and then seamlessly execute them on larger, possibly distant, computing resources while ensuring patient privacy and security. The initial phase of this project has looked at approaches for providing software interoperability between sites. This work provides an overview of the technologies assessed to enable proof-of-concept multi-site workflow execution including workflow engines, containerization, and HPC strategies.","filename":"post137s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Kevin","last_name":"Sayers","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Thierry","last_name":"Sengstag","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ioannis","last_name":"Xenarios","affiliation":"Swiss Institute of Bioinformatics","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Bernd","last_name":"Rinn","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Marcel","last_name":"Riedi","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Schwede","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jaroslaw","last_name":"Surkont","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post160","type":"poster","title":"CSM06 - A Distributed Parallel Approach for Large\u00a0Scale Optimal Power Flow with Security Constraints","begin_time":"19:50","end_time":"19:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The electrical power grid is a critical infrastructure, and in addition to economic dispatch, the grid should operate with strict security measures and\u00a0be resilient to failures of its components. Increased penetration of the renewable energy sources is placing greater stress on the grid, shifting operation of the power grid equipment towards their operational limits. Thus, any unexpected contingency could be critical to the overall operation. Security constrained optimal power flow (SCOPF) imposes additional security constraints, such that in the event of any contingency, the power\u00a0grid\u00a0will remain secure and within operational\u00a0limits. For a realistic power network with numerous contingencies considered, the overall problem size becomes intractable for single-core optimization tools in short time frames for industrial operations, such as real-time electricity market responses to electricity prices.\u00a0We propose an efficient distributed interior-point framework exploiting the block-structured KKT linear system arising from the optimality conditions of the augmented Lagrangian of the SCOPF problem. In order to utilize a node-level parallelism, an incomplete augmented multicore sparse factorization is used, which further exploits the sparse structure of the problem. Numerical experiments on Pan-European power grid with large number of contingency scenarios demonstrate that the problem\u00a0can be efficiently solved.","filename":"post160s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Drosos","last_name":"Kourounis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post185","type":"poster","title":"CSM07 - Evaluating OpenACC on a Large Scale Particle Simulation","begin_time":"19:54","end_time":"19:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The simulation of particle systems has become essential for visualizing the behaviour of relevant physical systems, ranging from simulations of molecular dynamics to simulations of colliding galaxies. Performing realistic simulations require considering a large number of particles, leading to immense computational costs. Simulating such systems thus require increasingly long time frames and performing increasingly complex simulations may become intractable for single-core simulation tools. Thus, it is essential to develop simulation tools which scale with the number of bodies used in a simulation. A possible approach for scalable simulation tools is to distribute the workload among different parallel threads available in currently available accelerators. This poster aims to explore the efficiency and scalability of parallelization based on the OpenACC programming standard, which is a directive based standard for parallel computing that offloads the computational kernels to a GPU accelerator. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post185s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Alessandra Martha","last_name":"De Felice","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hrishikesh","last_name":"Gupta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Juraj","last_name":"Kardos","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Samuel Adolfo","last_name":"Cruz Alegr\u00eda","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post186","type":"poster","title":"CSM08 - Evaluating TensorFlow Optimization Techniques for Solving Elliptic Boundary Control Problems","begin_time":"19:58","end_time":"20:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"TensorFlow is a software library which uses data flow graphs for numerical computations. The graph contains nodes representing mathematical operations and edges represent data tensors. In this work, we investigate the potential of using TensorFlow for solving large scale optimal control problems constrained by elliptic partial differential equations. We use finite difference discretization techniques to formulate the optimal control problem as a general non linear programming problem, which may contain up to tens of thousands of control and state variables. We compare the performance and accuracy of TensorFlow against state-of-the-art interior point optimization package IPOPT frequently used for solving such problems. This work is done as a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026 Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post186s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Olaf","last_name":"Schenk","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Manav","last_name":"Choudhary","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post184","type":"poster","title":"CSM09 - High Performance Topology Optimization","begin_time":"20:02","end_time":"20:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Topology Optimization (TO) is one state-of-the-art method for solving\u00a0constrained optimization problems that arise in structural engineering.\u00a0TO formulates the material design problem as an optimization procedure, which incurs significant computational costs that grow rapidly with the mesh resolution. Each iteration includes a Finite Element (FE) analysis and an optimization procedure, and most problems are\u00a0regarded as highly\u00a0computationally expensive. In this poster we consider a minimum compliance TO procedure for a maximum stiffness problem in 2 dimensions on an arbitrary domain, with Dirichlet boundary conditions (i.e. static load). Our implementation of this canonical TO problem improves both the speed and accuracy on high resolution meshes. The improvements are primarily achieved through the parallelization of the FE procedure, which is implemented through FEniCS and DOLFIN. The poster is based on a master student project within the course \u0022Software Atelier: Simulation, Data Science \u0026amp; Supercomputing\u0022 at Universit\u00e0 della Svizzera italiana.","filename":"post184s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sameer","last_name":"Rawat","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true},{"type":"Author","first_name":"Sumeet","last_name":"Gyanchandani","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimosthenis","last_name":"Pasadakis","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ezekiel","last_name":"Barnett","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"post180","type":"poster","title":"CSM10 - HPC-as-a-Service for Driving Artificial Intelligence for Drug Discovery","begin_time":"20:06","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"HPC-as-a-Service further lowers the entry barrier for users who are interested in utilizing massive parallel computers for modelling. Real-world pharma industry applications often encompass end-to-end data processing pipelines composed of a large number of interconnected tasks of various granularity. Most of the common tasks in the prediction of activity and toxicity of chemical compounds consist of several typical steps, such as compiling, cleaning and combining datasets, feature calculation, feature selection, model training and validation and applying models to predict properties of new compounds. Building and executing such pipelines on HPC systems can be challenging tasks for domain specialists who do not have sufficient level of experience in distributed computing. Therefore, we introduce a drug discovery web platform that enables large-scale machine learning applications being executed on supercomputing facilities via HPC as a Service Middleware. The middleware provides functionality for remote execution and ensures authentication and authorization to provided functions, necessary security for data management, monitoring and reporting of executed HPC jobs and their progress and provides current information about the state of the cluster. The ability of HPC job execution through a web platform provides users intuitive and straightforward access to HPC resources without necessary HPC knowledge.","filename":"post180s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Vojtech","last_name":"Cima","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nina","last_name":"Jeliazkova","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Vedrin","last_name":"Jeliazkov","affiliation":"Ideaconsult Ltd.","country":"Bulgaria","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Chupakhin","affiliation":"Janssen Pharmaceutica NV","country":"Belgium","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vaclav","last_name":"Svaton","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"5","is_presenter":true}]},{"id":"post150","type":"poster","title":"CSM11 - Importance of Rank Reordering for Advanced Polar Decomposition Algorithms","begin_time":"20:10","end_time":"20:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A major goal of reordering the processing elements of a distributed-memory application is to maximize the on-node point-to-point communication and therefore reduce the corresponding off-node traffic in order to improve the total communication time and load balance especially in network-bound codes. We demonstrate the importance of MPI rank reordering in the context of advanced dense linear algebra (DLA) applications, which are naturally assumed to be computation-bound. However, applications composed of successive calls to high-level DLA matrix operations of irregular workloads may also suffer from process misplacement especially in strong scaling mode of operations. In particular, we focus on two advanced polar decomposition (PD) algorithms, i.e. the QR-based Dynamically Weighted Halley method (QDWH) and the Zolotarev rational functions (ZOLOPD). PD is the first computational step toward solving symmetric eigenvalue problems and the singular value decomposition. We consider an extensive combination of grid topologies and rank reorderings for different matrix sizes and number of nodes. Performance profiling reveals an improvement of up to 54%, thanks to a careful process placement. Simulation have been performed on Cray XC systems using rank reordering features of the cray-mpich library. Results presented here are part of a paper submitted to the Cray User Group 2018.","filename":"post150s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"David","last_name":"Keyes","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Hatem","last_name":"Ltaief","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dalal","last_name":"Sukkari","affiliation":"King Abdullah University of Science and Technology","country":"Saudi Arabia","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aniello","last_name":"Esposito","affiliation":"Cray Inc.","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post175","type":"poster","title":"CSM13 - Neuronal Network Simulation Code for the Exascale Era","begin_time":"20:18","end_time":"20:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Numerical simulation of neuronal networks has become an important part of modern neuroscience, next to experimental and theoretical approaches. Simulation software for spiking neuronal networks, such as the open-source simulator NEST (www.nest-simulator.org), is based on the hypothesis that the main processes of brain function can be captured at the level of individual neurons, their connections, and their interactions through electric pulses, called spikes. As neurons have on average a few thousand incoming connections, connectivity is very sparse in large-scale network models of a billion neurons, which is approximately one percent of the human brain. Today simulating such networks is possible on petascale computers as, for example, the K computer. To manage memory usage and runtime, neuronal simulators ultimately targeting brain-scale simulations on the next generation of supercomputers need to fully exploit the even sparser connectivity of these networks. To this end, we have developed a two-tier connection infrastructure and a framework for directed communication among compute nodes. We show that the new technology implemented in NEST achieves perfect weak scaling with respect to memory usage and good weak scaling with respect to runtime, which is a breakthrough on the way to brain-scale simulations in the exascale era.","filename":"post175s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jakob","last_name":"Jordan","affiliation":"University of Bern","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tammo","last_name":"Ippen","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Moritz","last_name":"Helias","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Itaru","last_name":"Kitayama","affiliation":"RIKEN","country":"Japan","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Mitsuhisa","last_name":"Sato","affiliation":"RIKEN","country":"Japan","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jun","last_name":"Igarashi","affiliation":"RIKEN","country":"Japan","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Diesmann","affiliation":"Forschungszentrum J\u00fclich","country":"Germany","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Susanne","last_name":"Kunkel","affiliation":"Norwegian University of Life Sciences","country":"Norway","bio":"","order":"8","is_presenter":true}]},{"id":"post174","type":"poster","title":"CSM14 - A New Community-Driven Resource for Scientific Software Improvement Exchange","begin_time":"20:22","end_time":"20:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Better Scientific Software is an organization dedicated to improving developer productivity and software sustainability for computational science and engineering (CSE). This poster introduces the BSSw website (https:\/\/bssw.io), a new community-based resource for scientific software improvement exchange. We\u0027re creating a central hub for sharing information on practices, techniques, experiences, and tools to improve developer productivity and software sustainability for CSE. The site aims to raise awareness of the importance of good software practices to scientific productivity and to the quality and reliability of computationally-based scientific results. Additional goals are to raise awareness of the increasing challenges facing CSE software developers as high-end computing heads to extreme scales, and to facilitate CSE collaboration via software in order to advance scientific discoveries. Site users can find information on scientific software topics and can propose to curate or create new content based on their own experiences. Communities can also create content tailored to the unique needs and perspectives of a focused scientific domain. The backend enables collaborative content development using standard GitHub tools and processes. We need community contributions to build the BSSw site into a vibrant resource, with content and editorial processes provided by volunteers throughout the international CSE community. Join us!","filename":"post174s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lois C.","last_name":"McInnes","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"David E.","last_name":"Bernholdt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael A.","last_name":"Heroux","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anshu","last_name":"Dubey","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":true}]},{"id":"post146","type":"poster","title":"CSM15 - ORCA and Cut-and-Solve: A Potential High-Performance Solution to Learning Genetic Causes of Complex Diseases","begin_time":"20:26","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the advent of genetic sequencing, there was much hope of finding the inherited elements underlying complex diseases, such as Alzheimer\u0027s disease, but it has been a challenge to find useful information hidden in the data. A likely contributor to this failure is the fact that the pathogenesis of most complex diseases involves patterns of genetic markers rather than single markers working alone. To combat this, we propose an integer programming model called ORCA which finds the pattern with the absolute maximum percentage difference between cases and controls. However, this optimization problem requires massive computations and conventional methods, such as branch-and-cut, are not suitable for large-scale parallelization. We present a novel implementation that utilizes an alternative search strategy, cut-and-solve. Cut-and-solve employs a linear search path where chunks of the solution space are \u0027cut\u0027 away and treated as separate problems. Leveraging this structure, we are in the process of massively parallelizing cut-and-solve to find candidate genetic patterns highly associated with Alzheimer\u0027s disease.","filename":"post146s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Sharlee","last_name":"Climer","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sanjiv K.","last_name":"Bhatia","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Carlos","last_name":"Cruchaga","affiliation":"Washington University School of Medicine","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michael","last_name":"Chan","affiliation":"University of Missouri - St. Louis","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post131","type":"poster","title":"CSM16 - Parallelization of the Boundary Element Method","begin_time":"20:30","end_time":"20:34","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The main advantage of the boundary element method (BEM) is a reduction of the problem to the boundary of the computational domain. This makes it well suited for problems stated on unbounded domains, such as sound or electromagnetic wave scattering. We present the BEM4I library of parallel BEM-based solvers for problems modeled by the Laplace, Lame, Helmholtz, and wave equation. The library has been parallelized and optimized on multiple levels. OpenMP 4.5 directives have been used for the shared memory parallelization and SIMD vectorization of the computationally most intensive kernels. Two approaches have been implemented for the distributed memory parallelization;\u00a0the first one is based on the parallelization of the adaptive cross approximation method (ACA) while the second uses the boundary element tearing and interconnecting (BETI) domain decomposition method. In the poster, we present the structure of the library and approaches for the vectorization and parallelization as well as the results of the scalability experiments performed on Xeon and Xeon Phi based clusters.","bio":"","contributors":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jan","last_name":"Zapletal","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michal","last_name":"Kravcenko","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Michal","last_name":"Merta","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]},{"id":"post126","type":"poster","title":"CSM17 - Performance and Implementation of a Geometric Multigrid Solver with Trilinos","begin_time":"20:34","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The accurate and efficient simulation of neighbouring bunch effects in high intensity cyclotrons requires one to solve large-scale \u003Cem\u003EN\u003C\/em\u003E-body problems of \u003Cem\u003EO\u003C\/em\u003E(10^9...10^10) particles coupled with Maxwell\u0027s equations. In order to capture those effects with standard particle-in-cell models an extremely fine mesh with \u003Cem\u003EO\u003C\/em\u003E(10^8...10^9) grid points is necessary to meet the condition of high resolution. This requirement represents a waste of memory in regions of void, therefore, the usage of block-structured adaptive mesh refinement algorithms is more suitable. The \u003Cem\u003EN\u003C\/em\u003E-body problem is then solved on a hierarchy of levels and grids using geometric multigrid algorithms. We show benchmarks of a new implementation of a geometric multigrid algorithm using Trilinos that ran on Piz Daint with \u003Cem\u003EO\u003C\/em\u003E(10^4...10^5) cores.","filename":"post126s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Adelmann","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthias","last_name":"Frey","affiliation":"Paul Scherrer Institute","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post153","type":"poster","title":"CSM18 - Performance Evaluation of Dynamic Loop Scheduling Techniques Using MPI Passive RDMA on Distributed Memory Systems","begin_time":"20:38","end_time":"20:42","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large parallel loops are present in many scientific applications. Static and dynamic loop scheduling (DLS) techniques aim to achieve load balanced executions of applications. The use of DLS techniques in scientific applications, such as the self-scheduling-based techniques, showed significant performance advantages compared to static techniques. On distributed-memory systems, DLS techniques have been implemented using the message-passing interface (MPI). Existing implementations of MPI-based DLS libraries do not consider the novel features of the latest MPI standards, such as one-sided communication, shared-memory window creation, and atomic read-modify-write operations. This poster considers these features and proposes an MPI-based DLS library written in the C language. Unlike existing libraries, the proposed DLS library does not employ a master-worker execution model. Moreover, it contains implementations of five well-known DLS techniques, namely self-scheduling, fixed-size chunking, guided self-scheduling, trapezoid self-scheduling, and factoring. An application from the computer vision is used to assess and compare the performance of the proposed library against the performance of existing solutions. The evaluation results show improved performance and highlight the need to revise and upgrade existing solutions in light of the significant advancements in the MPI standards.","filename":"post153s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ahmed","last_name":"Eleliemy","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post172","type":"poster","title":"CSM20 - Practical Communication-Optimal Algorithm for Dense Matrix-Matrix Multiplication","begin_time":"20:46","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Available memory can be traded for reducing expensive communication. The optimal strategy depends on the precise workload and the available memory. CARMA (Demmel et al., 2013) is the first matrix-matrix multiplication algorithm that is communication-optimal for all memory ranges and all matrix shapes.\u00a0The algorithm recursively splits the largest matrix dimension creating smaller subproblems which are then recursively solved sequentially or in parallel, depending on the available memory. While appealing and simple at first sight, the implementation details are tricky and the distributed version requires the data layout very different from any layout used in existing linear-algebra libraries.\u00a0Here, we present results from an implementation of CARMA that provides functionality not present in earlier published prototypes, namely the ability to deal with matrix dimensions and processor numbers that are not powers of two, and do not necessarily share common divisors. Furthermore, we derive a relatively simple data layout, which preserves communication-optimality, but requires fewer intermediate copies during execution, has improved memory access patterns and is potentially more compatible with existing linear algebra libraries.\u00a0Additional validation and verification, benchmarking and a compatibility layer to the established SCALAPACK library, leads to a matrix-matrix multiplication software package that can be used in other applications.","filename":"post172s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Thibault","last_name":"Notargiacomo","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Joost","last_name":"VandeVondele","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marko","last_name":"Kabic","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post140","type":"poster","title":"CSM21 - Practical Experience with Task-Based Programming Techniques for Quantum Chemistry Software","begin_time":"20:50","end_time":"20:54","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"With the increase in scale, complexity, and heterogeneity of modern high-performance computing (HPC) platforms, one of the grim challenges for traditional programming models is sustaining the expected performance at scale. The main objective of this work is to move away from traditional programming models that force scientific applications to be developed for specific architectures or platforms. Instead, we use dataflow programming models to represent the algorithms in a way that enables us to observe and capture data dependencies, which is the most essential property of an algorithm. We discuss dataflow programming models for computational chemistry applications, because they comprise one of the driving forces of HPC, and compare different dataflow executions in terms of programmability, resource utilization, and scalability. In particular, we evaluate two programming paradigms: (1) explicit dataflow, where the dataflow is specified explicitly by the developer; and (2) implicit dataflow, where a task scheduling runtime derives the dataflow using per-task, data-access information embedded in a serial program. We use the state-of-the-art NWChem chemistry application as our science driver, and we present our findings using three different task-based runtimes PaRSEC, StarPU, and OpenMP, which enable the different forms of dataflow execution.","filename":"post140s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post173","type":"poster","title":"CSM22 - Redesigning Numerical Modelling Algorithms for Efficient, Large-Scale Cloud Deployment","begin_time":"20:54","end_time":"20:58","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ready availability of cloud computing resources presents an opportunity for rapid turnaround and increased flexibility for large-scale numerical modelling, opening up new possibilities for interactive applications. However, achieving linear scaling and efficient data handling for complex, coupled numerical modelling problems on standard high-latency cloud virtual machines is still challenging. We explore the improvements in scalability and data transfer hiding that are achievable for elastic wave equation modelling by moving away from a sequential programming approach as conventionally used with the Message Passing Interface (MPI), in which it is difficult to avoid synchronization across a parallel system. Instead, we use the concepts of actor-based and reactive programming to remove all unnecessary synchronization within and between virtual machines. We do this by introducing flexibility into the order of computation and data exchange, and by making extensive use of task and data prioritization. This is effective in eliminating wait time and spreads communication out evenly, reducing network contention. We use a theoretical model to examine the scalability characteristics of the new system in comparisons with an optimized traditional MPI implementation. The new system scales linearly to within measurable errors in tests on commodity cloud clusters of up to 2000 cores.","filename":"post173s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Anindya","last_name":"Sharma","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James W. D.","last_name":"Hobro","affiliation":"Schlumberger","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"post141","type":"poster","title":"CSM23 - Software-Defined Events through PAPI for In-Depth Analysis of Application Performance","begin_time":"20:58","end_time":"21:02","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"One of the most recent developments of the Performance API (PAPI) is the addition of Software-Defined Events (SDE). PAPI has successfully served the role of the abstraction and unification layer for hardware performance counters for over a decade. This poster presents our effort to extend this role to encompass performance critical information that does not originate in hardware, but rather in critical software layers, such as libraries and runtime systems. Our overall objective is to enable monitoring of both types of performance events, hardware- and software-related events, in a uniform way, through one consistent PAPI interface. Performance analysts will be able to form a complete picture of the entire application performance without learning new instrumentation primitives. The goal of the poster is threefold. First, we outline PAPI\u0027s new SDE API and describe the semantics. Second, we showcase the usefulness of SDE through its employment in software layers as diverse as the compiler\/library tool ByFL, and the state-of-the-art chemistry application NWChem. We outline the process of instrumenting these software packages and highlight the performance information that can be acquired with SDEs. Third, we present our vision for future, more advanced features and discuss the benefits and the caveats associated with them.","filename":"post141s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Heike","last_name":"Jagode","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Dongarra","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Anthony","last_name":"Danalis","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post152","type":"poster","title":"CSM24 - A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations","begin_time":"21:02","end_time":"21:06","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of large loops. These loops contain computationally-intensive operations, resulting in heavy loop bodies. Loop scheduling techniques are used to parallelize such applications. Dynamic loop scheduling (DLS) techniques are used to mitigate variations in loop iterations execution times caused by problem, algorithmic, or systemic characteristics and, therefore, achieve a balanced load execution of scientific applications on high performance computing systems. Such variations are referred to as perturbations and include, decreased delivered computational speed, reduced available network bandwidth, or larger network latencies. The perturbations can also be caused by other applications or processes that share the same resources, or a temporary system fault or malfunction. In this poster, the performance of a computer vision application scheduled using DLS is studied under nine different perturbation scenarios. The application execution is simulated and its performance is analyzed. The evaluation of the simulation results suggests that no single scheduling technique achieves the best overall performance in all the considered scenarios. This work reveals the need for a mechanism to select the best performing scheduling technique based on the system state during execution to achieve improved application performance.","filename":"post152s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ali","last_name":"Mohammed","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post144","type":"poster","title":"CSM25 - Towards an Exascale-Ready Mini-App for Smooth Particle Hydrodynamics","begin_time":"21:06","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The smooth particle hydrodynamics (SPH) technique is a purely Lagrangian method, used in numerical simulations of fluids in astrophysics and computational fluid dynamics, among many other fields. SPH simulations represent computationally demanding calculations. Therefore, trade-offs are made between temporal and spatial scales, resolution, dimensionality (2-D or 3-D), and approximate versions of the physics involved. The parallelization of SPH codes is not trivial due to the absence of a structured particle grid. This poster presents insights into the current performance and functionalities of three SPH implementations of the SPH-EXA PASC project[1]: SPHYNX[2], ChaNGa[3], and SPH-flow[4]. The insights are obtained by the implementation (configuration and extension of the original code base), execution, evaluation, and analysis on two modern HPC systems, for a common test case: 3D rotating square patch[5] with 1 million particles. The performance of these codes is negatively impacted by factors, such as multiple time-stepping, gravity, or boundary conditions. Therefore, the goal is to extrapolate their common basic SPH features, with the aim of consolidating them into a pure-SPH, Exascale-ready, MPI+X, optimized, mini-app. The SPH mini-app will integrate further specific physics models. [1]https:\/\/www.pasc-ch.org\/projects\/2017-2020\/sph-exa\/. [2]http:\/\/astro.physik.unibas.ch\/sphynx. [3]http:\/\/faculty.washington.edu\/trq\/hpcc\/tools\/changa.html. [4]http:\/\/www.sph-flow.com. [5]http:\/\/padis.uniroma1.it\/handle\/10805\/688 (2D version).","filename":"post144s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Florina","last_name":"Ciorba","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Lucio","last_name":"Mayer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Rub\u00e9n","last_name":"Cabezon","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Imbert","affiliation":"NEXTFLOW Software","country":"France","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Aur\u00e9lien","last_name":"Cavelan","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Darren S.","last_name":"Reed","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Jean-Guillaume","last_name":"Piccinali","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Ioana","last_name":"Banicescu","affiliation":"Mississippi State University","country":"United States of America","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Domingo","last_name":"Garci\u00e1-Senz","affiliation":"Universitat Polit\u00e8cnica de Catalunya","country":"Spain","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Thomas R.","last_name":"Quinn","affiliation":"University of Washington","country":"United States of America","bio":"","order":"11","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danilo","last_name":"Guerrera","affiliation":"University of Basel","country":"Switzerland","bio":"","order":"5","is_presenter":true}]},{"id":"post138","type":"poster","title":"CSM26 - Towards Whole Program Generation for Ocean Modeling","begin_time":"21:10","end_time":"21:14","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(Numerical) ocean modeling provides a crucial tool for researching effects such as tsunamis and flooding. However, creating efficient implementations can be challenging, especially when covering a wide range of methods and target hardware. One possible remedy is employing domain-specific languages (DSLs) in conjunction with code generation techniques. ExaStencils and its multi-layered external DSL ExaSlang (ExaStencils language) provides such a framework. In this poster presentation, we present our advances towards developing and adapting code generation techniques for ocean modeling applications. For this, we implement a prototype solver for the shallow water equations (SWE) in ExaSlang. Its base is a finite volume discretization and the Lax-Friedrichs method. We showcase DSL code examples as well as performance results obtained on Piz Daint. Additionally, a roadmap for future extensions is sketched: We aim at adding support for real-world geometries such as coastlines and islands. Here, a patch-based approach allows us to combine the flexibility of an unstructured coarse-grid mesh and the performance benefits of topological structure within patches. Moreover, code generation allows specializing generated applications to varying aspects of the chosen discretization as well as the target hardware. This becomes especially important when switching to more sophisticated discretization techniques such as Discontinuous Galerkin (DG).","filename":"post138s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Harald","last_name":"K\u00f6stler","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sebastian","last_name":"Kuckuk","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post183","type":"poster","title":"CSM27 - Using Data Analysis Techniques to Detect Ransomware","begin_time":"21:14","end_time":"21:18","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A ransomware infection typically disables entire infrastructure by encrypting sensitive files on a system\/network and demands for huge amounts of ransom to unlock these files. Several attempts at protecting vital data from such fatal attacks have been made, but many of the newly developed ransomware variants bypass the existing anti-malware detection systems. In this work, we deployed more robust and efficient techniques on large system and user files that could immediately detect malicious activities and alert the user before a significant amount of information is lost. We monitored four indicators which include file system analysis for malicious contents using Hadoop, checking data integrity by generating hash codes using C#, using machine learning algorithms to predict ransomware prone files, and monitoring the file system log to keep a check on suspicious file activities. Further, we studied how using data processing platforms like Hadoop and R helped improve the computational speed and how these indicators can be deployed on a computer network or HDFS clusters. Various classification tree models were studied for their computational efficiency and scalability. Our ultimate aim is to utilize these techniques in protecting large sets of real-time data that all big research labs and organizations work with.","filename":"post183s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Upasna","last_name":"Sharma","affiliation":"The Northcap University","country":"India","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Abhishek","last_name":"Barry","affiliation":"The Northcap University","country":"India","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sushma","last_name":"Yellapragada","affiliation":"The Northcap University","country":"India","bio":"","order":"1","is_presenter":true}]},{"id":"post166","type":"poster","title":"CSM28 - Utopia: A High Performance C++ Embedded Domain Specific Language for Scientific Computing","begin_time":"21:18","end_time":"21:22","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The rise of new technologies is a driver for changes in scientific-computing software libraries. However, such changes affect the whole simulation software, inducing unwanted modifications to high-level code in the application. To avoid modifications, state-of-the-art software mainly rely on high-level programming interfaces or scripting languages. This is achieved separating the model from the computation, thus allowing one to keep the implementation details hidden from the application code. We achieve this separation by using C++ meta-programming and particular evaluation strategies. We present the open source project Utopia, a common application programming interface to the best established parallel linear algebra libraries as a possible candidate of \u0022write once, run everywhere\u0022 while maintaining performance portability. We focus on the Utopia back-end implementation based on Trilinos and show how to provide both basic functionalities and extensions targeting backend-specific performance in a simple way. Furthermore, we consider one application to the end-user software FASTER showing the ease of porting and its improved performance.","filename":"post166s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Andreas","last_name":"Fink","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Patrick","last_name":"Zulian","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Dimitrios","last_name":"Karvounis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Krause","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nur Aiman","last_name":"Fadel","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"post181","type":"poster","title":"CSM29 - Validation of the Self-Adaptive Navigation System by Enhanced HPC Traffic Simulator","begin_time":"21:22","end_time":"21:26","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The navigation challenges for smart cities are the solutions envisioning a central and knowledgeable routing server, which collects and fuses all useful data sources and controls overall traffic in an intelligent way. The self-adaptive navigation system developed within the FET-HPC project ANTAREX implements the traffic flow optimization service coordinated with external client-side navigation applications and heterogeneous data sources. We have developed the enhanced Traffic simulator on HPC infrastructure for testing an efficiency and usability of the navigation system. Building blocks of the simulator include server-side navigation system, virtual Smart City world, benchmark settings, and navigation test bed, which contains industrial Sygic client-side navigation and simplified simulation of vehicles. The important feature of the simulator is the ability to evaluate the traffic flow control strategy in the Smart City world, with and without enabled global view calculation of traffic network, and for a given percentage of vehicles connected to the server-side service. The integration of the Sygic navigation to the large-scale traffic simulator enables to perform compliance test of real navigation applications to the developed central navigation system.","filename":"post181s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jan","last_name":"Martinovi\u010d","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Sevcik","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vit","last_name":"Ptosek","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"3","is_presenter":true},{"type":"Author","first_name":"Katerina","last_name":"Slaninova","affiliation":"IT4Innovations National Supercomputing Center","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Radim","last_name":"Cmar","affiliation":"Sygic","country":"Slovakia","bio":"","order":"5","is_presenter":false}] } Presentation
EAD01 - On the Solution of Macroeconomic Models with Distributional Channels and Default Risk
, Luca Mazzone (University of Zurich, Switzerland)
+ Abstract { "session": {"id":"sess146","title":"Posters in Emerging Application Domains","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Emerging Application Domains"],"slots":[{"id":"post158","type":"poster","title":"EAD01 - On the Solution of Macroeconomic Models with Distributional Channels and Default Risk","begin_time":"19:30","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The importance of distributional channels in macroeconomic dynamics has been object of considerable attention from empirical studies. Despite considerable amount of effort aimed at incorporating heterogeneity into macroeconomics, however, their explicit inclusion in the standard policy toolbox is far from widespread. A relevant obstacle, in such cases, is the computation of equilibria. I propose a global solution method for solving infinite-horizon, heterogeneous agent macroeconomic models with aggregate uncertainty. Details of the algorithm are illustrated by presenting its application in an example model: in it, aggregate dynamics depends explicitly on firm entry and exit, and individual choices are often constrained by a form of market incompleteness. Existing computational strategies are either unfeasible or provide inaccurate solutions. Moreover, global solutions are computationally expensive because the minimal representation of the aggregate state space - and thus the aggregate law of motion - faces the curse of dimensionality. In my example model, the approximate law of motion is still a five-dimensional object, making it already too expensive to evaluate using Cartesian grids. The proposed strategy thus combines adaptive sparse grids with a cross-sectional density approximation, and introduces a framework for solving the more general class of dynamic models with firm or household heterogeneity accurately.","filename":"post158s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Luca","last_name":"Mazzone","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luca","last_name":"Mazzone","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post158","type":"poster","title":"EAD01 - On the Solution of Macroeconomic Models with Distributional Channels and Default Risk","begin_time":"19:30","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The importance of distributional channels in macroeconomic dynamics has been object of considerable attention from empirical studies. Despite considerable amount of effort aimed at incorporating heterogeneity into macroeconomics, however, their explicit inclusion in the standard policy toolbox is far from widespread. A relevant obstacle, in such cases, is the computation of equilibria. I propose a global solution method for solving infinite-horizon, heterogeneous agent macroeconomic models with aggregate uncertainty. Details of the algorithm are illustrated by presenting its application in an example model: in it, aggregate dynamics depends explicitly on firm entry and exit, and individual choices are often constrained by a form of market incompleteness. Existing computational strategies are either unfeasible or provide inaccurate solutions. Moreover, global solutions are computationally expensive because the minimal representation of the aggregate state space - and thus the aggregate law of motion - faces the curse of dimensionality. In my example model, the approximate law of motion is still a five-dimensional object, making it already too expensive to evaluate using Cartesian grids. The proposed strategy thus combines adaptive sparse grids with a cross-sectional density approximation, and introduces a framework for solving the more general class of dynamic models with firm or household heterogeneity accurately.","filename":"post158s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Luca","last_name":"Mazzone","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luca","last_name":"Mazzone","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Luca","last_name":"Mazzone","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
ENG01 - Adaptive Particle Representation: A Novel Framework for Adaptive-Resolution Simulations
, Suryanarayana Maddu (Max Planck Institute for Molecular Cell Biology and Genetics, Germany)
+ Abstract { "session": {"id":"sess147","title":"Posters in Engineering","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Engineering"],"slots":[{"id":"post135","type":"poster","title":"ENG01 - Adaptive Particle Representation: A Novel Framework for Adaptive-Resolution Simulations","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"As progress in HPC hardware and software enables the numerical solution of increasingly complex non-linear models, adaptive-resolution and multi-resolution methods continue to gain importance. Models of this sort can only be efficiently simulated using self-adaptive discretization schemes like AMR and wavelets. However, these schemes are poorly suited for distributed-memory HPC systems, as they rely on global data tree structures and global mappings. We\u00a0therefore\u00a0present the Adaptive Particle Representation (APR), a novel and efficient method for adaptive-resolution simulation. The APR is based on local information only and is designed for scalability. It is based on a linear-time algorithm to construct an optimal resolution function for each point in the domain. It also provides point-wise error bounds for the function value and any derivative approximation. Differential operators can then be consistently evaluated on the evolving APR using discretization-corrected operators. While the APR combines ideas from wavelets and AMR, it effectively avoids\u00a0their performance issues. We benchmark this in numerical experiments and test the consistency and convergence of differential operators approximated on APR discretizations. These include the numerical solution of advection-diffusion of a Gaussian pulse, the two-dimensional unsteady Burgers equation, and adaptive three-dimensional Taylor-Green vortex simulation and\u00a0scaling experiments for testing performance.","filename":"post135s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bevan L.","last_name":"Cheeseman","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post147","type":"poster","title":"ENG02 - Assessment of Detached Eddy Simulation in Predicting Separated Flow over Airfoils at a Moderate Reynolds Number","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A highly scalable stabilized finite element flow solver is employed in the current study to simulate the flow around the NACA-4412 wing at a moderate chord Reynolds number (\u003Cem\u003ERe_c\u003C\/em\u003E = 400,000) with an angle of attack of 5 degrees. The flow under investigation involves a wide range of scales and complicated flow physics induced by the geometry, such as wall-bounded turbulence, flow separation, and turbulent wake flows. Previous DNS investigations have already revealed these complex flow patterns at similar Reynolds numbers. However, industry applications nowadays, targeting higher Reynolds number flows, still heavily depend on RANS or hybrid RANS\/LES approach in aircraft designs. The accurate prediction of separated flows is a cornerstone in successful CFD estimations. A delayed detached eddy simulation (DDES) turbulence model is utilized herein. The RANS model is primarily applied in near wall regions, while it gives way to LES in regions where flow is separated. To assess the readiness of this DDES approach, the obtained simulation results are to be compared with the DNS study conducted at KTH with the same flow configurations. The present research will better inform researchers the strengths and possible limitations of DDES in predicting complicated flow separation phenomenon.","filename":"post147s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun","last_name":"Fang","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Kenneth","last_name":"Jansen","affiliation":"University of Colorado Boulder","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Philipp","last_name":"Schlatter","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ricardo","last_name":"Vinuesa","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Rasquin","affiliation":"Cenaero","country":"Belgium","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}]},{"id":"post157","type":"poster","title":"ENG03 - Large Eddy Simulation of Tsunami Triggered Coastal Inundation in the Presence of Mitigation Parks","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years we have observed a paradigm shift from 2D shallow water models to models based on the 3D incompressible Navier-Stokes equations to simulate tsunami triggered inundation. The shift is due to the limitations of shallow water models for the simulation of onshore turbulent flows. The fully three-dimensional nature of turbulence and entrainment at the scales of coastal flooding requires massive parallel computing and the ability to have properly designed sub-grid scale models for large eddy simulation. In this poster, numerical mathematics, computational fluid dynamics, and tsunami physics are linked to understand the mechanisms that govern the coastal hydrodynamics as tsunami propagates inland. We look at what geometrical features of the coast line most affect the extension and propagation of flooding, in particular so when artificial, mitigation hills are installed at the shore line. Our numerical simulations rely on a massively parallel (hybrid MPI\/OpenMP) finite element solver of the incompressible 3D Navier-Stokes equations. Turbulence is modeled via dynamic Large Eddy Simulation.","filename":"post157s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jenny","last_name":"Suckale","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yilang","last_name":"Xu","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Beatriz","last_name":"Eguzkitza","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Houzeaux","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Mariano","last_name":"V\u00e1zquez","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post139","type":"poster","title":"ENG04 - A Low-Mach Simulation of Flow and Heat Transfer in a Motored Internal Combustion Engine Using the Spectral Element Method","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this work, a proof-of-concept, wall-resolved implicit large eddy simulation (LES) campaign was performed to capture in-cylinder flow and heat transfer within an internal combustion engine (ICE). Over 20 cycles of the gas-exchange process was simulated, revealing variations in the intake jet impingement angle on the wall and piston surfaces from cycle-to-cycle. This calculation confirms a rising level of turbulent intensity during the compression stroke which is crucial to efficient engine operation. This work marks a milestone achievement in using Nek5000, a highly-scalable computational fluid dynamics (CFD) solver, to capture turbulent flow and thermal fields inside realistic engine geometries. In the context of an arbitrary Lagrangian-Eulerian (ALE) framework, several algorithms have been developed and integrated into Nek5000 in order overcome the computational challenges associated with moving boundaries (i.e. valves and pistons). In particular, this simulation makes use of a characteristic-based, time-stepping scheme to overcome Courant (CFL) conditions associated with standard, semi-implicit schemes. In addition, a grid-to-grid algorithm is employed to interpolate field variables from one mesh to another in order maintain spatial resolution requirements. The potential impact of this work is an unmatched, accurate, and highly-scalable tool for researchers to facilitate the efficient design of ICEs with short turn-around time.","filename":"post139s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Georgios","last_name":"Giannakopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ananias","last_name":"Tomboulides","affiliation":"Aristotle University of Thessaloniki","country":"Greece","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Paul","last_name":"Fischer","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Christos","last_name":"Frouzakis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Misun","last_name":"Min","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Konstantinos","last_name":"Boulouchos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post134","type":"poster","title":"ENG05 - OpenFPM for Scalable Particle-Mesh Simulations on Distributed-Memory Computers","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scalable and efficient numerical simulations are of increasing importance in all areas of science and technology. This is fueled by a steady growth in the performance of computing hardware and increasing heterogeneous parallelism. However, efficiently implementing scalable simulation codes on heterogeneous, distributed hardware systems is the current bottleneck. This bottleneck can be relaxed by intermediate software layers that provide abstractions closer to the problem domain, allowing the computational scientist to focus on the simulation algorithm. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and\/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of hybrid particle-mesh simulations of both discrete and continuous models. This infrastructure is complemented with portable implementations of frequently used numerical routines, as well as interfaces to third-party libraries. We present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), high-dimensional Monte Carlo sampling (CMA-ES), and Reaction-Diffusion solvers, comparing it to the current state of the art and existing simulation software frameworks.","filename":"post134s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post167","type":"poster","title":"ENG06 - Wavelet Based Data Compression Strategies for Exascale CFD Simulations","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The steady increase of available computer resources has enabled engineers and scientists to use progressively more complex models to simulate a myriad of fluid flow problems. Yet, whereas modern high performance computers have seen a steady growth in computing power, the same trend has not been mirrored by a significant gain in data transfer rates. Current systems are capable of producing and processing high amounts of data quickly, while the overall performance is often limited by their ability to transfer and store the computed data. Considering that researchers invariably seek to study simulations with increasingly higher spatial and temporal resolution, the imminent move to exascale computing will consequently only exacerbate this problem. One of the major pitfalls of storing \u0027raw\u0027 simulation data lies in the implicit and redundant manner in which it represents the flow physics. Thus, using image compression algorithms to transform a large \u0027raw\u0027 into a compact data format could help to overcome the I\/O bottleneck. We therefore propose to adapt the wavelet-based JPEG-2000 compression standard for volumetric floating-point arrays.","filename":"post167s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"Rist","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post135","type":"poster","title":"ENG01 - Adaptive Particle Representation: A Novel Framework for Adaptive-Resolution Simulations","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"As progress in HPC hardware and software enables the numerical solution of increasingly complex non-linear models, adaptive-resolution and multi-resolution methods continue to gain importance. Models of this sort can only be efficiently simulated using self-adaptive discretization schemes like AMR and wavelets. However, these schemes are poorly suited for distributed-memory HPC systems, as they rely on global data tree structures and global mappings. We\u00a0therefore\u00a0present the Adaptive Particle Representation (APR), a novel and efficient method for adaptive-resolution simulation. The APR is based on local information only and is designed for scalability. It is based on a linear-time algorithm to construct an optimal resolution function for each point in the domain. It also provides point-wise error bounds for the function value and any derivative approximation. Differential operators can then be consistently evaluated on the evolving APR using discretization-corrected operators. While the APR combines ideas from wavelets and AMR, it effectively avoids\u00a0their performance issues. We benchmark this in numerical experiments and test the consistency and convergence of differential operators approximated on APR discretizations. These include the numerical solution of advection-diffusion of a Gaussian pulse, the two-dimensional unsteady Burgers equation, and adaptive three-dimensional Taylor-Green vortex simulation and\u00a0scaling experiments for testing performance.","filename":"post135s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bevan L.","last_name":"Cheeseman","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bevan L.","last_name":"Cheeseman","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"4","is_presenter":false}] } Presentation
ENG02 - Assessment of Detached Eddy Simulation in Predicting Separated Flow over Airfoils at a Moderate Reynolds Number
, Ramesh Balakrishnan (Argonne National Laboratory, United States of America)
+ Abstract { "session": {"id":"sess147","title":"Posters in Engineering","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Engineering"],"slots":[{"id":"post135","type":"poster","title":"ENG01 - Adaptive Particle Representation: A Novel Framework for Adaptive-Resolution Simulations","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"As progress in HPC hardware and software enables the numerical solution of increasingly complex non-linear models, adaptive-resolution and multi-resolution methods continue to gain importance. Models of this sort can only be efficiently simulated using self-adaptive discretization schemes like AMR and wavelets. However, these schemes are poorly suited for distributed-memory HPC systems, as they rely on global data tree structures and global mappings. We\u00a0therefore\u00a0present the Adaptive Particle Representation (APR), a novel and efficient method for adaptive-resolution simulation. The APR is based on local information only and is designed for scalability. It is based on a linear-time algorithm to construct an optimal resolution function for each point in the domain. It also provides point-wise error bounds for the function value and any derivative approximation. Differential operators can then be consistently evaluated on the evolving APR using discretization-corrected operators. While the APR combines ideas from wavelets and AMR, it effectively avoids\u00a0their performance issues. We benchmark this in numerical experiments and test the consistency and convergence of differential operators approximated on APR discretizations. These include the numerical solution of advection-diffusion of a Gaussian pulse, the two-dimensional unsteady Burgers equation, and adaptive three-dimensional Taylor-Green vortex simulation and\u00a0scaling experiments for testing performance.","filename":"post135s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bevan L.","last_name":"Cheeseman","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post147","type":"poster","title":"ENG02 - Assessment of Detached Eddy Simulation in Predicting Separated Flow over Airfoils at a Moderate Reynolds Number","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A highly scalable stabilized finite element flow solver is employed in the current study to simulate the flow around the NACA-4412 wing at a moderate chord Reynolds number (\u003Cem\u003ERe_c\u003C\/em\u003E = 400,000) with an angle of attack of 5 degrees. The flow under investigation involves a wide range of scales and complicated flow physics induced by the geometry, such as wall-bounded turbulence, flow separation, and turbulent wake flows. Previous DNS investigations have already revealed these complex flow patterns at similar Reynolds numbers. However, industry applications nowadays, targeting higher Reynolds number flows, still heavily depend on RANS or hybrid RANS\/LES approach in aircraft designs. The accurate prediction of separated flows is a cornerstone in successful CFD estimations. A delayed detached eddy simulation (DDES) turbulence model is utilized herein. The RANS model is primarily applied in near wall regions, while it gives way to LES in regions where flow is separated. To assess the readiness of this DDES approach, the obtained simulation results are to be compared with the DNS study conducted at KTH with the same flow configurations. The present research will better inform researchers the strengths and possible limitations of DDES in predicting complicated flow separation phenomenon.","filename":"post147s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun","last_name":"Fang","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Kenneth","last_name":"Jansen","affiliation":"University of Colorado Boulder","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Philipp","last_name":"Schlatter","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ricardo","last_name":"Vinuesa","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Rasquin","affiliation":"Cenaero","country":"Belgium","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}]},{"id":"post157","type":"poster","title":"ENG03 - Large Eddy Simulation of Tsunami Triggered Coastal Inundation in the Presence of Mitigation Parks","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years we have observed a paradigm shift from 2D shallow water models to models based on the 3D incompressible Navier-Stokes equations to simulate tsunami triggered inundation. The shift is due to the limitations of shallow water models for the simulation of onshore turbulent flows. The fully three-dimensional nature of turbulence and entrainment at the scales of coastal flooding requires massive parallel computing and the ability to have properly designed sub-grid scale models for large eddy simulation. In this poster, numerical mathematics, computational fluid dynamics, and tsunami physics are linked to understand the mechanisms that govern the coastal hydrodynamics as tsunami propagates inland. We look at what geometrical features of the coast line most affect the extension and propagation of flooding, in particular so when artificial, mitigation hills are installed at the shore line. Our numerical simulations rely on a massively parallel (hybrid MPI\/OpenMP) finite element solver of the incompressible 3D Navier-Stokes equations. Turbulence is modeled via dynamic Large Eddy Simulation.","filename":"post157s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jenny","last_name":"Suckale","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yilang","last_name":"Xu","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Beatriz","last_name":"Eguzkitza","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Houzeaux","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Mariano","last_name":"V\u00e1zquez","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post139","type":"poster","title":"ENG04 - A Low-Mach Simulation of Flow and Heat Transfer in a Motored Internal Combustion Engine Using the Spectral Element Method","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this work, a proof-of-concept, wall-resolved implicit large eddy simulation (LES) campaign was performed to capture in-cylinder flow and heat transfer within an internal combustion engine (ICE). Over 20 cycles of the gas-exchange process was simulated, revealing variations in the intake jet impingement angle on the wall and piston surfaces from cycle-to-cycle. This calculation confirms a rising level of turbulent intensity during the compression stroke which is crucial to efficient engine operation. This work marks a milestone achievement in using Nek5000, a highly-scalable computational fluid dynamics (CFD) solver, to capture turbulent flow and thermal fields inside realistic engine geometries. In the context of an arbitrary Lagrangian-Eulerian (ALE) framework, several algorithms have been developed and integrated into Nek5000 in order overcome the computational challenges associated with moving boundaries (i.e. valves and pistons). In particular, this simulation makes use of a characteristic-based, time-stepping scheme to overcome Courant (CFL) conditions associated with standard, semi-implicit schemes. In addition, a grid-to-grid algorithm is employed to interpolate field variables from one mesh to another in order maintain spatial resolution requirements. The potential impact of this work is an unmatched, accurate, and highly-scalable tool for researchers to facilitate the efficient design of ICEs with short turn-around time.","filename":"post139s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Georgios","last_name":"Giannakopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ananias","last_name":"Tomboulides","affiliation":"Aristotle University of Thessaloniki","country":"Greece","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Paul","last_name":"Fischer","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Christos","last_name":"Frouzakis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Misun","last_name":"Min","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Konstantinos","last_name":"Boulouchos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post134","type":"poster","title":"ENG05 - OpenFPM for Scalable Particle-Mesh Simulations on Distributed-Memory Computers","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scalable and efficient numerical simulations are of increasing importance in all areas of science and technology. This is fueled by a steady growth in the performance of computing hardware and increasing heterogeneous parallelism. However, efficiently implementing scalable simulation codes on heterogeneous, distributed hardware systems is the current bottleneck. This bottleneck can be relaxed by intermediate software layers that provide abstractions closer to the problem domain, allowing the computational scientist to focus on the simulation algorithm. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and\/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of hybrid particle-mesh simulations of both discrete and continuous models. This infrastructure is complemented with portable implementations of frequently used numerical routines, as well as interfaces to third-party libraries. We present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), high-dimensional Monte Carlo sampling (CMA-ES), and Reaction-Diffusion solvers, comparing it to the current state of the art and existing simulation software frameworks.","filename":"post134s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post167","type":"poster","title":"ENG06 - Wavelet Based Data Compression Strategies for Exascale CFD Simulations","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The steady increase of available computer resources has enabled engineers and scientists to use progressively more complex models to simulate a myriad of fluid flow problems. Yet, whereas modern high performance computers have seen a steady growth in computing power, the same trend has not been mirrored by a significant gain in data transfer rates. Current systems are capable of producing and processing high amounts of data quickly, while the overall performance is often limited by their ability to transfer and store the computed data. Considering that researchers invariably seek to study simulations with increasingly higher spatial and temporal resolution, the imminent move to exascale computing will consequently only exacerbate this problem. One of the major pitfalls of storing \u0027raw\u0027 simulation data lies in the implicit and redundant manner in which it represents the flow physics. Thus, using image compression algorithms to transform a large \u0027raw\u0027 into a compact data format could help to overcome the I\/O bottleneck. We therefore propose to adapt the wavelet-based JPEG-2000 compression standard for volumetric floating-point arrays.","filename":"post167s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"Rist","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post147","type":"poster","title":"ENG02 - Assessment of Detached Eddy Simulation in Predicting Separated Flow over Airfoils at a Moderate Reynolds Number","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A highly scalable stabilized finite element flow solver is employed in the current study to simulate the flow around the NACA-4412 wing at a moderate chord Reynolds number (\u003Cem\u003ERe_c\u003C\/em\u003E = 400,000) with an angle of attack of 5 degrees. The flow under investigation involves a wide range of scales and complicated flow physics induced by the geometry, such as wall-bounded turbulence, flow separation, and turbulent wake flows. Previous DNS investigations have already revealed these complex flow patterns at similar Reynolds numbers. However, industry applications nowadays, targeting higher Reynolds number flows, still heavily depend on RANS or hybrid RANS\/LES approach in aircraft designs. The accurate prediction of separated flows is a cornerstone in successful CFD estimations. A delayed detached eddy simulation (DDES) turbulence model is utilized herein. The RANS model is primarily applied in near wall regions, while it gives way to LES in regions where flow is separated. To assess the readiness of this DDES approach, the obtained simulation results are to be compared with the DNS study conducted at KTH with the same flow configurations. The present research will better inform researchers the strengths and possible limitations of DDES in predicting complicated flow separation phenomenon.","filename":"post147s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun","last_name":"Fang","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Kenneth","last_name":"Jansen","affiliation":"University of Colorado Boulder","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Philipp","last_name":"Schlatter","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ricardo","last_name":"Vinuesa","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Rasquin","affiliation":"Cenaero","country":"Belgium","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jun","last_name":"Fang","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Kenneth","last_name":"Jansen","affiliation":"University of Colorado Boulder","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Philipp","last_name":"Schlatter","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ricardo","last_name":"Vinuesa","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Rasquin","affiliation":"Cenaero","country":"Belgium","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}] } Presentation
ENG03 - Large Eddy Simulation of Tsunami Triggered Coastal Inundation in the Presence of Mitigation Parks
, Simone Marras (New Jersey Institute of Technology, United States of America)
+ Abstract { "session": {"id":"sess147","title":"Posters in Engineering","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Engineering"],"slots":[{"id":"post135","type":"poster","title":"ENG01 - Adaptive Particle Representation: A Novel Framework for Adaptive-Resolution Simulations","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"As progress in HPC hardware and software enables the numerical solution of increasingly complex non-linear models, adaptive-resolution and multi-resolution methods continue to gain importance. Models of this sort can only be efficiently simulated using self-adaptive discretization schemes like AMR and wavelets. However, these schemes are poorly suited for distributed-memory HPC systems, as they rely on global data tree structures and global mappings. We\u00a0therefore\u00a0present the Adaptive Particle Representation (APR), a novel and efficient method for adaptive-resolution simulation. The APR is based on local information only and is designed for scalability. It is based on a linear-time algorithm to construct an optimal resolution function for each point in the domain. It also provides point-wise error bounds for the function value and any derivative approximation. Differential operators can then be consistently evaluated on the evolving APR using discretization-corrected operators. While the APR combines ideas from wavelets and AMR, it effectively avoids\u00a0their performance issues. We benchmark this in numerical experiments and test the consistency and convergence of differential operators approximated on APR discretizations. These include the numerical solution of advection-diffusion of a Gaussian pulse, the two-dimensional unsteady Burgers equation, and adaptive three-dimensional Taylor-Green vortex simulation and\u00a0scaling experiments for testing performance.","filename":"post135s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bevan L.","last_name":"Cheeseman","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post147","type":"poster","title":"ENG02 - Assessment of Detached Eddy Simulation in Predicting Separated Flow over Airfoils at a Moderate Reynolds Number","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A highly scalable stabilized finite element flow solver is employed in the current study to simulate the flow around the NACA-4412 wing at a moderate chord Reynolds number (\u003Cem\u003ERe_c\u003C\/em\u003E = 400,000) with an angle of attack of 5 degrees. The flow under investigation involves a wide range of scales and complicated flow physics induced by the geometry, such as wall-bounded turbulence, flow separation, and turbulent wake flows. Previous DNS investigations have already revealed these complex flow patterns at similar Reynolds numbers. However, industry applications nowadays, targeting higher Reynolds number flows, still heavily depend on RANS or hybrid RANS\/LES approach in aircraft designs. The accurate prediction of separated flows is a cornerstone in successful CFD estimations. A delayed detached eddy simulation (DDES) turbulence model is utilized herein. The RANS model is primarily applied in near wall regions, while it gives way to LES in regions where flow is separated. To assess the readiness of this DDES approach, the obtained simulation results are to be compared with the DNS study conducted at KTH with the same flow configurations. The present research will better inform researchers the strengths and possible limitations of DDES in predicting complicated flow separation phenomenon.","filename":"post147s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun","last_name":"Fang","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Kenneth","last_name":"Jansen","affiliation":"University of Colorado Boulder","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Philipp","last_name":"Schlatter","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ricardo","last_name":"Vinuesa","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Rasquin","affiliation":"Cenaero","country":"Belgium","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}]},{"id":"post157","type":"poster","title":"ENG03 - Large Eddy Simulation of Tsunami Triggered Coastal Inundation in the Presence of Mitigation Parks","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years we have observed a paradigm shift from 2D shallow water models to models based on the 3D incompressible Navier-Stokes equations to simulate tsunami triggered inundation. The shift is due to the limitations of shallow water models for the simulation of onshore turbulent flows. The fully three-dimensional nature of turbulence and entrainment at the scales of coastal flooding requires massive parallel computing and the ability to have properly designed sub-grid scale models for large eddy simulation. In this poster, numerical mathematics, computational fluid dynamics, and tsunami physics are linked to understand the mechanisms that govern the coastal hydrodynamics as tsunami propagates inland. We look at what geometrical features of the coast line most affect the extension and propagation of flooding, in particular so when artificial, mitigation hills are installed at the shore line. Our numerical simulations rely on a massively parallel (hybrid MPI\/OpenMP) finite element solver of the incompressible 3D Navier-Stokes equations. Turbulence is modeled via dynamic Large Eddy Simulation.","filename":"post157s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jenny","last_name":"Suckale","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yilang","last_name":"Xu","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Beatriz","last_name":"Eguzkitza","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Houzeaux","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Mariano","last_name":"V\u00e1zquez","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post139","type":"poster","title":"ENG04 - A Low-Mach Simulation of Flow and Heat Transfer in a Motored Internal Combustion Engine Using the Spectral Element Method","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this work, a proof-of-concept, wall-resolved implicit large eddy simulation (LES) campaign was performed to capture in-cylinder flow and heat transfer within an internal combustion engine (ICE). Over 20 cycles of the gas-exchange process was simulated, revealing variations in the intake jet impingement angle on the wall and piston surfaces from cycle-to-cycle. This calculation confirms a rising level of turbulent intensity during the compression stroke which is crucial to efficient engine operation. This work marks a milestone achievement in using Nek5000, a highly-scalable computational fluid dynamics (CFD) solver, to capture turbulent flow and thermal fields inside realistic engine geometries. In the context of an arbitrary Lagrangian-Eulerian (ALE) framework, several algorithms have been developed and integrated into Nek5000 in order overcome the computational challenges associated with moving boundaries (i.e. valves and pistons). In particular, this simulation makes use of a characteristic-based, time-stepping scheme to overcome Courant (CFL) conditions associated with standard, semi-implicit schemes. In addition, a grid-to-grid algorithm is employed to interpolate field variables from one mesh to another in order maintain spatial resolution requirements. The potential impact of this work is an unmatched, accurate, and highly-scalable tool for researchers to facilitate the efficient design of ICEs with short turn-around time.","filename":"post139s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Georgios","last_name":"Giannakopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ananias","last_name":"Tomboulides","affiliation":"Aristotle University of Thessaloniki","country":"Greece","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Paul","last_name":"Fischer","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Christos","last_name":"Frouzakis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Misun","last_name":"Min","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Konstantinos","last_name":"Boulouchos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post134","type":"poster","title":"ENG05 - OpenFPM for Scalable Particle-Mesh Simulations on Distributed-Memory Computers","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scalable and efficient numerical simulations are of increasing importance in all areas of science and technology. This is fueled by a steady growth in the performance of computing hardware and increasing heterogeneous parallelism. However, efficiently implementing scalable simulation codes on heterogeneous, distributed hardware systems is the current bottleneck. This bottleneck can be relaxed by intermediate software layers that provide abstractions closer to the problem domain, allowing the computational scientist to focus on the simulation algorithm. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and\/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of hybrid particle-mesh simulations of both discrete and continuous models. This infrastructure is complemented with portable implementations of frequently used numerical routines, as well as interfaces to third-party libraries. We present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), high-dimensional Monte Carlo sampling (CMA-ES), and Reaction-Diffusion solvers, comparing it to the current state of the art and existing simulation software frameworks.","filename":"post134s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post167","type":"poster","title":"ENG06 - Wavelet Based Data Compression Strategies for Exascale CFD Simulations","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The steady increase of available computer resources has enabled engineers and scientists to use progressively more complex models to simulate a myriad of fluid flow problems. Yet, whereas modern high performance computers have seen a steady growth in computing power, the same trend has not been mirrored by a significant gain in data transfer rates. Current systems are capable of producing and processing high amounts of data quickly, while the overall performance is often limited by their ability to transfer and store the computed data. Considering that researchers invariably seek to study simulations with increasingly higher spatial and temporal resolution, the imminent move to exascale computing will consequently only exacerbate this problem. One of the major pitfalls of storing \u0027raw\u0027 simulation data lies in the implicit and redundant manner in which it represents the flow physics. Thus, using image compression algorithms to transform a large \u0027raw\u0027 into a compact data format could help to overcome the I\/O bottleneck. We therefore propose to adapt the wavelet-based JPEG-2000 compression standard for volumetric floating-point arrays.","filename":"post167s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"Rist","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post157","type":"poster","title":"ENG03 - Large Eddy Simulation of Tsunami Triggered Coastal Inundation in the Presence of Mitigation Parks","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years we have observed a paradigm shift from 2D shallow water models to models based on the 3D incompressible Navier-Stokes equations to simulate tsunami triggered inundation. The shift is due to the limitations of shallow water models for the simulation of onshore turbulent flows. The fully three-dimensional nature of turbulence and entrainment at the scales of coastal flooding requires massive parallel computing and the ability to have properly designed sub-grid scale models for large eddy simulation. In this poster, numerical mathematics, computational fluid dynamics, and tsunami physics are linked to understand the mechanisms that govern the coastal hydrodynamics as tsunami propagates inland. We look at what geometrical features of the coast line most affect the extension and propagation of flooding, in particular so when artificial, mitigation hills are installed at the shore line. Our numerical simulations rely on a massively parallel (hybrid MPI\/OpenMP) finite element solver of the incompressible 3D Navier-Stokes equations. Turbulence is modeled via dynamic Large Eddy Simulation.","filename":"post157s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jenny","last_name":"Suckale","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yilang","last_name":"Xu","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Beatriz","last_name":"Eguzkitza","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Houzeaux","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Mariano","last_name":"V\u00e1zquez","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jenny","last_name":"Suckale","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yilang","last_name":"Xu","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Beatriz","last_name":"Eguzkitza","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Houzeaux","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Mariano","last_name":"V\u00e1zquez","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"6","is_presenter":false}] } Presentation
ENG04 - A Low-Mach Simulation of Flow and Heat Transfer in a Motored Internal Combustion Engine Using the Spectral Element Method
, Saumil Patel (Argonne National Laboratory, United States of America)
+ Abstract { "session": {"id":"sess147","title":"Posters in Engineering","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Engineering"],"slots":[{"id":"post135","type":"poster","title":"ENG01 - Adaptive Particle Representation: A Novel Framework for Adaptive-Resolution Simulations","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"As progress in HPC hardware and software enables the numerical solution of increasingly complex non-linear models, adaptive-resolution and multi-resolution methods continue to gain importance. Models of this sort can only be efficiently simulated using self-adaptive discretization schemes like AMR and wavelets. However, these schemes are poorly suited for distributed-memory HPC systems, as they rely on global data tree structures and global mappings. We\u00a0therefore\u00a0present the Adaptive Particle Representation (APR), a novel and efficient method for adaptive-resolution simulation. The APR is based on local information only and is designed for scalability. It is based on a linear-time algorithm to construct an optimal resolution function for each point in the domain. It also provides point-wise error bounds for the function value and any derivative approximation. Differential operators can then be consistently evaluated on the evolving APR using discretization-corrected operators. While the APR combines ideas from wavelets and AMR, it effectively avoids\u00a0their performance issues. We benchmark this in numerical experiments and test the consistency and convergence of differential operators approximated on APR discretizations. These include the numerical solution of advection-diffusion of a Gaussian pulse, the two-dimensional unsteady Burgers equation, and adaptive three-dimensional Taylor-Green vortex simulation and\u00a0scaling experiments for testing performance.","filename":"post135s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bevan L.","last_name":"Cheeseman","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post147","type":"poster","title":"ENG02 - Assessment of Detached Eddy Simulation in Predicting Separated Flow over Airfoils at a Moderate Reynolds Number","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A highly scalable stabilized finite element flow solver is employed in the current study to simulate the flow around the NACA-4412 wing at a moderate chord Reynolds number (\u003Cem\u003ERe_c\u003C\/em\u003E = 400,000) with an angle of attack of 5 degrees. The flow under investigation involves a wide range of scales and complicated flow physics induced by the geometry, such as wall-bounded turbulence, flow separation, and turbulent wake flows. Previous DNS investigations have already revealed these complex flow patterns at similar Reynolds numbers. However, industry applications nowadays, targeting higher Reynolds number flows, still heavily depend on RANS or hybrid RANS\/LES approach in aircraft designs. The accurate prediction of separated flows is a cornerstone in successful CFD estimations. A delayed detached eddy simulation (DDES) turbulence model is utilized herein. The RANS model is primarily applied in near wall regions, while it gives way to LES in regions where flow is separated. To assess the readiness of this DDES approach, the obtained simulation results are to be compared with the DNS study conducted at KTH with the same flow configurations. The present research will better inform researchers the strengths and possible limitations of DDES in predicting complicated flow separation phenomenon.","filename":"post147s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun","last_name":"Fang","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Kenneth","last_name":"Jansen","affiliation":"University of Colorado Boulder","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Philipp","last_name":"Schlatter","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ricardo","last_name":"Vinuesa","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Rasquin","affiliation":"Cenaero","country":"Belgium","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}]},{"id":"post157","type":"poster","title":"ENG03 - Large Eddy Simulation of Tsunami Triggered Coastal Inundation in the Presence of Mitigation Parks","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years we have observed a paradigm shift from 2D shallow water models to models based on the 3D incompressible Navier-Stokes equations to simulate tsunami triggered inundation. The shift is due to the limitations of shallow water models for the simulation of onshore turbulent flows. The fully three-dimensional nature of turbulence and entrainment at the scales of coastal flooding requires massive parallel computing and the ability to have properly designed sub-grid scale models for large eddy simulation. In this poster, numerical mathematics, computational fluid dynamics, and tsunami physics are linked to understand the mechanisms that govern the coastal hydrodynamics as tsunami propagates inland. We look at what geometrical features of the coast line most affect the extension and propagation of flooding, in particular so when artificial, mitigation hills are installed at the shore line. Our numerical simulations rely on a massively parallel (hybrid MPI\/OpenMP) finite element solver of the incompressible 3D Navier-Stokes equations. Turbulence is modeled via dynamic Large Eddy Simulation.","filename":"post157s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jenny","last_name":"Suckale","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yilang","last_name":"Xu","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Beatriz","last_name":"Eguzkitza","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Houzeaux","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Mariano","last_name":"V\u00e1zquez","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post139","type":"poster","title":"ENG04 - A Low-Mach Simulation of Flow and Heat Transfer in a Motored Internal Combustion Engine Using the Spectral Element Method","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this work, a proof-of-concept, wall-resolved implicit large eddy simulation (LES) campaign was performed to capture in-cylinder flow and heat transfer within an internal combustion engine (ICE). Over 20 cycles of the gas-exchange process was simulated, revealing variations in the intake jet impingement angle on the wall and piston surfaces from cycle-to-cycle. This calculation confirms a rising level of turbulent intensity during the compression stroke which is crucial to efficient engine operation. This work marks a milestone achievement in using Nek5000, a highly-scalable computational fluid dynamics (CFD) solver, to capture turbulent flow and thermal fields inside realistic engine geometries. In the context of an arbitrary Lagrangian-Eulerian (ALE) framework, several algorithms have been developed and integrated into Nek5000 in order overcome the computational challenges associated with moving boundaries (i.e. valves and pistons). In particular, this simulation makes use of a characteristic-based, time-stepping scheme to overcome Courant (CFL) conditions associated with standard, semi-implicit schemes. In addition, a grid-to-grid algorithm is employed to interpolate field variables from one mesh to another in order maintain spatial resolution requirements. The potential impact of this work is an unmatched, accurate, and highly-scalable tool for researchers to facilitate the efficient design of ICEs with short turn-around time.","filename":"post139s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Georgios","last_name":"Giannakopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ananias","last_name":"Tomboulides","affiliation":"Aristotle University of Thessaloniki","country":"Greece","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Paul","last_name":"Fischer","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Christos","last_name":"Frouzakis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Misun","last_name":"Min","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Konstantinos","last_name":"Boulouchos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post134","type":"poster","title":"ENG05 - OpenFPM for Scalable Particle-Mesh Simulations on Distributed-Memory Computers","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scalable and efficient numerical simulations are of increasing importance in all areas of science and technology. This is fueled by a steady growth in the performance of computing hardware and increasing heterogeneous parallelism. However, efficiently implementing scalable simulation codes on heterogeneous, distributed hardware systems is the current bottleneck. This bottleneck can be relaxed by intermediate software layers that provide abstractions closer to the problem domain, allowing the computational scientist to focus on the simulation algorithm. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and\/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of hybrid particle-mesh simulations of both discrete and continuous models. This infrastructure is complemented with portable implementations of frequently used numerical routines, as well as interfaces to third-party libraries. We present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), high-dimensional Monte Carlo sampling (CMA-ES), and Reaction-Diffusion solvers, comparing it to the current state of the art and existing simulation software frameworks.","filename":"post134s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post167","type":"poster","title":"ENG06 - Wavelet Based Data Compression Strategies for Exascale CFD Simulations","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The steady increase of available computer resources has enabled engineers and scientists to use progressively more complex models to simulate a myriad of fluid flow problems. Yet, whereas modern high performance computers have seen a steady growth in computing power, the same trend has not been mirrored by a significant gain in data transfer rates. Current systems are capable of producing and processing high amounts of data quickly, while the overall performance is often limited by their ability to transfer and store the computed data. Considering that researchers invariably seek to study simulations with increasingly higher spatial and temporal resolution, the imminent move to exascale computing will consequently only exacerbate this problem. One of the major pitfalls of storing \u0027raw\u0027 simulation data lies in the implicit and redundant manner in which it represents the flow physics. Thus, using image compression algorithms to transform a large \u0027raw\u0027 into a compact data format could help to overcome the I\/O bottleneck. We therefore propose to adapt the wavelet-based JPEG-2000 compression standard for volumetric floating-point arrays.","filename":"post167s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"Rist","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post139","type":"poster","title":"ENG04 - A Low-Mach Simulation of Flow and Heat Transfer in a Motored Internal Combustion Engine Using the Spectral Element Method","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this work, a proof-of-concept, wall-resolved implicit large eddy simulation (LES) campaign was performed to capture in-cylinder flow and heat transfer within an internal combustion engine (ICE). Over 20 cycles of the gas-exchange process was simulated, revealing variations in the intake jet impingement angle on the wall and piston surfaces from cycle-to-cycle. This calculation confirms a rising level of turbulent intensity during the compression stroke which is crucial to efficient engine operation. This work marks a milestone achievement in using Nek5000, a highly-scalable computational fluid dynamics (CFD) solver, to capture turbulent flow and thermal fields inside realistic engine geometries. In the context of an arbitrary Lagrangian-Eulerian (ALE) framework, several algorithms have been developed and integrated into Nek5000 in order overcome the computational challenges associated with moving boundaries (i.e. valves and pistons). In particular, this simulation makes use of a characteristic-based, time-stepping scheme to overcome Courant (CFL) conditions associated with standard, semi-implicit schemes. In addition, a grid-to-grid algorithm is employed to interpolate field variables from one mesh to another in order maintain spatial resolution requirements. The potential impact of this work is an unmatched, accurate, and highly-scalable tool for researchers to facilitate the efficient design of ICEs with short turn-around time.","filename":"post139s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Georgios","last_name":"Giannakopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ananias","last_name":"Tomboulides","affiliation":"Aristotle University of Thessaloniki","country":"Greece","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Paul","last_name":"Fischer","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Christos","last_name":"Frouzakis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Misun","last_name":"Min","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Konstantinos","last_name":"Boulouchos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Georgios","last_name":"Giannakopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ananias","last_name":"Tomboulides","affiliation":"Aristotle University of Thessaloniki","country":"Greece","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Paul","last_name":"Fischer","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Christos","last_name":"Frouzakis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Misun","last_name":"Min","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Konstantinos","last_name":"Boulouchos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false}] } Presentation
ENG05 - OpenFPM for Scalable Particle-Mesh Simulations on Distributed-Memory Computers
, Pietro Incardona (Centre of System Biology Dresden, Germany)
+ Abstract { "session": {"id":"sess147","title":"Posters in Engineering","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Engineering"],"slots":[{"id":"post135","type":"poster","title":"ENG01 - Adaptive Particle Representation: A Novel Framework for Adaptive-Resolution Simulations","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"As progress in HPC hardware and software enables the numerical solution of increasingly complex non-linear models, adaptive-resolution and multi-resolution methods continue to gain importance. Models of this sort can only be efficiently simulated using self-adaptive discretization schemes like AMR and wavelets. However, these schemes are poorly suited for distributed-memory HPC systems, as they rely on global data tree structures and global mappings. We\u00a0therefore\u00a0present the Adaptive Particle Representation (APR), a novel and efficient method for adaptive-resolution simulation. The APR is based on local information only and is designed for scalability. It is based on a linear-time algorithm to construct an optimal resolution function for each point in the domain. It also provides point-wise error bounds for the function value and any derivative approximation. Differential operators can then be consistently evaluated on the evolving APR using discretization-corrected operators. While the APR combines ideas from wavelets and AMR, it effectively avoids\u00a0their performance issues. We benchmark this in numerical experiments and test the consistency and convergence of differential operators approximated on APR discretizations. These include the numerical solution of advection-diffusion of a Gaussian pulse, the two-dimensional unsteady Burgers equation, and adaptive three-dimensional Taylor-Green vortex simulation and\u00a0scaling experiments for testing performance.","filename":"post135s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bevan L.","last_name":"Cheeseman","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post147","type":"poster","title":"ENG02 - Assessment of Detached Eddy Simulation in Predicting Separated Flow over Airfoils at a Moderate Reynolds Number","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A highly scalable stabilized finite element flow solver is employed in the current study to simulate the flow around the NACA-4412 wing at a moderate chord Reynolds number (\u003Cem\u003ERe_c\u003C\/em\u003E = 400,000) with an angle of attack of 5 degrees. The flow under investigation involves a wide range of scales and complicated flow physics induced by the geometry, such as wall-bounded turbulence, flow separation, and turbulent wake flows. Previous DNS investigations have already revealed these complex flow patterns at similar Reynolds numbers. However, industry applications nowadays, targeting higher Reynolds number flows, still heavily depend on RANS or hybrid RANS\/LES approach in aircraft designs. The accurate prediction of separated flows is a cornerstone in successful CFD estimations. A delayed detached eddy simulation (DDES) turbulence model is utilized herein. The RANS model is primarily applied in near wall regions, while it gives way to LES in regions where flow is separated. To assess the readiness of this DDES approach, the obtained simulation results are to be compared with the DNS study conducted at KTH with the same flow configurations. The present research will better inform researchers the strengths and possible limitations of DDES in predicting complicated flow separation phenomenon.","filename":"post147s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun","last_name":"Fang","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Kenneth","last_name":"Jansen","affiliation":"University of Colorado Boulder","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Philipp","last_name":"Schlatter","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ricardo","last_name":"Vinuesa","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Rasquin","affiliation":"Cenaero","country":"Belgium","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}]},{"id":"post157","type":"poster","title":"ENG03 - Large Eddy Simulation of Tsunami Triggered Coastal Inundation in the Presence of Mitigation Parks","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years we have observed a paradigm shift from 2D shallow water models to models based on the 3D incompressible Navier-Stokes equations to simulate tsunami triggered inundation. The shift is due to the limitations of shallow water models for the simulation of onshore turbulent flows. The fully three-dimensional nature of turbulence and entrainment at the scales of coastal flooding requires massive parallel computing and the ability to have properly designed sub-grid scale models for large eddy simulation. In this poster, numerical mathematics, computational fluid dynamics, and tsunami physics are linked to understand the mechanisms that govern the coastal hydrodynamics as tsunami propagates inland. We look at what geometrical features of the coast line most affect the extension and propagation of flooding, in particular so when artificial, mitigation hills are installed at the shore line. Our numerical simulations rely on a massively parallel (hybrid MPI\/OpenMP) finite element solver of the incompressible 3D Navier-Stokes equations. Turbulence is modeled via dynamic Large Eddy Simulation.","filename":"post157s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jenny","last_name":"Suckale","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yilang","last_name":"Xu","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Beatriz","last_name":"Eguzkitza","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Houzeaux","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Mariano","last_name":"V\u00e1zquez","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post139","type":"poster","title":"ENG04 - A Low-Mach Simulation of Flow and Heat Transfer in a Motored Internal Combustion Engine Using the Spectral Element Method","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this work, a proof-of-concept, wall-resolved implicit large eddy simulation (LES) campaign was performed to capture in-cylinder flow and heat transfer within an internal combustion engine (ICE). Over 20 cycles of the gas-exchange process was simulated, revealing variations in the intake jet impingement angle on the wall and piston surfaces from cycle-to-cycle. This calculation confirms a rising level of turbulent intensity during the compression stroke which is crucial to efficient engine operation. This work marks a milestone achievement in using Nek5000, a highly-scalable computational fluid dynamics (CFD) solver, to capture turbulent flow and thermal fields inside realistic engine geometries. In the context of an arbitrary Lagrangian-Eulerian (ALE) framework, several algorithms have been developed and integrated into Nek5000 in order overcome the computational challenges associated with moving boundaries (i.e. valves and pistons). In particular, this simulation makes use of a characteristic-based, time-stepping scheme to overcome Courant (CFL) conditions associated with standard, semi-implicit schemes. In addition, a grid-to-grid algorithm is employed to interpolate field variables from one mesh to another in order maintain spatial resolution requirements. The potential impact of this work is an unmatched, accurate, and highly-scalable tool for researchers to facilitate the efficient design of ICEs with short turn-around time.","filename":"post139s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Georgios","last_name":"Giannakopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ananias","last_name":"Tomboulides","affiliation":"Aristotle University of Thessaloniki","country":"Greece","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Paul","last_name":"Fischer","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Christos","last_name":"Frouzakis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Misun","last_name":"Min","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Konstantinos","last_name":"Boulouchos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post134","type":"poster","title":"ENG05 - OpenFPM for Scalable Particle-Mesh Simulations on Distributed-Memory Computers","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scalable and efficient numerical simulations are of increasing importance in all areas of science and technology. This is fueled by a steady growth in the performance of computing hardware and increasing heterogeneous parallelism. However, efficiently implementing scalable simulation codes on heterogeneous, distributed hardware systems is the current bottleneck. This bottleneck can be relaxed by intermediate software layers that provide abstractions closer to the problem domain, allowing the computational scientist to focus on the simulation algorithm. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and\/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of hybrid particle-mesh simulations of both discrete and continuous models. This infrastructure is complemented with portable implementations of frequently used numerical routines, as well as interfaces to third-party libraries. We present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), high-dimensional Monte Carlo sampling (CMA-ES), and Reaction-Diffusion solvers, comparing it to the current state of the art and existing simulation software frameworks.","filename":"post134s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post167","type":"poster","title":"ENG06 - Wavelet Based Data Compression Strategies for Exascale CFD Simulations","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The steady increase of available computer resources has enabled engineers and scientists to use progressively more complex models to simulate a myriad of fluid flow problems. Yet, whereas modern high performance computers have seen a steady growth in computing power, the same trend has not been mirrored by a significant gain in data transfer rates. Current systems are capable of producing and processing high amounts of data quickly, while the overall performance is often limited by their ability to transfer and store the computed data. Considering that researchers invariably seek to study simulations with increasingly higher spatial and temporal resolution, the imminent move to exascale computing will consequently only exacerbate this problem. One of the major pitfalls of storing \u0027raw\u0027 simulation data lies in the implicit and redundant manner in which it represents the flow physics. Thus, using image compression algorithms to transform a large \u0027raw\u0027 into a compact data format could help to overcome the I\/O bottleneck. We therefore propose to adapt the wavelet-based JPEG-2000 compression standard for volumetric floating-point arrays.","filename":"post167s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"Rist","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post134","type":"poster","title":"ENG05 - OpenFPM for Scalable Particle-Mesh Simulations on Distributed-Memory Computers","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scalable and efficient numerical simulations are of increasing importance in all areas of science and technology. This is fueled by a steady growth in the performance of computing hardware and increasing heterogeneous parallelism. However, efficiently implementing scalable simulation codes on heterogeneous, distributed hardware systems is the current bottleneck. This bottleneck can be relaxed by intermediate software layers that provide abstractions closer to the problem domain, allowing the computational scientist to focus on the simulation algorithm. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and\/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of hybrid particle-mesh simulations of both discrete and continuous models. This infrastructure is complemented with portable implementations of frequently used numerical routines, as well as interfaces to third-party libraries. We present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), high-dimensional Monte Carlo sampling (CMA-ES), and Reaction-Diffusion solvers, comparing it to the current state of the art and existing simulation software frameworks.","filename":"post134s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"2","is_presenter":false}] } Presentation
ENG06 - Wavelet Based Data Compression Strategies for Exascale CFD Simulations
, Patrick Vogler (University of Stuttgart, Germany)
+ Abstract { "session": {"id":"sess147","title":"Posters in Engineering","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Engineering"],"slots":[{"id":"post135","type":"poster","title":"ENG01 - Adaptive Particle Representation: A Novel Framework for Adaptive-Resolution Simulations","begin_time":"19:30","end_time":"19:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"As progress in HPC hardware and software enables the numerical solution of increasingly complex non-linear models, adaptive-resolution and multi-resolution methods continue to gain importance. Models of this sort can only be efficiently simulated using self-adaptive discretization schemes like AMR and wavelets. However, these schemes are poorly suited for distributed-memory HPC systems, as they rely on global data tree structures and global mappings. We\u00a0therefore\u00a0present the Adaptive Particle Representation (APR), a novel and efficient method for adaptive-resolution simulation. The APR is based on local information only and is designed for scalability. It is based on a linear-time algorithm to construct an optimal resolution function for each point in the domain. It also provides point-wise error bounds for the function value and any derivative approximation. Differential operators can then be consistently evaluated on the evolving APR using discretization-corrected operators. While the APR combines ideas from wavelets and AMR, it effectively avoids\u00a0their performance issues. We benchmark this in numerical experiments and test the consistency and convergence of differential operators approximated on APR discretizations. These include the numerical solution of advection-diffusion of a Gaussian pulse, the two-dimensional unsteady Burgers equation, and adaptive three-dimensional Taylor-Green vortex simulation and\u00a0scaling experiments for testing performance.","filename":"post135s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Bevan L.","last_name":"Cheeseman","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Suryanarayana","last_name":"Maddu","affiliation":"Max Planck Institute for Molecular Cell Biology and Genetics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post147","type":"poster","title":"ENG02 - Assessment of Detached Eddy Simulation in Predicting Separated Flow over Airfoils at a Moderate Reynolds Number","begin_time":"19:50","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A highly scalable stabilized finite element flow solver is employed in the current study to simulate the flow around the NACA-4412 wing at a moderate chord Reynolds number (\u003Cem\u003ERe_c\u003C\/em\u003E = 400,000) with an angle of attack of 5 degrees. The flow under investigation involves a wide range of scales and complicated flow physics induced by the geometry, such as wall-bounded turbulence, flow separation, and turbulent wake flows. Previous DNS investigations have already revealed these complex flow patterns at similar Reynolds numbers. However, industry applications nowadays, targeting higher Reynolds number flows, still heavily depend on RANS or hybrid RANS\/LES approach in aircraft designs. The accurate prediction of separated flows is a cornerstone in successful CFD estimations. A delayed detached eddy simulation (DDES) turbulence model is utilized herein. The RANS model is primarily applied in near wall regions, while it gives way to LES in regions where flow is separated. To assess the readiness of this DDES approach, the obtained simulation results are to be compared with the DNS study conducted at KTH with the same flow configurations. The present research will better inform researchers the strengths and possible limitations of DDES in predicting complicated flow separation phenomenon.","filename":"post147s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jun","last_name":"Fang","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Kenneth","last_name":"Jansen","affiliation":"University of Colorado Boulder","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Philipp","last_name":"Schlatter","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ricardo","last_name":"Vinuesa","affiliation":"KTH Royal Institute of Technology","country":"Sweden","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Michel","last_name":"Rasquin","affiliation":"Cenaero","country":"Belgium","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ramesh","last_name":"Balakrishnan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":true}]},{"id":"post157","type":"poster","title":"ENG03 - Large Eddy Simulation of Tsunami Triggered Coastal Inundation in the Presence of Mitigation Parks","begin_time":"20:10","end_time":"20:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years we have observed a paradigm shift from 2D shallow water models to models based on the 3D incompressible Navier-Stokes equations to simulate tsunami triggered inundation. The shift is due to the limitations of shallow water models for the simulation of onshore turbulent flows. The fully three-dimensional nature of turbulence and entrainment at the scales of coastal flooding requires massive parallel computing and the ability to have properly designed sub-grid scale models for large eddy simulation. In this poster, numerical mathematics, computational fluid dynamics, and tsunami physics are linked to understand the mechanisms that govern the coastal hydrodynamics as tsunami propagates inland. We look at what geometrical features of the coast line most affect the extension and propagation of flooding, in particular so when artificial, mitigation hills are installed at the shore line. Our numerical simulations rely on a massively parallel (hybrid MPI\/OpenMP) finite element solver of the incompressible 3D Navier-Stokes equations. Turbulence is modeled via dynamic Large Eddy Simulation.","filename":"post157s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Jenny","last_name":"Suckale","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Yilang","last_name":"Xu","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Beatriz","last_name":"Eguzkitza","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Guillaume","last_name":"Houzeaux","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Mariano","last_name":"V\u00e1zquez","affiliation":"Barcelona Supercomputing Center","country":"Spain","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simone","last_name":"Marras","affiliation":"New Jersey Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post139","type":"poster","title":"ENG04 - A Low-Mach Simulation of Flow and Heat Transfer in a Motored Internal Combustion Engine Using the Spectral Element Method","begin_time":"20:30","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this work, a proof-of-concept, wall-resolved implicit large eddy simulation (LES) campaign was performed to capture in-cylinder flow and heat transfer within an internal combustion engine (ICE). Over 20 cycles of the gas-exchange process was simulated, revealing variations in the intake jet impingement angle on the wall and piston surfaces from cycle-to-cycle. This calculation confirms a rising level of turbulent intensity during the compression stroke which is crucial to efficient engine operation. This work marks a milestone achievement in using Nek5000, a highly-scalable computational fluid dynamics (CFD) solver, to capture turbulent flow and thermal fields inside realistic engine geometries. In the context of an arbitrary Lagrangian-Eulerian (ALE) framework, several algorithms have been developed and integrated into Nek5000 in order overcome the computational challenges associated with moving boundaries (i.e. valves and pistons). In particular, this simulation makes use of a characteristic-based, time-stepping scheme to overcome Courant (CFL) conditions associated with standard, semi-implicit schemes. In addition, a grid-to-grid algorithm is employed to interpolate field variables from one mesh to another in order maintain spatial resolution requirements. The potential impact of this work is an unmatched, accurate, and highly-scalable tool for researchers to facilitate the efficient design of ICEs with short turn-around time.","filename":"post139s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Georgios","last_name":"Giannakopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ananias","last_name":"Tomboulides","affiliation":"Aristotle University of Thessaloniki","country":"Greece","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Paul","last_name":"Fischer","affiliation":"University of Illinois Urbana-Champaign","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Christos","last_name":"Frouzakis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Misun","last_name":"Min","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Konstantinos","last_name":"Boulouchos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Saumil","last_name":"Patel","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post134","type":"poster","title":"ENG05 - OpenFPM for Scalable Particle-Mesh Simulations on Distributed-Memory Computers","begin_time":"20:50","end_time":"21:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Scalable and efficient numerical simulations are of increasing importance in all areas of science and technology. This is fueled by a steady growth in the performance of computing hardware and increasing heterogeneous parallelism. However, efficiently implementing scalable simulation codes on heterogeneous, distributed hardware systems is the current bottleneck. This bottleneck can be relaxed by intermediate software layers that provide abstractions closer to the problem domain, allowing the computational scientist to focus on the simulation algorithm. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and\/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of hybrid particle-mesh simulations of both discrete and continuous models. This infrastructure is complemented with portable implementations of frequently used numerical routines, as well as interfaces to third-party libraries. We present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), high-dimensional Monte Carlo sampling (CMA-ES), and Reaction-Diffusion solvers, comparing it to the current state of the art and existing simulation software frameworks.","filename":"post134s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ivo","last_name":"Sbalzarini","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Incardona","affiliation":"Centre of System Biology Dresden","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post167","type":"poster","title":"ENG06 - Wavelet Based Data Compression Strategies for Exascale CFD Simulations","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The steady increase of available computer resources has enabled engineers and scientists to use progressively more complex models to simulate a myriad of fluid flow problems. Yet, whereas modern high performance computers have seen a steady growth in computing power, the same trend has not been mirrored by a significant gain in data transfer rates. Current systems are capable of producing and processing high amounts of data quickly, while the overall performance is often limited by their ability to transfer and store the computed data. Considering that researchers invariably seek to study simulations with increasingly higher spatial and temporal resolution, the imminent move to exascale computing will consequently only exacerbate this problem. One of the major pitfalls of storing \u0027raw\u0027 simulation data lies in the implicit and redundant manner in which it represents the flow physics. Thus, using image compression algorithms to transform a large \u0027raw\u0027 into a compact data format could help to overcome the I\/O bottleneck. We therefore propose to adapt the wavelet-based JPEG-2000 compression standard for volumetric floating-point arrays.","filename":"post167s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"Rist","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post167","type":"poster","title":"ENG06 - Wavelet Based Data Compression Strategies for Exascale CFD Simulations","begin_time":"21:10","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The steady increase of available computer resources has enabled engineers and scientists to use progressively more complex models to simulate a myriad of fluid flow problems. Yet, whereas modern high performance computers have seen a steady growth in computing power, the same trend has not been mirrored by a significant gain in data transfer rates. Current systems are capable of producing and processing high amounts of data quickly, while the overall performance is often limited by their ability to transfer and store the computed data. Considering that researchers invariably seek to study simulations with increasingly higher spatial and temporal resolution, the imminent move to exascale computing will consequently only exacerbate this problem. One of the major pitfalls of storing \u0027raw\u0027 simulation data lies in the implicit and redundant manner in which it represents the flow physics. Thus, using image compression algorithms to transform a large \u0027raw\u0027 into a compact data format could help to overcome the I\/O bottleneck. We therefore propose to adapt the wavelet-based JPEG-2000 compression standard for volumetric floating-point arrays.","filename":"post167s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"Rist","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Patrick","last_name":"Vogler","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ulrich","last_name":"Rist","affiliation":"University of Stuttgart","country":"Germany","bio":"","order":"2","is_presenter":false}] } Presentation
LIF01 - AI-GWAPA: Explainable-AI and Genome Wide Association Phytobiome Analysis
, Piet Jones (University of Tennessee, United States of America)
+ Abstract { "session": {"id":"sess148","title":"Posters in Life Sciences","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Life Sciences"],"slots":[{"id":"post179","type":"poster","title":"LIF01 - AI-GWAPA: Explainable-AI and Genome Wide Association Phytobiome Analysis","begin_time":"19:30","end_time":"19:47","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The phytobiome consists of the plant, organismal communities and their environment. The interactions between these have significant effects on observable measurable traits that have potential economic and sustainability implications. A better systems-level understanding of these interactions will enhance our capacity to design higher yield and sustainable plants. We apply a collection of machine learning and deep learning methods to elucidate the interaction between viral, microbial and the plant host systems. Metatranscriptome samples from leaf, xylem and root along with approximately 10 million SNPs called across a population of 1000 \u003Cem\u003EP. trichocarpa\u003C\/em\u003E trees, allows us to associate host genetic variants to phytobiome constituents. Factorization machines are used, in a deep learning framework, to take higher-order interactions into account producing a set of high-confidence measurements of taxa abundance, which serve as phenotypes in a Genome Wide Association Analysis. In addition, putative host driven mutualistic\/pathogenic interactions between taxa are estimated. These provide candidate proteins for a capsule network deep learning model to predict putative protein-protein interactions, taking into account the protein\u2019s quantum chemical properties. Capsule networks provide an explainable AI approach to uncover features driving important interactions. This machine\/deep learning framework provides a methodology to better understand complex biological systems.","filename":"post179s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Irle","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ka Hung","last_name":"Lee","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Udaya","last_name":"Kalluri","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Wellington","last_name":"Muchero","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Jay","last_name":"Chen","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Gerald","last_name":"Tuskan","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"9","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post169","type":"poster","title":"LIF02 - The Bromodomain-Peptide (Un)Binding Network","begin_time":"19:47","end_time":"20:04","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Atomistic simulations are a valuable tool for understanding the structural details involved in biological processes. However, extracting kinetic information such as (un)binding rates for complex molecular systems remains a challenge as it requires both the efficient sampling of rare events and scalable analysis tools to cope with large amounts of data. Progress index-guided sampling (PIGS) is an unsupervised and scalable algorithm capable of enhancing the conformational diversification in a focused search space by simply rewarding spontaneous fluctuations. Here we use PIGS to study the binding mode and dissociation of a tripeptide from four different bromodomains, which are protein modules involved in epigenetics. By focusing the sampling enhancement on the two distinct loops forming the peptide binding pocket we are able to observe several unbinding events. Clustering of conformations and subsequent Markov state model analysis elucidate states, kinetics, and pathways involved in the unbinding process and help to shed light on structural aspects of epigenetic regulation. Both the PIGS protocol and the analysis tools rely on scalable libraries implemented in the software CAMPARI, which can also be interfaced with other propagation codes to fully exploit the tools available on different HPC infrastructures.","bio":"","contributors":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post177","type":"poster","title":"LIF03 - Explainable Machine Learning for Systems Biology: Tensor Iterative Random Forests","begin_time":"20:04","end_time":"20:21","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A novel algorithm, Tensor iterative Random Forest (TiRF), is able to effectively build forests that can be mined for interactions within a multidimensional X matrix, a multidimensional Y matrix, and interactions between multiple dimensions in X and Y, all at the same time. TiRF uses dimension reduction techniques (such as Lasso or a nested call to iRF) on Y for the given subsets of X, thus ensuring that the new subset of Y dimensions is highly connected to the X being split upon. The dimension-reduced Y matrix can then be used in measuring node purity, ensuring that noise is reduced and that the given features in X are measured for the ability to split the appropriate dimensions in Y. The resulting trees and forest now contain paths in X, and each node also has an associated set of Y. This means that random intersection trees can be used to find sets of interacting Xs from the forest and sets of interacting Ys conditional on those sets of X. As the data sets are growing exponentially, petascale deployments of TiRFs are being targeted in order to perform systems-wide analyses of biological or other complex systems that can be represented as matrices.","filename":"post177s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ashley","last_name":"Cliff","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ben","last_name":"Brown","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post124","type":"poster","title":"LIF04 - iReceptor: A Platform for Exploring and Analyzing Antibody\/B-cell and T-cell Receptor Repertoire Data across Federated Repositories","begin_time":"20:21","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Individuals produce an immense repertoire of antibody\/B-cell and T-cell receptor sequences in order to recognize and destroy a diverse array of pathogens. For example, it has been estimated that the number of possible human B-cell sequences is greater than 10^13, and each person produces 10^8 - 10^9 B-cells during an immune response, many of which are unique. In 2009 NGS approaches were used to characterize the Adaptive Immune Receptor Repertoire (AIRR) in exquisite detail. These AIRR-Seq data sets have rapidly become critical to vaccine development, understanding the immune response in autoimmune disease, and in developing novel therapeutics against cancer. The iReceptor system (ireceptor.org) is a platform to integrate and analyze these immense data sets by combining: 1) an international network of AIRR-Seq data repositories; 2) the ability to federate AIRR-Seq data across these distributed repositories; 3) the ability to apply advanced analytical tools using large advanced research computing (ARC) resources; and 4) a scientific gateway that hides the complexity of performing research queries and advanced analyses across these federated data. iReceptor enables these capabilities by building on the standards work being carried our by the AIRR Community (airr-community.org). iReceptor manages the staging of data and control of ARC analysis jobs through the use of the AGAVE science-as-a-service platform (agaveapi.co).","filename":"post124s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Felix","last_name":"Breden","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true}]},{"id":"post122","type":"poster","title":"LIF05 - Modeling Biological Networks with Exponential Random Graph Models","begin_time":"20:38","end_time":"20:55","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Much research in biological networks concerns \u0022motifs\u0022, small subgraphs which occur more frequently than by chance, which are considered the building blocks of complex networks. Exponential random graph models (ERGMs), a well-established class of statistical models for network data which are widely used in social network analysis, represent a principled statistical method of determining whether a motif is over (or under) represented in a network. Although the use of ERGMs for analyzing biological networks was introduced into the bioinformatics literature ten years ago, the use of ERGMs in biological network analysis has been very limited since then due to problems with applying existing methods to such networks, their size being typically far larger than those social networks to which ERGMs are usually applied. Here we use high performance computing to apply our recently developed new techniques (snowball sampling, improved fixed density ERGM sampling, scalable Equilibrium Expectation algorithm) for ERGM estimation to several biological networks (protein-protein interaction networks, gene regulatory networks, and a neural network), ranging in size from a few hundred to over five thousand nodes.","filename":"post122s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Maksym","last_name":"Byshkin","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Garry","last_name":"Robins","affiliation":"The University of Melbourne","country":"Australia","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Alessandro","last_name":"Lomi","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true}]},{"id":"post178","type":"poster","title":"LIF06 - Parallel Kraken for Meta-Omic Microbiome and Phytobiome Classification","begin_time":"20:55","end_time":"21:12","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Meta-omic taxa classification allows for the identification of a wide range of micro- and macro-organisms that influence disease, physiological states, and ecological states in both host systems and environmental samples. Traditional classification methods often either use markers or subsample genomes to limit search space (and, as such, lose discriminatory power) for identifying large numbers of taxa. However, accurately identifying taxa in meta-omics datasets requires methods that simultaneously identify taxa across all kingdoms of life, utilize whole genome sequences, and are highly parallel. One drawback to utilizing whole genome sequences without subsampling or combining similar sequences is that databases of all known sequences become impractical to store in RAM. To overcome this limitation, we have developed a highly parallel version of Kraken that can run multiple sequences through multiple databases in parallel and resolve each read assignment to identify taxa in transcriptomic, proteomic, and genomic data. Additionally, we have created databases from all known whole genome sequences in NCBI totaling 115k+ genomes and 700k+ viral genomes from JGI. Parallel Kraken has been used to classify thousands of samples and billions of reads in parallel and is part of a pipeline for understanding relationships between host and microbiome\/phytobiome.","filename":"post178s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ian","last_name":"Hodge","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post170","type":"poster","title":"LIF07 - SAPPHIRE Basin Recognition: An Unsupervised Algorithm to Identify and Project Metastable and Transition States in High-Dimensional Time Series Data","begin_time":"21:12","end_time":"21:29","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The identification of states and pathways in high-dimensional data sets is a daunting task in that the number of states increases exponentially with dimensionality. While simple projection techniques are prone to introduce overlaps and shortcuts, our ability to process information does mandate that the dynamics be reduced to a small number of essential patterns. Here we present a new approach called SAPPHIRE (states and pathways projected at high resolution) Basin Recognition (SBR) and show the automatic detection of metastable and transition states for different applications ranging from molecular dynamics simulations to temporal series of complex systems taken from other scientific domains. The method is based on a reordering of the trajectory according to a short spanning tree, such that the resulting new sequence amounts to a walk along the state space basin by basin. The temporal series is used to highlight recurrence and to provide a kinetic distance. By using the kinetic distance and the original time series, SBR performs an automatic identification of the basins. By comparing with other clustering techniques, we demonstrate the suitability of the method to capture the salient slow modes of the system while maintaining a manageable number of states.","filename":"post170s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post179","type":"poster","title":"LIF01 - AI-GWAPA: Explainable-AI and Genome Wide Association Phytobiome Analysis","begin_time":"19:30","end_time":"19:47","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The phytobiome consists of the plant, organismal communities and their environment. The interactions between these have significant effects on observable measurable traits that have potential economic and sustainability implications. A better systems-level understanding of these interactions will enhance our capacity to design higher yield and sustainable plants. We apply a collection of machine learning and deep learning methods to elucidate the interaction between viral, microbial and the plant host systems. Metatranscriptome samples from leaf, xylem and root along with approximately 10 million SNPs called across a population of 1000 \u003Cem\u003EP. trichocarpa\u003C\/em\u003E trees, allows us to associate host genetic variants to phytobiome constituents. Factorization machines are used, in a deep learning framework, to take higher-order interactions into account producing a set of high-confidence measurements of taxa abundance, which serve as phenotypes in a Genome Wide Association Analysis. In addition, putative host driven mutualistic\/pathogenic interactions between taxa are estimated. These provide candidate proteins for a capsule network deep learning model to predict putative protein-protein interactions, taking into account the protein\u2019s quantum chemical properties. Capsule networks provide an explainable AI approach to uncover features driving important interactions. This machine\/deep learning framework provides a methodology to better understand complex biological systems.","filename":"post179s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Irle","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ka Hung","last_name":"Lee","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Udaya","last_name":"Kalluri","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Wellington","last_name":"Muchero","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Jay","last_name":"Chen","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Gerald","last_name":"Tuskan","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"9","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Irle","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ka Hung","last_name":"Lee","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Udaya","last_name":"Kalluri","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Wellington","last_name":"Muchero","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Jay","last_name":"Chen","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Gerald","last_name":"Tuskan","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"9","is_presenter":false}] } Presentation
LIF03 - Explainable Machine Learning for Systems Biology: Tensor Iterative Random Forests
, Jonathon Romero (University of Tennessee, United States of America)
+ Abstract { "session": {"id":"sess148","title":"Posters in Life Sciences","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Life Sciences"],"slots":[{"id":"post179","type":"poster","title":"LIF01 - AI-GWAPA: Explainable-AI and Genome Wide Association Phytobiome Analysis","begin_time":"19:30","end_time":"19:47","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The phytobiome consists of the plant, organismal communities and their environment. The interactions between these have significant effects on observable measurable traits that have potential economic and sustainability implications. A better systems-level understanding of these interactions will enhance our capacity to design higher yield and sustainable plants. We apply a collection of machine learning and deep learning methods to elucidate the interaction between viral, microbial and the plant host systems. Metatranscriptome samples from leaf, xylem and root along with approximately 10 million SNPs called across a population of 1000 \u003Cem\u003EP. trichocarpa\u003C\/em\u003E trees, allows us to associate host genetic variants to phytobiome constituents. Factorization machines are used, in a deep learning framework, to take higher-order interactions into account producing a set of high-confidence measurements of taxa abundance, which serve as phenotypes in a Genome Wide Association Analysis. In addition, putative host driven mutualistic\/pathogenic interactions between taxa are estimated. These provide candidate proteins for a capsule network deep learning model to predict putative protein-protein interactions, taking into account the protein\u2019s quantum chemical properties. Capsule networks provide an explainable AI approach to uncover features driving important interactions. This machine\/deep learning framework provides a methodology to better understand complex biological systems.","filename":"post179s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Irle","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ka Hung","last_name":"Lee","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Udaya","last_name":"Kalluri","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Wellington","last_name":"Muchero","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Jay","last_name":"Chen","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Gerald","last_name":"Tuskan","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"9","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post169","type":"poster","title":"LIF02 - The Bromodomain-Peptide (Un)Binding Network","begin_time":"19:47","end_time":"20:04","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Atomistic simulations are a valuable tool for understanding the structural details involved in biological processes. However, extracting kinetic information such as (un)binding rates for complex molecular systems remains a challenge as it requires both the efficient sampling of rare events and scalable analysis tools to cope with large amounts of data. Progress index-guided sampling (PIGS) is an unsupervised and scalable algorithm capable of enhancing the conformational diversification in a focused search space by simply rewarding spontaneous fluctuations. Here we use PIGS to study the binding mode and dissociation of a tripeptide from four different bromodomains, which are protein modules involved in epigenetics. By focusing the sampling enhancement on the two distinct loops forming the peptide binding pocket we are able to observe several unbinding events. Clustering of conformations and subsequent Markov state model analysis elucidate states, kinetics, and pathways involved in the unbinding process and help to shed light on structural aspects of epigenetic regulation. Both the PIGS protocol and the analysis tools rely on scalable libraries implemented in the software CAMPARI, which can also be interfaced with other propagation codes to fully exploit the tools available on different HPC infrastructures.","bio":"","contributors":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post177","type":"poster","title":"LIF03 - Explainable Machine Learning for Systems Biology: Tensor Iterative Random Forests","begin_time":"20:04","end_time":"20:21","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A novel algorithm, Tensor iterative Random Forest (TiRF), is able to effectively build forests that can be mined for interactions within a multidimensional X matrix, a multidimensional Y matrix, and interactions between multiple dimensions in X and Y, all at the same time. TiRF uses dimension reduction techniques (such as Lasso or a nested call to iRF) on Y for the given subsets of X, thus ensuring that the new subset of Y dimensions is highly connected to the X being split upon. The dimension-reduced Y matrix can then be used in measuring node purity, ensuring that noise is reduced and that the given features in X are measured for the ability to split the appropriate dimensions in Y. The resulting trees and forest now contain paths in X, and each node also has an associated set of Y. This means that random intersection trees can be used to find sets of interacting Xs from the forest and sets of interacting Ys conditional on those sets of X. As the data sets are growing exponentially, petascale deployments of TiRFs are being targeted in order to perform systems-wide analyses of biological or other complex systems that can be represented as matrices.","filename":"post177s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ashley","last_name":"Cliff","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ben","last_name":"Brown","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post124","type":"poster","title":"LIF04 - iReceptor: A Platform for Exploring and Analyzing Antibody\/B-cell and T-cell Receptor Repertoire Data across Federated Repositories","begin_time":"20:21","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Individuals produce an immense repertoire of antibody\/B-cell and T-cell receptor sequences in order to recognize and destroy a diverse array of pathogens. For example, it has been estimated that the number of possible human B-cell sequences is greater than 10^13, and each person produces 10^8 - 10^9 B-cells during an immune response, many of which are unique. In 2009 NGS approaches were used to characterize the Adaptive Immune Receptor Repertoire (AIRR) in exquisite detail. These AIRR-Seq data sets have rapidly become critical to vaccine development, understanding the immune response in autoimmune disease, and in developing novel therapeutics against cancer. The iReceptor system (ireceptor.org) is a platform to integrate and analyze these immense data sets by combining: 1) an international network of AIRR-Seq data repositories; 2) the ability to federate AIRR-Seq data across these distributed repositories; 3) the ability to apply advanced analytical tools using large advanced research computing (ARC) resources; and 4) a scientific gateway that hides the complexity of performing research queries and advanced analyses across these federated data. iReceptor enables these capabilities by building on the standards work being carried our by the AIRR Community (airr-community.org). iReceptor manages the staging of data and control of ARC analysis jobs through the use of the AGAVE science-as-a-service platform (agaveapi.co).","filename":"post124s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Felix","last_name":"Breden","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true}]},{"id":"post122","type":"poster","title":"LIF05 - Modeling Biological Networks with Exponential Random Graph Models","begin_time":"20:38","end_time":"20:55","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Much research in biological networks concerns \u0022motifs\u0022, small subgraphs which occur more frequently than by chance, which are considered the building blocks of complex networks. Exponential random graph models (ERGMs), a well-established class of statistical models for network data which are widely used in social network analysis, represent a principled statistical method of determining whether a motif is over (or under) represented in a network. Although the use of ERGMs for analyzing biological networks was introduced into the bioinformatics literature ten years ago, the use of ERGMs in biological network analysis has been very limited since then due to problems with applying existing methods to such networks, their size being typically far larger than those social networks to which ERGMs are usually applied. Here we use high performance computing to apply our recently developed new techniques (snowball sampling, improved fixed density ERGM sampling, scalable Equilibrium Expectation algorithm) for ERGM estimation to several biological networks (protein-protein interaction networks, gene regulatory networks, and a neural network), ranging in size from a few hundred to over five thousand nodes.","filename":"post122s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Maksym","last_name":"Byshkin","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Garry","last_name":"Robins","affiliation":"The University of Melbourne","country":"Australia","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Alessandro","last_name":"Lomi","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true}]},{"id":"post178","type":"poster","title":"LIF06 - Parallel Kraken for Meta-Omic Microbiome and Phytobiome Classification","begin_time":"20:55","end_time":"21:12","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Meta-omic taxa classification allows for the identification of a wide range of micro- and macro-organisms that influence disease, physiological states, and ecological states in both host systems and environmental samples. Traditional classification methods often either use markers or subsample genomes to limit search space (and, as such, lose discriminatory power) for identifying large numbers of taxa. However, accurately identifying taxa in meta-omics datasets requires methods that simultaneously identify taxa across all kingdoms of life, utilize whole genome sequences, and are highly parallel. One drawback to utilizing whole genome sequences without subsampling or combining similar sequences is that databases of all known sequences become impractical to store in RAM. To overcome this limitation, we have developed a highly parallel version of Kraken that can run multiple sequences through multiple databases in parallel and resolve each read assignment to identify taxa in transcriptomic, proteomic, and genomic data. Additionally, we have created databases from all known whole genome sequences in NCBI totaling 115k+ genomes and 700k+ viral genomes from JGI. Parallel Kraken has been used to classify thousands of samples and billions of reads in parallel and is part of a pipeline for understanding relationships between host and microbiome\/phytobiome.","filename":"post178s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ian","last_name":"Hodge","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post170","type":"poster","title":"LIF07 - SAPPHIRE Basin Recognition: An Unsupervised Algorithm to Identify and Project Metastable and Transition States in High-Dimensional Time Series Data","begin_time":"21:12","end_time":"21:29","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The identification of states and pathways in high-dimensional data sets is a daunting task in that the number of states increases exponentially with dimensionality. While simple projection techniques are prone to introduce overlaps and shortcuts, our ability to process information does mandate that the dynamics be reduced to a small number of essential patterns. Here we present a new approach called SAPPHIRE (states and pathways projected at high resolution) Basin Recognition (SBR) and show the automatic detection of metastable and transition states for different applications ranging from molecular dynamics simulations to temporal series of complex systems taken from other scientific domains. The method is based on a reordering of the trajectory according to a short spanning tree, such that the resulting new sequence amounts to a walk along the state space basin by basin. The temporal series is used to highlight recurrence and to provide a kinetic distance. By using the kinetic distance and the original time series, SBR performs an automatic identification of the basins. By comparing with other clustering techniques, we demonstrate the suitability of the method to capture the salient slow modes of the system while maintaining a manageable number of states.","filename":"post170s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post177","type":"poster","title":"LIF03 - Explainable Machine Learning for Systems Biology: Tensor Iterative Random Forests","begin_time":"20:04","end_time":"20:21","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A novel algorithm, Tensor iterative Random Forest (TiRF), is able to effectively build forests that can be mined for interactions within a multidimensional X matrix, a multidimensional Y matrix, and interactions between multiple dimensions in X and Y, all at the same time. TiRF uses dimension reduction techniques (such as Lasso or a nested call to iRF) on Y for the given subsets of X, thus ensuring that the new subset of Y dimensions is highly connected to the X being split upon. The dimension-reduced Y matrix can then be used in measuring node purity, ensuring that noise is reduced and that the given features in X are measured for the ability to split the appropriate dimensions in Y. The resulting trees and forest now contain paths in X, and each node also has an associated set of Y. This means that random intersection trees can be used to find sets of interacting Xs from the forest and sets of interacting Ys conditional on those sets of X. As the data sets are growing exponentially, petascale deployments of TiRFs are being targeted in order to perform systems-wide analyses of biological or other complex systems that can be represented as matrices.","filename":"post177s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ashley","last_name":"Cliff","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ben","last_name":"Brown","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ashley","last_name":"Cliff","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ben","last_name":"Brown","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}] } Presentation
LIF04 - iReceptor: A Platform for Exploring and Analyzing Antibody/B-cell and T-cell Receptor Repertoire Data across Federated Repositories
, Brian Corrie (Simon Fraser University, Canada)
+ Abstract { "session": {"id":"sess148","title":"Posters in Life Sciences","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Life Sciences"],"slots":[{"id":"post179","type":"poster","title":"LIF01 - AI-GWAPA: Explainable-AI and Genome Wide Association Phytobiome Analysis","begin_time":"19:30","end_time":"19:47","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The phytobiome consists of the plant, organismal communities and their environment. The interactions between these have significant effects on observable measurable traits that have potential economic and sustainability implications. A better systems-level understanding of these interactions will enhance our capacity to design higher yield and sustainable plants. We apply a collection of machine learning and deep learning methods to elucidate the interaction between viral, microbial and the plant host systems. Metatranscriptome samples from leaf, xylem and root along with approximately 10 million SNPs called across a population of 1000 \u003Cem\u003EP. trichocarpa\u003C\/em\u003E trees, allows us to associate host genetic variants to phytobiome constituents. Factorization machines are used, in a deep learning framework, to take higher-order interactions into account producing a set of high-confidence measurements of taxa abundance, which serve as phenotypes in a Genome Wide Association Analysis. In addition, putative host driven mutualistic\/pathogenic interactions between taxa are estimated. These provide candidate proteins for a capsule network deep learning model to predict putative protein-protein interactions, taking into account the protein\u2019s quantum chemical properties. Capsule networks provide an explainable AI approach to uncover features driving important interactions. This machine\/deep learning framework provides a methodology to better understand complex biological systems.","filename":"post179s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Irle","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ka Hung","last_name":"Lee","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Udaya","last_name":"Kalluri","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Wellington","last_name":"Muchero","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Jay","last_name":"Chen","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Gerald","last_name":"Tuskan","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"9","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post169","type":"poster","title":"LIF02 - The Bromodomain-Peptide (Un)Binding Network","begin_time":"19:47","end_time":"20:04","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Atomistic simulations are a valuable tool for understanding the structural details involved in biological processes. However, extracting kinetic information such as (un)binding rates for complex molecular systems remains a challenge as it requires both the efficient sampling of rare events and scalable analysis tools to cope with large amounts of data. Progress index-guided sampling (PIGS) is an unsupervised and scalable algorithm capable of enhancing the conformational diversification in a focused search space by simply rewarding spontaneous fluctuations. Here we use PIGS to study the binding mode and dissociation of a tripeptide from four different bromodomains, which are protein modules involved in epigenetics. By focusing the sampling enhancement on the two distinct loops forming the peptide binding pocket we are able to observe several unbinding events. Clustering of conformations and subsequent Markov state model analysis elucidate states, kinetics, and pathways involved in the unbinding process and help to shed light on structural aspects of epigenetic regulation. Both the PIGS protocol and the analysis tools rely on scalable libraries implemented in the software CAMPARI, which can also be interfaced with other propagation codes to fully exploit the tools available on different HPC infrastructures.","bio":"","contributors":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post177","type":"poster","title":"LIF03 - Explainable Machine Learning for Systems Biology: Tensor Iterative Random Forests","begin_time":"20:04","end_time":"20:21","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A novel algorithm, Tensor iterative Random Forest (TiRF), is able to effectively build forests that can be mined for interactions within a multidimensional X matrix, a multidimensional Y matrix, and interactions between multiple dimensions in X and Y, all at the same time. TiRF uses dimension reduction techniques (such as Lasso or a nested call to iRF) on Y for the given subsets of X, thus ensuring that the new subset of Y dimensions is highly connected to the X being split upon. The dimension-reduced Y matrix can then be used in measuring node purity, ensuring that noise is reduced and that the given features in X are measured for the ability to split the appropriate dimensions in Y. The resulting trees and forest now contain paths in X, and each node also has an associated set of Y. This means that random intersection trees can be used to find sets of interacting Xs from the forest and sets of interacting Ys conditional on those sets of X. As the data sets are growing exponentially, petascale deployments of TiRFs are being targeted in order to perform systems-wide analyses of biological or other complex systems that can be represented as matrices.","filename":"post177s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ashley","last_name":"Cliff","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ben","last_name":"Brown","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post124","type":"poster","title":"LIF04 - iReceptor: A Platform for Exploring and Analyzing Antibody\/B-cell and T-cell Receptor Repertoire Data across Federated Repositories","begin_time":"20:21","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Individuals produce an immense repertoire of antibody\/B-cell and T-cell receptor sequences in order to recognize and destroy a diverse array of pathogens. For example, it has been estimated that the number of possible human B-cell sequences is greater than 10^13, and each person produces 10^8 - 10^9 B-cells during an immune response, many of which are unique. In 2009 NGS approaches were used to characterize the Adaptive Immune Receptor Repertoire (AIRR) in exquisite detail. These AIRR-Seq data sets have rapidly become critical to vaccine development, understanding the immune response in autoimmune disease, and in developing novel therapeutics against cancer. The iReceptor system (ireceptor.org) is a platform to integrate and analyze these immense data sets by combining: 1) an international network of AIRR-Seq data repositories; 2) the ability to federate AIRR-Seq data across these distributed repositories; 3) the ability to apply advanced analytical tools using large advanced research computing (ARC) resources; and 4) a scientific gateway that hides the complexity of performing research queries and advanced analyses across these federated data. iReceptor enables these capabilities by building on the standards work being carried our by the AIRR Community (airr-community.org). iReceptor manages the staging of data and control of ARC analysis jobs through the use of the AGAVE science-as-a-service platform (agaveapi.co).","filename":"post124s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Felix","last_name":"Breden","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true}]},{"id":"post122","type":"poster","title":"LIF05 - Modeling Biological Networks with Exponential Random Graph Models","begin_time":"20:38","end_time":"20:55","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Much research in biological networks concerns \u0022motifs\u0022, small subgraphs which occur more frequently than by chance, which are considered the building blocks of complex networks. Exponential random graph models (ERGMs), a well-established class of statistical models for network data which are widely used in social network analysis, represent a principled statistical method of determining whether a motif is over (or under) represented in a network. Although the use of ERGMs for analyzing biological networks was introduced into the bioinformatics literature ten years ago, the use of ERGMs in biological network analysis has been very limited since then due to problems with applying existing methods to such networks, their size being typically far larger than those social networks to which ERGMs are usually applied. Here we use high performance computing to apply our recently developed new techniques (snowball sampling, improved fixed density ERGM sampling, scalable Equilibrium Expectation algorithm) for ERGM estimation to several biological networks (protein-protein interaction networks, gene regulatory networks, and a neural network), ranging in size from a few hundred to over five thousand nodes.","filename":"post122s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Maksym","last_name":"Byshkin","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Garry","last_name":"Robins","affiliation":"The University of Melbourne","country":"Australia","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Alessandro","last_name":"Lomi","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true}]},{"id":"post178","type":"poster","title":"LIF06 - Parallel Kraken for Meta-Omic Microbiome and Phytobiome Classification","begin_time":"20:55","end_time":"21:12","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Meta-omic taxa classification allows for the identification of a wide range of micro- and macro-organisms that influence disease, physiological states, and ecological states in both host systems and environmental samples. Traditional classification methods often either use markers or subsample genomes to limit search space (and, as such, lose discriminatory power) for identifying large numbers of taxa. However, accurately identifying taxa in meta-omics datasets requires methods that simultaneously identify taxa across all kingdoms of life, utilize whole genome sequences, and are highly parallel. One drawback to utilizing whole genome sequences without subsampling or combining similar sequences is that databases of all known sequences become impractical to store in RAM. To overcome this limitation, we have developed a highly parallel version of Kraken that can run multiple sequences through multiple databases in parallel and resolve each read assignment to identify taxa in transcriptomic, proteomic, and genomic data. Additionally, we have created databases from all known whole genome sequences in NCBI totaling 115k+ genomes and 700k+ viral genomes from JGI. Parallel Kraken has been used to classify thousands of samples and billions of reads in parallel and is part of a pipeline for understanding relationships between host and microbiome\/phytobiome.","filename":"post178s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ian","last_name":"Hodge","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post170","type":"poster","title":"LIF07 - SAPPHIRE Basin Recognition: An Unsupervised Algorithm to Identify and Project Metastable and Transition States in High-Dimensional Time Series Data","begin_time":"21:12","end_time":"21:29","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The identification of states and pathways in high-dimensional data sets is a daunting task in that the number of states increases exponentially with dimensionality. While simple projection techniques are prone to introduce overlaps and shortcuts, our ability to process information does mandate that the dynamics be reduced to a small number of essential patterns. Here we present a new approach called SAPPHIRE (states and pathways projected at high resolution) Basin Recognition (SBR) and show the automatic detection of metastable and transition states for different applications ranging from molecular dynamics simulations to temporal series of complex systems taken from other scientific domains. The method is based on a reordering of the trajectory according to a short spanning tree, such that the resulting new sequence amounts to a walk along the state space basin by basin. The temporal series is used to highlight recurrence and to provide a kinetic distance. By using the kinetic distance and the original time series, SBR performs an automatic identification of the basins. By comparing with other clustering techniques, we demonstrate the suitability of the method to capture the salient slow modes of the system while maintaining a manageable number of states.","filename":"post170s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post124","type":"poster","title":"LIF04 - iReceptor: A Platform for Exploring and Analyzing Antibody\/B-cell and T-cell Receptor Repertoire Data across Federated Repositories","begin_time":"20:21","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Individuals produce an immense repertoire of antibody\/B-cell and T-cell receptor sequences in order to recognize and destroy a diverse array of pathogens. For example, it has been estimated that the number of possible human B-cell sequences is greater than 10^13, and each person produces 10^8 - 10^9 B-cells during an immune response, many of which are unique. In 2009 NGS approaches were used to characterize the Adaptive Immune Receptor Repertoire (AIRR) in exquisite detail. These AIRR-Seq data sets have rapidly become critical to vaccine development, understanding the immune response in autoimmune disease, and in developing novel therapeutics against cancer. The iReceptor system (ireceptor.org) is a platform to integrate and analyze these immense data sets by combining: 1) an international network of AIRR-Seq data repositories; 2) the ability to federate AIRR-Seq data across these distributed repositories; 3) the ability to apply advanced analytical tools using large advanced research computing (ARC) resources; and 4) a scientific gateway that hides the complexity of performing research queries and advanced analyses across these federated data. iReceptor enables these capabilities by building on the standards work being carried our by the AIRR Community (airr-community.org). iReceptor manages the staging of data and control of ARC analysis jobs through the use of the AGAVE science-as-a-service platform (agaveapi.co).","filename":"post124s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Felix","last_name":"Breden","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Felix","last_name":"Breden","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"2","is_presenter":false}] } Presentation
LIF05 - Modeling Biological Networks with Exponential Random Graph Models
, Alex Stivala (Swinburne University of Technology, Australia)
+ Abstract { "session": {"id":"sess148","title":"Posters in Life Sciences","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Life Sciences"],"slots":[{"id":"post179","type":"poster","title":"LIF01 - AI-GWAPA: Explainable-AI and Genome Wide Association Phytobiome Analysis","begin_time":"19:30","end_time":"19:47","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The phytobiome consists of the plant, organismal communities and their environment. The interactions between these have significant effects on observable measurable traits that have potential economic and sustainability implications. A better systems-level understanding of these interactions will enhance our capacity to design higher yield and sustainable plants. We apply a collection of machine learning and deep learning methods to elucidate the interaction between viral, microbial and the plant host systems. Metatranscriptome samples from leaf, xylem and root along with approximately 10 million SNPs called across a population of 1000 \u003Cem\u003EP. trichocarpa\u003C\/em\u003E trees, allows us to associate host genetic variants to phytobiome constituents. Factorization machines are used, in a deep learning framework, to take higher-order interactions into account producing a set of high-confidence measurements of taxa abundance, which serve as phenotypes in a Genome Wide Association Analysis. In addition, putative host driven mutualistic\/pathogenic interactions between taxa are estimated. These provide candidate proteins for a capsule network deep learning model to predict putative protein-protein interactions, taking into account the protein\u2019s quantum chemical properties. Capsule networks provide an explainable AI approach to uncover features driving important interactions. This machine\/deep learning framework provides a methodology to better understand complex biological systems.","filename":"post179s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Irle","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ka Hung","last_name":"Lee","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Udaya","last_name":"Kalluri","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Wellington","last_name":"Muchero","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Jay","last_name":"Chen","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Gerald","last_name":"Tuskan","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"9","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post169","type":"poster","title":"LIF02 - The Bromodomain-Peptide (Un)Binding Network","begin_time":"19:47","end_time":"20:04","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Atomistic simulations are a valuable tool for understanding the structural details involved in biological processes. However, extracting kinetic information such as (un)binding rates for complex molecular systems remains a challenge as it requires both the efficient sampling of rare events and scalable analysis tools to cope with large amounts of data. Progress index-guided sampling (PIGS) is an unsupervised and scalable algorithm capable of enhancing the conformational diversification in a focused search space by simply rewarding spontaneous fluctuations. Here we use PIGS to study the binding mode and dissociation of a tripeptide from four different bromodomains, which are protein modules involved in epigenetics. By focusing the sampling enhancement on the two distinct loops forming the peptide binding pocket we are able to observe several unbinding events. Clustering of conformations and subsequent Markov state model analysis elucidate states, kinetics, and pathways involved in the unbinding process and help to shed light on structural aspects of epigenetic regulation. Both the PIGS protocol and the analysis tools rely on scalable libraries implemented in the software CAMPARI, which can also be interfaced with other propagation codes to fully exploit the tools available on different HPC infrastructures.","bio":"","contributors":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post177","type":"poster","title":"LIF03 - Explainable Machine Learning for Systems Biology: Tensor Iterative Random Forests","begin_time":"20:04","end_time":"20:21","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A novel algorithm, Tensor iterative Random Forest (TiRF), is able to effectively build forests that can be mined for interactions within a multidimensional X matrix, a multidimensional Y matrix, and interactions between multiple dimensions in X and Y, all at the same time. TiRF uses dimension reduction techniques (such as Lasso or a nested call to iRF) on Y for the given subsets of X, thus ensuring that the new subset of Y dimensions is highly connected to the X being split upon. The dimension-reduced Y matrix can then be used in measuring node purity, ensuring that noise is reduced and that the given features in X are measured for the ability to split the appropriate dimensions in Y. The resulting trees and forest now contain paths in X, and each node also has an associated set of Y. This means that random intersection trees can be used to find sets of interacting Xs from the forest and sets of interacting Ys conditional on those sets of X. As the data sets are growing exponentially, petascale deployments of TiRFs are being targeted in order to perform systems-wide analyses of biological or other complex systems that can be represented as matrices.","filename":"post177s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ashley","last_name":"Cliff","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ben","last_name":"Brown","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post124","type":"poster","title":"LIF04 - iReceptor: A Platform for Exploring and Analyzing Antibody\/B-cell and T-cell Receptor Repertoire Data across Federated Repositories","begin_time":"20:21","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Individuals produce an immense repertoire of antibody\/B-cell and T-cell receptor sequences in order to recognize and destroy a diverse array of pathogens. For example, it has been estimated that the number of possible human B-cell sequences is greater than 10^13, and each person produces 10^8 - 10^9 B-cells during an immune response, many of which are unique. In 2009 NGS approaches were used to characterize the Adaptive Immune Receptor Repertoire (AIRR) in exquisite detail. These AIRR-Seq data sets have rapidly become critical to vaccine development, understanding the immune response in autoimmune disease, and in developing novel therapeutics against cancer. The iReceptor system (ireceptor.org) is a platform to integrate and analyze these immense data sets by combining: 1) an international network of AIRR-Seq data repositories; 2) the ability to federate AIRR-Seq data across these distributed repositories; 3) the ability to apply advanced analytical tools using large advanced research computing (ARC) resources; and 4) a scientific gateway that hides the complexity of performing research queries and advanced analyses across these federated data. iReceptor enables these capabilities by building on the standards work being carried our by the AIRR Community (airr-community.org). iReceptor manages the staging of data and control of ARC analysis jobs through the use of the AGAVE science-as-a-service platform (agaveapi.co).","filename":"post124s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Felix","last_name":"Breden","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true}]},{"id":"post122","type":"poster","title":"LIF05 - Modeling Biological Networks with Exponential Random Graph Models","begin_time":"20:38","end_time":"20:55","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Much research in biological networks concerns \u0022motifs\u0022, small subgraphs which occur more frequently than by chance, which are considered the building blocks of complex networks. Exponential random graph models (ERGMs), a well-established class of statistical models for network data which are widely used in social network analysis, represent a principled statistical method of determining whether a motif is over (or under) represented in a network. Although the use of ERGMs for analyzing biological networks was introduced into the bioinformatics literature ten years ago, the use of ERGMs in biological network analysis has been very limited since then due to problems with applying existing methods to such networks, their size being typically far larger than those social networks to which ERGMs are usually applied. Here we use high performance computing to apply our recently developed new techniques (snowball sampling, improved fixed density ERGM sampling, scalable Equilibrium Expectation algorithm) for ERGM estimation to several biological networks (protein-protein interaction networks, gene regulatory networks, and a neural network), ranging in size from a few hundred to over five thousand nodes.","filename":"post122s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Maksym","last_name":"Byshkin","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Garry","last_name":"Robins","affiliation":"The University of Melbourne","country":"Australia","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Alessandro","last_name":"Lomi","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true}]},{"id":"post178","type":"poster","title":"LIF06 - Parallel Kraken for Meta-Omic Microbiome and Phytobiome Classification","begin_time":"20:55","end_time":"21:12","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Meta-omic taxa classification allows for the identification of a wide range of micro- and macro-organisms that influence disease, physiological states, and ecological states in both host systems and environmental samples. Traditional classification methods often either use markers or subsample genomes to limit search space (and, as such, lose discriminatory power) for identifying large numbers of taxa. However, accurately identifying taxa in meta-omics datasets requires methods that simultaneously identify taxa across all kingdoms of life, utilize whole genome sequences, and are highly parallel. One drawback to utilizing whole genome sequences without subsampling or combining similar sequences is that databases of all known sequences become impractical to store in RAM. To overcome this limitation, we have developed a highly parallel version of Kraken that can run multiple sequences through multiple databases in parallel and resolve each read assignment to identify taxa in transcriptomic, proteomic, and genomic data. Additionally, we have created databases from all known whole genome sequences in NCBI totaling 115k+ genomes and 700k+ viral genomes from JGI. Parallel Kraken has been used to classify thousands of samples and billions of reads in parallel and is part of a pipeline for understanding relationships between host and microbiome\/phytobiome.","filename":"post178s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ian","last_name":"Hodge","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post170","type":"poster","title":"LIF07 - SAPPHIRE Basin Recognition: An Unsupervised Algorithm to Identify and Project Metastable and Transition States in High-Dimensional Time Series Data","begin_time":"21:12","end_time":"21:29","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The identification of states and pathways in high-dimensional data sets is a daunting task in that the number of states increases exponentially with dimensionality. While simple projection techniques are prone to introduce overlaps and shortcuts, our ability to process information does mandate that the dynamics be reduced to a small number of essential patterns. Here we present a new approach called SAPPHIRE (states and pathways projected at high resolution) Basin Recognition (SBR) and show the automatic detection of metastable and transition states for different applications ranging from molecular dynamics simulations to temporal series of complex systems taken from other scientific domains. The method is based on a reordering of the trajectory according to a short spanning tree, such that the resulting new sequence amounts to a walk along the state space basin by basin. The temporal series is used to highlight recurrence and to provide a kinetic distance. By using the kinetic distance and the original time series, SBR performs an automatic identification of the basins. By comparing with other clustering techniques, we demonstrate the suitability of the method to capture the salient slow modes of the system while maintaining a manageable number of states.","filename":"post170s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post122","type":"poster","title":"LIF05 - Modeling Biological Networks with Exponential Random Graph Models","begin_time":"20:38","end_time":"20:55","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Much research in biological networks concerns \u0022motifs\u0022, small subgraphs which occur more frequently than by chance, which are considered the building blocks of complex networks. Exponential random graph models (ERGMs), a well-established class of statistical models for network data which are widely used in social network analysis, represent a principled statistical method of determining whether a motif is over (or under) represented in a network. Although the use of ERGMs for analyzing biological networks was introduced into the bioinformatics literature ten years ago, the use of ERGMs in biological network analysis has been very limited since then due to problems with applying existing methods to such networks, their size being typically far larger than those social networks to which ERGMs are usually applied. Here we use high performance computing to apply our recently developed new techniques (snowball sampling, improved fixed density ERGM sampling, scalable Equilibrium Expectation algorithm) for ERGM estimation to several biological networks (protein-protein interaction networks, gene regulatory networks, and a neural network), ranging in size from a few hundred to over five thousand nodes.","filename":"post122s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Maksym","last_name":"Byshkin","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Garry","last_name":"Robins","affiliation":"The University of Melbourne","country":"Australia","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Alessandro","last_name":"Lomi","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Maksym","last_name":"Byshkin","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Garry","last_name":"Robins","affiliation":"The University of Melbourne","country":"Australia","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Alessandro","last_name":"Lomi","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}] } Presentation
LIF06 - Parallel Kraken for Meta-Omic Microbiome and Phytobiome Classification
, Benjamin Garcia (Oak Ridge National Laboratory, United States of America)
+ Abstract { "session": {"id":"sess148","title":"Posters in Life Sciences","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Life Sciences"],"slots":[{"id":"post179","type":"poster","title":"LIF01 - AI-GWAPA: Explainable-AI and Genome Wide Association Phytobiome Analysis","begin_time":"19:30","end_time":"19:47","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The phytobiome consists of the plant, organismal communities and their environment. The interactions between these have significant effects on observable measurable traits that have potential economic and sustainability implications. A better systems-level understanding of these interactions will enhance our capacity to design higher yield and sustainable plants. We apply a collection of machine learning and deep learning methods to elucidate the interaction between viral, microbial and the plant host systems. Metatranscriptome samples from leaf, xylem and root along with approximately 10 million SNPs called across a population of 1000 \u003Cem\u003EP. trichocarpa\u003C\/em\u003E trees, allows us to associate host genetic variants to phytobiome constituents. Factorization machines are used, in a deep learning framework, to take higher-order interactions into account producing a set of high-confidence measurements of taxa abundance, which serve as phenotypes in a Genome Wide Association Analysis. In addition, putative host driven mutualistic\/pathogenic interactions between taxa are estimated. These provide candidate proteins for a capsule network deep learning model to predict putative protein-protein interactions, taking into account the protein\u2019s quantum chemical properties. Capsule networks provide an explainable AI approach to uncover features driving important interactions. This machine\/deep learning framework provides a methodology to better understand complex biological systems.","filename":"post179s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Irle","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ka Hung","last_name":"Lee","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Udaya","last_name":"Kalluri","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Wellington","last_name":"Muchero","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Jay","last_name":"Chen","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Gerald","last_name":"Tuskan","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"9","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post169","type":"poster","title":"LIF02 - The Bromodomain-Peptide (Un)Binding Network","begin_time":"19:47","end_time":"20:04","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Atomistic simulations are a valuable tool for understanding the structural details involved in biological processes. However, extracting kinetic information such as (un)binding rates for complex molecular systems remains a challenge as it requires both the efficient sampling of rare events and scalable analysis tools to cope with large amounts of data. Progress index-guided sampling (PIGS) is an unsupervised and scalable algorithm capable of enhancing the conformational diversification in a focused search space by simply rewarding spontaneous fluctuations. Here we use PIGS to study the binding mode and dissociation of a tripeptide from four different bromodomains, which are protein modules involved in epigenetics. By focusing the sampling enhancement on the two distinct loops forming the peptide binding pocket we are able to observe several unbinding events. Clustering of conformations and subsequent Markov state model analysis elucidate states, kinetics, and pathways involved in the unbinding process and help to shed light on structural aspects of epigenetic regulation. Both the PIGS protocol and the analysis tools rely on scalable libraries implemented in the software CAMPARI, which can also be interfaced with other propagation codes to fully exploit the tools available on different HPC infrastructures.","bio":"","contributors":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post177","type":"poster","title":"LIF03 - Explainable Machine Learning for Systems Biology: Tensor Iterative Random Forests","begin_time":"20:04","end_time":"20:21","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A novel algorithm, Tensor iterative Random Forest (TiRF), is able to effectively build forests that can be mined for interactions within a multidimensional X matrix, a multidimensional Y matrix, and interactions between multiple dimensions in X and Y, all at the same time. TiRF uses dimension reduction techniques (such as Lasso or a nested call to iRF) on Y for the given subsets of X, thus ensuring that the new subset of Y dimensions is highly connected to the X being split upon. The dimension-reduced Y matrix can then be used in measuring node purity, ensuring that noise is reduced and that the given features in X are measured for the ability to split the appropriate dimensions in Y. The resulting trees and forest now contain paths in X, and each node also has an associated set of Y. This means that random intersection trees can be used to find sets of interacting Xs from the forest and sets of interacting Ys conditional on those sets of X. As the data sets are growing exponentially, petascale deployments of TiRFs are being targeted in order to perform systems-wide analyses of biological or other complex systems that can be represented as matrices.","filename":"post177s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ashley","last_name":"Cliff","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ben","last_name":"Brown","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post124","type":"poster","title":"LIF04 - iReceptor: A Platform for Exploring and Analyzing Antibody\/B-cell and T-cell Receptor Repertoire Data across Federated Repositories","begin_time":"20:21","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Individuals produce an immense repertoire of antibody\/B-cell and T-cell receptor sequences in order to recognize and destroy a diverse array of pathogens. For example, it has been estimated that the number of possible human B-cell sequences is greater than 10^13, and each person produces 10^8 - 10^9 B-cells during an immune response, many of which are unique. In 2009 NGS approaches were used to characterize the Adaptive Immune Receptor Repertoire (AIRR) in exquisite detail. These AIRR-Seq data sets have rapidly become critical to vaccine development, understanding the immune response in autoimmune disease, and in developing novel therapeutics against cancer. The iReceptor system (ireceptor.org) is a platform to integrate and analyze these immense data sets by combining: 1) an international network of AIRR-Seq data repositories; 2) the ability to federate AIRR-Seq data across these distributed repositories; 3) the ability to apply advanced analytical tools using large advanced research computing (ARC) resources; and 4) a scientific gateway that hides the complexity of performing research queries and advanced analyses across these federated data. iReceptor enables these capabilities by building on the standards work being carried our by the AIRR Community (airr-community.org). iReceptor manages the staging of data and control of ARC analysis jobs through the use of the AGAVE science-as-a-service platform (agaveapi.co).","filename":"post124s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Felix","last_name":"Breden","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true}]},{"id":"post122","type":"poster","title":"LIF05 - Modeling Biological Networks with Exponential Random Graph Models","begin_time":"20:38","end_time":"20:55","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Much research in biological networks concerns \u0022motifs\u0022, small subgraphs which occur more frequently than by chance, which are considered the building blocks of complex networks. Exponential random graph models (ERGMs), a well-established class of statistical models for network data which are widely used in social network analysis, represent a principled statistical method of determining whether a motif is over (or under) represented in a network. Although the use of ERGMs for analyzing biological networks was introduced into the bioinformatics literature ten years ago, the use of ERGMs in biological network analysis has been very limited since then due to problems with applying existing methods to such networks, their size being typically far larger than those social networks to which ERGMs are usually applied. Here we use high performance computing to apply our recently developed new techniques (snowball sampling, improved fixed density ERGM sampling, scalable Equilibrium Expectation algorithm) for ERGM estimation to several biological networks (protein-protein interaction networks, gene regulatory networks, and a neural network), ranging in size from a few hundred to over five thousand nodes.","filename":"post122s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Maksym","last_name":"Byshkin","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Garry","last_name":"Robins","affiliation":"The University of Melbourne","country":"Australia","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Alessandro","last_name":"Lomi","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true}]},{"id":"post178","type":"poster","title":"LIF06 - Parallel Kraken for Meta-Omic Microbiome and Phytobiome Classification","begin_time":"20:55","end_time":"21:12","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Meta-omic taxa classification allows for the identification of a wide range of micro- and macro-organisms that influence disease, physiological states, and ecological states in both host systems and environmental samples. Traditional classification methods often either use markers or subsample genomes to limit search space (and, as such, lose discriminatory power) for identifying large numbers of taxa. However, accurately identifying taxa in meta-omics datasets requires methods that simultaneously identify taxa across all kingdoms of life, utilize whole genome sequences, and are highly parallel. One drawback to utilizing whole genome sequences without subsampling or combining similar sequences is that databases of all known sequences become impractical to store in RAM. To overcome this limitation, we have developed a highly parallel version of Kraken that can run multiple sequences through multiple databases in parallel and resolve each read assignment to identify taxa in transcriptomic, proteomic, and genomic data. Additionally, we have created databases from all known whole genome sequences in NCBI totaling 115k+ genomes and 700k+ viral genomes from JGI. Parallel Kraken has been used to classify thousands of samples and billions of reads in parallel and is part of a pipeline for understanding relationships between host and microbiome\/phytobiome.","filename":"post178s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ian","last_name":"Hodge","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post170","type":"poster","title":"LIF07 - SAPPHIRE Basin Recognition: An Unsupervised Algorithm to Identify and Project Metastable and Transition States in High-Dimensional Time Series Data","begin_time":"21:12","end_time":"21:29","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The identification of states and pathways in high-dimensional data sets is a daunting task in that the number of states increases exponentially with dimensionality. While simple projection techniques are prone to introduce overlaps and shortcuts, our ability to process information does mandate that the dynamics be reduced to a small number of essential patterns. Here we present a new approach called SAPPHIRE (states and pathways projected at high resolution) Basin Recognition (SBR) and show the automatic detection of metastable and transition states for different applications ranging from molecular dynamics simulations to temporal series of complex systems taken from other scientific domains. The method is based on a reordering of the trajectory according to a short spanning tree, such that the resulting new sequence amounts to a walk along the state space basin by basin. The temporal series is used to highlight recurrence and to provide a kinetic distance. By using the kinetic distance and the original time series, SBR performs an automatic identification of the basins. By comparing with other clustering techniques, we demonstrate the suitability of the method to capture the salient slow modes of the system while maintaining a manageable number of states.","filename":"post170s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post178","type":"poster","title":"LIF06 - Parallel Kraken for Meta-Omic Microbiome and Phytobiome Classification","begin_time":"20:55","end_time":"21:12","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Meta-omic taxa classification allows for the identification of a wide range of micro- and macro-organisms that influence disease, physiological states, and ecological states in both host systems and environmental samples. Traditional classification methods often either use markers or subsample genomes to limit search space (and, as such, lose discriminatory power) for identifying large numbers of taxa. However, accurately identifying taxa in meta-omics datasets requires methods that simultaneously identify taxa across all kingdoms of life, utilize whole genome sequences, and are highly parallel. One drawback to utilizing whole genome sequences without subsampling or combining similar sequences is that databases of all known sequences become impractical to store in RAM. To overcome this limitation, we have developed a highly parallel version of Kraken that can run multiple sequences through multiple databases in parallel and resolve each read assignment to identify taxa in transcriptomic, proteomic, and genomic data. Additionally, we have created databases from all known whole genome sequences in NCBI totaling 115k+ genomes and 700k+ viral genomes from JGI. Parallel Kraken has been used to classify thousands of samples and billions of reads in parallel and is part of a pipeline for understanding relationships between host and microbiome\/phytobiome.","filename":"post178s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ian","last_name":"Hodge","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ian","last_name":"Hodge","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}] } Presentation
LIF07 - SAPPHIRE Basin Recognition: An Unsupervised Algorithm to Identify and Project Metastable and Transition States in High-Dimensional Time Series Data
, Francesco Cocina (University of Zurich, Switzerland)
+ Abstract { "session": {"id":"sess148","title":"Posters in Life Sciences","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Life Sciences"],"slots":[{"id":"post179","type":"poster","title":"LIF01 - AI-GWAPA: Explainable-AI and Genome Wide Association Phytobiome Analysis","begin_time":"19:30","end_time":"19:47","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The phytobiome consists of the plant, organismal communities and their environment. The interactions between these have significant effects on observable measurable traits that have potential economic and sustainability implications. A better systems-level understanding of these interactions will enhance our capacity to design higher yield and sustainable plants. We apply a collection of machine learning and deep learning methods to elucidate the interaction between viral, microbial and the plant host systems. Metatranscriptome samples from leaf, xylem and root along with approximately 10 million SNPs called across a population of 1000 \u003Cem\u003EP. trichocarpa\u003C\/em\u003E trees, allows us to associate host genetic variants to phytobiome constituents. Factorization machines are used, in a deep learning framework, to take higher-order interactions into account producing a set of high-confidence measurements of taxa abundance, which serve as phenotypes in a Genome Wide Association Analysis. In addition, putative host driven mutualistic\/pathogenic interactions between taxa are estimated. These provide candidate proteins for a capsule network deep learning model to predict putative protein-protein interactions, taking into account the protein\u2019s quantum chemical properties. Capsule networks provide an explainable AI approach to uncover features driving important interactions. This machine\/deep learning framework provides a methodology to better understand complex biological systems.","filename":"post179s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stephan","last_name":"Irle","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ka Hung","last_name":"Lee","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Udaya","last_name":"Kalluri","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Wellington","last_name":"Muchero","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Jay","last_name":"Chen","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Gerald","last_name":"Tuskan","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"9","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post169","type":"poster","title":"LIF02 - The Bromodomain-Peptide (Un)Binding Network","begin_time":"19:47","end_time":"20:04","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Atomistic simulations are a valuable tool for understanding the structural details involved in biological processes. However, extracting kinetic information such as (un)binding rates for complex molecular systems remains a challenge as it requires both the efficient sampling of rare events and scalable analysis tools to cope with large amounts of data. Progress index-guided sampling (PIGS) is an unsupervised and scalable algorithm capable of enhancing the conformational diversification in a focused search space by simply rewarding spontaneous fluctuations. Here we use PIGS to study the binding mode and dissociation of a tripeptide from four different bromodomains, which are protein modules involved in epigenetics. By focusing the sampling enhancement on the two distinct loops forming the peptide binding pocket we are able to observe several unbinding events. Clustering of conformations and subsequent Markov state model analysis elucidate states, kinetics, and pathways involved in the unbinding process and help to shed light on structural aspects of epigenetic regulation. Both the PIGS protocol and the analysis tools rely on scalable libraries implemented in the software CAMPARI, which can also be interfaced with other propagation codes to fully exploit the tools available on different HPC infrastructures.","bio":"","contributors":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Cassiano","last_name":"Langini","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"post177","type":"poster","title":"LIF03 - Explainable Machine Learning for Systems Biology: Tensor Iterative Random Forests","begin_time":"20:04","end_time":"20:21","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"A novel algorithm, Tensor iterative Random Forest (TiRF), is able to effectively build forests that can be mined for interactions within a multidimensional X matrix, a multidimensional Y matrix, and interactions between multiple dimensions in X and Y, all at the same time. TiRF uses dimension reduction techniques (such as Lasso or a nested call to iRF) on Y for the given subsets of X, thus ensuring that the new subset of Y dimensions is highly connected to the X being split upon. The dimension-reduced Y matrix can then be used in measuring node purity, ensuring that noise is reduced and that the given features in X are measured for the ability to split the appropriate dimensions in Y. The resulting trees and forest now contain paths in X, and each node also has an associated set of Y. This means that random intersection trees can be used to find sets of interacting Xs from the forest and sets of interacting Ys conditional on those sets of X. As the data sets are growing exponentially, petascale deployments of TiRFs are being targeted in order to perform systems-wide analyses of biological or other complex systems that can be represented as matrices.","filename":"post177s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ashley","last_name":"Cliff","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Ben","last_name":"Brown","affiliation":"Lawrence Berkeley National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonathon","last_name":"Romero","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post124","type":"poster","title":"LIF04 - iReceptor: A Platform for Exploring and Analyzing Antibody\/B-cell and T-cell Receptor Repertoire Data across Federated Repositories","begin_time":"20:21","end_time":"20:38","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Individuals produce an immense repertoire of antibody\/B-cell and T-cell receptor sequences in order to recognize and destroy a diverse array of pathogens. For example, it has been estimated that the number of possible human B-cell sequences is greater than 10^13, and each person produces 10^8 - 10^9 B-cells during an immune response, many of which are unique. In 2009 NGS approaches were used to characterize the Adaptive Immune Receptor Repertoire (AIRR) in exquisite detail. These AIRR-Seq data sets have rapidly become critical to vaccine development, understanding the immune response in autoimmune disease, and in developing novel therapeutics against cancer. The iReceptor system (ireceptor.org) is a platform to integrate and analyze these immense data sets by combining: 1) an international network of AIRR-Seq data repositories; 2) the ability to federate AIRR-Seq data across these distributed repositories; 3) the ability to apply advanced analytical tools using large advanced research computing (ARC) resources; and 4) a scientific gateway that hides the complexity of performing research queries and advanced analyses across these federated data. iReceptor enables these capabilities by building on the standards work being carried our by the AIRR Community (airr-community.org). iReceptor manages the staging of data and control of ARC analysis jobs through the use of the AGAVE science-as-a-service platform (agaveapi.co).","filename":"post124s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Felix","last_name":"Breden","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Brian","last_name":"Corrie","affiliation":"Simon Fraser University","country":"Canada","bio":"","order":"1","is_presenter":true}]},{"id":"post122","type":"poster","title":"LIF05 - Modeling Biological Networks with Exponential Random Graph Models","begin_time":"20:38","end_time":"20:55","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Much research in biological networks concerns \u0022motifs\u0022, small subgraphs which occur more frequently than by chance, which are considered the building blocks of complex networks. Exponential random graph models (ERGMs), a well-established class of statistical models for network data which are widely used in social network analysis, represent a principled statistical method of determining whether a motif is over (or under) represented in a network. Although the use of ERGMs for analyzing biological networks was introduced into the bioinformatics literature ten years ago, the use of ERGMs in biological network analysis has been very limited since then due to problems with applying existing methods to such networks, their size being typically far larger than those social networks to which ERGMs are usually applied. Here we use high performance computing to apply our recently developed new techniques (snowball sampling, improved fixed density ERGM sampling, scalable Equilibrium Expectation algorithm) for ERGM estimation to several biological networks (protein-protein interaction networks, gene regulatory networks, and a neural network), ranging in size from a few hundred to over five thousand nodes.","filename":"post122s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Maksym","last_name":"Byshkin","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Garry","last_name":"Robins","affiliation":"The University of Melbourne","country":"Australia","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Alessandro","last_name":"Lomi","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alex","last_name":"Stivala","affiliation":"Swinburne University of Technology","country":"Australia","bio":"","order":"1","is_presenter":true}]},{"id":"post178","type":"poster","title":"LIF06 - Parallel Kraken for Meta-Omic Microbiome and Phytobiome Classification","begin_time":"20:55","end_time":"21:12","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Meta-omic taxa classification allows for the identification of a wide range of micro- and macro-organisms that influence disease, physiological states, and ecological states in both host systems and environmental samples. Traditional classification methods often either use markers or subsample genomes to limit search space (and, as such, lose discriminatory power) for identifying large numbers of taxa. However, accurately identifying taxa in meta-omics datasets requires methods that simultaneously identify taxa across all kingdoms of life, utilize whole genome sequences, and are highly parallel. One drawback to utilizing whole genome sequences without subsampling or combining similar sequences is that databases of all known sequences become impractical to store in RAM. To overcome this limitation, we have developed a highly parallel version of Kraken that can run multiple sequences through multiple databases in parallel and resolve each read assignment to identify taxa in transcriptomic, proteomic, and genomic data. Additionally, we have created databases from all known whole genome sequences in NCBI totaling 115k+ genomes and 700k+ viral genomes from JGI. Parallel Kraken has been used to classify thousands of samples and billions of reads in parallel and is part of a pipeline for understanding relationships between host and microbiome\/phytobiome.","filename":"post178s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Piet","last_name":"Jones","affiliation":"University of Tennessee","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ian","last_name":"Hodge","affiliation":"Stanford University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Doug","last_name":"Hyatt","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Jacobson","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Benjamin","last_name":"Garcia","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"post170","type":"poster","title":"LIF07 - SAPPHIRE Basin Recognition: An Unsupervised Algorithm to Identify and Project Metastable and Transition States in High-Dimensional Time Series Data","begin_time":"21:12","end_time":"21:29","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The identification of states and pathways in high-dimensional data sets is a daunting task in that the number of states increases exponentially with dimensionality. While simple projection techniques are prone to introduce overlaps and shortcuts, our ability to process information does mandate that the dynamics be reduced to a small number of essential patterns. Here we present a new approach called SAPPHIRE (states and pathways projected at high resolution) Basin Recognition (SBR) and show the automatic detection of metastable and transition states for different applications ranging from molecular dynamics simulations to temporal series of complex systems taken from other scientific domains. The method is based on a reordering of the trajectory according to a short spanning tree, such that the resulting new sequence amounts to a walk along the state space basin by basin. The temporal series is used to highlight recurrence and to provide a kinetic distance. By using the kinetic distance and the original time series, SBR performs an automatic identification of the basins. By comparing with other clustering techniques, we demonstrate the suitability of the method to capture the salient slow modes of the system while maintaining a manageable number of states.","filename":"post170s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post170","type":"poster","title":"LIF07 - SAPPHIRE Basin Recognition: An Unsupervised Algorithm to Identify and Project Metastable and Transition States in High-Dimensional Time Series Data","begin_time":"21:12","end_time":"21:29","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The identification of states and pathways in high-dimensional data sets is a daunting task in that the number of states increases exponentially with dimensionality. While simple projection techniques are prone to introduce overlaps and shortcuts, our ability to process information does mandate that the dynamics be reduced to a small number of essential patterns. Here we present a new approach called SAPPHIRE (states and pathways projected at high resolution) Basin Recognition (SBR) and show the automatic detection of metastable and transition states for different applications ranging from molecular dynamics simulations to temporal series of complex systems taken from other scientific domains. The method is based on a reordering of the trajectory according to a short spanning tree, such that the resulting new sequence amounts to a walk along the state space basin by basin. The temporal series is used to highlight recurrence and to provide a kinetic distance. By using the kinetic distance and the original time series, SBR performs an automatic identification of the basins. By comparing with other clustering techniques, we demonstrate the suitability of the method to capture the salient slow modes of the system while maintaining a manageable number of states.","filename":"post170s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Francesco","last_name":"Cocina","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marco","last_name":"Bacci","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Vitalis","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Amedeo","last_name":"Caflisch","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false}] } Presentation
PHY02 - Hydrodynamical High Performance Simulations of an Accretion Disk surrounding a Supermassive Black Hole and its Interactions
, Fabian Klein (University of Heidelberg, Germany)
+ Abstract { "session": {"id":"sess149","title":"Posters in Physics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Physics"],"slots":[{"id":"post159","type":"poster","title":"PHY01 - AFiD-GPU: A Versatile Navier-Stokes Solver for Wall-Bounded Turbulent Flows on GPU Clusters","begin_time":"19:30","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The AFiD code, an open source solver for the incompressible Navier-Stokes equations (http:\/\/www.afid.eu), has been ported to GPU clusters to tackle large-scale wall-bounded turbulent flow simulations. The GPU porting has been carried out in CUDA Fortran with the extensive use of kernel loop directives (CUF kernels) in order to have a source code as close as possible to the original CPU version; just a few routines have been manually rewritten. A new transpose scheme has been devised to improve the scaling of the Poisson solver, which is the main bottleneck of incompressible solvers. For large meshes the GPU version of the code shows good strong scaling characteristics, and the wall-clock time per step for GPU version is an order of magnitude smaller than for the CPU version of the code. Due to the increased performance and efficient use of memory, the GPU version of AFiD can perform simulations in parameter ranges that are unprecedented in thermally-driven wall-bounded turbulence. To verify the accuracy of the code, turbulent Rayleigh-B\u00e9nard convection and plane Couette flow are simulated and the results are in excellent agreement with the experimental and\u00a0computational data that have been published in literature.","bio":"","contributors":[{"type":"Author","first_name":"Xiaojue","last_name":"Zhu","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Everett","last_name":"Phillips","affiliation":"NVIDIA Corporation","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vamsi","last_name":"Spandan","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Donners","affiliation":"SURFsara","country":"Netherlands","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Gregory","last_name":"Ruetsch","affiliation":"NVIDIA Corporation","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Josh","last_name":"Romero","affiliation":"NVIDIA Corporation","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Rodolfo","last_name":"Ostilla-M\u00f3nico","affiliation":"University of Houston","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Yantao","last_name":"Yang","affiliation":"Peking University","country":"China","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Detlef","last_name":"Lohse","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Roberto","last_name":"Verzicco","affiliation":"University of Rome Tor Vergata","country":"Italy","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Massimiliano","last_name":"Fatica","affiliation":"NVIDIA Corporation","country":"United States of America","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Richard","last_name":"Stevens","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"12","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Xiaojue","last_name":"Zhu","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"1","is_presenter":true}]},{"id":"post151","type":"poster","title":"PHY02 - Hydrodynamical High Performance Simulations of an Accretion Disk surrounding a Supermassive Black Hole and its Interactions","begin_time":"20:10","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This computational and numerical project is aiming to improve the modelling of star-gas disk interactions in active galactic nuclei (AGN) disks. In the context of the StarD-isk project ([Just et al., 2012]) we consider an AGN system (central star cluster, supermassive black hole (SMBH) and accretion disk (AD)). The AD creates dissipative forces acting on stars in the disk, resulting in an increased mass flow to the SMBH and asymmetries in the phase space distribution due to its rotation. We employ a vertically extended Keplerian disk (alpha formulation [Shakura and Sunyaev, 1973]), allowing a fully self-consistent treatment of stellar dynamics including the dissipative force originating from star-gas ram pressure effects. [Kennedy et al., 2016] is treating the disk stationary (no bi-directional feedback, no checks of the disk\u2019s lifetime and stability). We are using an ideal monotonic gas, alpha viscosity, the disk\u0027s self-gravity and radiation transport in flux-limited diffusion limit (FLD). Additionally, force equilibrium-ensuring initial conditions are used. We employ the PLUTO Code ([Mignone et al., 2007]) with modifications and modules (self-gravity and FLD) developed by Rolf Kuiper ([Kuiper et al., 2010a]) in order to carry out the numerical simulations of the resulting Navier-Stokes equations including massively parallel runs as offered by PLUTO.","filename":"post151s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fabian","last_name":"Klein","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Rainer","last_name":"Spurzem","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Just","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Kuiper","affiliation":"Universit\u00e4t T\u00fcbingen","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Fabian","last_name":"Klein","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post132","type":"poster","title":"PHY03 - A Performance Model for Quantum ESPRESSO\u2019s PWscf","begin_time":"20:50","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Performance modelling of applications is essential to co-design of future exascale architecture. To this end, the parallel execution of PWscf (Plane-Wave Self Consistent Field), one of the most used components of the Quantum ESPRESSO open-source suite, has been carefully profiled and analyzed on modern many-core systems. The results were categorized and grouped in a set of distinct kernels that describe the execution flow of the application and can be used to reproduce the absolute execution time, as a function of the input data, on different architectures. In this poster presentation we will describe the strategy used to define the model, the outcomes that it can provide and the future perspective for this work.","filename":"post132s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Bonf\u00e0","affiliation":"CINECA","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabio","last_name":"Affinito","affiliation":"CINECA","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlo","last_name":"Cavazzoni","affiliation":"CINECA","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Bonf\u00e0","affiliation":"CINECA","country":"Italy","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post151","type":"poster","title":"PHY02 - Hydrodynamical High Performance Simulations of an Accretion Disk surrounding a Supermassive Black Hole and its Interactions","begin_time":"20:10","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This computational and numerical project is aiming to improve the modelling of star-gas disk interactions in active galactic nuclei (AGN) disks. In the context of the StarD-isk project ([Just et al., 2012]) we consider an AGN system (central star cluster, supermassive black hole (SMBH) and accretion disk (AD)). The AD creates dissipative forces acting on stars in the disk, resulting in an increased mass flow to the SMBH and asymmetries in the phase space distribution due to its rotation. We employ a vertically extended Keplerian disk (alpha formulation [Shakura and Sunyaev, 1973]), allowing a fully self-consistent treatment of stellar dynamics including the dissipative force originating from star-gas ram pressure effects. [Kennedy et al., 2016] is treating the disk stationary (no bi-directional feedback, no checks of the disk\u2019s lifetime and stability). We are using an ideal monotonic gas, alpha viscosity, the disk\u0027s self-gravity and radiation transport in flux-limited diffusion limit (FLD). Additionally, force equilibrium-ensuring initial conditions are used. We employ the PLUTO Code ([Mignone et al., 2007]) with modifications and modules (self-gravity and FLD) developed by Rolf Kuiper ([Kuiper et al., 2010a]) in order to carry out the numerical simulations of the resulting Navier-Stokes equations including massively parallel runs as offered by PLUTO.","filename":"post151s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fabian","last_name":"Klein","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Rainer","last_name":"Spurzem","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Just","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Kuiper","affiliation":"Universit\u00e4t T\u00fcbingen","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Fabian","last_name":"Klein","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Fabian","last_name":"Klein","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Rainer","last_name":"Spurzem","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Just","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Kuiper","affiliation":"Universit\u00e4t T\u00fcbingen","country":"Germany","bio":"","order":"4","is_presenter":false}] } Presentation
PHY03 - A Performance Model for Quantum ESPRESSO’s PWscf
, Pietro Bonfà (CINECA, Italy)
+ Abstract { "session": {"id":"sess149","title":"Posters in Physics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Physics"],"slots":[{"id":"post159","type":"poster","title":"PHY01 - AFiD-GPU: A Versatile Navier-Stokes Solver for Wall-Bounded Turbulent Flows on GPU Clusters","begin_time":"19:30","end_time":"20:10","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The AFiD code, an open source solver for the incompressible Navier-Stokes equations (http:\/\/www.afid.eu), has been ported to GPU clusters to tackle large-scale wall-bounded turbulent flow simulations. The GPU porting has been carried out in CUDA Fortran with the extensive use of kernel loop directives (CUF kernels) in order to have a source code as close as possible to the original CPU version; just a few routines have been manually rewritten. A new transpose scheme has been devised to improve the scaling of the Poisson solver, which is the main bottleneck of incompressible solvers. For large meshes the GPU version of the code shows good strong scaling characteristics, and the wall-clock time per step for GPU version is an order of magnitude smaller than for the CPU version of the code. Due to the increased performance and efficient use of memory, the GPU version of AFiD can perform simulations in parameter ranges that are unprecedented in thermally-driven wall-bounded turbulence. To verify the accuracy of the code, turbulent Rayleigh-B\u00e9nard convection and plane Couette flow are simulated and the results are in excellent agreement with the experimental and\u00a0computational data that have been published in literature.","bio":"","contributors":[{"type":"Author","first_name":"Xiaojue","last_name":"Zhu","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Everett","last_name":"Phillips","affiliation":"NVIDIA Corporation","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Vamsi","last_name":"Spandan","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"John","last_name":"Donners","affiliation":"SURFsara","country":"Netherlands","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Gregory","last_name":"Ruetsch","affiliation":"NVIDIA Corporation","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Josh","last_name":"Romero","affiliation":"NVIDIA Corporation","country":"United States of America","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Rodolfo","last_name":"Ostilla-M\u00f3nico","affiliation":"University of Houston","country":"United States of America","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Yantao","last_name":"Yang","affiliation":"Peking University","country":"China","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Detlef","last_name":"Lohse","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Roberto","last_name":"Verzicco","affiliation":"University of Rome Tor Vergata","country":"Italy","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Massimiliano","last_name":"Fatica","affiliation":"NVIDIA Corporation","country":"United States of America","bio":"","order":"11","is_presenter":false},{"type":"Author","first_name":"Richard","last_name":"Stevens","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"12","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Xiaojue","last_name":"Zhu","affiliation":"University of Twente","country":"Netherlands","bio":"","order":"1","is_presenter":true}]},{"id":"post151","type":"poster","title":"PHY02 - Hydrodynamical High Performance Simulations of an Accretion Disk surrounding a Supermassive Black Hole and its Interactions","begin_time":"20:10","end_time":"20:50","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This computational and numerical project is aiming to improve the modelling of star-gas disk interactions in active galactic nuclei (AGN) disks. In the context of the StarD-isk project ([Just et al., 2012]) we consider an AGN system (central star cluster, supermassive black hole (SMBH) and accretion disk (AD)). The AD creates dissipative forces acting on stars in the disk, resulting in an increased mass flow to the SMBH and asymmetries in the phase space distribution due to its rotation. We employ a vertically extended Keplerian disk (alpha formulation [Shakura and Sunyaev, 1973]), allowing a fully self-consistent treatment of stellar dynamics including the dissipative force originating from star-gas ram pressure effects. [Kennedy et al., 2016] is treating the disk stationary (no bi-directional feedback, no checks of the disk\u2019s lifetime and stability). We are using an ideal monotonic gas, alpha viscosity, the disk\u0027s self-gravity and radiation transport in flux-limited diffusion limit (FLD). Additionally, force equilibrium-ensuring initial conditions are used. We employ the PLUTO Code ([Mignone et al., 2007]) with modifications and modules (self-gravity and FLD) developed by Rolf Kuiper ([Kuiper et al., 2010a]) in order to carry out the numerical simulations of the resulting Navier-Stokes equations including massively parallel runs as offered by PLUTO.","filename":"post151s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Fabian","last_name":"Klein","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Rainer","last_name":"Spurzem","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Andreas","last_name":"Just","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Rolf","last_name":"Kuiper","affiliation":"Universit\u00e4t T\u00fcbingen","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Fabian","last_name":"Klein","affiliation":"University of Heidelberg","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"post132","type":"poster","title":"PHY03 - A Performance Model for Quantum ESPRESSO\u2019s PWscf","begin_time":"20:50","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Performance modelling of applications is essential to co-design of future exascale architecture. To this end, the parallel execution of PWscf (Plane-Wave Self Consistent Field), one of the most used components of the Quantum ESPRESSO open-source suite, has been carefully profiled and analyzed on modern many-core systems. The results were categorized and grouped in a set of distinct kernels that describe the execution flow of the application and can be used to reproduce the absolute execution time, as a function of the input data, on different architectures. In this poster presentation we will describe the strategy used to define the model, the outcomes that it can provide and the future perspective for this work.","filename":"post132s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Bonf\u00e0","affiliation":"CINECA","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabio","last_name":"Affinito","affiliation":"CINECA","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlo","last_name":"Cavazzoni","affiliation":"CINECA","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Bonf\u00e0","affiliation":"CINECA","country":"Italy","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post132","type":"poster","title":"PHY03 - A Performance Model for Quantum ESPRESSO\u2019s PWscf","begin_time":"20:50","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Performance modelling of applications is essential to co-design of future exascale architecture. To this end, the parallel execution of PWscf (Plane-Wave Self Consistent Field), one of the most used components of the Quantum ESPRESSO open-source suite, has been carefully profiled and analyzed on modern many-core systems. The results were categorized and grouped in a set of distinct kernels that describe the execution flow of the application and can be used to reproduce the absolute execution time, as a function of the input data, on different architectures. In this poster presentation we will describe the strategy used to define the model, the outcomes that it can provide and the future perspective for this work.","filename":"post132s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Pietro","last_name":"Bonf\u00e0","affiliation":"CINECA","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabio","last_name":"Affinito","affiliation":"CINECA","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlo","last_name":"Cavazzoni","affiliation":"CINECA","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Pietro","last_name":"Bonf\u00e0","affiliation":"CINECA","country":"Italy","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Pietro","last_name":"Bonf\u00e0","affiliation":"CINECA","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabio","last_name":"Affinito","affiliation":"CINECA","country":"Italy","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Carlo","last_name":"Cavazzoni","affiliation":"CINECA","country":"Italy","bio":"","order":"3","is_presenter":false}] } Presentation
SED01 - Extreme Scale Global Convection Models for Flow-Induced Topography
, Simon Bauer (Ludwig Maximilian University of Munich, Germany)
+ Abstract { "session": {"id":"sess150","title":"Posters in Solid Earth Dynamics","date":"Tuesday, July 3rd 2018","begin_time":"19:30","end_time":"21:30","room":"Foyer 2nd Floor","contributors":[],"view_type":"X","view_type_id":"evtt109","tracks":["Solid Earth Dynamics"],"slots":[{"id":"post156","type":"poster","title":"SED01 - Extreme Scale Global Convection Models for Flow-Induced Topography","begin_time":"19:30","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The asthenosphere is a mechanically weak layer in the uppermost part of the Earth\u0027s mantle, lying beneath the outermost rigid layer, i.e., the lithosphere. While there is robust evidence for the existence of such a weak layer, the thickness of the asthenosphere remains debated. In this study, we investigate the effect of several asthenosphere channel thicknesses on dynamic topography, i.e. topography caused by global mantle flow. We start from a thickness of 1000 km going down to an extremely narrow channel of 100 km with significant contrasts in the viscosity of up to three orders of magnitude. Additionally, we add lateral viscosity variations using a present day temperature field. We employ a prototype of a new mantle convection framework, based on a matrix-free finite element implementation. We demonstrate that our framework is capable to simulate such highly complex scenarios and present results with an unprecedented global resolution close to 1 km using more than 60,000 compute cores on a peta-scale system. This yields systems with \u003Cem\u003EO\u003C\/em\u003E(10^12) degrees of freedom. These results will serve us to gain insights into the current debate on the power spectrum of the observed surface dynamic topography of the Earth.","filename":"post156s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simon","last_name":"Bauer","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Hans-Peter","last_name":"Bunge","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Siavash","last_name":"Ghelichkhan","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Huber","affiliation":"TU Munich","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marcus","last_name":"Mohr","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Barbara","last_name":"Wohlmuth","affiliation":"TU Munich","country":"Germany","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simon","last_name":"Bauer","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"post156","type":"poster","title":"SED01 - Extreme Scale Global Convection Models for Flow-Induced Topography","begin_time":"19:30","end_time":"21:30","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The asthenosphere is a mechanically weak layer in the uppermost part of the Earth\u0027s mantle, lying beneath the outermost rigid layer, i.e., the lithosphere. While there is robust evidence for the existence of such a weak layer, the thickness of the asthenosphere remains debated. In this study, we investigate the effect of several asthenosphere channel thicknesses on dynamic topography, i.e. topography caused by global mantle flow. We start from a thickness of 1000 km going down to an extremely narrow channel of 100 km with significant contrasts in the viscosity of up to three orders of magnitude. Additionally, we add lateral viscosity variations using a present day temperature field. We employ a prototype of a new mantle convection framework, based on a matrix-free finite element implementation. We demonstrate that our framework is capable to simulate such highly complex scenarios and present results with an unprecedented global resolution close to 1 km using more than 60,000 compute cores on a peta-scale system. This yields systems with \u003Cem\u003EO\u003C\/em\u003E(10^12) degrees of freedom. These results will serve us to gain insights into the current debate on the power spectrum of the observed surface dynamic topography of the Earth.","filename":"post156s2.pdf","bio":"","contributors":[{"type":"Author","first_name":"Simon","last_name":"Bauer","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Hans-Peter","last_name":"Bunge","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Siavash","last_name":"Ghelichkhan","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Huber","affiliation":"TU Munich","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marcus","last_name":"Mohr","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Barbara","last_name":"Wohlmuth","affiliation":"TU Munich","country":"Germany","bio":"","order":"7","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Simon","last_name":"Bauer","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Simon","last_name":"Bauer","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Hans-Peter","last_name":"Bunge","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Siavash","last_name":"Ghelichkhan","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Markus","last_name":"Huber","affiliation":"TU Munich","country":"Germany","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Marcus","last_name":"Mohr","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Ulrich","last_name":"R\u00fcde","affiliation":"Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg","country":"Germany","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Barbara","last_name":"Wohlmuth","affiliation":"TU Munich","country":"Germany","bio":"","order":"7","is_presenter":false}] } Presentation
Wednesday, July 4, 2018
09:00 - 10:00
CSCS Update
Montreal Room
Chair: Willem Deconinck (ECMWF, United Kingdom)
The European Centre for Medium-Range Weather Forecasts (ECMWF) leads a number of Horizon 2020 activities (ESCAPE) with innovation actions for developing a holistic understanding of energy-efficiency for extreme-scale applications using heterogeneous HPC architectures by: (a) defining and encapsulating the fundamental algorithmic building blocks ("Weather and Climate Dwarfs") underlying weather and climate services; (b) combining frontier research on algorithm development with hardware adaptation using DSLs; (c) developing benchmarks and cross-disciplinary Verification, Validation, and Uncertainty Quantification (VVUQ) for weather and climate applications; and (d) synthesizing the complementary skills of global numerical weather prediction with leading European researchers.This talk will illustrate the need for and practicality of producing ensembles of km-scale simulations, summarize progress on accelerating state-of-the-art global weather and climate predictions, and discuss outstanding issues and future directions on producing and analysing big weather data while balancing time-critical customer needs with energy- and time-to-solution.
+ Biography { "slot": {"id":"evtypp137","type":"parent","title":"","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":true,"abstract":"The European Centre for Medium-Range Weather Forecasts (ECMWF) leads a number of Horizon 2020 activities (ESCAPE) with innovation actions for developing a holistic understanding of energy-efficiency for extreme-scale applications using heterogeneous HPC architectures by: (a) defining and encapsulating the fundamental algorithmic building blocks (\u0022Weather and Climate Dwarfs\u0022) underlying weather and climate services; (b) combining frontier research on algorithm development with hardware adaptation using DSLs; (c) developing benchmarks and cross-disciplinary Verification, Validation, and Uncertainty Quantification (VVUQ) for weather and climate applications; and (d) synthesizing the complementary skills of global numerical weather prediction with leading European researchers.\n\nThis talk will illustrate the need for and practicality of producing ensembles of km-scale simulations, summarize progress on accelerating state-of-the-art global weather and climate predictions, and discuss outstanding issues and future directions on producing and analysing big weather data while balancing time-critical customer needs with energy- and time-to-solution.","filename":"evtypp137s1-file1.pdf","bio":"Nils P. Wedi has a PhD from Ludwig Maximilian University of Munich and joined ECMWF in 1995.\nHis career at ECMWF encapsulates a diverse range of work both technical and scientific. He leads ECMWF\u0027s Earth System Modelling section that addresses all aspects of scientific and computational performance relating to ECMWF\u0027s forecast model and the ensemble forecasting system. He develops strategies to secure the scalability of the model on future high-performance computing systems. He is the scientific coordinator of the European H2020 projects ESCAPE and ESCAPE-2 to address the challenges of rising energy cost for computing towards affordable, exascale high performance simulations of weather and climate, and he is a member of the World Meteorological Organization working group on numerical experimentation (WGNE).","contributors":[{"type":"Session chair \/ organizer \/ interviewer","first_name":"Chair: Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"Nils P. Wedi has a PhD from Ludwig Maximilian University of Munich and joined ECMWF in 1995.\nHis career at ECMWF encapsulates a diverse range of work both technical and scientific. He leads ECMWF\u0027s Earth System Modelling section that addresses all aspects of scientific and computational performance relating to ECMWF\u0027s forecast model and the ensemble forecasting system. He develops strategies to secure the scalability of the model on future high-performance computing systems. He is the scientific coordinator of the European H2020 projects ESCAPE and ESCAPE-2 to address the challenges of rising energy cost for computing towards affordable, exascale high performance simulations of weather and climate, and he is a member of the World Meteorological Organization working group on numerical experimentation (WGNE).","order":"1","is_presenter":false}],"has_presenters":false,"presenters":[]} } Presentation
10:50 - 11:15
Coffee Break
Foyer 2nd Floor
11:15 - 13:15
Minisymposia Session V
MS35 - Gravitational-Wave Data Analysis with the Current Generation of Advanced Detectors
Osaka Room
Organizer(s):
Maria Haney (University of Zurich, Switzerland)
, Philippe Jetzer (University of Zurich, Switzerland)
Track(s):
Physics
In the last two years, the field of gravitational wave astronomy has seen a breakthrough: In February 2016, the ground-based Laser Interferometer Gravitational-Wave Observatory (LIGO) announced the first direct detection of gravitational waves and the first observation of a binary black hole merger. Very recently, the LIGO-Virgo collaborations and their partner observatories announced the first joint observation of a neutron star merger in gravitational waves and the electromagnetic spectrum, marking a new era in multi-messenger astronomy. This minisymposium is intended to give an overview over certain important aspects of gravitational-wave data analysis with the current generation of ground-based detectors, highlighting applications of high-performance computing, machine learning and citizen science. The four presentations of the minisymposium will address the following topics: searches for gravitational waves in the detector data, computational aspects of source parameter estimation and physics implications of the LIGO-Virgo data, numerical relativity and its applications for the modelling of gravitational waves, aspects of data quality and data cleaning for the LIGO-Virgo data.
11:15 - 11:45
The LIGO/Virgo Search for Gravitational Waves
, Alexander Nitz (Max Planck Institute for Gravitational Physics, Germany)
+ Abstract { "session": {"id":"sess177","title":"MS35 - Gravitational-Wave Data Analysis with the Current Generation of Advanced Detectors","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Osaka Room","contributors":[{"type":"Session Chair","first_name":"Maria","last_name":"Haney","affiliation":"University of Zurich","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Physics"],"slots":[{"id":"symp153","type":"minisymposia","title":"MS35 - Gravitational-Wave Data Analysis with the Current Generation of Advanced Detectors","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"In the last two years, the field of gravitational wave astronomy has seen a breakthrough: In February 2016, the ground-based Laser Interferometer Gravitational-Wave Observatory (LIGO) announced the first direct detection of gravitational waves and the first observation of a binary black hole merger. Very recently, the LIGO-Virgo collaborations and their partner observatories announced the first joint observation of a neutron star merger in gravitational waves and the electromagnetic spectrum, marking a new era in multi-messenger astronomy. This minisymposium is intended to give an overview over certain important aspects of gravitational-wave data analysis with the current generation of ground-based\u00a0detectors, highlighting applications of high-performance computing, machine learning and citizen science. The four presentations of the minisymposium\u00a0will address\u00a0the following topics: searches for gravitational waves in the detector data,\u00a0computational aspects of source parameter estimation and physics implications of the LIGO-Virgo\u00a0data,\u00a0numerical relativity and its applications for the modelling of gravitational waves,\u00a0aspects of data quality and data cleaning for the LIGO-Virgo data.","bio":"","contributors":[{"type":"Organizer","first_name":"Maria","last_name":"Haney","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Philippe","last_name":"Jetzer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Maria","last_name":"Haney","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa198","type":"child","title":"The LIGO\/Virgo Search for Gravitational Waves","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The LIGO and Virgo detectors have completed a prolific observation run. We are now observing gravitational waves from both the mergers of binary black holes and neutron stars. We\u2019ll discuss how these discoveries were made and look into what the near future of searching for gravitational waves from compact binary mergers will look like.","filename":"msa198s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alexander","last_name":"Nitz","affiliation":"Max Planck Institute for Gravitational Physics","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alexander","last_name":"Nitz","affiliation":"Max Planck Institute for Gravitational Physics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa227","type":"child","title":"Methods and Challenges in the Characterization of Gravitational-Wave Sources","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advanced gravitational-wave detectors have so far detected signals emitted by the coalescence of two neutron stars or two black holes. Their astrophysical (masses, spins) and extrinsic (position, orientation) parameters have been estimated using stochastic samplers that efficiently explore a 15D parameter space. Although theoretically straightforward, parameter estimation can become non trivial due to the time required to calculate waveform models, the behavior of the noise, and the correlation between parameters. In this talk I will review the main methods results, and challenges associated with the characterization of compact binaries detected by advanced LIGO and Virgo. I will also underline which new challenges will arise as the sensitivity of gravitational-wave detectors improves by a factor of few or a factor of 10.","bio":"","contributors":[{"type":"Author","first_name":"Salvatore","last_name":"Vitale","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Salvatore","last_name":"Vitale","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa192","type":"child","title":"Numerical Relativity and its Applications for the Modelling of Gravitational Waves","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The optimally efficient detection of gravitational wave events, and the robust identification of the sources of such events rely on accurate models of the gravitational wave signals for astrophysically plausible events. Such models are synthesized from numerical solutions of the Einstein equations and perturbative models. This talk will review the current status and open challenges in solving the Einstein equations numerically as a system of partial differential equations as applied to the coalescence of compact binaries in the context of gravitational source modelling.","bio":"","contributors":[{"type":"Author","first_name":"Sascha","last_name":"Husa","affiliation":"University of the Balearic Islands","country":"Spain","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sascha","last_name":"Husa","affiliation":"University of the Balearic Islands","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"msa239","type":"child","title":"Data Quality for Gravitational-Wave Detectors","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Gravitational-wave detectors are extremely complicated and unprecedently sensitive machines. We monitor many thousands of sensors and control systems to detect any hardware problems, electronics failures, noise from the environment, and all the many other subtle issues which degrade our sensitivity. I\u0027ll talk about many of the ways that we mine this large data set, including methods from signal processing and statistics, machine learning, and citizen science, to diagnose and solve the issues obstructing our\u00a0searches for gravitational waves.","filename":"msa239s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrew P.","last_name":"Lundgren","affiliation":"University of Portsmouth","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andrew P.","last_name":"Lundgren","affiliation":"University of Portsmouth","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa198","type":"child","title":"The LIGO\/Virgo Search for Gravitational Waves","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The LIGO and Virgo detectors have completed a prolific observation run. We are now observing gravitational waves from both the mergers of binary black holes and neutron stars. We\u2019ll discuss how these discoveries were made and look into what the near future of searching for gravitational waves from compact binary mergers will look like.","filename":"msa198s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alexander","last_name":"Nitz","affiliation":"Max Planck Institute for Gravitational Physics","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alexander","last_name":"Nitz","affiliation":"Max Planck Institute for Gravitational Physics","country":"Germany","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Alexander","last_name":"Nitz","affiliation":"Max Planck Institute for Gravitational Physics","country":"Germany","bio":"","order":"1","is_presenter":true}] } Presentation
12:45 - 13:15
Data Quality for Gravitational-Wave Detectors
, Andrew P. Lundgren (University of Portsmouth, United Kingdom)
+ Abstract { "session": {"id":"sess177","title":"MS35 - Gravitational-Wave Data Analysis with the Current Generation of Advanced Detectors","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Osaka Room","contributors":[{"type":"Session Chair","first_name":"Maria","last_name":"Haney","affiliation":"University of Zurich","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Physics"],"slots":[{"id":"symp153","type":"minisymposia","title":"MS35 - Gravitational-Wave Data Analysis with the Current Generation of Advanced Detectors","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"In the last two years, the field of gravitational wave astronomy has seen a breakthrough: In February 2016, the ground-based Laser Interferometer Gravitational-Wave Observatory (LIGO) announced the first direct detection of gravitational waves and the first observation of a binary black hole merger. Very recently, the LIGO-Virgo collaborations and their partner observatories announced the first joint observation of a neutron star merger in gravitational waves and the electromagnetic spectrum, marking a new era in multi-messenger astronomy. This minisymposium is intended to give an overview over certain important aspects of gravitational-wave data analysis with the current generation of ground-based\u00a0detectors, highlighting applications of high-performance computing, machine learning and citizen science. The four presentations of the minisymposium\u00a0will address\u00a0the following topics: searches for gravitational waves in the detector data,\u00a0computational aspects of source parameter estimation and physics implications of the LIGO-Virgo\u00a0data,\u00a0numerical relativity and its applications for the modelling of gravitational waves,\u00a0aspects of data quality and data cleaning for the LIGO-Virgo data.","bio":"","contributors":[{"type":"Organizer","first_name":"Maria","last_name":"Haney","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Philippe","last_name":"Jetzer","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Maria","last_name":"Haney","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa198","type":"child","title":"The LIGO\/Virgo Search for Gravitational Waves","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The LIGO and Virgo detectors have completed a prolific observation run. We are now observing gravitational waves from both the mergers of binary black holes and neutron stars. We\u2019ll discuss how these discoveries were made and look into what the near future of searching for gravitational waves from compact binary mergers will look like.","filename":"msa198s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Alexander","last_name":"Nitz","affiliation":"Max Planck Institute for Gravitational Physics","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alexander","last_name":"Nitz","affiliation":"Max Planck Institute for Gravitational Physics","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa227","type":"child","title":"Methods and Challenges in the Characterization of Gravitational-Wave Sources","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Advanced gravitational-wave detectors have so far detected signals emitted by the coalescence of two neutron stars or two black holes. Their astrophysical (masses, spins) and extrinsic (position, orientation) parameters have been estimated using stochastic samplers that efficiently explore a 15D parameter space. Although theoretically straightforward, parameter estimation can become non trivial due to the time required to calculate waveform models, the behavior of the noise, and the correlation between parameters. In this talk I will review the main methods results, and challenges associated with the characterization of compact binaries detected by advanced LIGO and Virgo. I will also underline which new challenges will arise as the sensitivity of gravitational-wave detectors improves by a factor of few or a factor of 10.","bio":"","contributors":[{"type":"Author","first_name":"Salvatore","last_name":"Vitale","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Salvatore","last_name":"Vitale","affiliation":"Massachusetts Institute of Technology","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa192","type":"child","title":"Numerical Relativity and its Applications for the Modelling of Gravitational Waves","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The optimally efficient detection of gravitational wave events, and the robust identification of the sources of such events rely on accurate models of the gravitational wave signals for astrophysically plausible events. Such models are synthesized from numerical solutions of the Einstein equations and perturbative models. This talk will review the current status and open challenges in solving the Einstein equations numerically as a system of partial differential equations as applied to the coalescence of compact binaries in the context of gravitational source modelling.","bio":"","contributors":[{"type":"Author","first_name":"Sascha","last_name":"Husa","affiliation":"University of the Balearic Islands","country":"Spain","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sascha","last_name":"Husa","affiliation":"University of the Balearic Islands","country":"Spain","bio":"","order":"1","is_presenter":true}]},{"id":"msa239","type":"child","title":"Data Quality for Gravitational-Wave Detectors","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Gravitational-wave detectors are extremely complicated and unprecedently sensitive machines. We monitor many thousands of sensors and control systems to detect any hardware problems, electronics failures, noise from the environment, and all the many other subtle issues which degrade our sensitivity. I\u0027ll talk about many of the ways that we mine this large data set, including methods from signal processing and statistics, machine learning, and citizen science, to diagnose and solve the issues obstructing our\u00a0searches for gravitational waves.","filename":"msa239s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrew P.","last_name":"Lundgren","affiliation":"University of Portsmouth","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andrew P.","last_name":"Lundgren","affiliation":"University of Portsmouth","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa239","type":"child","title":"Data Quality for Gravitational-Wave Detectors","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Gravitational-wave detectors are extremely complicated and unprecedently sensitive machines. We monitor many thousands of sensors and control systems to detect any hardware problems, electronics failures, noise from the environment, and all the many other subtle issues which degrade our sensitivity. I\u0027ll talk about many of the ways that we mine this large data set, including methods from signal processing and statistics, machine learning, and citizen science, to diagnose and solve the issues obstructing our\u00a0searches for gravitational waves.","filename":"msa239s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andrew P.","last_name":"Lundgren","affiliation":"University of Portsmouth","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andrew P.","last_name":"Lundgren","affiliation":"University of Portsmouth","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Andrew P.","last_name":"Lundgren","affiliation":"University of Portsmouth","country":"United Kingdom","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Frank Wuerthwein (UC San Diego, United States of America)
, Kaushik De (The University of Texas at Arlington, United States of America)
Track(s):
Computer Science and Applied Mathematics, Physics
This minisymposium will present the latest advances in using HPC systems worldwide for physics results from large experiments in High Energy Physics and Particle Astrophysics. While HPC usage worldwide will be described, the experiences from the use of leadership class computing facilities in the US by the Large Hadron Collider experiments will be highlighted. LHC experiments have traditionally used the infrastructure provided by the Worldwide LHC Computing Grid. This has been supplemented in recent years by incorporating traditional HPC systems into the production and analysis computing systems at the LHC. Primarily CPU intensive simulation workflows are executed at HPCs - though other workflows are also being tested. The experiences gained by the LHC experiments over the past few years have opened HPC usage for other experimental and data intensive sciences. Four talks at this minisymposium will summarize the state of the art and the future wishlist for HPC usage for current and future experiments.
11:45 - 12:15
Big Data on HPC via HEPCloud
, Marco Mascheroni (University of California San Diego, United States of America)
+ Abstract { "session": {"id":"sess168","title":"MS36 - HPC for HEP: Enabling Big Data from Large Instruments on Leadership Class HPC Infrastructures","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Sydney Room","contributors":[{"type":"Session Chair","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Physics"],"slots":[{"id":"symp145","type":"minisymposia","title":"MS36 - HPC for HEP: Enabling Big Data from Large Instruments on Leadership Class HPC Infrastructures","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"This minisymposium will present the latest advances in using HPC systems worldwide for physics results from large experiments in High Energy Physics and Particle Astrophysics. While HPC usage worldwide will be described, the experiences from the use of leadership class computing facilities in the US by the Large Hadron Collider experiments will be highlighted. LHC experiments have traditionally used the infrastructure provided by the Worldwide LHC Computing Grid. This has been supplemented in recent years by incorporating traditional HPC systems into the production and analysis computing systems at the LHC. Primarily CPU intensive simulation workflows are executed at HPCs - though other workflows are also being tested. The experiences gained by the LHC experiments over the past few years have opened HPC usage for other experimental and data intensive sciences. Four talks at this minisymposium will summarize the state of the art and the future wishlist for HPC usage for current and future experiments.","bio":"","contributors":[{"type":"Organizer","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Kaushik","last_name":"De","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa298","type":"child","title":"Running ATLAS Simulations on HPCs","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Experiments at the Large Hadron Collider require data intensive processing and traditionally do not use HPCs. Till a few years ago, the ATLAS experiment at the LHC was using less than 10 million hours of walltime at HPCs annually, while over an exabyte of data was processed annually on the grid. A large increase in data volume and data complexity at the LHC in 2016 created a shortage of computing cycles, and HPC systems stepped in to help the LHC achieve its physics goals. Currently, ATLAS is on schedule to utilize about half a billion hours of walltime usage on HPCs during the past 12 months. This is a huge increase in usage over a few years - requiring numerous innovations and improvements. This talk will describe the use of HPCs worldwide by ATLAS, primarily for simulations, and specifically focus on how the HPCs are integrated with the workflow management and data management systems, and the lessons learned during this integration.","bio":"","contributors":[{"type":"Author","first_name":"Kaushik","last_name":"De","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Kaushik","last_name":"De","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa235","type":"child","title":"Big Data on HPC via HEPCloud","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The higher energy and luminosity from the LHC in Run2 has put increased pressure on CMS computing resources. Extrapolating to even higher luminosities (and thus higher event complexities and trigger rates) beyond Run3, it becomes clear that simply scaling up the current model of CMS computing alone will become economically unfeasible. High Performance Computing (HPC) facilities, widely used in scientific computing outside of HEP, have the potential to help fill the gap. Here we describe the USCMS efforts to integrate US HPC resources into CMS Computing via the HEPCloud project at Fermilab. We describe the HEPCloud project, its goal and the driving vision behind it, to function as a portal to an ecosystem of diverse computing resources commercial or academic. We then will focus on how CMS has been using HPC for CMS workflows, mainly at the NERSC Cori facility. We will discuss our experiences in running data intensive workflows on HPC, especially how the IO requirements (both local storage and network) have mapped to HPC resources. We will describe past and current challenges and future plans for HPC use in CMS.","filename":"msa235s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dirk","last_name":"Hufnagel","affiliation":"Fermilab","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Mascheroni","affiliation":"University of California San Diego","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marco","last_name":"Mascheroni","affiliation":"University of California San Diego","country":"United States of America","bio":"","order":"2","is_presenter":true}]},{"id":"msa300","type":"child","title":"Perspective - Lessons from Titan, Looking to the Future","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"OLCF has seen the growing use of Titan for data sciences emerging from a variety of disciplines. The goals of experimental and observational data-intensive (EOD) science, like the ATLAS and ALICE experiments at the Large-Hadron Collider (LHC), are joining the goals of simulation studies in their requirements for access to computing at the largest scales. The BigPanDA project, along with other data projects, has served as a driver for innovation at OLCF. These innovations have included the opportunistic backfill in Titan\u0027s scheduled compute nodes with the large, malleable workload available from distributed-computing projects; as sand may fill the gaps between rocks packed into a jar. The OLCF has also deployed multiple container strategies to (1) automate deploying containers (Kubernetes\/OpenShift) as a framework for providing user-required services and applications, and (2)\u2028 HPC container runtimes (Singularity\/Shifter) focused on use within in a batch submission system.\u2028 Moreover, Titan\u0027s GPU-accelerated architecture has attracted a surge in machine-learning and artificial-intelligence workloads. With the advent of the Summit supercomputer in 2018 with over 27,000 machine-learning-optimized GPUs, high-bandwidth data movement, and large node-local memory, the volume of data analysis and machine-learning workloads is expected to grow significantly into the future.","bio":"","contributors":[{"type":"Author","first_name":"Jack","last_name":"Wells","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jack","last_name":"Wells","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa184","type":"child","title":"HPC Systems and the Integration Challenges of Large Instruments","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large Instruments like the four LHC detectors, LIGO, Virgo, IceCube, LSST, DUNE, BelleII, and so forth, are designed, built, and operated by large international collaborations. To bring all the resources to bear that these collaborations have access to requires globally distributed Cyberinfrastructure that leadership class HPC systems need to integrate into seamlessly in order to be maximally useful. We discuss an architectural wish list for future large scale HPC systems in order to better support science with large instruments. We give some examples for how these features have been integrated into existing machines.","filename":"msa184s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa235","type":"child","title":"Big Data on HPC via HEPCloud","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The higher energy and luminosity from the LHC in Run2 has put increased pressure on CMS computing resources. Extrapolating to even higher luminosities (and thus higher event complexities and trigger rates) beyond Run3, it becomes clear that simply scaling up the current model of CMS computing alone will become economically unfeasible. High Performance Computing (HPC) facilities, widely used in scientific computing outside of HEP, have the potential to help fill the gap. Here we describe the USCMS efforts to integrate US HPC resources into CMS Computing via the HEPCloud project at Fermilab. We describe the HEPCloud project, its goal and the driving vision behind it, to function as a portal to an ecosystem of diverse computing resources commercial or academic. We then will focus on how CMS has been using HPC for CMS workflows, mainly at the NERSC Cori facility. We will discuss our experiences in running data intensive workflows on HPC, especially how the IO requirements (both local storage and network) have mapped to HPC resources. We will describe past and current challenges and future plans for HPC use in CMS.","filename":"msa235s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dirk","last_name":"Hufnagel","affiliation":"Fermilab","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Mascheroni","affiliation":"University of California San Diego","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marco","last_name":"Mascheroni","affiliation":"University of California San Diego","country":"United States of America","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Dirk","last_name":"Hufnagel","affiliation":"Fermilab","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Mascheroni","affiliation":"University of California San Diego","country":"United States of America","bio":"","order":"2","is_presenter":true}] } Presentation
12:45 - 13:15
HPC Systems and the Integration Challenges of Large Instruments
, Frank Wuerthwein (UC San Diego, United States of America)
+ Abstract { "session": {"id":"sess168","title":"MS36 - HPC for HEP: Enabling Big Data from Large Instruments on Leadership Class HPC Infrastructures","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Sydney Room","contributors":[{"type":"Session Chair","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Physics"],"slots":[{"id":"symp145","type":"minisymposia","title":"MS36 - HPC for HEP: Enabling Big Data from Large Instruments on Leadership Class HPC Infrastructures","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"This minisymposium will present the latest advances in using HPC systems worldwide for physics results from large experiments in High Energy Physics and Particle Astrophysics. While HPC usage worldwide will be described, the experiences from the use of leadership class computing facilities in the US by the Large Hadron Collider experiments will be highlighted. LHC experiments have traditionally used the infrastructure provided by the Worldwide LHC Computing Grid. This has been supplemented in recent years by incorporating traditional HPC systems into the production and analysis computing systems at the LHC. Primarily CPU intensive simulation workflows are executed at HPCs - though other workflows are also being tested. The experiences gained by the LHC experiments over the past few years have opened HPC usage for other experimental and data intensive sciences. Four talks at this minisymposium will summarize the state of the art and the future wishlist for HPC usage for current and future experiments.","bio":"","contributors":[{"type":"Organizer","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Kaushik","last_name":"De","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa298","type":"child","title":"Running ATLAS Simulations on HPCs","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Experiments at the Large Hadron Collider require data intensive processing and traditionally do not use HPCs. Till a few years ago, the ATLAS experiment at the LHC was using less than 10 million hours of walltime at HPCs annually, while over an exabyte of data was processed annually on the grid. A large increase in data volume and data complexity at the LHC in 2016 created a shortage of computing cycles, and HPC systems stepped in to help the LHC achieve its physics goals. Currently, ATLAS is on schedule to utilize about half a billion hours of walltime usage on HPCs during the past 12 months. This is a huge increase in usage over a few years - requiring numerous innovations and improvements. This talk will describe the use of HPCs worldwide by ATLAS, primarily for simulations, and specifically focus on how the HPCs are integrated with the workflow management and data management systems, and the lessons learned during this integration.","bio":"","contributors":[{"type":"Author","first_name":"Kaushik","last_name":"De","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Kaushik","last_name":"De","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa235","type":"child","title":"Big Data on HPC via HEPCloud","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The higher energy and luminosity from the LHC in Run2 has put increased pressure on CMS computing resources. Extrapolating to even higher luminosities (and thus higher event complexities and trigger rates) beyond Run3, it becomes clear that simply scaling up the current model of CMS computing alone will become economically unfeasible. High Performance Computing (HPC) facilities, widely used in scientific computing outside of HEP, have the potential to help fill the gap. Here we describe the USCMS efforts to integrate US HPC resources into CMS Computing via the HEPCloud project at Fermilab. We describe the HEPCloud project, its goal and the driving vision behind it, to function as a portal to an ecosystem of diverse computing resources commercial or academic. We then will focus on how CMS has been using HPC for CMS workflows, mainly at the NERSC Cori facility. We will discuss our experiences in running data intensive workflows on HPC, especially how the IO requirements (both local storage and network) have mapped to HPC resources. We will describe past and current challenges and future plans for HPC use in CMS.","filename":"msa235s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Dirk","last_name":"Hufnagel","affiliation":"Fermilab","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Marco","last_name":"Mascheroni","affiliation":"University of California San Diego","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Marco","last_name":"Mascheroni","affiliation":"University of California San Diego","country":"United States of America","bio":"","order":"2","is_presenter":true}]},{"id":"msa300","type":"child","title":"Perspective - Lessons from Titan, Looking to the Future","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"OLCF has seen the growing use of Titan for data sciences emerging from a variety of disciplines. The goals of experimental and observational data-intensive (EOD) science, like the ATLAS and ALICE experiments at the Large-Hadron Collider (LHC), are joining the goals of simulation studies in their requirements for access to computing at the largest scales. The BigPanDA project, along with other data projects, has served as a driver for innovation at OLCF. These innovations have included the opportunistic backfill in Titan\u0027s scheduled compute nodes with the large, malleable workload available from distributed-computing projects; as sand may fill the gaps between rocks packed into a jar. The OLCF has also deployed multiple container strategies to (1) automate deploying containers (Kubernetes\/OpenShift) as a framework for providing user-required services and applications, and (2)\u2028 HPC container runtimes (Singularity\/Shifter) focused on use within in a batch submission system.\u2028 Moreover, Titan\u0027s GPU-accelerated architecture has attracted a surge in machine-learning and artificial-intelligence workloads. With the advent of the Summit supercomputer in 2018 with over 27,000 machine-learning-optimized GPUs, high-bandwidth data movement, and large node-local memory, the volume of data analysis and machine-learning workloads is expected to grow significantly into the future.","bio":"","contributors":[{"type":"Author","first_name":"Jack","last_name":"Wells","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jack","last_name":"Wells","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa184","type":"child","title":"HPC Systems and the Integration Challenges of Large Instruments","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large Instruments like the four LHC detectors, LIGO, Virgo, IceCube, LSST, DUNE, BelleII, and so forth, are designed, built, and operated by large international collaborations. To bring all the resources to bear that these collaborations have access to requires globally distributed Cyberinfrastructure that leadership class HPC systems need to integrate into seamlessly in order to be maximally useful. We discuss an architectural wish list for future large scale HPC systems in order to better support science with large instruments. We give some examples for how these features have been integrated into existing machines.","filename":"msa184s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa184","type":"child","title":"HPC Systems and the Integration Challenges of Large Instruments","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Large Instruments like the four LHC detectors, LIGO, Virgo, IceCube, LSST, DUNE, BelleII, and so forth, are designed, built, and operated by large international collaborations. To bring all the resources to bear that these collaborations have access to requires globally distributed Cyberinfrastructure that leadership class HPC systems need to integrate into seamlessly in order to be maximally useful. We discuss an architectural wish list for future large scale HPC systems in order to better support science with large instruments. We give some examples for how these features have been integrated into existing machines.","filename":"msa184s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Ritabrata Dutta (Università della Svizzera italiana, Switzerland)
, Nikos Karathanasopoulos (ETH Zurich, Switzerland)
, Bastien Chopard (University of Geneva, Switzerland)
Track(s):
Life Sciences, Engineering, Emerging Application Domains, Computer Science and Applied Mathematics
The HPUQ minisymposium focuses on uncertainty quantification (UQ) of mechanistic models for natural sciences (eg. Engineering, Life and Aquatic Sciences) using high performance computing (HPC). The statistical inference (e.g. calibration) of models for complex mechanistic models, in the abundance of data arriving from heterogeneous sources, poses a methodological and computational challenge for scientists. In the first session, the minisymposium highlights cutting edge frameworks for rigorous and robust UQ as ABCpy, Π4U, PyMLMC, SPUX to address these issues, with a focus towards optimal algorithmic performance and efficient utilization of HPC resources. In the second session of the minisymposium, we shift the focus to the applications of UQ methodologies in several important scientific domains spanning from Biomedicine and Biomechanics to Aerospace Engineering and Fluid Dynamics.
11:15 - 11:45
ABCpy: Benchmarking ABC Algorithms from HPC Perspective
, Ritabrata Dutta (Università della Svizzera italiana, Switzerland)
+ Abstract { "session": {"id":"sess181","title":"MS37 - HPUQ: Current Challenges in Uncertainty Quantification for Mechanistic Models, Part I: Theory, Methods and Tools","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Nikos","last_name":"Karathanasopoulos","affiliation":"ETH Zurich","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp116","type":"minisymposia","title":"MS37 - HPUQ: Current Challenges in Uncertainty Quantification for Mechanistic Models, Part I: Theory, Methods and Tools","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The HPUQ minisymposium focuses on uncertainty quantification (UQ) of mechanistic models for natural sciences (eg. Engineering, Life and Aquatic Sciences) using high performance computing (HPC). The statistical inference (e.g. calibration) of models for complex mechanistic models, in the abundance of data arriving from heterogeneous sources, poses a methodological and computational challenge for scientists. In the first session, the minisymposium highlights cutting edge frameworks for rigorous and robust UQ as ABCpy, \u03a04U, PyMLMC, SPUX to address these issues, with a focus towards optimal algorithmic performance and efficient utilization of HPC resources. In the second session of the minisymposium, we shift the focus to the applications of UQ methodologies in several important scientific domains spanning from Biomedicine and Biomechanics to Aerospace Engineering and Fluid Dynamics.","bio":"","contributors":[{"type":"Organizer","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Nikos","last_name":"Karathanasopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Bastien","last_name":"Chopard","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa225","type":"child","title":"ABCpy: Benchmarking ABC Algorithms from HPC Perspective","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ABCpy is a highly modular scientific library for Approximate Bayesian Computation (ABC) written in Python. Our main contribution is to illustrate a software engineering effort that enables domain scientists to easily apply ABC (for likelihood-free Bayesian uncertainty quantification of Mechanistic models) to their research without being ABC experts; using ABCpy they can easily run large parallel simulations without much knowledge about parallelization, even without much additional effort to parallelize their code. Further, ABCpy enables ABC experts to easily develop new inference schemes and evaluate them in a standardized environment and to extend the library with new algorithms. These benefits come mainly from the modularity of ABCpy. We give an overview of the design of ABCpy and provide a performance evaluation concentrating on parallelization. This points us towards the inherent imbalance in some of the ABC algorithms. We develop a dynamic scheduling MPI implementation to mitigate this issue and classify ABC algorithms according to their adaptability towards high-performance computing.","filename":"msa225s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marcel","last_name":"Schoengens","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Avinash","last_name":"Ummadisinghu","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Widerman","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Jukka-Pekka","last_name":"Onnela","affiliation":"Harvard University","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa243","type":"child","title":"The Hierarchical Bayesian Framework Applied to Molecular Dynamics","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Hierarchical Bayesian (HB) framework for the quantification of uncertainty and model selection in the presence of heterogeneous data will be presented as well as an efficient algorithm for the sampling of the posterior, high dimensional distribution. The framework is then applied to two problems from Molecular Dynamics. In the first problem we revisit the exponent related to the repulsion force in the Lennard-Jones potential. Using experimental data from the radial distribution function of argon in various thermodynamic conditions we show that the exponent should be close to approximately 6.5. We show that the proposed 6-\u003Cem\u003Ep \u003C\/em\u003Epotential applies to a wider range of thermodynamic conditions, than the classical 6-12 potential. In the second problem we address the question of the best coarse-grained model for liquid water. Typically, the level of coarse-graining and the model complexity are preselected based on physical intuition. These assumptions are rarely systematically addressed even though the model\u0027s accuracy, efficiency and transferability critically depends on them. We propose the HB framework as a mean for the rigorous selection of coarse-graining level and show its validity conditional on macroscopic quantities on various thermodynamic conditions.","filename":"msa243s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Georgios","last_name":"Arampatzis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Petros","last_name":"Koumoutsakos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Georgios","last_name":"Arampatzis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa238","type":"child","title":"PyMLMC + SPUX: Uncertainty Quantification Using Multi-Level and Particle Filtering Techniques","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Evolution of complex systems such as hydrodynamic flows and ecological networks can be modeled using differential equations and individual based models. Examples include Saint-Venant and Navier-Stokes equations for lakes, rivers and tsunamis, multi-phase Euler equations for cavitation, Darcy\u2019s law for porous flows and predator-prey foodwebs for mesocosm dynamics. Many of such dynamical systems strongly depend on alleatorically uncertain input data, such as initial data, sources and model coefficients, with additional epistemic uncertainty directly influencing the evolution trajectories. In this talk I will introduce two parallel uncertainty quantification frameworks: PyMLMC and SPUX. PyMLMC propagates uncertainty in model input using optimal fidelity multi-level Monte Carlo sampling, which significantly accelerates standard Monte Carlo method by clever variance reduction relying on a series of coarse resolution simulations used as control variates. A significantly more challenging task is probabilistic model parameter estimation incorporating prior expert knowledge and observed experimental data. The Python framework SPUX employs Particle Markov Chain Monte Carlo for efficient marginal likelihood approximations by iteratively evolving and adaptively filtering multiple state particles in parallel while completely avoiding heavy filesystem access. Modularity and efficient use of computational resources makes SPUX accessible for domain scientists interested in Bayesian inference for complex stochastic models.","bio":"","contributors":[{"type":"Author","first_name":"Jonas","last_name":"Sukys","affiliation":"Swiss Federal Institute of Aquatic Science and Technology","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonas","last_name":"Sukys","affiliation":"Swiss Federal Institute of Aquatic Science and Technology","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa171","type":"child","title":"Low-Rank Tensor Approximations for Sensitivity Analysis of Complex Models with High-Dimensional Input","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work presents a computationally efficient method for conducting sensitivity analysis of complex, expensive-to-evaluate computer models. The focus is set on the so-called Sobol\u2019 sensitivity indices, which represent the fraction of the total variance of a response quantity of interest (QoI) that can be attributed to a random input variable or a group thereof. The proposed method for computing these indices is based on substituting the original model with a low-rank tensor approximation (LRA) meta-model. The LRA meta-model provides a statistically equivalent representation of the QoI as a sum of rank-one tensors, the parameters of which can be determined from a relatively small number of runs of the original model. Because the number of unknown parameters in a LRA meta-model grows only linearly with the dimension of the random input, LRA can be particularly efficient in high-dimensional problems. It is demonstrated that the Sobol\u2019 indices can be computed \u003Cem\u003Eanalytically\u003C\/em\u003E in terms of the LRA parameters, thus enabling efficient analysis of computationally heavy models. The accuracy and efficiency of the approach is manifested in example applications related to structural mechanics, heat conduction and hydrogeology.","filename":"msa171s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Katerina","last_name":"Konakli","affiliation":"COWI","country":"Denmark","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Katerina","last_name":"Konakli","affiliation":"COWI","country":"Denmark","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa225","type":"child","title":"ABCpy: Benchmarking ABC Algorithms from HPC Perspective","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ABCpy is a highly modular scientific library for Approximate Bayesian Computation (ABC) written in Python. Our main contribution is to illustrate a software engineering effort that enables domain scientists to easily apply ABC (for likelihood-free Bayesian uncertainty quantification of Mechanistic models) to their research without being ABC experts; using ABCpy they can easily run large parallel simulations without much knowledge about parallelization, even without much additional effort to parallelize their code. Further, ABCpy enables ABC experts to easily develop new inference schemes and evaluate them in a standardized environment and to extend the library with new algorithms. These benefits come mainly from the modularity of ABCpy. We give an overview of the design of ABCpy and provide a performance evaluation concentrating on parallelization. This points us towards the inherent imbalance in some of the ABC algorithms. We develop a dynamic scheduling MPI implementation to mitigate this issue and classify ABC algorithms according to their adaptability towards high-performance computing.","filename":"msa225s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marcel","last_name":"Schoengens","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Avinash","last_name":"Ummadisinghu","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Widerman","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Jukka-Pekka","last_name":"Onnela","affiliation":"Harvard University","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marcel","last_name":"Schoengens","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Avinash","last_name":"Ummadisinghu","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Widerman","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Jukka-Pekka","last_name":"Onnela","affiliation":"Harvard University","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"6","is_presenter":false}] } Presentation
11:45 - 12:15
The Hierarchical Bayesian Framework Applied to Molecular Dynamics
, Georgios Arampatzis (ETH Zurich, Switzerland)
+ Abstract { "session": {"id":"sess181","title":"MS37 - HPUQ: Current Challenges in Uncertainty Quantification for Mechanistic Models, Part I: Theory, Methods and Tools","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Nikos","last_name":"Karathanasopoulos","affiliation":"ETH Zurich","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp116","type":"minisymposia","title":"MS37 - HPUQ: Current Challenges in Uncertainty Quantification for Mechanistic Models, Part I: Theory, Methods and Tools","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The HPUQ minisymposium focuses on uncertainty quantification (UQ) of mechanistic models for natural sciences (eg. Engineering, Life and Aquatic Sciences) using high performance computing (HPC). The statistical inference (e.g. calibration) of models for complex mechanistic models, in the abundance of data arriving from heterogeneous sources, poses a methodological and computational challenge for scientists. In the first session, the minisymposium highlights cutting edge frameworks for rigorous and robust UQ as ABCpy, \u03a04U, PyMLMC, SPUX to address these issues, with a focus towards optimal algorithmic performance and efficient utilization of HPC resources. In the second session of the minisymposium, we shift the focus to the applications of UQ methodologies in several important scientific domains spanning from Biomedicine and Biomechanics to Aerospace Engineering and Fluid Dynamics.","bio":"","contributors":[{"type":"Organizer","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Nikos","last_name":"Karathanasopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Bastien","last_name":"Chopard","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa225","type":"child","title":"ABCpy: Benchmarking ABC Algorithms from HPC Perspective","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ABCpy is a highly modular scientific library for Approximate Bayesian Computation (ABC) written in Python. Our main contribution is to illustrate a software engineering effort that enables domain scientists to easily apply ABC (for likelihood-free Bayesian uncertainty quantification of Mechanistic models) to their research without being ABC experts; using ABCpy they can easily run large parallel simulations without much knowledge about parallelization, even without much additional effort to parallelize their code. Further, ABCpy enables ABC experts to easily develop new inference schemes and evaluate them in a standardized environment and to extend the library with new algorithms. These benefits come mainly from the modularity of ABCpy. We give an overview of the design of ABCpy and provide a performance evaluation concentrating on parallelization. This points us towards the inherent imbalance in some of the ABC algorithms. We develop a dynamic scheduling MPI implementation to mitigate this issue and classify ABC algorithms according to their adaptability towards high-performance computing.","filename":"msa225s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marcel","last_name":"Schoengens","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Avinash","last_name":"Ummadisinghu","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Widerman","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Jukka-Pekka","last_name":"Onnela","affiliation":"Harvard University","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa243","type":"child","title":"The Hierarchical Bayesian Framework Applied to Molecular Dynamics","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Hierarchical Bayesian (HB) framework for the quantification of uncertainty and model selection in the presence of heterogeneous data will be presented as well as an efficient algorithm for the sampling of the posterior, high dimensional distribution. The framework is then applied to two problems from Molecular Dynamics. In the first problem we revisit the exponent related to the repulsion force in the Lennard-Jones potential. Using experimental data from the radial distribution function of argon in various thermodynamic conditions we show that the exponent should be close to approximately 6.5. We show that the proposed 6-\u003Cem\u003Ep \u003C\/em\u003Epotential applies to a wider range of thermodynamic conditions, than the classical 6-12 potential. In the second problem we address the question of the best coarse-grained model for liquid water. Typically, the level of coarse-graining and the model complexity are preselected based on physical intuition. These assumptions are rarely systematically addressed even though the model\u0027s accuracy, efficiency and transferability critically depends on them. We propose the HB framework as a mean for the rigorous selection of coarse-graining level and show its validity conditional on macroscopic quantities on various thermodynamic conditions.","filename":"msa243s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Georgios","last_name":"Arampatzis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Petros","last_name":"Koumoutsakos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Georgios","last_name":"Arampatzis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa238","type":"child","title":"PyMLMC + SPUX: Uncertainty Quantification Using Multi-Level and Particle Filtering Techniques","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Evolution of complex systems such as hydrodynamic flows and ecological networks can be modeled using differential equations and individual based models. Examples include Saint-Venant and Navier-Stokes equations for lakes, rivers and tsunamis, multi-phase Euler equations for cavitation, Darcy\u2019s law for porous flows and predator-prey foodwebs for mesocosm dynamics. Many of such dynamical systems strongly depend on alleatorically uncertain input data, such as initial data, sources and model coefficients, with additional epistemic uncertainty directly influencing the evolution trajectories. In this talk I will introduce two parallel uncertainty quantification frameworks: PyMLMC and SPUX. PyMLMC propagates uncertainty in model input using optimal fidelity multi-level Monte Carlo sampling, which significantly accelerates standard Monte Carlo method by clever variance reduction relying on a series of coarse resolution simulations used as control variates. A significantly more challenging task is probabilistic model parameter estimation incorporating prior expert knowledge and observed experimental data. The Python framework SPUX employs Particle Markov Chain Monte Carlo for efficient marginal likelihood approximations by iteratively evolving and adaptively filtering multiple state particles in parallel while completely avoiding heavy filesystem access. Modularity and efficient use of computational resources makes SPUX accessible for domain scientists interested in Bayesian inference for complex stochastic models.","bio":"","contributors":[{"type":"Author","first_name":"Jonas","last_name":"Sukys","affiliation":"Swiss Federal Institute of Aquatic Science and Technology","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonas","last_name":"Sukys","affiliation":"Swiss Federal Institute of Aquatic Science and Technology","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa171","type":"child","title":"Low-Rank Tensor Approximations for Sensitivity Analysis of Complex Models with High-Dimensional Input","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work presents a computationally efficient method for conducting sensitivity analysis of complex, expensive-to-evaluate computer models. The focus is set on the so-called Sobol\u2019 sensitivity indices, which represent the fraction of the total variance of a response quantity of interest (QoI) that can be attributed to a random input variable or a group thereof. The proposed method for computing these indices is based on substituting the original model with a low-rank tensor approximation (LRA) meta-model. The LRA meta-model provides a statistically equivalent representation of the QoI as a sum of rank-one tensors, the parameters of which can be determined from a relatively small number of runs of the original model. Because the number of unknown parameters in a LRA meta-model grows only linearly with the dimension of the random input, LRA can be particularly efficient in high-dimensional problems. It is demonstrated that the Sobol\u2019 indices can be computed \u003Cem\u003Eanalytically\u003C\/em\u003E in terms of the LRA parameters, thus enabling efficient analysis of computationally heavy models. The accuracy and efficiency of the approach is manifested in example applications related to structural mechanics, heat conduction and hydrogeology.","filename":"msa171s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Katerina","last_name":"Konakli","affiliation":"COWI","country":"Denmark","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Katerina","last_name":"Konakli","affiliation":"COWI","country":"Denmark","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa243","type":"child","title":"The Hierarchical Bayesian Framework Applied to Molecular Dynamics","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Hierarchical Bayesian (HB) framework for the quantification of uncertainty and model selection in the presence of heterogeneous data will be presented as well as an efficient algorithm for the sampling of the posterior, high dimensional distribution. The framework is then applied to two problems from Molecular Dynamics. In the first problem we revisit the exponent related to the repulsion force in the Lennard-Jones potential. Using experimental data from the radial distribution function of argon in various thermodynamic conditions we show that the exponent should be close to approximately 6.5. We show that the proposed 6-\u003Cem\u003Ep \u003C\/em\u003Epotential applies to a wider range of thermodynamic conditions, than the classical 6-12 potential. In the second problem we address the question of the best coarse-grained model for liquid water. Typically, the level of coarse-graining and the model complexity are preselected based on physical intuition. These assumptions are rarely systematically addressed even though the model\u0027s accuracy, efficiency and transferability critically depends on them. We propose the HB framework as a mean for the rigorous selection of coarse-graining level and show its validity conditional on macroscopic quantities on various thermodynamic conditions.","filename":"msa243s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Georgios","last_name":"Arampatzis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Petros","last_name":"Koumoutsakos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Georgios","last_name":"Arampatzis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Georgios","last_name":"Arampatzis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Petros","last_name":"Koumoutsakos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}] } Presentation
12:45 - 13:15
Low-Rank Tensor Approximations for Sensitivity Analysis of Complex Models with High-Dimensional Input
, Katerina Konakli (COWI, Denmark)
+ Abstract { "session": {"id":"sess181","title":"MS37 - HPUQ: Current Challenges in Uncertainty Quantification for Mechanistic Models, Part I: Theory, Methods and Tools","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Singapore Room","contributors":[{"type":"Session Chair","first_name":"Nikos","last_name":"Karathanasopoulos","affiliation":"ETH Zurich","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Life Sciences","Engineering","Emerging Application Domains","Computer Science and Applied Mathematics"],"slots":[{"id":"symp116","type":"minisymposia","title":"MS37 - HPUQ: Current Challenges in Uncertainty Quantification for Mechanistic Models, Part I: Theory, Methods and Tools","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The HPUQ minisymposium focuses on uncertainty quantification (UQ) of mechanistic models for natural sciences (eg. Engineering, Life and Aquatic Sciences) using high performance computing (HPC). The statistical inference (e.g. calibration) of models for complex mechanistic models, in the abundance of data arriving from heterogeneous sources, poses a methodological and computational challenge for scientists. In the first session, the minisymposium highlights cutting edge frameworks for rigorous and robust UQ as ABCpy, \u03a04U, PyMLMC, SPUX to address these issues, with a focus towards optimal algorithmic performance and efficient utilization of HPC resources. In the second session of the minisymposium, we shift the focus to the applications of UQ methodologies in several important scientific domains spanning from Biomedicine and Biomechanics to Aerospace Engineering and Fluid Dynamics.","bio":"","contributors":[{"type":"Organizer","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Nikos","last_name":"Karathanasopoulos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Bastien","last_name":"Chopard","affiliation":"University of Geneva","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa225","type":"child","title":"ABCpy: Benchmarking ABC Algorithms from HPC Perspective","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ABCpy is a highly modular scientific library for Approximate Bayesian Computation (ABC) written in Python. Our main contribution is to illustrate a software engineering effort that enables domain scientists to easily apply ABC (for likelihood-free Bayesian uncertainty quantification of Mechanistic models) to their research without being ABC experts; using ABCpy they can easily run large parallel simulations without much knowledge about parallelization, even without much additional effort to parallelize their code. Further, ABCpy enables ABC experts to easily develop new inference schemes and evaluate them in a standardized environment and to extend the library with new algorithms. These benefits come mainly from the modularity of ABCpy. We give an overview of the design of ABCpy and provide a performance evaluation concentrating on parallelization. This points us towards the inherent imbalance in some of the ABC algorithms. We develop a dynamic scheduling MPI implementation to mitigate this issue and classify ABC algorithms according to their adaptability towards high-performance computing.","filename":"msa225s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marcel","last_name":"Schoengens","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Avinash","last_name":"Ummadisinghu","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Widerman","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Jukka-Pekka","last_name":"Onnela","affiliation":"Harvard University","country":"United States of America","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Antonietta","last_name":"Mira","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ritabrata","last_name":"Dutta","affiliation":"Universit\u00e0 della Svizzera italiana","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa243","type":"child","title":"The Hierarchical Bayesian Framework Applied to Molecular Dynamics","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Hierarchical Bayesian (HB) framework for the quantification of uncertainty and model selection in the presence of heterogeneous data will be presented as well as an efficient algorithm for the sampling of the posterior, high dimensional distribution. The framework is then applied to two problems from Molecular Dynamics. In the first problem we revisit the exponent related to the repulsion force in the Lennard-Jones potential. Using experimental data from the radial distribution function of argon in various thermodynamic conditions we show that the exponent should be close to approximately 6.5. We show that the proposed 6-\u003Cem\u003Ep \u003C\/em\u003Epotential applies to a wider range of thermodynamic conditions, than the classical 6-12 potential. In the second problem we address the question of the best coarse-grained model for liquid water. Typically, the level of coarse-graining and the model complexity are preselected based on physical intuition. These assumptions are rarely systematically addressed even though the model\u0027s accuracy, efficiency and transferability critically depends on them. We propose the HB framework as a mean for the rigorous selection of coarse-graining level and show its validity conditional on macroscopic quantities on various thermodynamic conditions.","filename":"msa243s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Georgios","last_name":"Arampatzis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Petros","last_name":"Koumoutsakos","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Georgios","last_name":"Arampatzis","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa238","type":"child","title":"PyMLMC + SPUX: Uncertainty Quantification Using Multi-Level and Particle Filtering Techniques","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Evolution of complex systems such as hydrodynamic flows and ecological networks can be modeled using differential equations and individual based models. Examples include Saint-Venant and Navier-Stokes equations for lakes, rivers and tsunamis, multi-phase Euler equations for cavitation, Darcy\u2019s law for porous flows and predator-prey foodwebs for mesocosm dynamics. Many of such dynamical systems strongly depend on alleatorically uncertain input data, such as initial data, sources and model coefficients, with additional epistemic uncertainty directly influencing the evolution trajectories. In this talk I will introduce two parallel uncertainty quantification frameworks: PyMLMC and SPUX. PyMLMC propagates uncertainty in model input using optimal fidelity multi-level Monte Carlo sampling, which significantly accelerates standard Monte Carlo method by clever variance reduction relying on a series of coarse resolution simulations used as control variates. A significantly more challenging task is probabilistic model parameter estimation incorporating prior expert knowledge and observed experimental data. The Python framework SPUX employs Particle Markov Chain Monte Carlo for efficient marginal likelihood approximations by iteratively evolving and adaptively filtering multiple state particles in parallel while completely avoiding heavy filesystem access. Modularity and efficient use of computational resources makes SPUX accessible for domain scientists interested in Bayesian inference for complex stochastic models.","bio":"","contributors":[{"type":"Author","first_name":"Jonas","last_name":"Sukys","affiliation":"Swiss Federal Institute of Aquatic Science and Technology","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Jonas","last_name":"Sukys","affiliation":"Swiss Federal Institute of Aquatic Science and Technology","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa171","type":"child","title":"Low-Rank Tensor Approximations for Sensitivity Analysis of Complex Models with High-Dimensional Input","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work presents a computationally efficient method for conducting sensitivity analysis of complex, expensive-to-evaluate computer models. The focus is set on the so-called Sobol\u2019 sensitivity indices, which represent the fraction of the total variance of a response quantity of interest (QoI) that can be attributed to a random input variable or a group thereof. The proposed method for computing these indices is based on substituting the original model with a low-rank tensor approximation (LRA) meta-model. The LRA meta-model provides a statistically equivalent representation of the QoI as a sum of rank-one tensors, the parameters of which can be determined from a relatively small number of runs of the original model. Because the number of unknown parameters in a LRA meta-model grows only linearly with the dimension of the random input, LRA can be particularly efficient in high-dimensional problems. It is demonstrated that the Sobol\u2019 indices can be computed \u003Cem\u003Eanalytically\u003C\/em\u003E in terms of the LRA parameters, thus enabling efficient analysis of computationally heavy models. The accuracy and efficiency of the approach is manifested in example applications related to structural mechanics, heat conduction and hydrogeology.","filename":"msa171s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Katerina","last_name":"Konakli","affiliation":"COWI","country":"Denmark","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Katerina","last_name":"Konakli","affiliation":"COWI","country":"Denmark","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa171","type":"child","title":"Low-Rank Tensor Approximations for Sensitivity Analysis of Complex Models with High-Dimensional Input","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"This work presents a computationally efficient method for conducting sensitivity analysis of complex, expensive-to-evaluate computer models. The focus is set on the so-called Sobol\u2019 sensitivity indices, which represent the fraction of the total variance of a response quantity of interest (QoI) that can be attributed to a random input variable or a group thereof. The proposed method for computing these indices is based on substituting the original model with a low-rank tensor approximation (LRA) meta-model. The LRA meta-model provides a statistically equivalent representation of the QoI as a sum of rank-one tensors, the parameters of which can be determined from a relatively small number of runs of the original model. Because the number of unknown parameters in a LRA meta-model grows only linearly with the dimension of the random input, LRA can be particularly efficient in high-dimensional problems. It is demonstrated that the Sobol\u2019 indices can be computed \u003Cem\u003Eanalytically\u003C\/em\u003E in terms of the LRA parameters, thus enabling efficient analysis of computationally heavy models. The accuracy and efficiency of the approach is manifested in example applications related to structural mechanics, heat conduction and hydrogeology.","filename":"msa171s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Katerina","last_name":"Konakli","affiliation":"COWI","country":"Denmark","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Katerina","last_name":"Konakli","affiliation":"COWI","country":"Denmark","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Katerina","last_name":"Konakli","affiliation":"COWI","country":"Denmark","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Ivano Tavernelli (IBM Research, Switzerland)
, Matthieu Mottet (IBM Research, Switzerland)
Track(s):
Chemistry and Materials
Modeling energy and mass transport phenomena in the solid state accurately and efficiently in computer simulations is of critical importance to the discovery and design of new materials in important applications, such as lithium-ion solid-state electrolytes, proton-conducting fuel cells and better thermoelectrics. Progress in recent years enabled the modeling community to elucidate transport mechanisms in several materials of importance, but to achieve further progress major obstacles need to be tackled.
Firstly, the computational cost of the current methods limits the time and lengths scales of the systems that can be modeled. We will present several efforts that are targeted either at reaching larger system sizes and time scales in simulations, or to improve the estimates of transport coefficients from finite simulation lengths using more advanced statistical tools.
Secondly, with increases in computer power and novel methods for modeling and analysis it has become possible to push for large-scale screening of structural databases for desired transport properties. Such high-throughput approaches require new concepts of automatization, data reproducibility and results dissemination in the field of materials' simulation, and we will present the developed concepts, applications and results.
Firstly, the computational cost of the current methods limits the time and lengths scales of the systems that can be modeled. We will present several efforts that are targeted either at reaching larger system sizes and time scales in simulations, or to improve the estimates of transport coefficients from finite simulation lengths using more advanced statistical tools.
Secondly, with increases in computer power and novel methods for modeling and analysis it has become possible to push for large-scale screening of structural databases for desired transport properties. Such high-throughput approaches require new concepts of automatization, data reproducibility and results dissemination in the field of materials' simulation, and we will present the developed concepts, applications and results.
12:45 - 13:15
Accurate Thermal Conductivities from Optimally Short Molecular Dynamics Simulations
, Loris Ercole (SISSA, Italy)
+ Abstract { "session": {"id":"sess188","title":"MS38 - Mass and Energy Transport Phenomena in Solid State","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Matthieu","last_name":"Mottet","affiliation":"IBM Research","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Chemistry and Materials"],"slots":[{"id":"symp138","type":"minisymposia","title":"MS38 - Mass and Energy Transport Phenomena in Solid State","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Modeling energy and mass transport phenomena in the solid state accurately and efficiently in computer simulations is of critical importance to the discovery and design of new materials in important applications, such as lithium-ion solid-state electrolytes, proton-conducting fuel cells and better thermoelectrics. Progress in recent years enabled the modeling community to elucidate transport mechanisms in several materials of importance, but to achieve further progress major obstacles need to be tackled.\u003Cbr \/\u003E\u003Cbr \/\u003EFirstly, the computational cost of the current methods limits the time and lengths scales of the systems that can be modeled. We will present several efforts that are targeted either at reaching larger system sizes and time scales in simulations, or to improve the estimates of transport coefficients from finite simulation lengths using more advanced statistical tools. \u003Cbr \/\u003E\u003Cbr \/\u003ESecondly, with increases in computer power and novel methods for modeling and analysis it has become possible to push for large-scale screening of structural databases for desired transport properties. Such high-throughput approaches require new concepts of automatization, data reproducibility and results dissemination in the field of materials\u0027 simulation, and we will present the developed concepts, applications and results.","bio":"","contributors":[{"type":"Organizer","first_name":"Ivano","last_name":"Tavernelli","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Organizer","first_name":"Matthieu","last_name":"Mottet","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Matthieu","last_name":"Mottet","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa236","type":"child","title":"The Materials Genome in Action","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"It is now possible to make an enormous spectrum of different nanoporous materials simply by changing the building blocks in the synthesis of Metal Organic Frameworks (MOFs) or related materials. This unique chemical tenability allows us to tailor-make materials that are optimal for a given application. The promise of finding just the right material seems remote, however: because of practical imitations, we can only ever synthesise, characterise and test a tiny fraction of all possible materials. To take full advantage of this development, therefore, we need to develop alternative techniques, collectively referred to as Materials Genomics, to rapidly screen large numbers of materials and obtain fundamental insights into the chemical nature of the ideal material for a given application. In this lecture we illustrate this approach by suggesting how to obtain optimal materials for gas separations and gas storage.","bio":"","contributors":[{"type":"Author","first_name":"Berend","last_name":"Smit","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Seyed Mohamad","last_name":"Moosavi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Seyed Mohamad","last_name":"Moosavi","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa119","type":"child","title":"High-Throughput Screening for New Solid-State Electrolyte Candidates","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Extensive computational screening of structural databases for solid-state ionic conductors can lead to novel candidate materials for next-generation solid-state lithium-ion batteries, while deepening our understanding of the microscopic processes governing ionic diffusion in the solid state. Such a task is ambitious because no current modeling technique is both unbiasedly predictive for chemically diverse systems and computationally affordable. In order to reach this goal of efficiency and accuracy, we simplify the potential-energy surface that would be provided by density-functional theory with physically motivated approximations. The result is a novel hybrid quantum\/empirical model that can be used to perform molecular dynamics simulations of solid-state diffusion. The efficiency of the model allows one to adopt a high-throughput screening approach, that here is deployed using the AiiDA materials informatics platform. In this talk I will present the different screening protocols and show how high-level workflows can automatize and streamline the calculation of transport coefficient. Last, the full provenance of the data calculated is fully preserved by AiiDA, allowing one to search in the data atomistic descriptors that are predictive for ionic diffusion.","bio":"","contributors":[{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Aris","last_name":"Marcolongo","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nicola","last_name":"Marzari","affiliation":"EPFL","country":"Switzerland","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Leonid","last_name":"Kahle","affiliation":"EPFL","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa234","type":"child","title":"Doping Solid-State Electrolytes: Classical Modelling and Insights","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Doping and elemental substitution are essential tools for the optimisation of the conductivity of solid-state electrolytes (SSE). Therefore, the ability to model doped structure is an important step to discover and optimise new SSE in silico. The complexity and disorder introduced by doping make it computationally unrealistic to use first-principle frameworks for high throughput studies. Instead, the use of classical force-fields allows to overcome this limitation. In this talk, I will present a systematic approach to the training of polarisable force-fields for solid-state electrolytes and demonstrate a methodology allowing to account for the disorder introduced by doping. This approach is similarly well suited for structure with partial ionic occupancy. In particular, I will showcase detailed results for W-doped garnet-type SSE, with particular attention to the impact of the change in the carrier concentration and in the local potential introduced by the impurities on the dynamics of the system. The study underlines the challenges presented by doping and the complex interplay between its thermodynamic and kinetic effects.","bio":"","contributors":[{"type":"Author","first_name":"Matthieu","last_name":"Mottet","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Aris","last_name":"Marcolongo","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Ivano","last_name":"Tavernelli","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Teodoro","last_name":"Laino","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Matthieu","last_name":"Mottet","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa241","type":"child","title":"Accurate Thermal Conductivities from Optimally Short Molecular Dynamics Simulations","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The evaluation of thermal transport coefficients in extended systems is known to require impractically long simulations, thus calling for a paradigm shift that would allow to deploy state-of-the-art quantum simulation methods. We recently introduced a new method[1] to compute these coefficients from optimally short molecular dynamics simulations, based on the Green-Kubo theory of linear response and the cepstral analysis of time series. Information from the \u003Cem\u003Efull\u003C\/em\u003E sample power spectrum of the current for a \u003Cem\u003Esingle\u003C\/em\u003E and relatively short trajectory is leveraged to evaluate and optimally reduce the noise affecting its zero-frequency value, whose expectation is proportional to the corresponding conductivity. Our method is unbiased and consistent, in that both the resulting bias and statistical error can be made arbitrarily small in the long-time limit. A simple data-analysis protocol is proposed and validated in some paradigmatic cases (liquid-Ar and H2O, crystalline Mgo and amorphous-SiO2), showing that simulation times of one to a few hundred picoseconds are sufficient in these systems to achieve an accuracy of the order of 10% on the estimated thermal conductivities. Finally, we present a first application of this method to the \u003Cem\u003Eab initio \u003C\/em\u003Esimulation of heat transport in amorphous silica. [1]Ercole, Marcolongo, and Baroni, \u003Cem\u003ESci. Rep.\u003C\/em\u003E 7, 15835 (2017)","filename":"msa241s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Loris","last_name":"Ercole","affiliation":"SISSA","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Aris","last_name":"Marcolongo","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stefano","last_name":"Baroni","affiliation":"SISSA","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Loris","last_name":"Ercole","affiliation":"SISSA","country":"Italy","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa241","type":"child","title":"Accurate Thermal Conductivities from Optimally Short Molecular Dynamics Simulations","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The evaluation of thermal transport coefficients in extended systems is known to require impractically long simulations, thus calling for a paradigm shift that would allow to deploy state-of-the-art quantum simulation methods. We recently introduced a new method[1] to compute these coefficients from optimally short molecular dynamics simulations, based on the Green-Kubo theory of linear response and the cepstral analysis of time series. Information from the \u003Cem\u003Efull\u003C\/em\u003E sample power spectrum of the current for a \u003Cem\u003Esingle\u003C\/em\u003E and relatively short trajectory is leveraged to evaluate and optimally reduce the noise affecting its zero-frequency value, whose expectation is proportional to the corresponding conductivity. Our method is unbiased and consistent, in that both the resulting bias and statistical error can be made arbitrarily small in the long-time limit. A simple data-analysis protocol is proposed and validated in some paradigmatic cases (liquid-Ar and H2O, crystalline Mgo and amorphous-SiO2), showing that simulation times of one to a few hundred picoseconds are sufficient in these systems to achieve an accuracy of the order of 10% on the estimated thermal conductivities. Finally, we present a first application of this method to the \u003Cem\u003Eab initio \u003C\/em\u003Esimulation of heat transport in amorphous silica. [1]Ercole, Marcolongo, and Baroni, \u003Cem\u003ESci. Rep.\u003C\/em\u003E 7, 15835 (2017)","filename":"msa241s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Loris","last_name":"Ercole","affiliation":"SISSA","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Aris","last_name":"Marcolongo","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stefano","last_name":"Baroni","affiliation":"SISSA","country":"Italy","bio":"","order":"3","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Loris","last_name":"Ercole","affiliation":"SISSA","country":"Italy","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Loris","last_name":"Ercole","affiliation":"SISSA","country":"Italy","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Aris","last_name":"Marcolongo","affiliation":"IBM Research","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Stefano","last_name":"Baroni","affiliation":"SISSA","country":"Italy","bio":"","order":"3","is_presenter":false}] } Presentation
Organizer(s):
Christian Boehm (ETH Zurich, Switzerland)
, Václav Hapla (ETH Zurich, Switzerland)
Track(s):
Computer Science and Applied Mathematics, Solid Earth Dynamics, Physics
Nearly all fields in geophysics combine numerical models and data measurements to predict the future state of a dynamical system and/or infer unknown parameters. Such models may produce highly nonlinear systems with extremely large numbers of unknowns.
The ever-increasing power and widespread availability of massively parallel supercomputers offers researchers the opportunity to continually increase both the spatio-temporal resolution and the physical complexity within their numerical models. However, this requires access to solvers that can harness the resources of high-performance computing clusters efficiently and which scale for problem sizes with billions of degrees of freedom.
In this minisymposium, we discuss numerical and algorithmic approaches, scientific libraries and coding practices to develop and maintain scalable solvers for various applications in geophysics. Examples include, but are not limited to, seismic wave propagation and imaging, geodynamics, and hydro-mechanical processes.
The ever-increasing power and widespread availability of massively parallel supercomputers offers researchers the opportunity to continually increase both the spatio-temporal resolution and the physical complexity within their numerical models. However, this requires access to solvers that can harness the resources of high-performance computing clusters efficiently and which scale for problem sizes with billions of degrees of freedom.
In this minisymposium, we discuss numerical and algorithmic approaches, scientific libraries and coding practices to develop and maintain scalable solvers for various applications in geophysics. Examples include, but are not limited to, seismic wave propagation and imaging, geodynamics, and hydro-mechanical processes.
11:15 - 11:45
Extreme Scale Seismic Wave Propagation Simulation for Mars
, Václav Hapla (ETH Zurich, Switzerland)
+ Abstract { "session": {"id":"sess193","title":"MS39 - Scalable Solvers for Forward and Inverse Problems in Geophysics","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Darwin Room","contributors":[{"type":"Session Chair","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Solid Earth Dynamics","Physics"],"slots":[{"id":"symp114","type":"minisymposia","title":"MS39 - Scalable Solvers for Forward and Inverse Problems in Geophysics","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Nearly all fields in geophysics combine numerical models and data measurements to predict the future state of a dynamical system and\/or infer unknown parameters. Such models may produce highly nonlinear systems with extremely large numbers of unknowns.\u003Cbr \/\u003E\u003Cbr \/\u003EThe ever-increasing power and widespread availability of massively parallel supercomputers offers researchers the opportunity to continually increase\u00a0both the spatio-temporal resolution and the physical complexity within their numerical models. However, this requires access to solvers that can harness the resources of high-performance computing clusters efficiently and which scale for problem sizes with billions of degrees of freedom.\u003Cbr \/\u003E\u003Cbr \/\u003EIn this minisymposium, we discuss numerical and algorithmic approaches, scientific libraries and coding practices to develop and maintain scalable solvers for various applications in geophysics. Examples include, but are not limited to, seismic wave propagation and imaging, geodynamics, and hydro-mechanical processes.","bio":"","contributors":[{"type":"Organizer","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"V\u00e1clav","last_name":"Hapla","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa196","type":"child","title":"Extreme Scale Seismic Wave Propagation Simulation for Mars","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In 2018, the NASA InSight mission will place a highly sensitive broadband seismometer on Mars\u0027 surface to investigate its deep interior structure. As Mars has strong 3D features such as topography and crustal thickness, elastic wave propagation simulations are a crucial component for data interpretation. At the highest frequencies that are typically used, the resulting system has trillions of spatial degrees of freedom and requires hundreds of thousands of time steps. As there will only be a single receiver, reciprocity of the wave equation can be used to swap sources and receivers, so a total of three numerical simulations allows computing seismograms for any number of seismic sources.\u00a0To solve problems of that scale, we develop the Salvus software suite for full waveform modelling and inversion. Salvus makes use of the well-known PETSc toolkit. Its DMPlex module represents a mesh by a graph whose vertices represent cells, faces, edges and nodes uniformly. Discretization methods can be used unchanged for meshes of different shapes and dimensions.\u00a0For such large simulations, loading the whole mesh onto a single processor must be avoided. Hence, we employ parallel I\/O, on-the-fly partitioning, and load-balancing techniques, working with a distributed DMPlex representation throughout the whole simulation.","filename":"msa196s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"V\u00e1clav","last_name":"Hapla","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"van Driel","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Afanasiev","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Lion","last_name":"Krischer","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"V\u00e1clav","last_name":"Hapla","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa259","type":"child","title":"Seismic Wave Propagation on Complex Topographies Applied in the Alpine Area Using the ExaHyPE Hyperbolic PDE Engine","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ExaHyPE is a Horizon 2020 EU project to develop a high-performance engine to solve hyperbolic systems of PDEs using the high-order discontinuous Galerkin finite element method. The project goals are to develop an engine with flexible support for various applications which shall be tailored towards expected exascale architectures. The end-user is provided with an abstraction of the complicated algorithms to implement the ADER-DG numerical scheme and of the issues related to scalability and parallel adaptive mesh refinement (AMR), which are handled internally by the Peano framework. In our presentation we will give an introduction on how to implement scalable seismic wave propagation algorithms on complex topographies in the ExaHyPE engine. We will show and compare time-to-solution results for two different approaches for simulations in the alpine area. By remaining internally on a Cartesian mesh both methods allow modeling of the topography without the cumbersome meshing process. First is a newly developed curvilinear mesh approach which we are able to implement by only transforming flux and source terms. Second is a diffuse interface method which extends the PDE by a parameter handling the transition from solid to surface to a non-linear system.","bio":"","contributors":[{"type":"Author","first_name":"Leonhard","last_name":"Rannabauer","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Kenneth","last_name":"Duru","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Alice-Agnes","last_name":"Gabriel","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Bader","affiliation":"TU Munich","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Leonhard","last_name":"Rannabauer","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa179","type":"child","title":"StagBL: A Scalable, Portable, High-Performance Discretization and Solver Layer for Geodynamic Simulation","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"StagBL is an open-source parallel solver and discretization library for geodynamic simulation, encapsulating and optimizing operations essential to staggered-grid finite volume Stokes flow solvers. These form the basis for highly-efficient application codes for long-term mantle convection and lithospheric dynamics. StagBL prevents common bottlenecks to improving scalability, swapping solvers, adapting to new architectures, and optimizing performance. The StagBL project addresses these issues by providing a streamlined library to provide a path to performance from toy codes to quality, scalable implementations. It provides a parallel staggered-grid abstraction in C and Fortran, and an interface (DMStag) for PETSc. Tools are available to define boundary conditions, interact with particle systems, and efficiently solve Stokes systems in small (direct solver), medium (simple preconditioners), and large (block factorization and multigrid) model regimes. By implementing common kernels beneath a uniform abstraction layer, StagBL enables optimization for modern hardware, thus reducing community barriers to large-scale parallel simulation on modern architectures, and a platform to develop innovative new tools. By working directly with leading application codes StagYY, I3ELVIS, and LaMEM, and providing an API and examples for others, StagBL aims to become a community tool supplying scalable, portable, reproducible performance to novel science in regional- and planet-scale geodynamics.","bio":"","contributors":[{"type":"Author","first_name":"Patrick","last_name":"Sanan","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Patrick","last_name":"Sanan","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa163","type":"child","title":"HPC Solution Methods for Simulation of Hydro-Mechanical Processes in Geo-Environment","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"First, the contribution describes a hydro-mechanical model which combines Richards\u0027 model for variably saturated flow and nonlinear elasticity. If the flow and deformation are coupled by deformation dependent permeability and retention relation and saturation dependent elastic moduli, then the coupled hydro-mechanical model is suitable e.g. for simulation of processes in bentonite based engineering barriers in deep geological repository for the high-level radioactive waste. A validation of such type model was done within the international DECOVALEX project. The second part of the contribution deals with discussion about suitable iterative solution methods which combine iterations for solving coupled hydro-mechanical system with iterations for solving decoupled problems. Finally, we describe HPC solution methods for the systems arising from linearization.","filename":"msa163s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Radim","last_name":"Blaheta","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Hasal","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jakub","last_name":"Kruzik","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tomas","last_name":"Luber","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Zdenek","last_name":"Michalec","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Stary","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Radim","last_name":"Blaheta","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa196","type":"child","title":"Extreme Scale Seismic Wave Propagation Simulation for Mars","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In 2018, the NASA InSight mission will place a highly sensitive broadband seismometer on Mars\u0027 surface to investigate its deep interior structure. As Mars has strong 3D features such as topography and crustal thickness, elastic wave propagation simulations are a crucial component for data interpretation. At the highest frequencies that are typically used, the resulting system has trillions of spatial degrees of freedom and requires hundreds of thousands of time steps. As there will only be a single receiver, reciprocity of the wave equation can be used to swap sources and receivers, so a total of three numerical simulations allows computing seismograms for any number of seismic sources.\u00a0To solve problems of that scale, we develop the Salvus software suite for full waveform modelling and inversion. Salvus makes use of the well-known PETSc toolkit. Its DMPlex module represents a mesh by a graph whose vertices represent cells, faces, edges and nodes uniformly. Discretization methods can be used unchanged for meshes of different shapes and dimensions.\u00a0For such large simulations, loading the whole mesh onto a single processor must be avoided. Hence, we employ parallel I\/O, on-the-fly partitioning, and load-balancing techniques, working with a distributed DMPlex representation throughout the whole simulation.","filename":"msa196s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"V\u00e1clav","last_name":"Hapla","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"van Driel","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Afanasiev","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Lion","last_name":"Krischer","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"V\u00e1clav","last_name":"Hapla","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"V\u00e1clav","last_name":"Hapla","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"van Driel","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Afanasiev","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Lion","last_name":"Krischer","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}] } Presentation
12:45 - 13:15
HPC Solution Methods for Simulation of Hydro-Mechanical Processes in Geo-Environment
, Radim Blaheta (Institute of Geonics CAS, Czech Republic)
+ Abstract { "session": {"id":"sess193","title":"MS39 - Scalable Solvers for Forward and Inverse Problems in Geophysics","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Darwin Room","contributors":[{"type":"Session Chair","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Solid Earth Dynamics","Physics"],"slots":[{"id":"symp114","type":"minisymposia","title":"MS39 - Scalable Solvers for Forward and Inverse Problems in Geophysics","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Nearly all fields in geophysics combine numerical models and data measurements to predict the future state of a dynamical system and\/or infer unknown parameters. Such models may produce highly nonlinear systems with extremely large numbers of unknowns.\u003Cbr \/\u003E\u003Cbr \/\u003EThe ever-increasing power and widespread availability of massively parallel supercomputers offers researchers the opportunity to continually increase\u00a0both the spatio-temporal resolution and the physical complexity within their numerical models. However, this requires access to solvers that can harness the resources of high-performance computing clusters efficiently and which scale for problem sizes with billions of degrees of freedom.\u003Cbr \/\u003E\u003Cbr \/\u003EIn this minisymposium, we discuss numerical and algorithmic approaches, scientific libraries and coding practices to develop and maintain scalable solvers for various applications in geophysics. Examples include, but are not limited to, seismic wave propagation and imaging, geodynamics, and hydro-mechanical processes.","bio":"","contributors":[{"type":"Organizer","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"V\u00e1clav","last_name":"Hapla","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa196","type":"child","title":"Extreme Scale Seismic Wave Propagation Simulation for Mars","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In 2018, the NASA InSight mission will place a highly sensitive broadband seismometer on Mars\u0027 surface to investigate its deep interior structure. As Mars has strong 3D features such as topography and crustal thickness, elastic wave propagation simulations are a crucial component for data interpretation. At the highest frequencies that are typically used, the resulting system has trillions of spatial degrees of freedom and requires hundreds of thousands of time steps. As there will only be a single receiver, reciprocity of the wave equation can be used to swap sources and receivers, so a total of three numerical simulations allows computing seismograms for any number of seismic sources.\u00a0To solve problems of that scale, we develop the Salvus software suite for full waveform modelling and inversion. Salvus makes use of the well-known PETSc toolkit. Its DMPlex module represents a mesh by a graph whose vertices represent cells, faces, edges and nodes uniformly. Discretization methods can be used unchanged for meshes of different shapes and dimensions.\u00a0For such large simulations, loading the whole mesh onto a single processor must be avoided. Hence, we employ parallel I\/O, on-the-fly partitioning, and load-balancing techniques, working with a distributed DMPlex representation throughout the whole simulation.","filename":"msa196s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"V\u00e1clav","last_name":"Hapla","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"van Driel","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Afanasiev","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Christian","last_name":"Boehm","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Lion","last_name":"Krischer","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"V\u00e1clav","last_name":"Hapla","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa259","type":"child","title":"Seismic Wave Propagation on Complex Topographies Applied in the Alpine Area Using the ExaHyPE Hyperbolic PDE Engine","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ExaHyPE is a Horizon 2020 EU project to develop a high-performance engine to solve hyperbolic systems of PDEs using the high-order discontinuous Galerkin finite element method. The project goals are to develop an engine with flexible support for various applications which shall be tailored towards expected exascale architectures. The end-user is provided with an abstraction of the complicated algorithms to implement the ADER-DG numerical scheme and of the issues related to scalability and parallel adaptive mesh refinement (AMR), which are handled internally by the Peano framework. In our presentation we will give an introduction on how to implement scalable seismic wave propagation algorithms on complex topographies in the ExaHyPE engine. We will show and compare time-to-solution results for two different approaches for simulations in the alpine area. By remaining internally on a Cartesian mesh both methods allow modeling of the topography without the cumbersome meshing process. First is a newly developed curvilinear mesh approach which we are able to implement by only transforming flux and source terms. Second is a diffuse interface method which extends the PDE by a parameter handling the transition from solid to surface to a non-linear system.","bio":"","contributors":[{"type":"Author","first_name":"Leonhard","last_name":"Rannabauer","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Kenneth","last_name":"Duru","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Alice-Agnes","last_name":"Gabriel","affiliation":"Ludwig Maximilian University of Munich","country":"Germany","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Michael","last_name":"Bader","affiliation":"TU Munich","country":"Germany","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Leonhard","last_name":"Rannabauer","affiliation":"TU Munich","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa179","type":"child","title":"StagBL: A Scalable, Portable, High-Performance Discretization and Solver Layer for Geodynamic Simulation","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"StagBL is an open-source parallel solver and discretization library for geodynamic simulation, encapsulating and optimizing operations essential to staggered-grid finite volume Stokes flow solvers. These form the basis for highly-efficient application codes for long-term mantle convection and lithospheric dynamics. StagBL prevents common bottlenecks to improving scalability, swapping solvers, adapting to new architectures, and optimizing performance. The StagBL project addresses these issues by providing a streamlined library to provide a path to performance from toy codes to quality, scalable implementations. It provides a parallel staggered-grid abstraction in C and Fortran, and an interface (DMStag) for PETSc. Tools are available to define boundary conditions, interact with particle systems, and efficiently solve Stokes systems in small (direct solver), medium (simple preconditioners), and large (block factorization and multigrid) model regimes. By implementing common kernels beneath a uniform abstraction layer, StagBL enables optimization for modern hardware, thus reducing community barriers to large-scale parallel simulation on modern architectures, and a platform to develop innovative new tools. By working directly with leading application codes StagYY, I3ELVIS, and LaMEM, and providing an API and examples for others, StagBL aims to become a community tool supplying scalable, portable, reproducible performance to novel science in regional- and planet-scale geodynamics.","bio":"","contributors":[{"type":"Author","first_name":"Patrick","last_name":"Sanan","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Patrick","last_name":"Sanan","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa163","type":"child","title":"HPC Solution Methods for Simulation of Hydro-Mechanical Processes in Geo-Environment","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"First, the contribution describes a hydro-mechanical model which combines Richards\u0027 model for variably saturated flow and nonlinear elasticity. If the flow and deformation are coupled by deformation dependent permeability and retention relation and saturation dependent elastic moduli, then the coupled hydro-mechanical model is suitable e.g. for simulation of processes in bentonite based engineering barriers in deep geological repository for the high-level radioactive waste. A validation of such type model was done within the international DECOVALEX project. The second part of the contribution deals with discussion about suitable iterative solution methods which combine iterations for solving coupled hydro-mechanical system with iterations for solving decoupled problems. Finally, we describe HPC solution methods for the systems arising from linearization.","filename":"msa163s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Radim","last_name":"Blaheta","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Hasal","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jakub","last_name":"Kruzik","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tomas","last_name":"Luber","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Zdenek","last_name":"Michalec","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Stary","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Radim","last_name":"Blaheta","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa163","type":"child","title":"HPC Solution Methods for Simulation of Hydro-Mechanical Processes in Geo-Environment","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"First, the contribution describes a hydro-mechanical model which combines Richards\u0027 model for variably saturated flow and nonlinear elasticity. If the flow and deformation are coupled by deformation dependent permeability and retention relation and saturation dependent elastic moduli, then the coupled hydro-mechanical model is suitable e.g. for simulation of processes in bentonite based engineering barriers in deep geological repository for the high-level radioactive waste. A validation of such type model was done within the international DECOVALEX project. The second part of the contribution deals with discussion about suitable iterative solution methods which combine iterations for solving coupled hydro-mechanical system with iterations for solving decoupled problems. Finally, we describe HPC solution methods for the systems arising from linearization.","filename":"msa163s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Radim","last_name":"Blaheta","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Hasal","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jakub","last_name":"Kruzik","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tomas","last_name":"Luber","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Zdenek","last_name":"Michalec","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Stary","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"6","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Radim","last_name":"Blaheta","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Radim","last_name":"Blaheta","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Martin","last_name":"Hasal","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jakub","last_name":"Kruzik","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Tomas","last_name":"Luber","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Zdenek","last_name":"Michalec","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"Jiri","last_name":"Stary","affiliation":"Institute of Geonics CAS","country":"Czech Republic","bio":"","order":"6","is_presenter":false}] } Presentation
Organizer(s):
Peter Dominik Dueben (ECMWF, United Kingdom)
, Carlos E. Osuna (MeteoSwiss, Switzerland)
Track(s):
Computer Science and Applied Mathematics, Climate and Weather, Physics
Reliable weather predictions and climate projections are of vital importance for society and for the creation and preservation of prosperity. An increase in horizontal resolution in global simulations of the atmosphere to ~1 km would enable us to represent large cloud-systems and deep convection explicitly within simulations. This would reduce much of the uncertainty in predictions of weather and climate. Unfortunately, it is not possible yet to run operational simulations with state-of-the-art global weather and climate models at this level of resolution using today’s supercomputing facilities. This minisymposium will provide an update on the progress of global weather and climate models when running at 1 km horizontal resolution. The most significant challenges towards cloud-resolving simulations will be assessed (including scalability, simulations far away from peak-performance, I/O volume and efficient time-stepping schemes) and possible ways to overcome these barriers will be discussed.
11:15 - 11:45
At the Edge of Resolution: Earth System Modelling at ECMWF
, Nils P. Wedi (ECMWF, United Kingdom)
+ Abstract { "session": {"id":"sess197","title":"MS40 - Towards Weather and Climate Simulations at 1-km Resolution","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Climate and Weather","Physics"],"slots":[{"id":"symp111","type":"minisymposia","title":"MS40 - Towards Weather and Climate Simulations at 1-km Resolution","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Reliable weather predictions and climate projections are of vital importance for society and for the creation and preservation of prosperity. An increase in horizontal resolution in global simulations of the atmosphere to ~1 km would enable us to represent large cloud-systems and deep convection explicitly within simulations. This would reduce much of the uncertainty in predictions of weather and climate. Unfortunately, it is not possible yet to run operational simulations with state-of-the-art global weather and climate models at this level of resolution using today\u2019s supercomputing facilities. This minisymposium will provide an update on the progress of global weather and climate models when running at 1 km horizontal resolution. The most significant challenges towards cloud-resolving simulations will be assessed (including scalability, simulations far away from peak-performance, I\/O volume and efficient time-stepping schemes) and possible ways to overcome these barriers will be discussed.","bio":"","contributors":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa197","type":"child","title":"At the Edge of Resolution: Earth System Modelling at ECMWF","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Global km-scale simulations of weather and climate are at the frontier to become technically feasible as a research tool for exploring the added value of resolving rather than parametrising the vertical redistribution of momentum and heat at globally uniform resolutions of O(1km). Handling and processing the global data is still pushing the (design) limits of current infrastructures, but it is expected to inspire the future design process, and pave the road for the next generation of weather and climate services. For example, ECMWF plans to run an ensemble of forecasts at 5km resolution in 2025. This talk will discuss and compare simulations at 18km, 9km, 5km, 2.5km and 1.25km resolution, by analysing the statistical differences of horizontal and vertical motions, and expected ensemble statistics for extreme events at 5km, and by limited investigations into the relative effects of different choices for the numerics and (remaining) physical parametrizations. These illustrate that resolution is important but not the only aspect delivering continuous improvements in global numerical weather prediction.","filename":"msa197s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa263","type":"child","title":"Using Global Cloud-Resolving Models for Weather Predictions and for Studies of Clouds in the Climate System","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The FV3 group at the NOAA\/GFDL and the RCEC at the Academia Sinica, Taiwan, is developing a new type of Global Cloud-Resolving Model (GCRM) based on an integrated dynamics-physics concept, in which several fast-acting physics (e.g., cloud microphysics) are incorporated into a new FV3 (nu-FV3) framework. This new model is still being actively developed. This new model improves the dynamics-physics interaction and increases in computational efficiency due to the separation of the fast-acting physics from the slow-physics, allowing a near tenfold increase in overall time step. We have also built some of the SubGrid Orographical processes into the nu-FV3 dynamics, which unavoidably breaks the traditional boundary between \u0022dynamics\u0022 and \u0022physics\u0022. We believe the boundary between the \u0022dynamics\u0022 and \u0022physics\u0022 set by the traditional modeling framework is one reason that limits modeling advancements. A preliminary version of this new type of GCRM is used for the DYAMOND project. We will carry out several 40-day \u0022convective-parameterization-free\u0022 experiments across the gray-zone at three different horizontal resolutions: 13, 6.5, and 3.25 km. As a potential tool for sub-seasonal predictions, we shall analyze the hindcast skill (first 10 days) as well as the systematic \u0022climate basis\u0022 for the last 30 days.","bio":"","contributors":[{"type":"Author","first_name":"Shian-Jiann","last_name":"Lin","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Shian-Jiann","last_name":"Lin","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa271","type":"child","title":"Near-Global RCM Simulations to Establish a Baseline for Global 1 km GCM Simulations","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Reducing the horizontal resolution of global weather and climate models to the kilometer-scale holds the promise of reducing some of the long-standing biases and uncertainties. At these resolutions, some of the key processes such as deep convection, gravity wave drag and ocean eddies can be resolved explicitly on the model grid and thus much closer to first principles. But how far are we from achieving this goal? The presentation will show results from scaling a regional climate model (COSMO) to cover almost the entire Earth at increasing resolutions of up to 1 km. COSMO has been systematically adapted to make use of hybrid compute node designs with accelerators such as graphics processing units (GPUs) and thus can make efficient use of all of Europe\u2019s currently largest supercomputer, Piz Daint. To our knowledge this represents the first complete atmospheric model being run entirely on accelerators at this scale. At a grid spacing of 930\u2009m (1.9\u2009km), we achieve a simulation throughput of 0.043 (0.23) simulated years per day and an energy consumption of 596\u2009MWh per simulated year. We discuss the implications of these simulations as a baseline for what is achievable when systematically adapting our codes to make use of emerging computer architectures.","bio":"","contributors":[{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tarun","last_name":"Chadha","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Grzegorz","last_name":"Kwasniewski","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Luethi","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Leutwyler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Christoph","last_name":"Sch\u00e4r","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Hannes","last_name":"Vogt","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"11","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hannes","last_name":"Vogt","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"11","is_presenter":true}]},{"id":"msa202","type":"child","title":"ESCAPE: Energy-Efficient Scalable Algorithms for Weather Prediction on Exascale Supercomputers","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the simulation of complex multi-scale flow problems, such as those arising in weather and climate modelling or in engineering, one of the biggest challenges is to satisfy operational requirements in terms of time-to-solution and available energy without compromising the accuracy and stability of the solution. These two competing factors require extreme computational capabilities in conjunction with state-of-the-art algorithms that can optimally suit the targeted underlying hardware while improving the convergence to the desired solution. The European Centre for Medium Range Weather Forecasts (ECMWF) is leading the H2020 FET-HPC project ESCAPE (Energy-efficient SCalable Algorithms for weather Prediction on Exascale supercomputers). The ESCAPE project includes the development of new algorithms that are specifically designed for better energy efficiency, testing and optimisation of different numerical techniques and improved portability through domain specific languages. The project incorporates through ECMWF\u0027s project partners the expertise of leading European regional forecasting consortia, university research, experienced high-performance centres and hardware vendors. This talk gives an overview of the ESCAPE project and summarises some of the key results obtained so far.","filename":"msa202s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andreas","last_name":"Mueller","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Peter","last_name":"Bauer","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Mueller","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa197","type":"child","title":"At the Edge of Resolution: Earth System Modelling at ECMWF","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Global km-scale simulations of weather and climate are at the frontier to become technically feasible as a research tool for exploring the added value of resolving rather than parametrising the vertical redistribution of momentum and heat at globally uniform resolutions of O(1km). Handling and processing the global data is still pushing the (design) limits of current infrastructures, but it is expected to inspire the future design process, and pave the road for the next generation of weather and climate services. For example, ECMWF plans to run an ensemble of forecasts at 5km resolution in 2025. This talk will discuss and compare simulations at 18km, 9km, 5km, 2.5km and 1.25km resolution, by analysing the statistical differences of horizontal and vertical motions, and expected ensemble statistics for extreme events at 5km, and by limited investigations into the relative effects of different choices for the numerics and (remaining) physical parametrizations. These illustrate that resolution is important but not the only aspect delivering continuous improvements in global numerical weather prediction.","filename":"msa197s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}] } Presentation
12:45 - 13:15
ESCAPE: Energy-Efficient Scalable Algorithms for Weather Prediction on Exascale Supercomputers
, Andreas Mueller (ECMWF, United Kingdom)
+ Abstract { "session": {"id":"sess197","title":"MS40 - Towards Weather and Climate Simulations at 1-km Resolution","date":"Wednesday, July 4th 2018","begin_time":"11:15","end_time":"13:15","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Climate and Weather","Physics"],"slots":[{"id":"symp111","type":"minisymposia","title":"MS40 - Towards Weather and Climate Simulations at 1-km Resolution","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"Reliable weather predictions and climate projections are of vital importance for society and for the creation and preservation of prosperity. An increase in horizontal resolution in global simulations of the atmosphere to ~1 km would enable us to represent large cloud-systems and deep convection explicitly within simulations. This would reduce much of the uncertainty in predictions of weather and climate. Unfortunately, it is not possible yet to run operational simulations with state-of-the-art global weather and climate models at this level of resolution using today\u2019s supercomputing facilities. This minisymposium will provide an update on the progress of global weather and climate models when running at 1 km horizontal resolution. The most significant challenges towards cloud-resolving simulations will be assessed (including scalability, simulations far away from peak-performance, I\/O volume and efficient time-stepping schemes) and possible ways to overcome these barriers will be discussed.","bio":"","contributors":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Peter Dominik","last_name":"Dueben","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa197","type":"child","title":"At the Edge of Resolution: Earth System Modelling at ECMWF","begin_time":"11:15","end_time":"11:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Global km-scale simulations of weather and climate are at the frontier to become technically feasible as a research tool for exploring the added value of resolving rather than parametrising the vertical redistribution of momentum and heat at globally uniform resolutions of O(1km). Handling and processing the global data is still pushing the (design) limits of current infrastructures, but it is expected to inspire the future design process, and pave the road for the next generation of weather and climate services. For example, ECMWF plans to run an ensemble of forecasts at 5km resolution in 2025. This talk will discuss and compare simulations at 18km, 9km, 5km, 2.5km and 1.25km resolution, by analysing the statistical differences of horizontal and vertical motions, and expected ensemble statistics for extreme events at 5km, and by limited investigations into the relative effects of different choices for the numerics and (remaining) physical parametrizations. These illustrate that resolution is important but not the only aspect delivering continuous improvements in global numerical weather prediction.","filename":"msa197s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa263","type":"child","title":"Using Global Cloud-Resolving Models for Weather Predictions and for Studies of Clouds in the Climate System","begin_time":"11:45","end_time":"12:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The FV3 group at the NOAA\/GFDL and the RCEC at the Academia Sinica, Taiwan, is developing a new type of Global Cloud-Resolving Model (GCRM) based on an integrated dynamics-physics concept, in which several fast-acting physics (e.g., cloud microphysics) are incorporated into a new FV3 (nu-FV3) framework. This new model is still being actively developed. This new model improves the dynamics-physics interaction and increases in computational efficiency due to the separation of the fast-acting physics from the slow-physics, allowing a near tenfold increase in overall time step. We have also built some of the SubGrid Orographical processes into the nu-FV3 dynamics, which unavoidably breaks the traditional boundary between \u0022dynamics\u0022 and \u0022physics\u0022. We believe the boundary between the \u0022dynamics\u0022 and \u0022physics\u0022 set by the traditional modeling framework is one reason that limits modeling advancements. A preliminary version of this new type of GCRM is used for the DYAMOND project. We will carry out several 40-day \u0022convective-parameterization-free\u0022 experiments across the gray-zone at three different horizontal resolutions: 13, 6.5, and 3.25 km. As a potential tool for sub-seasonal predictions, we shall analyze the hindcast skill (first 10 days) as well as the systematic \u0022climate basis\u0022 for the last 30 days.","bio":"","contributors":[{"type":"Author","first_name":"Shian-Jiann","last_name":"Lin","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Shian-Jiann","last_name":"Lin","affiliation":"NOAA","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa271","type":"child","title":"Near-Global RCM Simulations to Establish a Baseline for Global 1 km GCM Simulations","begin_time":"12:15","end_time":"12:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Reducing the horizontal resolution of global weather and climate models to the kilometer-scale holds the promise of reducing some of the long-standing biases and uncertainties. At these resolutions, some of the key processes such as deep convection, gravity wave drag and ocean eddies can be resolved explicitly on the model grid and thus much closer to first principles. But how far are we from achieving this goal? The presentation will show results from scaling a regional climate model (COSMO) to cover almost the entire Earth at increasing resolutions of up to 1 km. COSMO has been systematically adapted to make use of hybrid compute node designs with accelerators such as graphics processing units (GPUs) and thus can make efficient use of all of Europe\u2019s currently largest supercomputer, Piz Daint. To our knowledge this represents the first complete atmospheric model being run entirely on accelerators at this scale. At a grid spacing of 930\u2009m (1.9\u2009km), we achieve a simulation throughput of 0.043 (0.23) simulated years per day and an energy consumption of 596\u2009MWh per simulated year. We discuss the implications of these simulations as a baseline for what is achievable when systematically adapting our codes to make use of emerging computer architectures.","bio":"","contributors":[{"type":"Author","first_name":"Oliver","last_name":"Fuhrer","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Tarun","last_name":"Chadha","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Torsten","last_name":"Hoefler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Grzegorz","last_name":"Kwasniewski","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Daniel","last_name":"Luethi","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"5","is_presenter":false},{"type":"Author","first_name":"David","last_name":"Leutwyler","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Xavier","last_name":"Lapillonne","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Christoph","last_name":"Sch\u00e4r","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"8","is_presenter":false},{"type":"Author","first_name":"Carlos E.","last_name":"Osuna","affiliation":"MeteoSwiss","country":"Switzerland","bio":"","order":"9","is_presenter":false},{"type":"Author","first_name":"Thomas","last_name":"Schulthess","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"10","is_presenter":false},{"type":"Author","first_name":"Hannes","last_name":"Vogt","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"11","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hannes","last_name":"Vogt","affiliation":"ETH Zurich \/ CSCS","country":"Switzerland","bio":"","order":"11","is_presenter":true}]},{"id":"msa202","type":"child","title":"ESCAPE: Energy-Efficient Scalable Algorithms for Weather Prediction on Exascale Supercomputers","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the simulation of complex multi-scale flow problems, such as those arising in weather and climate modelling or in engineering, one of the biggest challenges is to satisfy operational requirements in terms of time-to-solution and available energy without compromising the accuracy and stability of the solution. These two competing factors require extreme computational capabilities in conjunction with state-of-the-art algorithms that can optimally suit the targeted underlying hardware while improving the convergence to the desired solution. The European Centre for Medium Range Weather Forecasts (ECMWF) is leading the H2020 FET-HPC project ESCAPE (Energy-efficient SCalable Algorithms for weather Prediction on Exascale supercomputers). The ESCAPE project includes the development of new algorithms that are specifically designed for better energy efficiency, testing and optimisation of different numerical techniques and improved portability through domain specific languages. The project incorporates through ECMWF\u0027s project partners the expertise of leading European regional forecasting consortia, university research, experienced high-performance centres and hardware vendors. This talk gives an overview of the ESCAPE project and summarises some of the key results obtained so far.","filename":"msa202s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andreas","last_name":"Mueller","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Peter","last_name":"Bauer","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Mueller","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa202","type":"child","title":"ESCAPE: Energy-Efficient Scalable Algorithms for Weather Prediction on Exascale Supercomputers","begin_time":"12:45","end_time":"13:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the simulation of complex multi-scale flow problems, such as those arising in weather and climate modelling or in engineering, one of the biggest challenges is to satisfy operational requirements in terms of time-to-solution and available energy without compromising the accuracy and stability of the solution. These two competing factors require extreme computational capabilities in conjunction with state-of-the-art algorithms that can optimally suit the targeted underlying hardware while improving the convergence to the desired solution. The European Centre for Medium Range Weather Forecasts (ECMWF) is leading the H2020 FET-HPC project ESCAPE (Energy-efficient SCalable Algorithms for weather Prediction on Exascale supercomputers). The ESCAPE project includes the development of new algorithms that are specifically designed for better energy efficiency, testing and optimisation of different numerical techniques and improved portability through domain specific languages. The project incorporates through ECMWF\u0027s project partners the expertise of leading European regional forecasting consortia, university research, experienced high-performance centres and hardware vendors. This talk gives an overview of the ESCAPE project and summarises some of the key results obtained so far.","filename":"msa202s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Andreas","last_name":"Mueller","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Peter","last_name":"Bauer","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Andreas","last_name":"Mueller","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Andreas","last_name":"Mueller","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Nils P.","last_name":"Wedi","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Peter","last_name":"Bauer","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"4","is_presenter":false}] } Presentation
Organizer(s):
Daniel Jacobson (Oak Ridge National Laboratory, United States of America)
, Ben Brown (Lawrence Berkeley National Laboratory, United States of America)
, Georgios Gkoutos (University of Birmingham, United Kingdom)
Track(s):
Life Sciences, Engineering, Emerging Application Domains, Computer Science and Applied Mathematics
The cost of generating biological data is dropping exponentially, a decrease that has far outstripped predictions based on Moore’s Law. This has ushered in a new era of systems biology in which there are unprecedented opportunities to gain insights into complex biological systems. Integrated biological models need to capture the higher order complexity of the interactions among cellular components. Solving such complex combinatorial problems will give us unprecedented levels of understanding of biological systems. However, this leads to a combinatorial explosion in the search space of biological data. These exponentially increasing volumes of data, combined with the desire to model more and more sophisticated sets of relationships within a cell and across an organism (or in some cases even ecosystems), have led to an unmet need for computational resources and sophisticated algorithms that can make use of them. Thus the bottleneck in biological science is no longer data generation but is in fact computational analysis. A full model of all of the higher order interactions of cellular and organismal components is one of the ultimate grand challenges of systems biology. The use of machine and deep learning algorithms provide some of the methodologies with which to achieve this goal.
13:15 - 14:15
Lunch
Foyer 2nd Floor
14:15 - 16:15
Minisymposia Session VI
Organizer(s):
Willem Deconinck (ECMWF, United Kingdom)
, Katherine Evans (Oak Ridge National Laboratory, United States of America)
Track(s):
Climate and Weather
The algorithms underlying numerical weather prediction (NWP) and climate models that have been developed in the past few decades face an increasing challenge to adapt to paradigm shifts imposed by new hardware developments. The emerging diverse and complex hardware solutions have a large impact on the programming models traditionally used in NWP and climate modelling software, triggering a rethink of design choices for future software frameworks and how Earth system model (ESM) components interact. Furthermore there is a drive to increase the model complexity to include ever more processes of the whole Earth system. These complexities will inevitably break NWP modelling infrastructures that did not take these aspects into consideration 30+ years ago. As upcoming hardware solutions evermore push the boundaries of parallel execution, each ESM component may be reaching a limit in parallel scaling efficiency.
Already, coupling infrastructure exists that enables concurrent execution of model components. In order to make use of increasing parallelism, it may be required to take a step backwards, and redesign the ESM components to become more modular and enable more concurrent execution. This minisymposium will provide an update on progress both in infrastructure developments and developments in concurrent model component execution.
Already, coupling infrastructure exists that enables concurrent execution of model components. In order to make use of increasing parallelism, it may be required to take a step backwards, and redesign the ESM components to become more modular and enable more concurrent execution. This minisymposium will provide an update on progress both in infrastructure developments and developments in concurrent model component execution.
14:45 - 15:15
Comodels: A New Approach for Coupling Models for the [Tera,Exa]Scale
, George Mozdzynski (ECMWF, United Kingdom)
+ Abstract { "session": {"id":"sess164","title":"MS42 - Coupling Strategies Towards Exascale for Complex Earth System Modelling","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp157","type":"minisymposia","title":"MS42 - Coupling Strategies Towards Exascale for Complex Earth System Modelling","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The algorithms underlying numerical weather prediction (NWP) and climate models that have been developed in the past few decades face an increasing challenge to adapt to paradigm shifts imposed by new hardware developments. The emerging diverse and complex hardware solutions have a large impact on the programming models traditionally used in NWP and climate modelling software, triggering a rethink of design choices for future software frameworks and how Earth system model (ESM) components interact. Furthermore there is a drive to increase the model complexity to include ever more processes of the whole Earth system. These complexities will inevitably break NWP modelling infrastructures that did not take these aspects into consideration 30+ years ago. As upcoming hardware solutions evermore push the boundaries of parallel execution, each ESM component may be reaching a limit in parallel scaling efficiency. \u003Cbr \/\u003E\u003Cbr \/\u003E Already, coupling infrastructure exists that enables concurrent execution of model components. In order to make use of increasing parallelism, it may be required to take a step backwards, and redesign the ESM components to become more modular and enable more concurrent execution. This minisymposium will provide an update on progress both in infrastructure developments and developments in concurrent model component execution.","bio":"","contributors":[{"type":"Organizer","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Katherine","last_name":"Evans","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa266","type":"child","title":"Flexible Earth System Modelling on Multiple Grids","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"There is a drive to increase the model complexity to include ever more processes of the whole Earth system. Some of these processes may require computations on grids of different type or resolution than the atmospheric grid. Multiple grid structures may be required as part of the numerical filtering strategy for atmospheric wave motions or to simply save computational cost of selected physical processes. These different grids may have different domain decompositions for parallel computations, and different parallelisation strategies. Moreover, the internal memory layout for a field that is optimal for a specific combination of hardware architecture and numerical algorithm may not be optimal for another combination. These complexities will inevitably break NWP modelling infrastructures that did not take these aspects into consideration 30+ years ago. In this talk, we demonstrate how Atlas, a new library developed at ECMWF, is used to complement ECMWF\u0027s Integrated Forecasting System (IFS) model to enable a number of physical processes to be implemented on multiple grids. Atlas helps to accommodate flexibility in hardware and software choices as well as increasing model complexity.","bio":"","contributors":[{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michail","last_name":"Diamantakis","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa150","type":"child","title":"Comodels: A New Approach for Coupling Models for the [Tera,Exa]Scale","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ECMWF\u2019s IFS spectral model has been using a hybrid MPI\/OpenMP parallelization approach since about 2002 when just 2 OpenMP threads per MPI task (OMP_NUM_THREADS=2) was used on an IBM Power4 system. The number of threads per task have gradually increased over the years to today where 12 or 18 achieves the best performance on a CRAY XC-40 (Broadwell) depending on the specific IFS model resolution. Further, it is realized that simply increasing the threads per task does not deliver improved performance. Likewise, the MPI communications cost of the IFS spectral model imposes severe demands on the switch fabric and scaling high resolution model cases beyond o(100k) cores yields minimal performance gains. Given that future supercomputer architectures are expected to support o(100) computational threads per MPI task (or socket) a codesign effort involving ECMWF, EPCC, and CRAY was started in 2015 to explore how such an extreme thread count could be achieved in practice. This effort followed on from this team\u2019s successful collaboration in the EU funded CRESTA project (2012-2014). In this talk we will present an evolutionary OpenMP parallelization approach for IFS which we call comodels which is an ongoing development at ECMWF and present some new results of this work and future plans.","filename":"msa150s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"George","last_name":"Mozdzynski","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"George","last_name":"Mozdzynski","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa267","type":"child","title":"Modeling Systems at the End of Dennard Scaling","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Conventional computational hardware has reached some physical limits: the phenomenon known as \u0027Dennard scaling\u0027 gave rise to Moore\u0027s Law, and many cycles of exponential growth in computing capacity. The consequence is that we now anticipate a computing future of increased concurrency and slower arithmetic. Earth system models, which are weak-scaling and memory-bandwidth-bound, face a particular challenge given their complexity in physical-chemical-biological space, to which mapping single algorithms or approaches is not possible. A particular aspect of such \u0027multi-scale multi-physics\u0027 models that is under-appreciated is that they are built using a combination of local process-level and global system-level observational constraints, for which the calibration process itself remains a substantial computational challenge. In this talk, we examine approaches to Earth system modeling in the post-Dennard era. The possibilities include following the industry trend toward machine learning and build models that learn; stochastic methods and emulators for fast exploration of uncertainty; using fewer bits of precision, among others. The talk will present ideas and challenges as we prepare for a post-Dennard future.","filename":"msa267s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa255","type":"child","title":"Making the Expensive Affordable: Running a Chemistry Model in the UKESM Climate Model","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Met Office\/NERC UKESM coupled climate model employs a chemistry model (UKCA) embedded within the Unified Model (UM) atmosphere. At high resolutions this is prohibitively expensive due to computational costs of the chemistry. Consequently, we have been developing a mechanism to run the chemistry model at a lower resolution than the \u0022main\u0022 atmospheric model. Since the chemistry code is embedded within the UM, it may not be run it as a separate stand-alone entity or at a different resolution from the atmosphere code. We believe it would be possible to achieve the necessary performance by creating a \u0022hybrid\u0022 coupled model: a configuration featuring two concurrently run copies of the same UM code at different resolutions. One component would run at high resolution, without chemistry, the other component at lower resolution featuring full chemistry. This approach requires the coupling exchange of 3D fields between the two components. We are putting together a coupled system featuring a full resolution UM atmosphere, a reduced resolution UM atmosphere-UKCA chemistry, a NEMO ocean and a CICE sea ice model coupled using OASIS3-MCT. I aim to give an overview of the plans, progress made and issues which may yet prove to be major stumbling blocks.","filename":"msa255s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Richard","last_name":"Hill","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Richard","last_name":"Hill","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa150","type":"child","title":"Comodels: A New Approach for Coupling Models for the [Tera,Exa]Scale","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ECMWF\u2019s IFS spectral model has been using a hybrid MPI\/OpenMP parallelization approach since about 2002 when just 2 OpenMP threads per MPI task (OMP_NUM_THREADS=2) was used on an IBM Power4 system. The number of threads per task have gradually increased over the years to today where 12 or 18 achieves the best performance on a CRAY XC-40 (Broadwell) depending on the specific IFS model resolution. Further, it is realized that simply increasing the threads per task does not deliver improved performance. Likewise, the MPI communications cost of the IFS spectral model imposes severe demands on the switch fabric and scaling high resolution model cases beyond o(100k) cores yields minimal performance gains. Given that future supercomputer architectures are expected to support o(100) computational threads per MPI task (or socket) a codesign effort involving ECMWF, EPCC, and CRAY was started in 2015 to explore how such an extreme thread count could be achieved in practice. This effort followed on from this team\u2019s successful collaboration in the EU funded CRESTA project (2012-2014). In this talk we will present an evolutionary OpenMP parallelization approach for IFS which we call comodels which is an ongoing development at ECMWF and present some new results of this work and future plans.","filename":"msa150s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"George","last_name":"Mozdzynski","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"George","last_name":"Mozdzynski","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"George","last_name":"Mozdzynski","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}] } Presentation
15:15 - 15:45
Modeling Systems at the End of Dennard Scaling
, Venkatramani Balaji (Princeton University, United States of America)
+ Abstract { "session": {"id":"sess164","title":"MS42 - Coupling Strategies Towards Exascale for Complex Earth System Modelling","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp157","type":"minisymposia","title":"MS42 - Coupling Strategies Towards Exascale for Complex Earth System Modelling","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The algorithms underlying numerical weather prediction (NWP) and climate models that have been developed in the past few decades face an increasing challenge to adapt to paradigm shifts imposed by new hardware developments. The emerging diverse and complex hardware solutions have a large impact on the programming models traditionally used in NWP and climate modelling software, triggering a rethink of design choices for future software frameworks and how Earth system model (ESM) components interact. Furthermore there is a drive to increase the model complexity to include ever more processes of the whole Earth system. These complexities will inevitably break NWP modelling infrastructures that did not take these aspects into consideration 30+ years ago. As upcoming hardware solutions evermore push the boundaries of parallel execution, each ESM component may be reaching a limit in parallel scaling efficiency. \u003Cbr \/\u003E\u003Cbr \/\u003E Already, coupling infrastructure exists that enables concurrent execution of model components. In order to make use of increasing parallelism, it may be required to take a step backwards, and redesign the ESM components to become more modular and enable more concurrent execution. This minisymposium will provide an update on progress both in infrastructure developments and developments in concurrent model component execution.","bio":"","contributors":[{"type":"Organizer","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Katherine","last_name":"Evans","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa266","type":"child","title":"Flexible Earth System Modelling on Multiple Grids","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"There is a drive to increase the model complexity to include ever more processes of the whole Earth system. Some of these processes may require computations on grids of different type or resolution than the atmospheric grid. Multiple grid structures may be required as part of the numerical filtering strategy for atmospheric wave motions or to simply save computational cost of selected physical processes. These different grids may have different domain decompositions for parallel computations, and different parallelisation strategies. Moreover, the internal memory layout for a field that is optimal for a specific combination of hardware architecture and numerical algorithm may not be optimal for another combination. These complexities will inevitably break NWP modelling infrastructures that did not take these aspects into consideration 30+ years ago. In this talk, we demonstrate how Atlas, a new library developed at ECMWF, is used to complement ECMWF\u0027s Integrated Forecasting System (IFS) model to enable a number of physical processes to be implemented on multiple grids. Atlas helps to accommodate flexibility in hardware and software choices as well as increasing model complexity.","bio":"","contributors":[{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michail","last_name":"Diamantakis","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa150","type":"child","title":"Comodels: A New Approach for Coupling Models for the [Tera,Exa]Scale","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ECMWF\u2019s IFS spectral model has been using a hybrid MPI\/OpenMP parallelization approach since about 2002 when just 2 OpenMP threads per MPI task (OMP_NUM_THREADS=2) was used on an IBM Power4 system. The number of threads per task have gradually increased over the years to today where 12 or 18 achieves the best performance on a CRAY XC-40 (Broadwell) depending on the specific IFS model resolution. Further, it is realized that simply increasing the threads per task does not deliver improved performance. Likewise, the MPI communications cost of the IFS spectral model imposes severe demands on the switch fabric and scaling high resolution model cases beyond o(100k) cores yields minimal performance gains. Given that future supercomputer architectures are expected to support o(100) computational threads per MPI task (or socket) a codesign effort involving ECMWF, EPCC, and CRAY was started in 2015 to explore how such an extreme thread count could be achieved in practice. This effort followed on from this team\u2019s successful collaboration in the EU funded CRESTA project (2012-2014). In this talk we will present an evolutionary OpenMP parallelization approach for IFS which we call comodels which is an ongoing development at ECMWF and present some new results of this work and future plans.","filename":"msa150s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"George","last_name":"Mozdzynski","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"George","last_name":"Mozdzynski","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa267","type":"child","title":"Modeling Systems at the End of Dennard Scaling","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Conventional computational hardware has reached some physical limits: the phenomenon known as \u0027Dennard scaling\u0027 gave rise to Moore\u0027s Law, and many cycles of exponential growth in computing capacity. The consequence is that we now anticipate a computing future of increased concurrency and slower arithmetic. Earth system models, which are weak-scaling and memory-bandwidth-bound, face a particular challenge given their complexity in physical-chemical-biological space, to which mapping single algorithms or approaches is not possible. A particular aspect of such \u0027multi-scale multi-physics\u0027 models that is under-appreciated is that they are built using a combination of local process-level and global system-level observational constraints, for which the calibration process itself remains a substantial computational challenge. In this talk, we examine approaches to Earth system modeling in the post-Dennard era. The possibilities include following the industry trend toward machine learning and build models that learn; stochastic methods and emulators for fast exploration of uncertainty; using fewer bits of precision, among others. The talk will present ideas and challenges as we prepare for a post-Dennard future.","filename":"msa267s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa255","type":"child","title":"Making the Expensive Affordable: Running a Chemistry Model in the UKESM Climate Model","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Met Office\/NERC UKESM coupled climate model employs a chemistry model (UKCA) embedded within the Unified Model (UM) atmosphere. At high resolutions this is prohibitively expensive due to computational costs of the chemistry. Consequently, we have been developing a mechanism to run the chemistry model at a lower resolution than the \u0022main\u0022 atmospheric model. Since the chemistry code is embedded within the UM, it may not be run it as a separate stand-alone entity or at a different resolution from the atmosphere code. We believe it would be possible to achieve the necessary performance by creating a \u0022hybrid\u0022 coupled model: a configuration featuring two concurrently run copies of the same UM code at different resolutions. One component would run at high resolution, without chemistry, the other component at lower resolution featuring full chemistry. This approach requires the coupling exchange of 3D fields between the two components. We are putting together a coupled system featuring a full resolution UM atmosphere, a reduced resolution UM atmosphere-UKCA chemistry, a NEMO ocean and a CICE sea ice model coupled using OASIS3-MCT. I aim to give an overview of the plans, progress made and issues which may yet prove to be major stumbling blocks.","filename":"msa255s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Richard","last_name":"Hill","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Richard","last_name":"Hill","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa267","type":"child","title":"Modeling Systems at the End of Dennard Scaling","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Conventional computational hardware has reached some physical limits: the phenomenon known as \u0027Dennard scaling\u0027 gave rise to Moore\u0027s Law, and many cycles of exponential growth in computing capacity. The consequence is that we now anticipate a computing future of increased concurrency and slower arithmetic. Earth system models, which are weak-scaling and memory-bandwidth-bound, face a particular challenge given their complexity in physical-chemical-biological space, to which mapping single algorithms or approaches is not possible. A particular aspect of such \u0027multi-scale multi-physics\u0027 models that is under-appreciated is that they are built using a combination of local process-level and global system-level observational constraints, for which the calibration process itself remains a substantial computational challenge. In this talk, we examine approaches to Earth system modeling in the post-Dennard era. The possibilities include following the industry trend toward machine learning and build models that learn; stochastic methods and emulators for fast exploration of uncertainty; using fewer bits of precision, among others. The talk will present ideas and challenges as we prepare for a post-Dennard future.","filename":"msa267s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}] } Presentation
15:45 - 16:15
Making the Expensive Affordable: Running a Chemistry Model in the UKESM Climate Model
, Richard Hill (Met Office, United Kingdom)
+ Abstract { "session": {"id":"sess164","title":"MS42 - Coupling Strategies Towards Exascale for Complex Earth System Modelling","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Rio Room","contributors":[{"type":"Session Chair","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Climate and Weather"],"slots":[{"id":"symp157","type":"minisymposia","title":"MS42 - Coupling Strategies Towards Exascale for Complex Earth System Modelling","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The algorithms underlying numerical weather prediction (NWP) and climate models that have been developed in the past few decades face an increasing challenge to adapt to paradigm shifts imposed by new hardware developments. The emerging diverse and complex hardware solutions have a large impact on the programming models traditionally used in NWP and climate modelling software, triggering a rethink of design choices for future software frameworks and how Earth system model (ESM) components interact. Furthermore there is a drive to increase the model complexity to include ever more processes of the whole Earth system. These complexities will inevitably break NWP modelling infrastructures that did not take these aspects into consideration 30+ years ago. As upcoming hardware solutions evermore push the boundaries of parallel execution, each ESM component may be reaching a limit in parallel scaling efficiency. \u003Cbr \/\u003E\u003Cbr \/\u003E Already, coupling infrastructure exists that enables concurrent execution of model components. In order to make use of increasing parallelism, it may be required to take a step backwards, and redesign the ESM components to become more modular and enable more concurrent execution. This minisymposium will provide an update on progress both in infrastructure developments and developments in concurrent model component execution.","bio":"","contributors":[{"type":"Organizer","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Katherine","last_name":"Evans","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa266","type":"child","title":"Flexible Earth System Modelling on Multiple Grids","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"There is a drive to increase the model complexity to include ever more processes of the whole Earth system. Some of these processes may require computations on grids of different type or resolution than the atmospheric grid. Multiple grid structures may be required as part of the numerical filtering strategy for atmospheric wave motions or to simply save computational cost of selected physical processes. These different grids may have different domain decompositions for parallel computations, and different parallelisation strategies. Moreover, the internal memory layout for a field that is optimal for a specific combination of hardware architecture and numerical algorithm may not be optimal for another combination. These complexities will inevitably break NWP modelling infrastructures that did not take these aspects into consideration 30+ years ago. In this talk, we demonstrate how Atlas, a new library developed at ECMWF, is used to complement ECMWF\u0027s Integrated Forecasting System (IFS) model to enable a number of physical processes to be implemented on multiple grids. Atlas helps to accommodate flexibility in hardware and software choices as well as increasing model complexity.","bio":"","contributors":[{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Michail","last_name":"Diamantakis","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Willem","last_name":"Deconinck","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa150","type":"child","title":"Comodels: A New Approach for Coupling Models for the [Tera,Exa]Scale","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"ECMWF\u2019s IFS spectral model has been using a hybrid MPI\/OpenMP parallelization approach since about 2002 when just 2 OpenMP threads per MPI task (OMP_NUM_THREADS=2) was used on an IBM Power4 system. The number of threads per task have gradually increased over the years to today where 12 or 18 achieves the best performance on a CRAY XC-40 (Broadwell) depending on the specific IFS model resolution. Further, it is realized that simply increasing the threads per task does not deliver improved performance. Likewise, the MPI communications cost of the IFS spectral model imposes severe demands on the switch fabric and scaling high resolution model cases beyond o(100k) cores yields minimal performance gains. Given that future supercomputer architectures are expected to support o(100) computational threads per MPI task (or socket) a codesign effort involving ECMWF, EPCC, and CRAY was started in 2015 to explore how such an extreme thread count could be achieved in practice. This effort followed on from this team\u2019s successful collaboration in the EU funded CRESTA project (2012-2014). In this talk we will present an evolutionary OpenMP parallelization approach for IFS which we call comodels which is an ongoing development at ECMWF and present some new results of this work and future plans.","filename":"msa150s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"George","last_name":"Mozdzynski","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"George","last_name":"Mozdzynski","affiliation":"ECMWF","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]},{"id":"msa267","type":"child","title":"Modeling Systems at the End of Dennard Scaling","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Conventional computational hardware has reached some physical limits: the phenomenon known as \u0027Dennard scaling\u0027 gave rise to Moore\u0027s Law, and many cycles of exponential growth in computing capacity. The consequence is that we now anticipate a computing future of increased concurrency and slower arithmetic. Earth system models, which are weak-scaling and memory-bandwidth-bound, face a particular challenge given their complexity in physical-chemical-biological space, to which mapping single algorithms or approaches is not possible. A particular aspect of such \u0027multi-scale multi-physics\u0027 models that is under-appreciated is that they are built using a combination of local process-level and global system-level observational constraints, for which the calibration process itself remains a substantial computational challenge. In this talk, we examine approaches to Earth system modeling in the post-Dennard era. The possibilities include following the industry trend toward machine learning and build models that learn; stochastic methods and emulators for fast exploration of uncertainty; using fewer bits of precision, among others. The talk will present ideas and challenges as we prepare for a post-Dennard future.","filename":"msa267s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Venkatramani","last_name":"Balaji","affiliation":"Princeton University","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa255","type":"child","title":"Making the Expensive Affordable: Running a Chemistry Model in the UKESM Climate Model","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Met Office\/NERC UKESM coupled climate model employs a chemistry model (UKCA) embedded within the Unified Model (UM) atmosphere. At high resolutions this is prohibitively expensive due to computational costs of the chemistry. Consequently, we have been developing a mechanism to run the chemistry model at a lower resolution than the \u0022main\u0022 atmospheric model. Since the chemistry code is embedded within the UM, it may not be run it as a separate stand-alone entity or at a different resolution from the atmosphere code. We believe it would be possible to achieve the necessary performance by creating a \u0022hybrid\u0022 coupled model: a configuration featuring two concurrently run copies of the same UM code at different resolutions. One component would run at high resolution, without chemistry, the other component at lower resolution featuring full chemistry. This approach requires the coupling exchange of 3D fields between the two components. We are putting together a coupled system featuring a full resolution UM atmosphere, a reduced resolution UM atmosphere-UKCA chemistry, a NEMO ocean and a CICE sea ice model coupled using OASIS3-MCT. I aim to give an overview of the plans, progress made and issues which may yet prove to be major stumbling blocks.","filename":"msa255s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Richard","last_name":"Hill","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Richard","last_name":"Hill","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa255","type":"child","title":"Making the Expensive Affordable: Running a Chemistry Model in the UKESM Climate Model","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Met Office\/NERC UKESM coupled climate model employs a chemistry model (UKCA) embedded within the Unified Model (UM) atmosphere. At high resolutions this is prohibitively expensive due to computational costs of the chemistry. Consequently, we have been developing a mechanism to run the chemistry model at a lower resolution than the \u0022main\u0022 atmospheric model. Since the chemistry code is embedded within the UM, it may not be run it as a separate stand-alone entity or at a different resolution from the atmosphere code. We believe it would be possible to achieve the necessary performance by creating a \u0022hybrid\u0022 coupled model: a configuration featuring two concurrently run copies of the same UM code at different resolutions. One component would run at high resolution, without chemistry, the other component at lower resolution featuring full chemistry. This approach requires the coupling exchange of 3D fields between the two components. We are putting together a coupled system featuring a full resolution UM atmosphere, a reduced resolution UM atmosphere-UKCA chemistry, a NEMO ocean and a CICE sea ice model coupled using OASIS3-MCT. I aim to give an overview of the plans, progress made and issues which may yet prove to be major stumbling blocks.","filename":"msa255s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Richard","last_name":"Hill","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Richard","last_name":"Hill","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Richard","last_name":"Hill","affiliation":"Met Office","country":"United Kingdom","bio":"","order":"1","is_presenter":true}] } Presentation
Organizer(s):
Hemanth Kolla (Sandia National Laboratories, United States of America)
, Jacqueline Chen (Sandia National Laboratories, United States of America)
Track(s):
Engineering
In the push towards exascale computing the current paradigm of bulk-synchronous distributed computing is giving way to a more asynchronous paradigm, where it is becoming increasingly important to have the inherent asynchrony in an application be exploited for performance and scalability. Whether the asynchrony is enabled algorithmically, expressed explicitly in the user program, discovered by a programming model and/or runtime there are critical research questions that need to be addressed. In this paradigm, asynchronous many task (AMT) programming models have made great progress in demonstrating the concept and paving the way forward. While asynchronous task based programming models for shared-memory systems have been around for a long time serious challenges remain in extending them to a distributed memory setting. This minisymposium is targeted at providing a platform to convey the progress and assess the challenges in distributed asynchronous computing. The talks included in the minisymposium will cover the range of relevant topics; resilience and fault tolerance for distributed AMT programming models, asynchrony-tolerant numerical schemes and algorithms for stencil-based applications, task-based parallel simulations of multi-phase flows, and task-based runtimes utilizing directed acyclic graphs and a domain specific language for multi-physics simulations of turbulent reacting flows.
14:15 - 14:45
Towards Exascale Simulations of Particle-Laden Turbulence in a Radiation Environment: The PSAAP Program at Stanford
, Hilario Torres (Stanford University, United States of America)
+ Abstract { "session": {"id":"sess165","title":"MS43 - Distributed Asynchronous Parallel Computing: Progress and Challenges for Multi-Physics Applications on Heterogeneous Architectures","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Darwin Room","contributors":[{"type":"Session Chair","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Engineering"],"slots":[{"id":"symp154","type":"minisymposia","title":"MS43 - Distributed Asynchronous Parallel Computing: Progress and Challenges for Multi-Physics Applications on Heterogeneous Architectures","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"In the push towards exascale computing the current paradigm of bulk-synchronous distributed computing is giving way to a more asynchronous paradigm, where it is becoming increasingly important to have the inherent asynchrony in an application be exploited for performance and scalability. Whether the asynchrony is enabled algorithmically, expressed explicitly in the user program, discovered by a programming model and\/or runtime there are critical research questions that need to be addressed. In this paradigm, \u003Cem\u003Easynchronous many task\u003C\/em\u003E (AMT) programming models have made great progress in demonstrating the concept and paving the way forward. While asynchronous task based programming models for shared-memory systems have been around for a long time serious challenges remain in extending them to a distributed memory setting. This minisymposium is targeted at providing a platform to convey the progress and assess the challenges in distributed asynchronous computing. The talks included in the minisymposium will cover the range of relevant topics; resilience and fault tolerance for distributed AMT programming models, asynchrony-tolerant numerical schemes and algorithms for stencil-based applications, task-based parallel simulations of multi-phase flows, and task-based runtimes utilizing directed acyclic graphs and a domain specific language for multi-physics simulations of turbulent reacting flows.","bio":"","contributors":[{"type":"Organizer","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Jacqueline","last_name":"Chen","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa130","type":"child","title":"Towards Exascale Simulations of Particle-Laden Turbulence in a Radiation Environment: The PSAAP Program at Stanford","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the framework of the Predictive Science Academic Alliance Program (PSAAP) the US Department of Energy\u00a0is funding a Multidisciplinary Simulation Center at Stanford University\u00a0to explore exascale computing\u00a0strategies for multiphysics simulations. Stanford Center\u0027s research portfolio blends efforts in computer science, uncertainty quantization, and computational physics to tackle a challenging physical problem: the transfer of radiative energy to a turbulent mixture of air and solid particles. The context is provided by a relatively untested and poorly understood method of harvesting solar energy. The talk will describe the Center\u0027s effort to develop and validate a computational environment to simulate this challenging\u00a0multi-physics problem emphasizing the strategies employed to carry out high-fidelity simulations and how uncertainty quantification techniques can be used to assess the overall\u00a0performance of the system.\u00a0A novel task-based programming system (Legion) is being deployed to tackle heterogeneous compute systems and retain portability and performance. Details of the implementation challenges and results obtained\u00a0on various architectures will be discussed. The integration of large scale simulations and multi-level sampling for uncertainty analysis within the Legion framework will also be summarized.","filename":"msa130s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Gianluca","last_name":"Iaccarino","affiliation":"Stanford University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Hilario","last_name":"Torres","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hilario","last_name":"Torres","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":true}]},{"id":"msa204","type":"child","title":"A Scalable Asynchronous Computing Approach for Solving PDEs at Extreme Scale","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Synchronization overheads pose a major challenge as applications advance towards extreme scales. In this work, we present an asynchronous computing algorithm based on finite difference schemes for PDEs where no synchronization between processing elements (PEs) is enforced. PEs are allowed to continue computations regardless of messages status and are thus asynchronous. We show that accuracy of commonly used finite difference schemes is degraded when they are used asynchronously. Since message arrivals at PEs is essentially a random process, so is the behavior of the error. Within a statistical framework we show that average errors drop always to first-order regardless of the original scheme. The value of the error is found to depend on both grid spacing as well as characteristics of the computing system including number of PEs and statistics of the delays. We propose new asynchrony-tolerant schemes that maintain their accuracy under relaxed synchronization. We present results from the simulations of linear and non-linear PDEs, including reacting flow simulations, to demonstrate the feasibility of the method.","bio":"","contributors":[{"type":"Author","first_name":"Aditya","last_name":"Konduri","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Emmet M.","last_name":"Cleary","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Diego A.","last_name":"Donzis","affiliation":"Texas A\u0026M University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Jacqueline","last_name":"Chen","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aditya","last_name":"Konduri","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa219","type":"child","title":"Fault Tolerance in Asynchronous Many-Task (AMT) Programming Models and Runtimes","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Among the many challenges faced by asynchronous many-task (AMT) programming models and runtimes, fault-tolerance is particularly daunting. The ability to allow asynchronous progress might appear to be favourable from a fault-tolerance perspective, since the overhead of recovering from a failed task could potentially be hidden. However, correctness and coherence requirements can become overwhelming for a poorly designed AMT runtime, more than off-setting any potential advantage of exploiting asynchrony. In this talk we present results from a systematic study of fault-tolerance for AMT systems. We establish that graph based analytical models are not tractable for the task-graphs of even the simplest applications. Accordingly, we present the design of, and results from, a task-graph simulator in which various aspects of an AMT system and its fault-tolerance are carefully parametrized. Simulator results of a stencil application task-graph are presented for various scenarios involving overdecomposition, failure rate, task scheduling and fault-tolerance strategy. The focus is particularly on two fault-tolerance strategies: task replication and task replay. Mock-up implementations of the stencil application, along with task replay and replication, in the shared-memory AMT system, HabaneroC++, are studied and compared with the simulator results.","filename":"msa219s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Keita","last_name":"Teranishi","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jackson","last_name":"Mayo","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Rob","last_name":"Armstrong","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Slattengren","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa165","type":"child","title":"Tools and Techniques to Enable Multiphysics Applications on Heterogeneous Architectures","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Deploying multiphysics applications on heterogeneous architectures is particularly challenging because of the complexity and volume of code that must be maintained as well as the complex logic associated with the interplay between numerical algorithms and hardware. In this talk, we will explore some of the abstractions that we have found useful in developing simulation tools for turbulent reacting flows. This includes task-based runtime systems that utilize directed acyclic graphs together with a domain-specific language that provides simple syntax while preserving performance.","filename":"msa165s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"James C.","last_name":"Sutherland","affiliation":"University of Utah","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tony","last_name":"Saad","affiliation":"University of Utah","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James C.","last_name":"Sutherland","affiliation":"University of Utah","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa130","type":"child","title":"Towards Exascale Simulations of Particle-Laden Turbulence in a Radiation Environment: The PSAAP Program at Stanford","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the framework of the Predictive Science Academic Alliance Program (PSAAP) the US Department of Energy\u00a0is funding a Multidisciplinary Simulation Center at Stanford University\u00a0to explore exascale computing\u00a0strategies for multiphysics simulations. Stanford Center\u0027s research portfolio blends efforts in computer science, uncertainty quantization, and computational physics to tackle a challenging physical problem: the transfer of radiative energy to a turbulent mixture of air and solid particles. The context is provided by a relatively untested and poorly understood method of harvesting solar energy. The talk will describe the Center\u0027s effort to develop and validate a computational environment to simulate this challenging\u00a0multi-physics problem emphasizing the strategies employed to carry out high-fidelity simulations and how uncertainty quantification techniques can be used to assess the overall\u00a0performance of the system.\u00a0A novel task-based programming system (Legion) is being deployed to tackle heterogeneous compute systems and retain portability and performance. Details of the implementation challenges and results obtained\u00a0on various architectures will be discussed. The integration of large scale simulations and multi-level sampling for uncertainty analysis within the Legion framework will also be summarized.","filename":"msa130s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Gianluca","last_name":"Iaccarino","affiliation":"Stanford University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Hilario","last_name":"Torres","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hilario","last_name":"Torres","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Gianluca","last_name":"Iaccarino","affiliation":"Stanford University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Hilario","last_name":"Torres","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":true}] } Presentation
15:15 - 15:45
Fault Tolerance in Asynchronous Many-Task (AMT) Programming Models and Runtimes
, Hemanth Kolla (Sandia National Laboratories, United States of America)
+ Abstract { "session": {"id":"sess165","title":"MS43 - Distributed Asynchronous Parallel Computing: Progress and Challenges for Multi-Physics Applications on Heterogeneous Architectures","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Darwin Room","contributors":[{"type":"Session Chair","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Engineering"],"slots":[{"id":"symp154","type":"minisymposia","title":"MS43 - Distributed Asynchronous Parallel Computing: Progress and Challenges for Multi-Physics Applications on Heterogeneous Architectures","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"In the push towards exascale computing the current paradigm of bulk-synchronous distributed computing is giving way to a more asynchronous paradigm, where it is becoming increasingly important to have the inherent asynchrony in an application be exploited for performance and scalability. Whether the asynchrony is enabled algorithmically, expressed explicitly in the user program, discovered by a programming model and\/or runtime there are critical research questions that need to be addressed. In this paradigm, \u003Cem\u003Easynchronous many task\u003C\/em\u003E (AMT) programming models have made great progress in demonstrating the concept and paving the way forward. While asynchronous task based programming models for shared-memory systems have been around for a long time serious challenges remain in extending them to a distributed memory setting. This minisymposium is targeted at providing a platform to convey the progress and assess the challenges in distributed asynchronous computing. The talks included in the minisymposium will cover the range of relevant topics; resilience and fault tolerance for distributed AMT programming models, asynchrony-tolerant numerical schemes and algorithms for stencil-based applications, task-based parallel simulations of multi-phase flows, and task-based runtimes utilizing directed acyclic graphs and a domain specific language for multi-physics simulations of turbulent reacting flows.","bio":"","contributors":[{"type":"Organizer","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Jacqueline","last_name":"Chen","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa130","type":"child","title":"Towards Exascale Simulations of Particle-Laden Turbulence in a Radiation Environment: The PSAAP Program at Stanford","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the framework of the Predictive Science Academic Alliance Program (PSAAP) the US Department of Energy\u00a0is funding a Multidisciplinary Simulation Center at Stanford University\u00a0to explore exascale computing\u00a0strategies for multiphysics simulations. Stanford Center\u0027s research portfolio blends efforts in computer science, uncertainty quantization, and computational physics to tackle a challenging physical problem: the transfer of radiative energy to a turbulent mixture of air and solid particles. The context is provided by a relatively untested and poorly understood method of harvesting solar energy. The talk will describe the Center\u0027s effort to develop and validate a computational environment to simulate this challenging\u00a0multi-physics problem emphasizing the strategies employed to carry out high-fidelity simulations and how uncertainty quantification techniques can be used to assess the overall\u00a0performance of the system.\u00a0A novel task-based programming system (Legion) is being deployed to tackle heterogeneous compute systems and retain portability and performance. Details of the implementation challenges and results obtained\u00a0on various architectures will be discussed. The integration of large scale simulations and multi-level sampling for uncertainty analysis within the Legion framework will also be summarized.","filename":"msa130s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Gianluca","last_name":"Iaccarino","affiliation":"Stanford University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Hilario","last_name":"Torres","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hilario","last_name":"Torres","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":true}]},{"id":"msa204","type":"child","title":"A Scalable Asynchronous Computing Approach for Solving PDEs at Extreme Scale","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Synchronization overheads pose a major challenge as applications advance towards extreme scales. In this work, we present an asynchronous computing algorithm based on finite difference schemes for PDEs where no synchronization between processing elements (PEs) is enforced. PEs are allowed to continue computations regardless of messages status and are thus asynchronous. We show that accuracy of commonly used finite difference schemes is degraded when they are used asynchronously. Since message arrivals at PEs is essentially a random process, so is the behavior of the error. Within a statistical framework we show that average errors drop always to first-order regardless of the original scheme. The value of the error is found to depend on both grid spacing as well as characteristics of the computing system including number of PEs and statistics of the delays. We propose new asynchrony-tolerant schemes that maintain their accuracy under relaxed synchronization. We present results from the simulations of linear and non-linear PDEs, including reacting flow simulations, to demonstrate the feasibility of the method.","bio":"","contributors":[{"type":"Author","first_name":"Aditya","last_name":"Konduri","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Emmet M.","last_name":"Cleary","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Diego A.","last_name":"Donzis","affiliation":"Texas A\u0026M University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Jacqueline","last_name":"Chen","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aditya","last_name":"Konduri","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa219","type":"child","title":"Fault Tolerance in Asynchronous Many-Task (AMT) Programming Models and Runtimes","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Among the many challenges faced by asynchronous many-task (AMT) programming models and runtimes, fault-tolerance is particularly daunting. The ability to allow asynchronous progress might appear to be favourable from a fault-tolerance perspective, since the overhead of recovering from a failed task could potentially be hidden. However, correctness and coherence requirements can become overwhelming for a poorly designed AMT runtime, more than off-setting any potential advantage of exploiting asynchrony. In this talk we present results from a systematic study of fault-tolerance for AMT systems. We establish that graph based analytical models are not tractable for the task-graphs of even the simplest applications. Accordingly, we present the design of, and results from, a task-graph simulator in which various aspects of an AMT system and its fault-tolerance are carefully parametrized. Simulator results of a stencil application task-graph are presented for various scenarios involving overdecomposition, failure rate, task scheduling and fault-tolerance strategy. The focus is particularly on two fault-tolerance strategies: task replication and task replay. Mock-up implementations of the stencil application, along with task replay and replication, in the shared-memory AMT system, HabaneroC++, are studied and compared with the simulator results.","filename":"msa219s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Keita","last_name":"Teranishi","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jackson","last_name":"Mayo","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Rob","last_name":"Armstrong","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Slattengren","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa165","type":"child","title":"Tools and Techniques to Enable Multiphysics Applications on Heterogeneous Architectures","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Deploying multiphysics applications on heterogeneous architectures is particularly challenging because of the complexity and volume of code that must be maintained as well as the complex logic associated with the interplay between numerical algorithms and hardware. In this talk, we will explore some of the abstractions that we have found useful in developing simulation tools for turbulent reacting flows. This includes task-based runtime systems that utilize directed acyclic graphs together with a domain-specific language that provides simple syntax while preserving performance.","filename":"msa165s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"James C.","last_name":"Sutherland","affiliation":"University of Utah","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tony","last_name":"Saad","affiliation":"University of Utah","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James C.","last_name":"Sutherland","affiliation":"University of Utah","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa219","type":"child","title":"Fault Tolerance in Asynchronous Many-Task (AMT) Programming Models and Runtimes","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Among the many challenges faced by asynchronous many-task (AMT) programming models and runtimes, fault-tolerance is particularly daunting. The ability to allow asynchronous progress might appear to be favourable from a fault-tolerance perspective, since the overhead of recovering from a failed task could potentially be hidden. However, correctness and coherence requirements can become overwhelming for a poorly designed AMT runtime, more than off-setting any potential advantage of exploiting asynchrony. In this talk we present results from a systematic study of fault-tolerance for AMT systems. We establish that graph based analytical models are not tractable for the task-graphs of even the simplest applications. Accordingly, we present the design of, and results from, a task-graph simulator in which various aspects of an AMT system and its fault-tolerance are carefully parametrized. Simulator results of a stencil application task-graph are presented for various scenarios involving overdecomposition, failure rate, task scheduling and fault-tolerance strategy. The focus is particularly on two fault-tolerance strategies: task replication and task replay. Mock-up implementations of the stencil application, along with task replay and replication, in the shared-memory AMT system, HabaneroC++, are studied and compared with the simulator results.","filename":"msa219s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Keita","last_name":"Teranishi","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jackson","last_name":"Mayo","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Rob","last_name":"Armstrong","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Slattengren","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Keita","last_name":"Teranishi","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jackson","last_name":"Mayo","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Rob","last_name":"Armstrong","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Slattengren","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"5","is_presenter":false}] } Presentation
15:45 - 16:15
Tools and Techniques to Enable Multiphysics Applications on Heterogeneous Architectures
, James C. Sutherland (University of Utah, United States of America)
+ Abstract { "session": {"id":"sess165","title":"MS43 - Distributed Asynchronous Parallel Computing: Progress and Challenges for Multi-Physics Applications on Heterogeneous Architectures","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Darwin Room","contributors":[{"type":"Session Chair","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Engineering"],"slots":[{"id":"symp154","type":"minisymposia","title":"MS43 - Distributed Asynchronous Parallel Computing: Progress and Challenges for Multi-Physics Applications on Heterogeneous Architectures","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"In the push towards exascale computing the current paradigm of bulk-synchronous distributed computing is giving way to a more asynchronous paradigm, where it is becoming increasingly important to have the inherent asynchrony in an application be exploited for performance and scalability. Whether the asynchrony is enabled algorithmically, expressed explicitly in the user program, discovered by a programming model and\/or runtime there are critical research questions that need to be addressed. In this paradigm, \u003Cem\u003Easynchronous many task\u003C\/em\u003E (AMT) programming models have made great progress in demonstrating the concept and paving the way forward. While asynchronous task based programming models for shared-memory systems have been around for a long time serious challenges remain in extending them to a distributed memory setting. This minisymposium is targeted at providing a platform to convey the progress and assess the challenges in distributed asynchronous computing. The talks included in the minisymposium will cover the range of relevant topics; resilience and fault tolerance for distributed AMT programming models, asynchrony-tolerant numerical schemes and algorithms for stencil-based applications, task-based parallel simulations of multi-phase flows, and task-based runtimes utilizing directed acyclic graphs and a domain specific language for multi-physics simulations of turbulent reacting flows.","bio":"","contributors":[{"type":"Organizer","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Jacqueline","last_name":"Chen","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa130","type":"child","title":"Towards Exascale Simulations of Particle-Laden Turbulence in a Radiation Environment: The PSAAP Program at Stanford","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In the framework of the Predictive Science Academic Alliance Program (PSAAP) the US Department of Energy\u00a0is funding a Multidisciplinary Simulation Center at Stanford University\u00a0to explore exascale computing\u00a0strategies for multiphysics simulations. Stanford Center\u0027s research portfolio blends efforts in computer science, uncertainty quantization, and computational physics to tackle a challenging physical problem: the transfer of radiative energy to a turbulent mixture of air and solid particles. The context is provided by a relatively untested and poorly understood method of harvesting solar energy. The talk will describe the Center\u0027s effort to develop and validate a computational environment to simulate this challenging\u00a0multi-physics problem emphasizing the strategies employed to carry out high-fidelity simulations and how uncertainty quantification techniques can be used to assess the overall\u00a0performance of the system.\u00a0A novel task-based programming system (Legion) is being deployed to tackle heterogeneous compute systems and retain portability and performance. Details of the implementation challenges and results obtained\u00a0on various architectures will be discussed. The integration of large scale simulations and multi-level sampling for uncertainty analysis within the Legion framework will also be summarized.","filename":"msa130s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Gianluca","last_name":"Iaccarino","affiliation":"Stanford University","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Hilario","last_name":"Torres","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hilario","last_name":"Torres","affiliation":"Stanford University","country":"United States of America","bio":"","order":"2","is_presenter":true}]},{"id":"msa204","type":"child","title":"A Scalable Asynchronous Computing Approach for Solving PDEs at Extreme Scale","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Synchronization overheads pose a major challenge as applications advance towards extreme scales. In this work, we present an asynchronous computing algorithm based on finite difference schemes for PDEs where no synchronization between processing elements (PEs) is enforced. PEs are allowed to continue computations regardless of messages status and are thus asynchronous. We show that accuracy of commonly used finite difference schemes is degraded when they are used asynchronously. Since message arrivals at PEs is essentially a random process, so is the behavior of the error. Within a statistical framework we show that average errors drop always to first-order regardless of the original scheme. The value of the error is found to depend on both grid spacing as well as characteristics of the computing system including number of PEs and statistics of the delays. We propose new asynchrony-tolerant schemes that maintain their accuracy under relaxed synchronization. We present results from the simulations of linear and non-linear PDEs, including reacting flow simulations, to demonstrate the feasibility of the method.","bio":"","contributors":[{"type":"Author","first_name":"Aditya","last_name":"Konduri","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Emmet M.","last_name":"Cleary","affiliation":"California Institute of Technology","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Diego A.","last_name":"Donzis","affiliation":"Texas A\u0026M University","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Jacqueline","last_name":"Chen","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Aditya","last_name":"Konduri","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa219","type":"child","title":"Fault Tolerance in Asynchronous Many-Task (AMT) Programming Models and Runtimes","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Among the many challenges faced by asynchronous many-task (AMT) programming models and runtimes, fault-tolerance is particularly daunting. The ability to allow asynchronous progress might appear to be favourable from a fault-tolerance perspective, since the overhead of recovering from a failed task could potentially be hidden. However, correctness and coherence requirements can become overwhelming for a poorly designed AMT runtime, more than off-setting any potential advantage of exploiting asynchrony. In this talk we present results from a systematic study of fault-tolerance for AMT systems. We establish that graph based analytical models are not tractable for the task-graphs of even the simplest applications. Accordingly, we present the design of, and results from, a task-graph simulator in which various aspects of an AMT system and its fault-tolerance are carefully parametrized. Simulator results of a stencil application task-graph are presented for various scenarios involving overdecomposition, failure rate, task scheduling and fault-tolerance strategy. The focus is particularly on two fault-tolerance strategies: task replication and task replay. Mock-up implementations of the stencil application, along with task replay and replication, in the shared-memory AMT system, HabaneroC++, are studied and compared with the simulator results.","filename":"msa219s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Keita","last_name":"Teranishi","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jackson","last_name":"Mayo","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Rob","last_name":"Armstrong","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Nicole","last_name":"Slattengren","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Hemanth","last_name":"Kolla","affiliation":"Sandia National Laboratories","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa165","type":"child","title":"Tools and Techniques to Enable Multiphysics Applications on Heterogeneous Architectures","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Deploying multiphysics applications on heterogeneous architectures is particularly challenging because of the complexity and volume of code that must be maintained as well as the complex logic associated with the interplay between numerical algorithms and hardware. In this talk, we will explore some of the abstractions that we have found useful in developing simulation tools for turbulent reacting flows. This includes task-based runtime systems that utilize directed acyclic graphs together with a domain-specific language that provides simple syntax while preserving performance.","filename":"msa165s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"James C.","last_name":"Sutherland","affiliation":"University of Utah","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tony","last_name":"Saad","affiliation":"University of Utah","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James C.","last_name":"Sutherland","affiliation":"University of Utah","country":"United States of America","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa165","type":"child","title":"Tools and Techniques to Enable Multiphysics Applications on Heterogeneous Architectures","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Deploying multiphysics applications on heterogeneous architectures is particularly challenging because of the complexity and volume of code that must be maintained as well as the complex logic associated with the interplay between numerical algorithms and hardware. In this talk, we will explore some of the abstractions that we have found useful in developing simulation tools for turbulent reacting flows. This includes task-based runtime systems that utilize directed acyclic graphs together with a domain-specific language that provides simple syntax while preserving performance.","filename":"msa165s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"James C.","last_name":"Sutherland","affiliation":"University of Utah","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tony","last_name":"Saad","affiliation":"University of Utah","country":"United States of America","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"James C.","last_name":"Sutherland","affiliation":"University of Utah","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"James C.","last_name":"Sutherland","affiliation":"University of Utah","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tony","last_name":"Saad","affiliation":"University of Utah","country":"United States of America","bio":"","order":"2","is_presenter":false}] } Presentation
Organizer(s):
Irina Paci (University of Victoria, Canada)
, Jeffrey Paci (University of Victoria, Canada)
Track(s):
Engineering, Chemistry and Materials, Physics
The miniaturisation of devices towards the molecular scale is a dynamic field of research that has evolved dramatically in recent years. Its applications span a broad field, from the fabrication of smaller, more powerful computer chips, to charge storage, robotics, solar cells or biosensors. Complex functional materials are being envisioned and developed with these applications in mind, in an unprecedented effort towards rational design, through a combination of theory, computation and experiment.
In this context, the challenge for theory is to formulate laws and uncover patterns when the interactions that drive the relevant processes are as varied as the systems being investigated. Potential energy surfaces in nanoscale materials are complex, with multiple competing minima and steep barriers. The properties of the complex materials are also a challenge for computation: quantum effects in nanoscale interactions are reflected in the behaviour of the material as a whole, a statistical entity. This minisymposium will bring together researchers who have made significant contributions to the two essential challenges in materials simulations: sampling complex potential energy surfaces in structural prediction and bridging the relevant length scales in materials fabrication and properties. Current challenges and perspectives in method development and new applications will be discussed.
In this context, the challenge for theory is to formulate laws and uncover patterns when the interactions that drive the relevant processes are as varied as the systems being investigated. Potential energy surfaces in nanoscale materials are complex, with multiple competing minima and steep barriers. The properties of the complex materials are also a challenge for computation: quantum effects in nanoscale interactions are reflected in the behaviour of the material as a whole, a statistical entity. This minisymposium will bring together researchers who have made significant contributions to the two essential challenges in materials simulations: sampling complex potential energy surfaces in structural prediction and bridging the relevant length scales in materials fabrication and properties. Current challenges and perspectives in method development and new applications will be discussed.
15:45 - 16:15
(i) Massively-Parallel Simulation of Self-Assembled Diblock-Copolymer Nano-Materials; (ii) Ab-Initio Quantum Monte Carlo Simulations for Single Vacancy Graphene and Isotropically-Strained Graphene
, Ludwig Schneider (University of Göttingen, Germany, Germany)
, Tomonori Shirakawa (SISSA, Italy)
+ Abstract { "session": {"id":"sess167","title":"MS44 - Emerging Trends in Statistical Mechanics Applications to Nanostructured Materials","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Boston 3 Room","contributors":[{"type":"Session Chair","first_name":"Irina","last_name":"Paci","affiliation":"University of Victoria, Department of Chemistry","country":""}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Engineering","Chemistry and Materials","Physics"],"slots":[{"id":"symp141","type":"minisymposia","title":"MS44 - Emerging Trends in Statistical Mechanics Applications to Nanostructured Materials","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The miniaturisation of devices towards the molecular scale is a dynamic field of research that has evolved dramatically in recent years. Its applications span a broad field, from the fabrication of smaller, more powerful computer chips, to charge storage, robotics, solar cells or biosensors. Complex functional materials are being envisioned and developed with these applications in mind, in an unprecedented effort towards rational design, through a combination of theory, computation and experiment.\u003Cbr \/\u003E \u003Cbr \/\u003EIn this context, the challenge for theory is to formulate laws and uncover patterns when the interactions that drive the relevant processes are as varied as the systems being investigated. Potential energy surfaces in nanoscale materials are complex, with multiple competing minima and steep barriers. The properties of the complex materials are also a challenge for computation: quantum effects in nanoscale interactions are reflected in the behaviour of the material as a whole, a statistical entity. This minisymposium will bring together researchers who have made significant contributions to the two essential challenges in materials simulations: sampling complex potential energy surfaces in structural prediction and bridging the relevant length scales in materials fabrication and properties. Current challenges and perspectives in method development and new applications will be discussed.","bio":"","contributors":[{"type":"Organizer","first_name":"Irina","last_name":"Paci","affiliation":"University of Victoria","country":"Canada","bio":"","order":"1","is_presenter":true},{"type":"Organizer","first_name":"Jeffrey","last_name":"Paci","affiliation":"University of Victoria","country":"Canada","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Irina","last_name":"Paci","affiliation":"University of Victoria","country":"Canada","bio":"","order":"1","is_presenter":true}]},{"id":"msa264","type":"child","title":"Bridging the Electronic, Atomistic and Mesoscopic Scales Using Machine Learned Models","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The ability to perform accurate calculations efficiently is crucial for computational materials design. In this talk, we will discuss our streamlined approach to force field development using first principles density functional theory training data and machine learning algorithms. Our objective has been to develop new, first-principles based, more accurate and more robust inter-atomic potentials for accurate simulations of dynamical processes at reactive interfaces and low dimensional systems such as clusters and molecules. The procedure involves several steps including (a) generation and manipulation of extensive fitting data sets through electronic structure calculations, (b) defining functional forms, (c) formulating novel highly optimized fitting procedures, (d) dual-Hamiltonian optimization to leverage classical FFs with more accurate approaches, and (e) subsequently coding and implementing these algorithms on high performance computers (HPCs). We will also discuss the validation of this approach on several diverse material systems ranging from precious metal nanocatalysts to newly discovered two dimensional materials such as stanene and silicene.","bio":"","contributors":[{"type":"Author","first_name":"Subramanian","last_name":"Sankaranarayanan","affiliation":"Argonne National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Kiran","last_name":"Sasikumar","affiliation":"Avant-Garde Materials Simulation Deutschland GmbH","country":"Germany","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Kiran","last_name":"Sasikumar","affiliation":"Avant-Garde Materials Simulation Deutschland GmbH","country":"Germany","bio":"","order":"2","is_presenter":true}]},{"id":"msa223","type":"child","title":"Metal and Metal-Oxide Clusters at Realistic Conditions: Beyond the Static, Monostructure Description","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The processes occurring at surfaces play a critical role in the manufacture and performance of advanced materials, e.g., electronic, magnetic, and optical devices, sensors, and catalysts. A prerequisite for analyzing and understanding the electronic properties and the function of surfaces is detailed knowledge of the atomic structure, i.e., the surface composition and geometry under realistic gas-phase conditions. The key quantity for studying the structure and function of surface\/cluster in reactive atmospheres is the Gibbs free energy, as function of number of particles, pressure, and temperature. Here, I present a set of methods for the sampling of the configurational space of metal and metal-oxide clusters in reactive (e.g., O2, H2) atmosphere, in the canonical and grand-canonical ensembles, aiming at the unbiased determination of the phase diagrams as function of temperature and partial pressure of the reactive gas. Applications to gold, magnesium-oxide, and titanium-oxide nanoclusters, with first-principles potential-energy surfaces, will demonstrate the insight gained by the direct access to observables at finite temperature.","bio":"","contributors":[{"type":"Author","first_name":"Luca M.","last_name":"Ghiringhelli","affiliation":"Fritz Haber Institute","country":"Germany","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luca M.","last_name":"Ghiringhelli","affiliation":"Fritz Haber Institute","country":"Germany","bio":"","order":"1","is_presenter":true}]},{"id":"msa216","type":"child","title":"From Computational Spectroscopy to Artificial Water Splitting","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Spectroscopy is extremely helpful for the analysis of materials and chemical processes. In addition to experimental data, (spectroscopic) calculations provide important insight and allow the targeted study of specific structures, their dynamics and interactions. This is also true for artificial photosynthesis, which permits the splitting of water into molecular hydrogen and oxygen by means of solar light and is therefore a promising strategy to meet the increasing worldwide need for clean energy. Detailed investigation\u00a0of the underlying mechanisms\u00a0and factors determining the activity of catalysts is a prerequisite for the design of more efficient systems. We present our recent research for the development of forefront methods in spectroscopy based on high-performance dynamic ab initio methods as well the in-depth study and design of water splitting catalysis.","bio":"","contributors":[{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Sandra","last_name":"Luber","affiliation":"University of Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa296","type":"child","title":"(i) Massively-Parallel Simulation of Self-Assembled Diblock-Copolymer Nano-Materials; (ii) Ab-Initio Quantum Monte Carlo Simulations for Single Vacancy Graphene and Isotropically-Strained Graphene","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(i) Polymeric materials exhibit a rich equilibrium phase diagram, qualifying them for applications in\u00a0electronic devices, filters, battery materials. Self-assembly of these materials rarely results in the equilibrium structures. Instead, configurations are trapped in long-lived meta-stable states. The properties of these structures, such as percolation and mechanical stability, can deviate from those of corresponding equilibrium phases. Investigating meta-stable states is challenging, due to finite size effects. SOMA, our massively-parallel implementation of the Single-Chain-in-Mean-Field algorithm, enables study of systems with billions of particles, unravelling the percolation characteristics of self-assembled diblock-copolymers as a function of volume fraction.\u00a0\u003Cbr \/\u003E (ii) Employing an ab-initio quantum Monte Carlo scheme (QMC), we will discuss the electronic structures of graphene in conditions where strong electron correlation plays an\u00a0important role. Experimental studies of graphene have uncovered emerging spin-half free moments around vacancies.\u00a0In QMC simulations of single vacancy graphene, we found a localized spin, composed\u00a0of dangling sigma-orbitals around the vacancy. A model\u00a0for the spin structure will be discussed. For isotropically strained graphene,\u00a0we found evidence of two insulating phases before mechanical failure: a dimer phase where the structural dimerization implies the opening of a charge gap, and an antiferromagnetic phase induced by strong on-site Coulomb repulsion.","filename":"msa296s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ludwig","last_name":"Schneider","affiliation":"University of G\u00f6ttingen, Germany","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marcus","last_name":"Mu\u0308ller","affiliation":"University of G\u00f6ttingen, Germany","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Tomonori","last_name":"Shirakawa","affiliation":"SISSA","country":"Italy","bio":"","order":"3","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ludwig","last_name":"Schneider","affiliation":"University of G\u00f6ttingen, Germany","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tomonori","last_name":"Shirakawa","affiliation":"SISSA","country":"Italy","bio":"","order":"3","is_presenter":true}]}]}, "slot": {"id":"msa296","type":"child","title":"(i) Massively-Parallel Simulation of Self-Assembled Diblock-Copolymer Nano-Materials; (ii) Ab-Initio Quantum Monte Carlo Simulations for Single Vacancy Graphene and Isotropically-Strained Graphene","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"(i) Polymeric materials exhibit a rich equilibrium phase diagram, qualifying them for applications in\u00a0electronic devices, filters, battery materials. Self-assembly of these materials rarely results in the equilibrium structures. Instead, configurations are trapped in long-lived meta-stable states. The properties of these structures, such as percolation and mechanical stability, can deviate from those of corresponding equilibrium phases. Investigating meta-stable states is challenging, due to finite size effects. SOMA, our massively-parallel implementation of the Single-Chain-in-Mean-Field algorithm, enables study of systems with billions of particles, unravelling the percolation characteristics of self-assembled diblock-copolymers as a function of volume fraction.\u00a0\u003Cbr \/\u003E (ii) Employing an ab-initio quantum Monte Carlo scheme (QMC), we will discuss the electronic structures of graphene in conditions where strong electron correlation plays an\u00a0important role. Experimental studies of graphene have uncovered emerging spin-half free moments around vacancies.\u00a0In QMC simulations of single vacancy graphene, we found a localized spin, composed\u00a0of dangling sigma-orbitals around the vacancy. A model\u00a0for the spin structure will be discussed. For isotropically strained graphene,\u00a0we found evidence of two insulating phases before mechanical failure: a dimer phase where the structural dimerization implies the opening of a charge gap, and an antiferromagnetic phase induced by strong on-site Coulomb repulsion.","filename":"msa296s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Ludwig","last_name":"Schneider","affiliation":"University of G\u00f6ttingen, Germany","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marcus","last_name":"Mu\u0308ller","affiliation":"University of G\u00f6ttingen, Germany","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Tomonori","last_name":"Shirakawa","affiliation":"SISSA","country":"Italy","bio":"","order":"3","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Ludwig","last_name":"Schneider","affiliation":"University of G\u00f6ttingen, Germany","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Tomonori","last_name":"Shirakawa","affiliation":"SISSA","country":"Italy","bio":"","order":"3","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Ludwig","last_name":"Schneider","affiliation":"University of G\u00f6ttingen, Germany","country":"Germany","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Marcus","last_name":"Mu\u0308ller","affiliation":"University of G\u00f6ttingen, Germany","country":"Germany","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Tomonori","last_name":"Shirakawa","affiliation":"SISSA","country":"Italy","bio":"","order":"3","is_presenter":true}] } Presentation
Organizer(s):
Roland Walter (University of Geneva, Switzerland)
, Claudio Gheller (ETH Zurich / CSCS, Switzerland)
Track(s):
Physics
The volume of data generated by astrophysics observatories, and the complexity of knowledge management, has increased constantly and is now on the verge to reach even higher levels with observatories planned to generate 10-100s PB per year. This increase in volume, mostly driven by ground based telescopes with high resolution detectors and/or extreme time sampling, requires new computing infrastructures and new ways for scientists to interact with data and computing.
Observatory interfaces will need to move from data to analysis, interpretation and knowledge and from analysis to synthesis. Data mining, driven by the needs of science and education will require an integration of archive, pipeline and interpretation, which are still largely conceived as disconnected services. This evolution will benefit from the increased computing power and artificial intelligence which will help interpreting data flow exceeding human insight and will likely transform the astronomy business model.
The goal of this minisymposium is to clarify this evolution by comparing the needs of current and planned observatories and study how this implementation could take place both at the technical level and in the way scientists collaborate. The specific opportunities for Switzerland will also be discussed.
Observatory interfaces will need to move from data to analysis, interpretation and knowledge and from analysis to synthesis. Data mining, driven by the needs of science and education will require an integration of archive, pipeline and interpretation, which are still largely conceived as disconnected services. This evolution will benefit from the increased computing power and artificial intelligence which will help interpreting data flow exceeding human insight and will likely transform the astronomy business model.
The goal of this minisymposium is to clarify this evolution by comparing the needs of current and planned observatories and study how this implementation could take place both at the technical level and in the way scientists collaborate. The specific opportunities for Switzerland will also be discussed.
Organizer(s):
Kaushik De (The University of Texas at Arlington, United States of America)
, Alexei Klimentov (Brookhaven National Laboratory, United States of America)
, Torre Wenaus (Brookhaven National Laboratory, United States of America)
Track(s):
Computer Science and Applied Mathematics, Physics
The ATLAS experiment at CERN’s Large Hadron Collider depends on the Worldwide LHC Computing Grid, the WLCG, for its remote computing infrastructure. PanDA, the workload management system used by ATLAS, annually processes over an exabyte of data using an average of 250,000 distributed batch slots, to enable hundreds of new scientific results. An effort was launched to extend PanDA, called BigPanDA, to access HPC resources, funded by the US Department of Energy (DOE-ASCR). Through this successful effort, ATLAS today uses about 20 million hours monthly on the Titan supercomputer at Oak Ridge National Laboratory. Many other supercomputers have also been integrated into ATLAS computing. This minisymposium will explore the software and operational lessons learned in integrating HPCs with traditional grid computing, and describe recent efforts to use BigPanDA for many other scientific domains. Three talks will summarize the state of the art and the future wishlist for HPC usage for current and future experiments, while a concluding expert panel discussion will focus on the future.
14:15 - 14:45
Enabling Biology, Chemistry and Other Sciences on Titan through BigPanDA
, Danila Oleynik (The University of Texas at Arlington, United States of America)
+ Abstract { "session": {"id":"sess192","title":"MS46 - HPC beyond HEP: Opening Doors for New Data Intensive Sciences at Leadership Class HPCs Using BigPanDA","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Sydney Room","contributors":[{"type":"Session Chair","first_name":"Torre","last_name":"Wenaus","affiliation":"Brookhaven National Laboratory, +1 631 344 4755","country":"United States of America"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics","Physics"],"slots":[{"id":"symp151","type":"minisymposia","title":"MS46 - HPC beyond HEP: Opening Doors for New Data Intensive Sciences at Leadership Class HPCs Using BigPanDA","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"The ATLAS experiment at CERN\u2019s Large Hadron Collider depends on the Worldwide LHC Computing Grid, the WLCG, for its remote computing infrastructure. PanDA, the workload management system used by ATLAS, annually processes over an exabyte of data using an average of 250,000 distributed batch slots, to enable hundreds of new scientific results. An effort was launched to extend PanDA, called BigPanDA, to access HPC resources, funded by the US Department of Energy (DOE-ASCR). Through this successful effort, ATLAS today uses about 20 million hours monthly on the Titan supercomputer at Oak Ridge National Laboratory. Many other supercomputers have also been integrated into ATLAS computing. This minisymposium will explore the software and operational lessons learned in integrating HPCs with traditional grid computing, and describe recent efforts to use BigPanDA for many other scientific domains. Three talks will summarize the state of the art and the future wishlist for HPC usage for current and future experiments, while a concluding expert panel discussion will focus on the future.","bio":"","contributors":[{"type":"Organizer","first_name":"Kaushik","last_name":"De","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Organizer","first_name":"Alexei","last_name":"Klimentov","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Organizer","first_name":"Torre","last_name":"Wenaus","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Torre","last_name":"Wenaus","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":true}]},{"id":"msa299","type":"child","title":"Enabling Biology, Chemistry and Other Sciences on Titan through BigPanDA","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Oak Ridge Leadership Computing Facility (OLCF) is one of the most powerful HPC centers available to researchers from different scientific fields to solve some of the world\u0027s most challenging scientific problems. Small scientific groups often need to develop expertise to optimize their applications for running on Titan, and to fit the usage policies of such big machines. We have installed the BigPanDA workload management system at OLCF to simplify the submission of user tasks to Titan. In this talk we will present results of an R\u0026D project to execute workloads from different scientific groups at OLCF. We will describe all steps: starting from deployment of PanDA server as service on demand at OLCF in OpenShift containers, to the adaptation of PanDA client tools for new users. Examples from some of the different scientific fields using this service will include biology\/genomics, molecular dynamics, LQCD, solid-state and neutrino physics, and different data science experiments: nEDM, LSST, and IceCube.","filename":"msa299s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Danila","last_name":"Oleynik","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ruslan","last_name":"Mashinistov","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pavlo","last_name":"Svirin","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sergey","last_name":"Panitkin","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danila","last_name":"Oleynik","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa301","type":"child","title":"BigPanDA Experience on Titan for the ATLAS Experiment at the LHC","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The PanDA software is used for workload management on distributed grid resources by the ATLAS experiment at the LHC. An effort was launched to extend PanDA, called BigPanDA, to access HPC resources, funded by the US Department of Energy (DOE-ASCR). Through this successful effort, ATLAS today uses over 25 million hours monthly on the Titan supercomputer at Oak Ridge National Laboratory. Many challenges were met and overcome in using HPCs for ATLAS simulations. ATLAS uses two different operational modes at Titan. The traditional mode uses allocations - which require software innovations to fit the low latency requirements of experimental science. New techniques were implemented to shape large jobs using allocations on a leadership class machine. In the second mode, high priority work is constantly sent to Titan to backfill high priority leadership class jobs. This has resulted in impressive gains in overall utilization of Titan, while benefiting the physics objectives of ATLAS. For both modes, BigPanDA has integrated traditional grid computing with HPC architecture. This talk will summarize the innovations to successfully use Titan for LHC physics goals.","bio":"","contributors":[{"type":"Author","first_name":"Alexei","last_name":"Klimentov","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Kaushik","last_name":"De","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Wells","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sergey","last_name":"Panitkin","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Danila","last_name":"Oleynik","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"5","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Alexei","last_name":"Klimentov","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"1","is_presenter":true}]},{"id":"msa306","type":"child","title":"BigPanDA: Blue Brain and Beyond","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"BigPanDA is a Workload Management System designed to support the execution of experimental workloads and workflows on distributed resources. In the first part of this talk, we discuss\u00a0 a \u201cproof of concept\u201d \u00a0project that was stated in 2017 and conducted jointly by the BigPanDA team and the Blue Brain Project (BBP) of the Ecole Polytechnique Federal de Lausanne (EPFL). This proof of concept project showed the efficient application of the BigPanDA system to support the complex scientific workflow of the BBP using a mix of desktop, cluster and supercomputers to reconstruct and simulate accurate models of brain tissue.\u00a0 In the second part of this talk, we will discuss how the next generation task execution layer of\u00a0 BigPanda, known as the NGE (Next Generation Executor) is being enhanced for upcoming pre- and full exascale systems. We will discuss the design and integration of NGE with the BigPanda system (including Harvester), demonstrate its native support for MPI tasks (and thereby of immediate relevance to the BBP) and characterize its performance.","bio":"","contributors":[{"type":"Author","first_name":"Shantenu","last_name":"Jha","affiliation":"Rutgers University","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabien","last_name":"Delalondre","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Shantenu","last_name":"Jha","affiliation":"Rutgers University","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Fabien","last_name":"Delalondre","affiliation":"EPFL","country":"Switzerland","bio":"","order":"2","is_presenter":true}]},{"id":"msa305","type":"child","title":"Panel: BigPanDA Experience at Oak Ridge - Learning from the LHC, Going Far Beyond","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In recent years, the ATLAS experiment at the Large Hadron Collider has been very successful in using HPCs as an integral part of their computing operations. Projects pioneered on Titan have led the way for a new pattern of computing usage for the LHC. The success of the LHC program has also led to new initiatives to try the same tools and practices for other data sciences. A panel of experts in HPC and Distributed Computing will lead a discussion about what the success at Titan means for the future of Exascale HPCs and for data science communities.","bio":"","contributors":[{"type":"Author","first_name":"Kaushik","last_name":"De","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":false},{"type":"Author","first_name":"Frank","last_name":"Wuerthwein","affiliation":"UC San Diego","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Jack","last_name":"Wells","affiliation":"Oak Ridge National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Alexei","last_name":"Klimentov","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false},{"type":"Author","first_name":"Torre","last_name":"Wenaus","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":true},{"type":"Author","first_name":"Vladimir","last_name":"Korenkov","affiliation":"JINR","country":"Russia","bio":"","order":"6","is_presenter":false},{"type":"Author","first_name":"Simone","last_name":"Campana","affiliation":"CERN","country":"Switzerland","bio":"","order":"7","is_presenter":false},{"type":"Author","first_name":"Shantenu","last_name":"Jha","affiliation":"Rutgers University","country":"United States of America","bio":"","order":"8","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Torre","last_name":"Wenaus","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"5","is_presenter":true}]}]}, "slot": {"id":"msa299","type":"child","title":"Enabling Biology, Chemistry and Other Sciences on Titan through BigPanDA","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"The Oak Ridge Leadership Computing Facility (OLCF) is one of the most powerful HPC centers available to researchers from different scientific fields to solve some of the world\u0027s most challenging scientific problems. Small scientific groups often need to develop expertise to optimize their applications for running on Titan, and to fit the usage policies of such big machines. We have installed the BigPanDA workload management system at OLCF to simplify the submission of user tasks to Titan. In this talk we will present results of an R\u0026D project to execute workloads from different scientific groups at OLCF. We will describe all steps: starting from deployment of PanDA server as service on demand at OLCF in OpenShift containers, to the adaptation of PanDA client tools for new users. Examples from some of the different scientific fields using this service will include biology\/genomics, molecular dynamics, LQCD, solid-state and neutrino physics, and different data science experiments: nEDM, LSST, and IceCube.","filename":"msa299s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Danila","last_name":"Oleynik","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ruslan","last_name":"Mashinistov","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pavlo","last_name":"Svirin","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sergey","last_name":"Panitkin","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Danila","last_name":"Oleynik","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Danila","last_name":"Oleynik","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Ruslan","last_name":"Mashinistov","affiliation":"The University of Texas at Arlington","country":"United States of America","bio":"","order":"2","is_presenter":false},{"type":"Author","first_name":"Pavlo","last_name":"Svirin","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"3","is_presenter":false},{"type":"Author","first_name":"Sergey","last_name":"Panitkin","affiliation":"Brookhaven National Laboratory","country":"United States of America","bio":"","order":"4","is_presenter":false}] } Presentation
Organizer(s):
Jonas Šukys (Swiss Federal Institute of Aquatic Science and Technology, Switzerland)
, Panagiotis Hadjidoukas (ETH Zurich, Switzerland)
, Antonietta Mira (Università della Svizzera italiana, Switzerland)
Track(s):
Life Sciences, Engineering, Emerging Application Domains, Computer Science and Applied Mathematics
The HPUQ minisymposium focuses on uncertainty quantification (UQ) of mechanistic models for natural sciences (eg. Engineering, Life and Aquatic Sciences) using high performance computing (HPC). The statistical inference (e.g. calibration) of models for complex mechanistic models, in the abundance of data arriving from heterogeneous sources, poses a methodological and computational challenge for scientists. In the first session, the minisymposium highlights cutting edge frameworks for rigorous and robust UQ as ABCpy, Π4U, PyMLMC, SPUX to address these issues, with a focus towards optimal algorithmic performance and efficient utilization of HPC resources. In the second session of the minisymposium, we shift the focus to the applications of UQ methodologies in several important scientific domains spanning from Biomedicine and Biomechanics to Aerospace Engineering and Fluid Dynamics.
Organizer(s):
Wesley P. Petersen (ETH Zurich, Switzerland)
Track(s):
Computer Science and Applied Mathematics
This minisymposium will explore some unconventional methods for numerical simulations of partial differential equations. Often, discretizations of PDEs produce regular error patterns, some of which can be both quantified and smoothed by adding small stochastic terms. In fact, stochastic processes can be intimately connected with PDEs. For example, the heat equation is just a PDE form for the distribution of Brownian random motion. More general frameworks can be built around the Feynman-Kac formula (e.g. Mark Freidlin's book on functional integrals, Princeton Univ. Press, 1985). Such formulations are both flexible and intrinsically parallelizable (e.g. W. Petersen and P. Arbenz, Oxford Univ. Press, 2004). In high spacial dimensions, such techniques, when formulated as Monte-Carlo methods, are much more accurate than might be expected and make many difficult high dimensional simulations even possible. In, say, 3-D, certain interactions (e.g. foams, bubbles) can be modelled phenomenologically where details about these interactions are not well understood. In addition, we will discuss balancing methods or systems in external fields, and the connections between particle methods and the PDEs which describe the distributions.
14:45 - 15:15
Splitting Methods for ODEs, PDEs, and SDEs - with Examples
, Wesley P. Petersen (ETH Zurich, Switzerland)
+ Abstract { "session": {"id":"sess201","title":"MS48 - Unconventional Methods for Partial Differential Equations","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Nairobi Room","contributors":[{"type":"Session Chair","first_name":"Wes","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"symp127","type":"minisymposia","title":"MS48 - Unconventional Methods for Partial Differential Equations","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"This minisymposium will explore some unconventional methods for numerical simulations of partial differential equations. Often, discretizations of PDEs produce regular error patterns, some of which can be both quantified and smoothed by adding small stochastic terms. In fact, stochastic processes can be intimately connected with PDEs. For example, the heat equation is just a PDE form for the distribution of Brownian random motion. More general frameworks can be built around the Feynman-Kac formula (e.g. Mark Freidlin\u0027s book on functional integrals, Princeton Univ. Press, 1985). Such formulations are both flexible and intrinsically parallelizable (e.g. W. Petersen and P. Arbenz, Oxford Univ. Press, 2004). In high spacial dimensions, such techniques, when formulated as Monte-Carlo methods, are much more accurate than might be expected and make many difficult high dimensional simulations even possible. In, say, 3-D, certain interactions (e.g. foams, bubbles) can be modelled phenomenologically where details about these interactions are not well understood. In addition, we will discuss balancing methods or systems in external fields, and the connections between particle methods and the PDEs which describe the distributions.","bio":"","contributors":[{"type":"Organizer","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa250","type":"child","title":"High-Order Well-Balanced Finite Volume Methods for Euler Equations with Gravity","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this talk I will present high-order, well-balanced finite volume methods (FVM) for the Euler equations with gravity. The Euler equations are a system of hyperbolic PDEs commonly used to describe inviscid compressible gases. They admit stationary steady-state solutions, known as hydrostatic equilibria. These equilibria arise from a force balance of gravity forces and pressure gradients. A variety of interesting natural phenomena occur approximately in hydrostatic equilibrium. Examples include numerical weather\/climate prediction on earth or exoplanets and convection in starts, among others. Textbook FVM generally do not preserve these equilibria exactly (i.e. to machine precision). Motivated by the given examples I will present high-order FVM which preserve the hydrostatic equilibria to machine precision without making any assumptions on the equation of state (EOS), which could be a tabulated EOS, or the gravitational potential, which could be the numerical solution of a Poisson equation and only given at certain point-values.","bio":"","contributors":[{"type":"Author","first_name":"Luc","last_name":"Grosheintz-Laval","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luc","last_name":"Grosheintz-Laval","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa288","type":"child","title":"Splitting Methods for ODEs, PDEs, and SDEs - with Examples","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Evolution equations are usually written with the time derivative of a process on the left hand side and multiple additive pieces on the right. Often, each of the additive pieces would permit relatively easy to compute approximations if taken alone. Taken altogether, only low order or less stable approximations are possible. Splitting methods use the approximations for the solutions of individual pieces with compositions of these to construct higher order methods with desired stability properties. Such splittings are very general: ordinary differential equations (Yoshida methods), partial differential equations (Godunov dimensional splittings), and Ito stochastic differential equations. This talk will show some formal compositions along with several numerical examples. These examples will be a Trotter-type anharmonic oscillator approximation, the solution of the Fisher\/KPP equation on a terrestrial map, and the simulation of an oscillating stochastic differential equation. Two important issues will emphasized: the connections between stochastic differential equations and diffusion processes, and the parallel computing aspects of these simulations.","filename":"msa288s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa258","type":"child","title":"Mutual Impact of Bubbles and Waves Studied with an Efficient Finite Volume Solver","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"While the finite volume method is a well established technique for solving PDEs, most of the existing implementations suffer from low utilization of computing resources limiting its application for large-scale problems. Finite volume solver Cubism-MPCF implements a modern algorithm for two-phase compressible flows based on a Godunov-type scheme with WENO reconstruction. Designed for high performance and scalability, the solver reached performance of 11 PFLOP\/s on Sequoia supercomputer. Present work combines applications of Cubism-MPCF to various phenomena including cavitation, shock-induced collapse and acoustics of bubbly liquids. Study of cavitation considers the collapse of a cluster of gas bubbles due to increased pressure. The large number of bubbles of about 12000, for the first time allowed to describe the collective behaviour such as propagation of the collapse front together with the evolution of microjets formed near individual bubbles. Detailed simulations of the shock-induced collapse cover a wide range of physical parameters revealing cases for which the effects of viscosity and surface tension become significant but ignored in other studies. Another application demonstrates Anderson localization of acoustic waves in bubbly liquids. Moreover, interaction with a standing wave leads to rearrangement and deformation of bubbles. These effects are unavailable for linearized models commonly used in acoustics.","bio":"","contributors":[{"type":"Author","first_name":"Fabian","last_name":"Wermelinger","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Petr","last_name":"Karnakov","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Fabian","last_name":"Wermelinger","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa257","type":"child","title":"Fight Uncertainty with Randomness: Stochastic Particle Methods for Microfluidics","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a porous medium, a fluid pushed inside a more viscous one propagates in a finger-like structure. This phenomenon is of great interest in the fields of oil recovery or pollutant spreading in ground water. These so-called \u0022fingering instabilities\u0022 depend on the viscosities, densities and surface tensions of the fluids. Such flows have been widely studied inside Hele-Shaw cells, which serve as an experimental platform for studying fundamental flow patterns in constricted geometries. It consists in two parallel plates separated by a small gap, in which the fluids are flowing under gravity or applied pressure gradient. Recent experiments exhibited surprising stability conditions in the limit of zero surface tension between the fluids. They report an extended stability region in terms of viscosity ratios, which is not predicted by linear stability analysis. Reproducing these results numerically is a crucial step towards understanding such complex flow patterns. We present numerical simulations of multicomponent fluids flowing inside Hele-Shaw cells. The simulations employ the Dissipative Particle Dynamics method, a stochastic particle method widely used in microfluidic applications.","filename":"msa257s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lucas","last_name":"Amoudruz","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Lucas","last_name":"Amoudruz","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa288","type":"child","title":"Splitting Methods for ODEs, PDEs, and SDEs - with Examples","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Evolution equations are usually written with the time derivative of a process on the left hand side and multiple additive pieces on the right. Often, each of the additive pieces would permit relatively easy to compute approximations if taken alone. Taken altogether, only low order or less stable approximations are possible. Splitting methods use the approximations for the solutions of individual pieces with compositions of these to construct higher order methods with desired stability properties. Such splittings are very general: ordinary differential equations (Yoshida methods), partial differential equations (Godunov dimensional splittings), and Ito stochastic differential equations. This talk will show some formal compositions along with several numerical examples. These examples will be a Trotter-type anharmonic oscillator approximation, the solution of the Fisher\/KPP equation on a terrestrial map, and the simulation of an oscillating stochastic differential equation. Two important issues will emphasized: the connections between stochastic differential equations and diffusion processes, and the parallel computing aspects of these simulations.","filename":"msa288s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
15:45 - 16:15
Fight Uncertainty with Randomness: Stochastic Particle Methods for Microfluidics
, Lucas Amoudruz (ETH Zurich, Switzerland)
+ Abstract { "session": {"id":"sess201","title":"MS48 - Unconventional Methods for Partial Differential Equations","date":"Wednesday, July 4th 2018","begin_time":"14:15","end_time":"16:15","room":"Nairobi Room","contributors":[{"type":"Session Chair","first_name":"Wes","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland"}],"view_type":"VIII","view_type_id":"evtt110","tracks":["Computer Science and Applied Mathematics"],"slots":[{"id":"symp127","type":"minisymposia","title":"MS48 - Unconventional Methods for Partial Differential Equations","has_begin_end_time":true,"has_just_one_minute":true,"is_parent":true,"abstract":"This minisymposium will explore some unconventional methods for numerical simulations of partial differential equations. Often, discretizations of PDEs produce regular error patterns, some of which can be both quantified and smoothed by adding small stochastic terms. In fact, stochastic processes can be intimately connected with PDEs. For example, the heat equation is just a PDE form for the distribution of Brownian random motion. More general frameworks can be built around the Feynman-Kac formula (e.g. Mark Freidlin\u0027s book on functional integrals, Princeton Univ. Press, 1985). Such formulations are both flexible and intrinsically parallelizable (e.g. W. Petersen and P. Arbenz, Oxford Univ. Press, 2004). In high spacial dimensions, such techniques, when formulated as Monte-Carlo methods, are much more accurate than might be expected and make many difficult high dimensional simulations even possible. In, say, 3-D, certain interactions (e.g. foams, bubbles) can be modelled phenomenologically where details about these interactions are not well understood. In addition, we will discuss balancing methods or systems in external fields, and the connections between particle methods and the PDEs which describe the distributions.","bio":"","contributors":[{"type":"Organizer","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Organizer","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa250","type":"child","title":"High-Order Well-Balanced Finite Volume Methods for Euler Equations with Gravity","begin_time":"14:15","end_time":"14:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In this talk I will present high-order, well-balanced finite volume methods (FVM) for the Euler equations with gravity. The Euler equations are a system of hyperbolic PDEs commonly used to describe inviscid compressible gases. They admit stationary steady-state solutions, known as hydrostatic equilibria. These equilibria arise from a force balance of gravity forces and pressure gradients. A variety of interesting natural phenomena occur approximately in hydrostatic equilibrium. Examples include numerical weather\/climate prediction on earth or exoplanets and convection in starts, among others. Textbook FVM generally do not preserve these equilibria exactly (i.e. to machine precision). Motivated by the given examples I will present high-order FVM which preserve the hydrostatic equilibria to machine precision without making any assumptions on the equation of state (EOS), which could be a tabulated EOS, or the gravitational potential, which could be the numerical solution of a Poisson equation and only given at certain point-values.","bio":"","contributors":[{"type":"Author","first_name":"Luc","last_name":"Grosheintz-Laval","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Luc","last_name":"Grosheintz-Laval","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa288","type":"child","title":"Splitting Methods for ODEs, PDEs, and SDEs - with Examples","begin_time":"14:45","end_time":"15:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"Evolution equations are usually written with the time derivative of a process on the left hand side and multiple additive pieces on the right. Often, each of the additive pieces would permit relatively easy to compute approximations if taken alone. Taken altogether, only low order or less stable approximations are possible. Splitting methods use the approximations for the solutions of individual pieces with compositions of these to construct higher order methods with desired stability properties. Such splittings are very general: ordinary differential equations (Yoshida methods), partial differential equations (Godunov dimensional splittings), and Ito stochastic differential equations. This talk will show some formal compositions along with several numerical examples. These examples will be a Trotter-type anharmonic oscillator approximation, the solution of the Fisher\/KPP equation on a terrestrial map, and the simulation of an oscillating stochastic differential equation. Two important issues will emphasized: the connections between stochastic differential equations and diffusion processes, and the parallel computing aspects of these simulations.","filename":"msa288s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Wesley P.","last_name":"Petersen","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa258","type":"child","title":"Mutual Impact of Bubbles and Waves Studied with an Efficient Finite Volume Solver","begin_time":"15:15","end_time":"15:45","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"While the finite volume method is a well established technique for solving PDEs, most of the existing implementations suffer from low utilization of computing resources limiting its application for large-scale problems. Finite volume solver Cubism-MPCF implements a modern algorithm for two-phase compressible flows based on a Godunov-type scheme with WENO reconstruction. Designed for high performance and scalability, the solver reached performance of 11 PFLOP\/s on Sequoia supercomputer. Present work combines applications of Cubism-MPCF to various phenomena including cavitation, shock-induced collapse and acoustics of bubbly liquids. Study of cavitation considers the collapse of a cluster of gas bubbles due to increased pressure. The large number of bubbles of about 12000, for the first time allowed to describe the collective behaviour such as propagation of the collapse front together with the evolution of microjets formed near individual bubbles. Detailed simulations of the shock-induced collapse cover a wide range of physical parameters revealing cases for which the effects of viscosity and surface tension become significant but ignored in other studies. Another application demonstrates Anderson localization of acoustic waves in bubbly liquids. Moreover, interaction with a standing wave leads to rearrangement and deformation of bubbles. These effects are unavailable for linearized models commonly used in acoustics.","bio":"","contributors":[{"type":"Author","first_name":"Fabian","last_name":"Wermelinger","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true},{"type":"Author","first_name":"Petr","last_name":"Karnakov","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"2","is_presenter":false}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Fabian","last_name":"Wermelinger","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]},{"id":"msa257","type":"child","title":"Fight Uncertainty with Randomness: Stochastic Particle Methods for Microfluidics","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a porous medium, a fluid pushed inside a more viscous one propagates in a finger-like structure. This phenomenon is of great interest in the fields of oil recovery or pollutant spreading in ground water. These so-called \u0022fingering instabilities\u0022 depend on the viscosities, densities and surface tensions of the fluids. Such flows have been widely studied inside Hele-Shaw cells, which serve as an experimental platform for studying fundamental flow patterns in constricted geometries. It consists in two parallel plates separated by a small gap, in which the fluids are flowing under gravity or applied pressure gradient. Recent experiments exhibited surprising stability conditions in the limit of zero surface tension between the fluids. They report an extended stability region in terms of viscosity ratios, which is not predicted by linear stability analysis. Reproducing these results numerically is a crucial step towards understanding such complex flow patterns. We present numerical simulations of multicomponent fluids flowing inside Hele-Shaw cells. The simulations employ the Dissipative Particle Dynamics method, a stochastic particle method widely used in microfluidic applications.","filename":"msa257s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lucas","last_name":"Amoudruz","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Lucas","last_name":"Amoudruz","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}]}, "slot": {"id":"msa257","type":"child","title":"Fight Uncertainty with Randomness: Stochastic Particle Methods for Microfluidics","begin_time":"15:45","end_time":"16:15","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":false,"abstract":"In a porous medium, a fluid pushed inside a more viscous one propagates in a finger-like structure. This phenomenon is of great interest in the fields of oil recovery or pollutant spreading in ground water. These so-called \u0022fingering instabilities\u0022 depend on the viscosities, densities and surface tensions of the fluids. Such flows have been widely studied inside Hele-Shaw cells, which serve as an experimental platform for studying fundamental flow patterns in constricted geometries. It consists in two parallel plates separated by a small gap, in which the fluids are flowing under gravity or applied pressure gradient. Recent experiments exhibited surprising stability conditions in the limit of zero surface tension between the fluids. They report an extended stability region in terms of viscosity ratios, which is not predicted by linear stability analysis. Reproducing these results numerically is a crucial step towards understanding such complex flow patterns. We present numerical simulations of multicomponent fluids flowing inside Hele-Shaw cells. The simulations employ the Dissipative Particle Dynamics method, a stochastic particle method widely used in microfluidic applications.","filename":"msa257s1.pdf","bio":"","contributors":[{"type":"Author","first_name":"Lucas","last_name":"Amoudruz","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}],"has_presenters":true,"presenters":[{"type":"Author","first_name":"Lucas","last_name":"Amoudruz","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}]}, "slotContributors": [{"type":"Author","first_name":"Lucas","last_name":"Amoudruz","affiliation":"ETH Zurich","country":"Switzerland","bio":"","order":"1","is_presenter":true}] } Presentation
16:15 - 16:40
Coffee Break
Foyer 2nd Floor
Chair: Sinéad Ryan (Trinity College Dublin, Ireland)
The main goal of the International Thermonuclear Experimental Reactor (ITER) project is the demonstration of the feasibility of future clean energy sources based on nuclear fusion in magnetically confined plasma. In the era of ITER construction, fusion plasma theory and modelling provide not only a deep understanding of a specific phenomenon, but moreover, modelling-based design is critical for ensuring active plasma control.
The most computationally demanding aspect of the project is first principles fusion plasma modelling, which relies on fluid models – such as Magneto Hydro Dynamics (MHD) – or increasingly often on kinetic models. The challenge stems from the complexity of the 3D magnetic topology, the large difference in time scales from Alfvenic (10-7s) to confinement time (hundreds of s), the large difference in space scales from micro-instabilities (mm) to the machine size (few meters), and most importantly, from the strongly non-linear nature of plasma instabilities, which need to be avoided or controlled.
The current status of first principles non-linear modelling of MHD instabilities and active methods of their control in existing machines and ITER will be presented, focusing particularly on the strong synergy between experiment, fusion plasma theory, numerical modelling and computer science in guaranteeing the success of the ITER project.
The most computationally demanding aspect of the project is first principles fusion plasma modelling, which relies on fluid models – such as Magneto Hydro Dynamics (MHD) – or increasingly often on kinetic models. The challenge stems from the complexity of the 3D magnetic topology, the large difference in time scales from Alfvenic (10-7s) to confinement time (hundreds of s), the large difference in space scales from micro-instabilities (mm) to the machine size (few meters), and most importantly, from the strongly non-linear nature of plasma instabilities, which need to be avoided or controlled.
The current status of first principles non-linear modelling of MHD instabilities and active methods of their control in existing machines and ITER will be presented, focusing particularly on the strong synergy between experiment, fusion plasma theory, numerical modelling and computer science in guaranteeing the success of the ITER project.
+ Biography { "slot": {"id":"evtypp138","type":"parent","title":"","has_begin_end_time":true,"has_just_one_minute":false,"is_parent":true,"abstract":"The main goal of the International Thermonuclear Experimental Reactor (ITER) project is the demonstration of the feasibility of future clean energy sources based on nuclear fusion in magnetically confined plasma. In the era of ITER construction, fusion plasma theory and modelling provide not only a deep understanding of a specific phenomenon, but moreover, modelling-based design is critical for ensuring active plasma control.\n\n\u003Cbr\u003EThe most computationally demanding aspect of the project is first principles fusion plasma modelling, which relies on fluid models \u2013 such as Magneto Hydro Dynamics (MHD) \u2013 or increasingly often on kinetic models. The challenge stems from the complexity of the 3D magnetic topology, the large difference in time scales from Alfvenic (10-7s) to confinement time (hundreds of s), the large difference in space scales from micro-instabilities (mm) to the machine size (few meters), and most importantly, from the strongly non-linear nature of plasma instabilities, which need to be avoided or controlled.\u003Cbr\u003E\n\nThe current status of first principles non-linear modelling of MHD instabilities and active methods of their control in existing machines and ITER will be presented, focusing particularly on the strong synergy between experiment, fusion plasma theory, numerical modelling and computer science in guaranteeing the success of the ITER project.","filename":"evtypp138s1-file1.pdf","bio":"Marina Becoulet is a Senior Research Physicist in the Institute of Research in Magnetic Fusion at the French Atomic Energy Commission (CEA\/IRFM). She is also a Research Director and an International Expert of CEA, specializing in theory and modelling of magnetic fusion plasmas, in particular non-linear MHD phenomena. After graduating from Moscow State University (Physics Department, Plasma Physics Division) in 1981, she obtained a PhD in Physics and Mathematics from the Institute of Applied Mathematics, Russian Academy of Science (1985). She worked at the Russian Academy of Science in Moscow, on the Joint European Torus in the UK, and since 1998 has been employed at CEA\/IRFM, France.","contributors":[{"type":"Session chair \/ organizer \/ interviewer","first_name":"Chair: Sin\u00e9ad","last_name":"Ryan","affiliation":"Trinity College Dublin","country":"Ireland","bio":"Marina Becoulet is a Senior Research Physicist in the Institute of Research in Magnetic Fusion at the French Atomic Energy Commission (CEA\/IRFM). She is also a Research Director and an International Expert of CEA, specializing in theory and modelling of magnetic fusion plasmas, in particular non-linear MHD phenomena. After graduating from Moscow State University (Physics Department, Plasma Physics Division) in 1981, she obtained a PhD in Physics and Mathematics from the Institute of Applied Mathematics, Russian Academy of Science (1985). She worked at the Russian Academy of Science in Moscow, on the Joint European Torus in the UK, and since 1998 has been employed at CEA\/IRFM, France.","order":"1","is_presenter":false}],"has_presenters":false,"presenters":[]} } Presentation
- Posters & Papers Recognition Ceremony
- PASC18 Wrap-up
- Welcome to PASC19
- PASC18 Wrap-up
- Welcome to PASC19