Title of the talk: Overcoming Data Locality: an In-Memory Runtime File System with Symmetrical Data Distribution
Abstract: In many-task computing (MTC), applications such as scientific workflows or parameter sweeps communicate via intermediate files; application performance strongly depends on the file system in use. The state of the art uses runtime systems providing in-memory file storage that is designed for data locality: files are placed on those nodes that write or read them. With data locality, however, task distribution conflicts with data distribution, leading to application slowdown, and worse, to prohibitive storage imbalance. To overcome these limitations, we develop MemFS, a fully symmetrical, in-memory runtime file system that stripes files across all compute nodes, based on a distributed hash function. Our experiments with Montage and BLAST workflows, using up to 512 cores, show that MemFS has both better performance and better scalability than the state-of-the-art, locality-based file system, AMFS.
Biography: Thilo Kielmann is Associate Professor at the Computer Science Department of Vrije Universiteit (VU University) in Amsterdam, the Netherlands. He investigates large-scale systems performability, focusing on clusters, clouds, and grids. Thilo studied Computer Science at Darmstadt University of Technology, Germany. He received his Ph.D. in Computer Engineering in 1997, and his habilitation in Computer Science in 2001, both from Siegen University, Germany. He has published numerous articles in international conferences and journals. In 2012, he was PC chair for HPDC’12, the 21st ACM International Symposium on High Performance Parallel and Distributed Computing. He is serving as Associate Editor for the Journal of Parallel and Distributed Computing (JPDC), and for Cluster Computing, The Journal of Networks, Software Tools and Applications.
Title of the talk: A principled approach to the performance-vs.-programmability trade-off in large-scale distributed data stores
Joint work with
Marek Zawirski, UMPC-LIP6 & INRIA
Masoud Saeida-Ardekani, UMPC-LIP6 & INRIA
Nuno Preguiça, U. Nova de Lisboa
Abstract: Distributed systems face a fundamental trade-off. Strong consistency is easy to understand but is slow, expensive, and is unavailable when the system partitions. Eventual consistency (EC) can be cheaper, faster, and more scalable, but is hard to understand and get correct. We explore the spectrum between those two end-points, from the perspective of parallelism, availability and responsiveness on the one hand, and the ease of programming and maintaining invariants on the other. When concurrent updates to replicated data are allowed, ensuring convergence is complex and often ad-hoc. We propose a simple, scalable, but somewhat restrictive programming model, Conflict-Free Replicated Data Types (CRDTs). Some simple sufficient conditions ensure that all replicas of a CRDT object converge to a correct state, without requiring synchronisation or rollback. CRDTs are safe by construction, and remain responsive, available and scalable despite high network latency, faults, or disconnection.