Seminarium: Systemy Rozproszone
3 kwietnia 2025 12:15, sala 4070
Dominik Strak, Szymon Potrzebowski
Combining Buffered I/O and Direct I/O in Distributed File Systems
This paper explores the performance trade-offs between buffered and direct I/O in HPC environments and introduces a dynamic, transparent approach that switches between them based on workload characteristics. The proposed method considers factors such as I/O size, file lock contention, and memory constraints to optimize performance. Implemented in the Lustre file system, the authors claim that the approach enhances throughput significantly, achieving up to 3x speedup over standard Lustre and up to 13x improvement over other distributed file systems with direct I/O support. Zapraszam,
Dominik Strak
Bibliografia:
Netcastle: Network Infrastructure Testing At Scale
Network operators have long struggled to achieve reliability. Increased complexity risks surprising interactions, increased downtime, and lost person-hours trying to debug correctness and performance problems in large systems. For these reasons, network operators have also long pushed back on deploying promising network research, fearing the unexpected consequences of increased network complexity. This presenation will cover a paper in which authors use statistics from a large-scale network to identify unique challenges in network testing. To tackle the challenges, they develop Netcastle: a system that provides continuous integration/continuous deployment (CI/CD) network testing. Authors share five years of experiences in building and running Netcastle at Meta. Zapraszam,
Szymon Potrzebowski
Bibliografia: