Sailor: Automating Distributed Training over Dynamic, Heterogeneous, and Geo-distributed Clusters

Published in Proceedings of the 31st Symposium on Operating Systems Principles (SOSP ’25), October 13–16, 2025, Seoul, Republic of Korea, 2025

Recommended citation: Foteini Strati*, Zhendong Zhang, George Manos, Ixeia Sánchez Périz, Qinghao Hu, Tiancheng Chen, Berk Buzcu, Song Han, Pamela Delgado, Ana Klimovic, In ACM SIGOPS 31st Symposium on Operating Systems Principles (SOSP ’25), October 13–16, 2025, Seoul, Republic of Korea. /files/2025-sailor.pdf