Towards Practical and Near-Optimal Coflow Scheduling for Data Center Networks

Publisher:
IEEE COMPUTER SOC
Publication Type:
Journal Article
Citation:
IEEE Transactions on Parallel and Distributed Systems, 2016, 27, (11), pp. 3366-3380
Issue Date:
2016-11-01
Full metadata record
In current data centers, an application (e.g., MapReduce, Dryad, search platform, etc.) usually generates a group of parallel flows to complete a job. These flows compose a coflow and only completing them all is meaningful to the application. Accordingly, minimizing the average Coflow Completion Time (CCT) becomes a critical objective of flow scheduling. However, achieving this goal in today's Data Center Networks (DCNs) is quite challenging, not only because the schedule problem is theoretically NP-hard, but also because it is tough to perform practical flow scheduling in large-scale DCNs. In this paper, we find that minimizing the average CCT of a set of coflows is equivalent to the well-known problem of minimizing the sum of completion times in a concurrent open shop. As there are abundant existing solutions for concurrent open shop, we open up a variety of techniques for coflow scheduling. Inspired by the best known result, we derive a 2-approximation algorithm for coflow scheduling, and further develop a decentralized coflow scheduling system, D-CAS, which avoids the system problems associated with current centralized proposals while addressing the performance challenges of decentralized suggestions. Trace-driven simulations indicate that D-CAS achieves a performance close to Varys, the state-of-the-art centralized method, and outperforms Baraat, the only existing decentralized method, significantly.
Please use this identifier to cite or link to this item: