Taskloaf: Simple Distributed Memory Task-parallelism in Python and C++

Thomas Thompson, Harvard University

Taskloaf is a small, simple Python and C++ distributed futures library. Task-based parallelism, where problems are broken into individual tasks that are scheduled dynamically by a runtime system, is a good fit for a wide class of algorithms. However, existing HPC task-based parallel libraries either exclusively target shared memory settings or are oriented toward solving problems at exascale, resulting in complex systems that are hard to use. In contrast, the driving goal with Taskloaf is to make parallelism exceedingly simple for small-to-medium scales. Taskloaf also aims to be small, with only about 2,000 lines of C++. The interface is based on the concept of monadic futures, a time-tested parallelism primitive consisting of four main operations (async, ready, then, unwrap). Coarse-grained parallel operations like domain decomposition or map/reduce can be composed from these fine-grained primitives. Internally, a work-stealing-based scheduler balances the cost of inactive cores with the cost of data movement. A distributed hash table coordinates the ownership and movement of data through the system, while reference counting makes data lifetime transparent to the user. Taskloaf is available at:

Abstract Author(s): T. Ben Thompson