Speeding up Python-based Lagrangian Fluid-Flow Particle Simulations via Dynamic Collection Data Structures
Array-like collection data structures are widely established in Python's scientific computing-ecosystem for high-performance computations. The structure maps well to regular, gridded lattice structures that are common to computational problems in physics and geosciences. High performance is, however, only guaranteed for static computations with a fixed computational domain. We show that for dynamic computations within an actively changing computational domain, the array-like collections provided by NumPy and its derivatives are a bottleneck for large computations. In response, we describe the integration of naturally-dynamic collection data structures (e.g. double-linked lists) into NumPy simulations and ctypes-based C-bindings. Our benchmarks verify and quantify the performance increase attributed to the change of the collection data structure. Our application scenario, a Lagrangian (oceanic) fluid-flow particle simulation within the Parcels framework, demonstrates the speed-up yield in a realistic setting and demonstrates the novel capabilities that are facilitated by optimised collection data structures.
READ FULL TEXT