HYPPO - Using Equivalences to Optimize Pipelines in Exploratory Machine Learning

Publication
ICDE 2024 - 40th International Conference on Extending Database Technology

We propose HYPPO, a novel system to optimize pipelines encountered in exploratory machine learning. HYPPO exploits alternative computational paths of artifacts from past executions to derive better execution plans while reusing materialized artifacts. Adding alternative computations introduces new challenges for exploratory machine learning regarding workload representation, system architecture, and optimal execution plan generation. To this end, we present a novel workload representation based on directed hypergraphs, and we formulate the problem of discovering the optimal execution plan as a search problem over directed hypergraphs and that of selecting artifacts to materialize as an optimization problem.

image
Overview of the pipeline optimization process in HYPPO.