Designing Machine Learning Toolboxes: Concepts, Principles and Patterns
Machine learning (ML) and AI toolboxes such as scikit-learn or Weka are workhorses of contemporary data scientific practice -- their central role being enabled by usable yet powerful designs that allow to easily specify, train and validate complex modeling pipelines. However, despite their universal...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Machine learning (ML) and AI toolboxes such as scikit-learn or Weka are
workhorses of contemporary data scientific practice -- their central role being
enabled by usable yet powerful designs that allow to easily specify, train and
validate complex modeling pipelines. However, despite their universal success,
the key design principles in their construction have never been fully analyzed.
In this paper, we attempt to provide an overview of key patterns in the design
of AI modeling toolboxes, taking inspiration, in equal parts, from the field of
software engineering, implementation patterns found in contemporary toolboxes,
and our own experience from developing ML toolboxes. In particular, we develop
a conceptual model for the AI/ML domain, with a new type system, called
scientific types, at its core. Scientific types capture the scientific meaning
of common elements in ML workflows based on the set of operations that we
usually perform with them (i.e. their interface) and their statistical
properties. From our conceptual analysis, we derive a set of design principles
and patterns. We illustrate that our analysis can not only explain the design
of existing toolboxes, but also guide the development of new ones. We intend
our contribution to be a state-of-art reference for future toolbox engineers, a
summary of best practices, a collection of ML design patterns which may become
useful for future research, and, potentially, the first steps towards a
higher-level programming paradigm for constructing AI. |
---|---|
DOI: | 10.48550/arxiv.2101.04938 |