VDSG: Optimizing Multimodal AI Pipelines with Metaxy

Fri, 29 May 2026 18:00:00 +0000

Abstract

The AI era has changed the economics of data pipelines. Multimodal workflows often fan out into transcription, image understanding, embeddings, classification, extraction, review, and downstream analytics. Without precise metadata, a small change can invalidate too much of the pipeline and force costly reruns.

This talk introduces , an open source Python framework for sample-level metadata versioning and field-level provenance. Metaxy acts as a control layer for incremental data pipelines: it records which fields depend on which upstream fields, computes what became stale, and lets the execution layer process only the affected records.

We focus on practical examples across startup, enterprise, and research settings:

avoiding wasteful recomputation in multimodal AI workflows,
using field-level lineage to decide what can be skipped,
keeping provenance queryable across document, audio, image, and tabular data,
connecting Metaxy with orchestrators and compute engines such as Dagster, Ray, and Slurm.

The core idea is simple: if an audio file changes, recompute transcription. Do not rerun face recognition if it only depends on the video stream.

Data-Science |

VDSG: Optimizing Multimodal AI Pipelines with Metaxy

Abstract

Links