Name: Perfecting the Art of Doing Nothing: Incremental Multimodal AI Pipelines with Metaxy
Start: 2026-05-07T18:00:00Z
Location: BRZ, Vienna

Perfecting the Art of Doing Nothing: Incremental Multimodal AI Pipelines with Metaxy

May 7, 2026·

Dr. Georg Heiler

· 2 min read

Slides Link

Abstract

AI pipelines are now expensive enough that recomputing more than necessary is the dominant cost. Tokens and GPU hours change the economics, and agentic workflows branch and converge in ways the traditional single-state data platform was never designed for. This talk introduces Metaxy, an open-source metadata control plane that tracks lineage at the level of individual fields per record, computes a precise diff when something changes, and hands that diff to whichever orchestrator or compute engine you already run. We walk through field-versioning, field-level dependencies, and selective recompute, then look at two production applications: Anam’s Cara 3 training-data workflows over millions of multimodal samples, and Jubust’s structured patent intelligence pipeline built on Docling, Metaxy, Ray, and Dagster. The second half zooms out to platform anatomy — building blocks vs. domain products, quality that lives in the graph, executable specifications as the seam between platform and domain teams, and compute flexibility from laptop to HPC cluster.

Date

May 7, 2026 6:00 PM

Location

BRZ, Vienna

Abstract

This talk introduces Metaxy, an open-source metadata control plane that:

tracks lineage at the level of individual fields per record (not per dataset, not per asset),
computes a precise diff when something changes — rows to add or recompute, rows to retire,
hands that diff to whichever orchestrator or compute engine you already run.

We walk through field-versioning, field-level dependencies, and selective recompute, then look at two production applications:

Anam — Cara 3 training-data workflows (face detection and cropping, audio extraction, transcription, embedding generation) over millions of multimodal samples since December 2025.
Jubust — structured patent intelligence built on Docling for parsing, Metaxy as the incremental control plane, Ray and Dagster for execution, and a reviewer workspace that turns expert corrections into signal for prompts, models, and evaluation.

The second half of the talk zooms out to platform anatomy — building blocks vs. domain products, quality that lives in the graph, executable specifications as the seam between platform and domain teams, and compute flexibility from laptop to HPC cluster.

Two takeaways on what Metaxy actually buys you:

Topological caching of expensive AI work. Field-level lineage and a precise diff mean GPU and token spend is scoped to what truly changed downstream — the usual incremental-recompute story, but at sample-and-field granularity instead of asset granularity.
Efficient metadata access over multimodal data enables intelligent routing. Once per-sample, per-field metadata is queryable, the platform can route work conditionally — pick a transcription model by detected language, pick a vision model by document type, send only the slices that need a heavy VLM through the expensive path, and keep the rest on cheap defaults.

Links

Deck: jubust.com/decks/platform-engineering/1
Session: Austrian Platform Engineering Community Sessionize
Event: BRZ — Austrian Platform Engineering Community 2026
Metaxy: docs.metaxy.io · github.com/anam-org/metaxy

Last updated on May 8, 2026

Metaxy Platform-Engineering Multimodal Ai Dagster Ray Duckdb Incremental

Authors

Dr. Georg Heiler

senior data expert

Georg is a co-founder @Jubust and a Senior data expert at Magenta as well as a ML-ops engineer at ASCII. He is solving challenges with data. His interests include geospatial graphs and time series. Georg transitions the data platform of Magenta to the cloud and is handling large scale multi-modal ML-ops challenges at ASCII.

Docling + Metaxy: Patent Intelligence at Scale Apr 26, 2026 →

No results found

Perfecting the Art of Doing Nothing: Incremental Multimodal AI Pipelines with Metaxy

Abstract

Links