Pixi powering Telekom data cloud



Livestream
Pixi is a tool which enables efficient dependency handling. It is created from prefix.dev, built in Rust and very fast. For us at Magenta Telekom in Austria Pixi is beneficial as we build our new data platform around metadata and strong governance and an explicit graph of data dependencies. In this talk we share our experience with Pixi and how it empowers our data infrastructure - in conjunction with Dagster.
Key principles
Explicit Modeling of Data Dependencies as a Graph
- Graph-Based Dependencies: By representing data dependencies as a graph, the data orchestrator Dagster manages pipelines efficiently, automatically handling dependencies between transformations - like a calculator when dealing with numbers.
- Event-Based Notifications: The dependency graph enables immediate notifications when upstream sources change, allowing downstream consumers to update promptly, reducing wait times.
- Comprehensive Integration: Incorporating data ingestion, transformation, BI/reporting, and AI into the dependency graph makes all relationships transparent, fostering cross-tool & cross-departmental collaboration by breaking down silos.
- Strong Governance: We adhere to robust governance principles by collecting metadata during ingestion and propagating it throughout the graph, ensuring data integrity and compliance.
Advantages Over the Previous System (Conda)
- Optimized Lockfiles: Lockfile resolution occurs only during development, streamlining deployment processes.
- Faster Dependency Resolution: Enhanced speed in resolving dependencies accelerates pipeline execution.
- Integrated Task Runner: Replaces cumbersome makefiles with a seamless task runner, improving workflow efficiency.
- Environment-Specific Dependencies: Supports feature-based environments (development, production, linting), ensuring appropriate configurations across different stages.
- Multi-Language Support: Facilitates environments that include both Python and Java, catering to diverse development needs.
Dates
- Live stream 2025-01-31 at 11:30
- https://www.youtube.com/live/QM-QTGa4b8U
- for interaction join directly on discord https://discord.gg/MdEYuYhtyd?event=1331572690904944742
- eventually a recording will be available here https://www.youtube.com/@prefix_dev
Further Resources
Enterprise Implementation: Explore the key ideas we implement for enterprises by trying out our Local Data Stack Template, which encompasses all the main elements discussed. Check out this post for more detail.
Upcoming Events: Stay tuned! On April 9th, we will share more details about our architecture at the Vienna Data Engineering Meetup.
Summary
Try It Out: Explore our Local Data Stack Template to enhance your own data platform.
Here is the 2nd recording with maybe refined audio:
Connect With Us: We welcome your feedback and questions! Reach out to us to discuss how these principles and the template can add value to your data infrastructure.
Join us for this live stream and learn how to empower your data platform with Pixi!