I’ve written a series of Medium articles on creating a Data Pipeline from scratch, using Polars and DeltaTables. The first (linked) is an overview with link to the GitHub repository and each of the deeper dive articles. I then go into the next level of detail, walking through each component.

The articles are paywalled (it took time to build and document), but the link provided is the ‘family & friends’ link which bypasses the paywall for the Lemmy community.

I hope some of you may find this helpful.

No comments yet!

Data Engineering

!data_engineering@programming.dev

Create post

A community for discussion about data engineering

Icon base by Delapouite under CC BY 3.0 with modifications to add a gradient

Community stats

  • 2

    Monthly active users

  • 37

    Posts

  • 42

    Comments

Community moderators