I get questions like this a lot:
- Where did this data come from?
- How do I know I can trust the source?
- What types of QA checks were applied to this data?
Data lineage is such a chronic issue in data engineering. This blog post from Airbyte gives a good overview & mentions some interesting products/projects that can maybe help out with data lineage.
Unfortunately, I have limited flexibility to purchase or install tools for this in my current role. Anyone rolled their own solution for this?
You must log in or register to comment.
Apache Nifi maintains a linage table for its data movement and transformation