Hi fellow data engineers,
Currently I’m restructuring a pipeline written with pyspark on Databricks. Since it’s a lot of transformations, results in an extensive DAG, but it’s cool to spend some extra processing resources to make a standard dimensional model (apart from the necessary transformations).
Was wondering what real benefits you have seen a star schema design has from the “one big table” approach, I could preach to my team? (My goal mainly would be to have a resulting smaller PowerBI model.)
And as a side question, what tools do you use to create a dimensional model such a star schema with code?
Thanks a lot!
You must log in or register to comment.