DocGen
An application to normalise and automate a large portion of documentation for Data Engineers
How To
The hard part first, you need to output your data product in the format of the example data product yaml file.
This is bespoke per client as they could be using Oracle/sql server/databricks/Azure/AWS/GCP... The general idea is to get all required tables and columns that the end product uses as a source and also all the details of the data product.
Now the easy part.
Import the DocGen class, and point it at your yaml file.
Each table(Entity) should have a type of Bronze, Silver or Gold. Following the medallion archetecture of todays engineering standards.
- The gold tables will get mapped into an ERD.
- The entire product will get a markdown table with all the tables and details around that.
- the entire product will get a lineage flow chart.