Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

DocGen

An application to normalise and automate a large portion of documentation for Data Engineers

How To

The hard part first, you need to output your data product in the format of the example data product yaml file.

This is bespoke per client as they could be using Oracle/sql server/databricks/Azure/AWS/GCP... The general idea is to get all required tables and columns that the end product uses as a source and also all the details of the data product.

Now the easy part.

Import the DocGen class, and point it at your yaml file.

Each table(Entity) should have a type of Bronze, Silver or Gold. Following the medallion archetecture of todays engineering standards.

  • The gold tables will get mapped into an ERD.
  • The entire product will get a markdown table with all the tables and details around that.
  • the entire product will get a lineage flow chart.