In the architecture of the BI Data Lake there is chosen for a flexible structure: Source areas (SRC), a central BI Data Lake with a Business area (BUS) and one or more Target areas (TRG).

‘Powerpack with Python’ generates this software and it contains several components:

  • Standards & Guidelines – Powerpack;  This document describes the selected flexible architecture and implementation of all ETL- and database-components. The document contains the principles for the implementation of the software.
  • Roadmaps – Powerpack; The Roadmap clarifies the steps which need to be carried out for implementing a BI Project with ‘Powerpack‘. It’s designed as a “cookbook”. The developer can simply follow the ‘recipes’.
  • Reference Guide – Powerpack;  A clear document, full of details, concerning the productivity boosters used by ‘Powerpack’. These are Python scripts whose functionality can be tailored to individual needs with parameters. It includes descriptions of the purpose of these scripts and the impact of the parameters.
  • Productivity boosters – Powerpack; ‘Powerpack’ uses Python scripts to automatically generate ETL- and database-objects. As a result, the BI project software components become uniform, they are created in accordance with the Standards & Guidelines and much less effort is required from the developer.
  • ETL- and database Code templates; ‘Powerpack’ applies optimized code templates. These templates perform specific tasks in the ELT-process.
  • Database and file Configuration templates; ‘Powerpack’ is using generic ini-files for specifying the source database and file configurations. Those ini-files are available for all kind of database-types (Oracle, Snowflake, Mysql, SQLServer etc.) and file-types (CSV, JSON and XML).

 

The BI Data Lake architecture is used to store the supplied data from the various Source areas (SRC) on a uniform platform and flexible way in the BI Data Lake database and uses Python as ETL tool.

Within the BI Data Lake you can choose the architecture how you want to implement it in your organization.

  • 1 area; Staging area (STG) used to store the supplied data from the various Source areas (SRC) on a uniform platform.
  • 2 areas; Staging area (STG) and History area (HIS) which is the storage of the data with history.
  • 3 areas; Staging area (STG), History area (HIS) and Business area (BUS) with your business specific transformations and relations.