High Level Design architecture (HLD)
Link.
Low Level Design architecture (LLD)
Link.
Architecture Data Flow
Here is a suggested template for Data Model + Data Mapping :
https://docs.google.com/spreadsheets/d/1bD8AIgsNUI2sgANoOEKTuHBlkxhsNVTD8cmOPYEloLw
DataPrep Flow
Schema showing the different STEPS of the application flow - with the data involved at each step
Steps descriptions
Describe the data and process involved at each step
DataSource 1 - Accolade DB
Description
Accolade is the application where users (mainly project managers) share data (status, forecast as examples) related to a project.
Tools
Talend to collect the data from the source and store on GCP/Google Big Query.
Access rights
Only authorized users can connect into Accolade application.
Only DA&AI DataEng's and Data Architects can access data on GCP/Google Big Query.
Source
Location
Accolade DB via Talend generic account created for this purpose.
Format
MSSQL direct table and run procedure.
Destination
Location
Extracted data will be stored on GCP/Cloud storage and Google Big Query.
Format
The format of the data saved in the databank
Sizing
Expected data volume for :
- full process
- incremental process
Assessment
Check the log tables in GCP on table log_tables and run_jobs to check that there is no error loading from source to staging/ods
Check the surrogate key must be unique in the data mart layer
Scheduling
Is there an automatic schedule ? Yes, to be scheduled.
At what frequency ? to collect data times/day.
What is the trigger ?
Timing
The average time expected for :
- 5 times/day (working days: from Monday to Friday)
- full process
- incremental process
Criticality
High/ Medium /Low
Logging
Logging location
DataSource 2
SAME QUESTIONS
