High Level Design architecture (HLD)
Link.
Low Level Design architecture (LLD)
Link.
Architecture Data Flow
Here is a suggested template for Data Model + Data Mapping :
DataPrep Flow
Schema showing the different STEPS of the application flow - with the data involved at each step
Accolade --> GCP Data Ocean DT (prj-data-dm-dt-[env]) --> GCP Product PMO (prj-data-pmo-dash-[env]) --> Output Tableau
Steps descriptions
Source table in Accolade
Flow name (Source to ODS)
Update Table in GCP
Flow name (ODS to DM)
Update Table in GCP
crv_initiative_v3
(Project Cost Budget&Cost)
F010_CRV_INITIATIVE_V3
STG_ACC_0000_0000_F001_F_H_crv_initiative_v3
ODS_ACC_0000_F001_F_H_crv_initiative_v3
F211_DIM_acc_initative
DIM_acc_initative
F212_DIM_acc_linked_prog
DIM_acc_linked_prog
F213_FACT_acc_initiative_cost
FACT_acc_initiative_cost
CRV_ZGetProjectDetails
(Main Project Data from standard fields)
F002_CRV_ZGetProjectDetails
STG_ACC_0000_0000_F001_F_H_crv_zgetprojectdetails
ODS_ACC0_0000_F001_F_H_crv_zgetprojectdetails
F202_DIM_CRV_ZGetProjectDetails
DIM_acc_proj_detail (project_id_sys)
F003_SGM_PROJECT_METRICS_15
STG_ACC_0000_0000_F001_I_H_sgm_projectmetrics_15
ODS_ACC_0000_F001_I_H_sgm_projectmetrics_15
Execute procedure MetricsForGroupID 15
(Status Indicators)
F203_DIM_acc_status_indicators
DIM_acc_status_indicators
F209_DIM_acc_impacted_teams
DIM_acc_impacted_teams
Execute procedure MetricsForGroupID 62
(Project Leader Estimates)
F004_SGM_PROJECT_METRICS_62
STG_ACC_0000_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_62
ODS_ACC_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_62
F204_FACT_acc_proj_leader_estimates
FACT_acc_proj_leader_estimates
Execute procedure MetricsForGroupID 94
(Risk Analysis)
F005_SGM_PROJECT_METRICS_94
STG_ACC_0000_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_94
ODS_ACC_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_94
F205_DIM_acc_mx_risk_analysis
DIM_acc_mx_risk_analysis
Execute procedure MetricsForGroupID 103
(Identification & Classification)
F006_SGM_PROJECT_METRICS_103
STG_ACC_0000_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_103
ODS_ACC_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_103
F206_DIM_acc_ident_classification
DIM_acc_ident_classification
F207_DIM_acc_impacted_bus
DIM_acc_impacted_bus
F208_DIM_acc_impacted_zones
DIM_acc_impacted_zones
F210_DIM_acc_impacted_functions
DIM_acc_impacted_functions
DataSource 1 - Accolade DB
Description
Accolade is the application where users (mainly project managers) share data (status, forecast as examples) related to a project.
Tools
Talend to collect the data from the source and store on GCP/Google Big Query.
Access rights
Only authorized users can connect into Accolade application.
Only DA&AI DataEng's and Data Architects can access data on GCP/Google Big Query.
Source
Location
Accolade DB via Talend generic account created for this purpose.
Format
MSSQL direct table and run procedure.
Destination
Location
Extracted data will be stored on GCP/Cloud storage and Google Big Query.
Format
The format of the data saved in the databank
Sizing
Expected data volume for :
- full process from source to staging (as of 6 Dec 2023)
- incremental process from ODS to DM (as of 6 Dec 2023)
Assessment
Check the log tables in GCP on table log_tables and run_jobs to check that there is no error loading from source to staging/ods
Check the surrogate key must be unique in the data mart layer
Scheduling
Is there an automatic schedule ? Yes, to be scheduled.
At what frequency ? to collect data 5 times/day.
What is the trigger ? TMC
Timing
The average time expected for :
- 5 times/day (working days: from Monday to Friday) : TBC
- full process (source to ODS)
- incremental process (ODS to DM)
Criticality
High/ Medium /Low
Logging
Table table log_tables, run_jobs, log_files, and reject_files in `prj-data-dm-dt-[environment].STG.[table]`


