You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 15 Next »

High Level Design architecture (HLD)

Link.


Low Level Design architecture (LLD)

Link.

Architecture Data Flow

Here is a suggested template for Data Model + Data Mapping :

DA&DA - Domain Mapping DT


DataPrep Flow

Schema showing the different STEPS of the application flow - with the data involved at each step

Accolade --> GCP Data Ocean DT (prj-data-dm-dt-[env])  --> GCP Product PMO (prj-data-pmo-dash-[env]) --> Output Tableau

prj-data-dm-dt-dev.STG.STG_ACC_0000_0000_F001_F_H_crv_initiative_v3

Steps descriptions


Source table in Accolade

Flow name (Source to ODS)

Update Table in GCP

Flow name (ODS to DM)

Update Table in GCP

Flow name (Join tables)

Update Table in GCP


crv_initiative_v3

(Project Cost Budget&Cost)

F010_CRV_INITIATIVE_V3

STG.STG_ACC_0000_0000_F001_F_H_crv_initiative_v3

ODS.ODS_ACC_0000_F001_F_H_crv_initiative_v3

F211_DIM_acc_initative

DIM_acc_initative




F202_DIM_CRV_ZGetProjectDetails












DIM_acc_project_detail









F212_DIM_acc_linked_prog

DIM_acc_linked_prog

F213_FACT_acc_initiative_cost

FACT_acc_initiative_cost

CRV_ZGetProjectDetails

(Main Project Data from standard fields)

F002_CRV_ZGetProjectDetails

STG.STG_ACC_0000_0000_F001_F_H_crv_zgetprojectdetails

ODS.ODS_ACC0_0000_F001_F_H_crv_zgetprojectdetails

F202_DIM_CRV_ZGetProjectDetails

DIM_acc_proj_detail 


Execute procedure MetricsForGroupID 15

 (Status Indicators)

F003_SGM_PROJECT_METRICS_15

STG.STG_ACC_0000_0000_F001_I_H_sgm_projectmetrics_15

ODS.ODS_ACC_0000_F001_I_H_sgm_projectmetrics_15

F203_DIM_acc_status_indicators

DIM_acc_status_indicators

F209_DIM_acc_impacted_teams

DIM_acc_impacted_teams

Execute procedure MetricsForGroupID 62

Project Leader Estimates


F004_SGM_PROJECT_METRICS_62


STG.STG_ACC_0000_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_62

ODS.ODS_ACC_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_62


F204_FACT_acc_proj_leader_estimates


FACT_acc_proj_leader_estimates


Execute procedure MetricsForGroupID 94

(Risk Analysis)

F005_SGM_PROJECT_METRICS_94

STG.STG_ACC_0000_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_94

ODS.ODS_ACC_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_94

F205_DIM_acc_mx_risk_analysis

DIM_acc_mx_risk_analysis


Execute procedure MetricsForGroupID 103

(Identification & Classification)

 

F006_SGM_PROJECT_METRICS_103

STG.STG_ACC_0000_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_103

ODS.ODS_ACC_0000_F001_F_H_rsp_qv_getprojectmetricsforgroupid_103

F206_DIM_acc_ident_classification

DIM_acc_ident_classification

F207_DIM_acc_impacted_bus

DIM_acc_impacted_bus

F208_DIM_acc_impacted_zones

DIM_acc_impacted_zones

F210_DIM_acc_impacted_functions

DIM_acc_impacted_functions



DataSource 1 - Accolade DB

Description

Accolade is the application where users (mainly project managers) share data (status, forecast as examples) related to a project. 

Tools

Talend to collect the data from the source and store on GCP/Google Big Query.

Access rights

Only authorized users can connect into Accolade application.

Only DA&AI DataEng's and Data Architects can access data on GCP/Google Big Query.

Source

Location

Accolade DB via Talend generic account created for this purpose.

Test = acew1twegodb01.nonprod.aws.cloud.solvay.com

Prod = acew1pwegodb01.prod.aws.cloud.solvay.com.

Format

MSSQL direct table and run procedure.

Destination

Location

Extracted data will be stored on GCP/Cloud storage and Google Big Query.

Format

The format of the data saved in the databank

Sizing

Expected data volume for :

  • full process from source to staging (as of 6 Dec 2023)

  • incremental process from ODS to DM (as of 6 Dec 2023)

Assessment

Check the log tables in GCP on table log_tables and run_jobs to check that there is no error loading from source to staging/ods

Check the surrogate key must be unique in the data mart layer

Scheduling

Is there an automatic schedule ? Yes

At what frequency ? to collect data 4 times/day ( every 6hour). 

What is the trigger ? TMC

Timing

The average time expected for :

  • 4 times/day (working days: from Monday to Friday) : to be scheduled. 2:00,  8:00, 14:00, 20:00 CET (monitor by DataOps only on 8:00 and 14:00)
  • full process (source to ODS)
  • incremental process (ODS to DM)

Criticality

High / Medium / Low

Logging

Table table log_tables, run_jobs, log_files, and reject_files in `prj-data-dm-dt-[environment].STG.[table]`


  • No labels