Page tree


Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

1.0 Overview



n

Button | Vectors (Formerly: SP Macro Button)
backgroundColorFFFFFF
urlconfluenceData Quality Monitoring Tool
iconicon-sp-left-big
fontSize16
labelXXX Menu
boxShadowColortrue
fontColor054A73

Panel
borderColor#ededed
titleColor#ffffff
titleBGColor#5495FC
titleBusiness Context and Application Overview

Business Context: 

The Data Quality Dashboard  is implemented on Qlik Sense, designed to serve multiple organizational domains, including Human Resources (HR), Marketing and Sales, Structured and Shared Services, Finance, Supply Chain, and Procurement.

Key Processes:

The dashboard supports critical data quality management processes across the involved domains. It includes:

Provide an overview of the app (e.g Domain, key processes, purpose of the app, etc)
  • Data Quality Monitoring: Enables users to continuously track data quality metrics, ensuring the integrity, accuracy, consistency, timeliness, conformity, uniqueness and completeness of data.

  • Failed Data View: Provides users with a centralized view of failed data records, allowing them to identify and review data quality issues that needs cleansing. 



Panel
borderColor#ededed
titleColor#ffffff
titleBGColor#5495FC
titleApplication User Profile

Users Profiles with access to the Data Quality Dashboard to monitor data quality metrics and view failed data records can:

Data Stewards leads the DQ process by participating in the DQ rule creation as well as the DQ issue identification and resolution

  • Defines the DQ rules from gathered requirements and  the profiled data in a functional way and complete the rules attributes. If the rule is simple he can also implement it.
  • Continuously track data quality metrics across various DQ dimensions such as integrity, accuracy, consistency, timeliness, conformity, uniqueness, and completeness.
  • Access a centralized view of failed data records to identify data quality issues.
  • Review the failed data to proceed with the cleansing process.
  • Refine or log new issues eventually and prioritize them based on the impact/severity.
  • Lead the DQ issue resolution and align with stakeholders on solving strategy. Prepare the fixing plan and follow the cleansing /remediation process.

  • Monitor the DQ metrics and assess impact.

Rule Owner is responsible for DQ under his scope by ensures the conformity of the DQ rules and the root cause of the DQ issues. 

  • Review The DQ Rule: Approve or reject the proposed DQ rule already reviewed by business.

  • Validate the issue Root Cause found by the steward as responsible of ensuring the data quality meets the identified targets under his scope


For more information refer to the data governance model here.

Describe the key User profiles that exist for the application. 

General role/Viewer role:

Approver role:

Info
iconfalse

Target Users:

As examples: Controllers / Accountants

VERSION

DATE

MODIFIED BY 

DESCRIPTION

0.01

dd.mm.yyyy

<Insert name>

Initial draft

Domains data stewards, data owners, data governance teams, and other stakeholders.

For more information refer to the data governance model here.

Panel
borderColor#ededed
titleColor#ffffff
titleBGColor#5495FC
titleApplication Type

 

Data Product Type 
  •  Dashboard
  •  Report
  •  Advanced analytics
  •  AI 
  •  Others <specify which one>
Technologies
  •  BW
  •  Tableau
  •  Qliksense
  •  Talend
  •  Dataiku
  •  Others <specify which one>

Data Sources 

Note: list of all applications and various environment

  •  SAP PF1 (Production environment)
  •  SAP WP1
  •  SAP PI1
  •  BW (versions)
  •  iCare CRM 
  •  CORE CRM
  •  Others <specify the name of the source> SAP SuccessFactors



2.0

Business

Data Quality Process


Capture the business process that the application supports . This can be describe through a process diagram or a business capability modelThe Data Quality process and it's key activities involved can be found here.


2.1

Challenge/Opportunities

Data Quality Dashboard Objective/Opportunities 

The primary objective of the Data Quality Dashboard is to empower data stewards and other stakeholders within each domain to maintain high standards of data quality. By implementing automated data quality rules and offering a centralized dashboard for monitoring and reviewing failed data, the dashboard provides data stewards with an opportunity to ensure that data across all domains is accurate, up-to-date, and consistent. This, in turn, supports informed decision-making and operational efficiency across the organization.

Clearly articulate the specific problem or opportunity that the application is addressing within the business by leveraging from data. This should be a concise and well-defined statement that captures the essence of the challenge or opportunity that the application is trying to solve by providing insight from the data. 


3.0 Application Feature Overview


Information about the existent features in the application.

FeatureDescriptionLatest uppdate in production (DD/MM/YYYY)

N/A 


4.0 Business Objects


This section should contain a table with the business objects used in the reports with links to the business object definition in LeanIX.  The purpose is to ensure that all DA&AI Products adhere to a centrally maintained list of business objects and definitions to allow us to achieve our digital ambitions.  For any questions about business objects and LeanIX, contact Data Governance or the Enterprise Information Architect.

Data DomainBusiness Object (in LeanIX)Business Object Definition (only use when the object is not yet in LeanIX)ex: Marketing & Salesex:  Customer 

5.0 Functional Specification

Expand
titleDashboard

5.1 Dashboard 

if already existed put the link to the wiki page of the user documentation 

5.2 Data Input

  1. Data Sources (including the nature of the data and what is it needed for)
    1. Business data (actuals, facts, customer data, transactions…)
    2. Reference data (hierarchies, lookup tables… )
  2. Transformation Rules (for each of the data source in the previous point)
    1. Extraction rules and filters
    2. Exception handling rules (how do we handle when data does not come in the format we need) 
    3. Enrichments (normally joins) 
    4. Aggregation rules

5.3 KPI's Definitions 

KPI NameDefinition Calculation 

5.4 Visualization

Graph name

Description 

Calculations//Measures/Rules (if applicable)Scope / FiltersGraph picture
  • Additional Information
Expand
titleAdvanced Analytics

5.1 Advanced Analytics

if already existed put the link to the wiki page of the user documentation 

5.2 Data Input

  • Data Sources (including the nature of the data and what is it needed for)
    • Business data (actuals, facts, customer data, transactions…)
    • Reference data (hierarchies, lookup tables… )
  • Transformation Rules (for each of the data source in the previous point)
    • Extraction rules and filters
    • Exception handling rules (how do we handle when data does not come in the format we need) 
    • Enrichments (normally joins) 
    • Aggregation rules

Image Removed

  • If data is sourced from the data ocean, the Multidimensional modeling of the data at conceptual level including fact and dimension tables to be captured or the link to the documentation of the corresponding data mart to be provided. 

5.3 Pre-processing 

Includes the details of the operations performed on the raw data to the data set that is useable for analytics. Data Cleaning, Data Reduction, Data Transformation, and Data Integration are types of preparation tasks. 

5.4 Modelling 

Analytics type: Identify and list the key features or functionalities that the application offers. These could include recommendation systems, predictive analytics, natural language processing, image recognition, etc. descriptive, prescriptive, predictive,..

End to end pipelines:  Procedures and calculation steps to make the advanced analytics pipelines 

Algorithm: Machin learning algorithms,  data mining techniques, mathematical modeling used for each step (high level). It can include the type of model (e.g., regression, classification, neural network), the algorithm used,

Data flow: Input and output of each step 

5.5 Results 

User flow:

If applicable, use diagrams or flowchart tools to create visual representations of the user flows. Start with a high-level overview of the flow and then drill down into more detailed flows for specific tasks or features.

If applicable, indicate where users provide input to the machine learning model. This could involve uploading data, entering text, making selections, or interacting with visualizations.

If applicable, consider how user feedback or actions can impact the machine learning model or the user flow. For example, if users can provide feedback on model predictions, include this in the flow.

Output:

 Capture where the machine learning model provides outputs or results to users. This might involve displaying recommendations, predictions, insights, or visualizations generated by the model.


5.0 Functional Specification



5.1 Dashboard 

The scope, reload frequency, screens, filters and KPIs are documented in the  Wiki Page for DQ QlikSense Documentation

5.2 Dashboard access

 

  To request access to the application:

  1. submit a Corporate Dashboard Access Request Form 

    Image Added

  2. Then Select DT- Data Quality Monitoring Dashboard as the dashboard Name

    Image Added

  3. Submit request 


  To access the application:

  1. Once access is granted, go to https://qliksense.solvay.com/hub, then select SBS from the streams menu on the left side. 

    Image Added

  2. Select Data Quality Monitoring Tool 

    Image Added

5.3 Rules Definitions & Data Input

Overview:
 

The Key Performance Indicators (KPIs) within the Data Quality Dashboard are defined based on data quality rules specified by data stewards from each domain. The rules define the criteria for evaluating the quality of data and are used to calculate the KPIs displayed in the dashboard. These rules are categorized under various data quality dimensions to systematically monitor and enhance data quality, and help in identifying data quality issues, thereby providing actionable insights to maintain high data quality standards.

The following rules are currently present in the dashboard. 

  1. Marketing & sales
Rule IDDQ dimensionBusiness NameFunctional descriptionSource Systems Tables
MRK-3UniquenessDuplicate customer Customers with the same name, address, VAT and Account Group. SAP PF1 
SAP WP1 
 KNA1


      2. Structures & Shared 

Rule IDDQ dimensionBusiness NameFunctional descriptionSource Systems Tables
SSR-1ConsistencyNo active plants linked to obsolete companies# of active plants linked to obsolete companies / total number of active plantsSAP PF1 
SAP WP1 

T001W

T001K

SSR-2ConsistencyNo active material codes connected to obsolete plant# of active materials in obsolete plants / total number of active materialsSAP PF1 
SAP WP1 

MARC

T001W

T001K

SSR-9ConsistencyNo active materials linked to obsolete sales org# of obsolete sales organizations linked to active material(s)/ Total number of  sales organizations linked to active materials in material sales views SAP PF1 
SAP WP1 

T001W

T001K

MVKE

TVKOT


     3. Finance

Rule IDDQ dimensionBusiness NameFunctional descriptionSource Systems Tables
FIN-1ConsistencyActive CCs to L4 in the ZCBS hierarchyThe rule checks if all "Active" Cost Centers are in "Level 4" in ZCBS hierarchy EXCEPT the Cost Centers that are in EDISCXX, they are in Level 4 but they should be blocked and be Inactive. SAP BW

BW_QRY_C_COSTCTR_0001

FIN-3Consistency Accuracy of assigning "Inactive" Cost centers to the EDISCXX node  The rule checks if all Cost centers in EDISCXX node are inactiveSAP BW

 BW_QRY_C_COSTCTR_0001

FIN-4Consistency All cost centers are assigned to an active GBU Cluster The rule checks if all Cost centers are in active GBU ClusterSAP BW

 BW_QRY_C_COSTCTR_0001

FIN-5ConformitySRM7 responsible codification The rule checks if the position responsible field of the cost center has 8 digits and the first 3 digits (left to right) need to start with “500”

SAP BW

 BW_QRY_C_COSTCTR_0001

FIN-6Consistency All cost centers are assigned to a BSA The rule checks if the BSA group is assigned to the cost centerSAP BW

  BW_QRY_C_COSTCTR_0001

FIN-7Consistency Cost Centers with Profit Centers The rule checks if all the cost centers have a profit center associatedSAP BW

  BW_QRY_C_COSTCTR_0001


   4. Human Resources

Rule IDDQ dimensionBusiness NameFunctional descriptionSource Systems Tables
HRS-1ConformityLegal Entity is activeThe status of the object "Legal entity" to which an active employee is allocated is active.SuccessFactorsEmpJob
FO.Company
HRS-2ConformityBusiness Unit is activeThe status of the object "Business Unit" to which an active (Employee Status is Active or Active leave ) employee is alloacted is active.SuccessFactorsEmpJob
FO.BusinessUnit
HRS-3ConformityLocation Group (site) is activeThe status of the object "Location Group" to which an active employee is alloacted is active.SuccessFactorsEmpJob
FO.LocationGroup
HRS-4ConformityLocation (PA/PSA) is activeThe status of the object "Location" to which an active (Employee Status is Active or Active leave ) employee is alloacted is active.SuccessFactorsEmpJob
FO.Location
HRS-5ConformityCost Centre is activeThe status of the object "Cost Centre" to which an active (Employee Status is Active or Active leave ) employee is alloacted is active.SuccessFactorsEmpJob
FO.CostCenter
HRS-6ConformityIncumbent's Position is activeThe status of the object "Position" to which an active employee is alloacted is active.SuccessFactorsEmpJob
Position
HRS-13CompletenessSupervisor is assignedAll active people have a supervisor assigned.SuccessFactorsEmpJob
Position
HRS-14ConsistencyCadres have Pay grade between S15 and S27For cadres there is a consistency between the Employment Type and the Pay gradeSuccessFactorsEmpJob
HRS-18ConsistencyCost Center is the same at Position and Job Info.Cost Center is the same at Position and Job Info.SuccessFactorsEmpJob
Position
HRS-19ConsistencyBusiness Unit is the same at Position and Job Info.Business Unit is the same at Position and Job Info.SuccessFactorsEmpJob
Position
HRS-23AccuracyExternal workforce have Position Grade "Not applicable"External Employees, Trainees and Apprentices have Position Grade "Not applicable"SuccessFactorsEmpJob
HRS-24AccuracyExternal workforce have Pay Grade "Not applicable"External Employees, Trainees and Apprentices have Pay Grade "Not applicable"SuccessFactorsEmpJob
HRS-25AccuracySales Cadre have SIP Bonus PlanSales Cadre must have SIP Bonus PlanSuccessFactorsEmpJob
EmpCompensation
HRS-28CompletenessCadres should always have a bonus plan assignedCadres have a bonus plan assignedSuccessFactorsEmpJob
EmpCompensation
HRS-29CompletenessCadres have annual indicative salaryAll Cadre employees must have annual indicative salary in ECSuccessFactorsEmpJob
EmpCompensation
HRS-30CompletenessCadres have Annual Salary (9ANS)All Cadre employees must have Annual Salary (9ANS) recorded in ECSuccessFactorsEmpJob
EmpCompensation
HRS-34ConsistencyIndicative Salary amount is the same as the Annual Salary (9ans)For all Cadre employees the Indicative Salary amount must be the same as the Annual Salary (9ans)SuccessFactorsEmpJob
EmpCompensation


Detailed information on the rules with their functional, technical specifications and the data inputs are documented in  a centralized Google Sheet.


Data Quality Dimensions:


The following are the data quality dimensions with their definitions under which the KPIs are grouped to assess the quality of data within Solvay.

DimensionDefinition
AccuracyDegree to which data correctly reflects the real world
CompletenessAchieved when all the data required for a particular use is present and available to be used
ConformityAchieved when the data is conforming to a pre-defined business rule/syntax (e.g. format, type or range)
ConsistencyAchieved when data values do not conflict with other values within a record or across different data sets and sources
IntegrityEnsures that all the data in a database can be traced and connected to other data/Degree to which a defined relational constraint is implemented between two data set
TimelinessIndicates whether the data is available when expected and needed and represent reality from the required point of time (Degree to which specified data values are up to date between data change and processing)
UniquenessMeasures the number of unique values and highlights if the are any data duplicates

6.0 System view (Architecture)


The system view (Architecture) can be found in the technical documentations .



7.0 Non-functional Descriptions 


Please populate the relevant section and delete those that are not applicable.

7.1  Security

 The dashboard is secure from unauthorized access, access only granted to authorized Users. 

7.2 Refresh of the Data

The data is refreshed weekly, every Monday.

Expand
titleArtificial Intelligence

5.1 Artificial Intelligence

if already existed put the link to the wiki page of the user documentation 

6.0 System view (Architecture)

The purpose of this part is to describe the physical components that supports the functionalities of the product. From that point of view, this part should capture and visualizes the physical components of the data products such as backend, front end, data providers, libraries for ML models, etc. 

7.0 Non-functional Descriptions 

Please populate the relevant section and delete those that are not applicable.

7.1 Usability

Usability is about the ease with which a User can learn to start using the solution and the ease with which they can use the system.  In addition to ease of learning and ease of use, usability also includes areas such as ease of recall, error avoidance and handling, accessibility among others e.g., 99% of metadata entry Users who have use the Maintenance Dashboard should be able to change filters, extract etc., when required.  Maintenance data will be centrally stored in the Google Cloud platform, which will be available to other applications e.g., and Dashboards if needed.

7.2 Regulatory Compliance

Software systems must comply with legal and regulatory e.g., GDPR requirements, this can change depending on country, organisation industry and / or region.  The software systems must be secure from unauthorized access.  The Maintenance Dashboard will comply with Solvay’s regulations and compliance e.g., access only granted to authorized Users.

7.3 Security

Security refers to essential aspects that assure a solution and its components will be protected against unauthorized access or malware attacks.  Important considerations related to security aspects of a system are User authentication, User authorization or User access privileges, data theft, malware attacks, data encryption, and maintaining audit trails, e.g., only Users with administrator access shall be able to create new accounts and assign data access privileges to the new accounts e.g.,

  • All data will be encrypted in the dashboard
  • Only authorised Users / Administrative Users will be able to access data.
  • Maintenance data will be split between either SCO or ECO, and Users will only have authority to one Entity data.

7.4 Performance

Performance defines how fast a software system or a particular section of it responds to certain User actions under a certain workload.  In most cases, this metric explains how long a User must wait before the target operation happens e.g., the page renders, a transaction is processed, etc., given the overall number of Users now.  Performance requirements may describe background processes invisible to Users, e.g., backup and speed of data transfers. 

7.5 Reliability

Reliability is the ability of a solution or its component to perform its required functions without failure under predefined conditions for a specified time / period.  Reliability can possibly be specified in terms of average time system runs before failure occurs, percentage of operations completed successfully within a time / period, maximum acceptable failure probability, or number of failures within a period.  Reliability aspects are in reference to (but not limited to) evaluation of the system to be considered as reliable, classification of reliability defining failures vs. regular failures, and the impact of failure on business operations.  The Maintenance Dashboard will display data from the previous refresh of data.   

7.6 Scalability

Scalability refers to the degree to which a solution can evolve to handle increased amounts of work.  The increased amount of work could be in terms of the user base, transactions, data, network traffic, or other factors e.g., the system should be able to handle an additional load of a maximum of 5,000 Users every month for the next 6 months without any noticeable performance impacts.  

7.7 Compatibility

Interoperability is the degree to which the solution is compatible with other components.  It is a measure of how effectively the system interoperates with other software systems and how easily it integrates with external hardware devices.

Interoperability aspects to be discussed during elicitation are in reference to (but not limited to) software systems to be interfaced with along with data / messages to be exchanged and any standard data formats, hardware components to be integrated with, and any standard communication protocols to be followed e.g., Order Management system will push the order file into a secured file transfer protocol server from where it will be loaded into the system through a daily job.  To guarantee between Google Cloud platform and SAP BW Queries e.g., BW_QRY_MVPMOR01_0002, Solvay has introduced a new tool called Xtract (Xtract).

7.8 Availability

Availability is the degree to which the solution is operable and accessible when required. It is a measure of time during which the system is fully operational e.g., available for use and sometimes included as a Service Level Agreement (SLA) considering its criticality to the business, e.g., the system shall be at least 99% available on weekdays between 09:00 to 18:30 Central European Time (CET).

7.9 Refresh of the Data

Frequency, data, and time of the data refresh in the data product.