1.0 Overview



n

Business Context: 

The Data Quality Dashboard  is implemented on Qlik Sense, designed to serve multiple organizational domains, including Human Resources (HR), Marketing and Sales, Structured and Shared Services, Finance, Supply Chain, and Procurement.

Key Processes:

The dashboard supports critical data quality management processes across the involved domains. It includes:

  • Data Quality Monitoring: Enables users to continuously track data quality metrics, ensuring the integrity, accuracy, consistency, timeliness, conformity, uniqueness and completeness of data.

  • Failed Data View: Provides users with a centralized view of failed data records, allowing them to identify and review data quality issues that needs cleansing. 


Users Profiles with access to the Data Quality Dashboard to monitor data quality metrics and view failed data records can:

Data Stewards leads the DQ process by participating in the DQ rule creation as well as the DQ issue identification and resolution

  • Defines the DQ rules from gathered requirements and  the profiled data in a functional way and complete the rules attributes. If the rule is simple he can also implement it.
  • Continuously track data quality metrics across various DQ dimensions such as integrity, accuracy, consistency, timeliness, conformity, uniqueness, and completeness.
  • Access a centralized view of failed data records to identify data quality issues.
  • Review the failed data to proceed with the cleansing process.
  • Refine or log new issues eventually and prioritize them based on the impact/severity.
  • Lead the DQ issue resolution and align with stakeholders on solving strategy. Prepare the fixing plan and follow the cleansing /remediation process.

  • Monitor the DQ metrics and assess impact.

Rule Owner is responsible for DQ under his scope by ensures the conformity of the DQ rules and the root cause of the DQ issues. 

  • Review The DQ Rule: Approve or reject the proposed DQ rule already reviewed by business.

  • Validate the issue Root Cause found by the steward as responsible of ensuring the data quality meets the identified targets under his scope

Target Users:

Domains data stewards, data governance teams, other stakeholders.

 

Data Product Type 
  • Dashboard
  • Report
  • Advanced analytics
  • AI 
  • Others <specify which one>
Technologies
  • BW
  • Tableau
  • Qliksense
  • Talend
  • Dataiku
  • Others <specify which one>

Data Sources 

Note: list of all applications and various environment

  • SAP PF1 (Production environment)
  • SAP WP1
  • SAP PI1
  • BW (versions)
  • iCare CRM 
  • CORE CRM
  • Others (SuccessFactors)



2.0 Data Quality Process


The Metro Map provides a streamlined overview of the key activities involved in creating and operating a DQ improvement initiative through the use of analytical capabilities and enabling DQ monitoring dashboard.




2.1 Data Quality Dashboard Objective/Opportunities 

The primary objective of the Data Quality Dashboard is to empower data stewards and other stakeholders within each domain to maintain high standards of data quality. By implementing automated data quality rules and offering a centralized dashboard for monitoring and reviewing failed data, the dashboard provides data stewards with an opportunity to ensure that data across all domains is accurate, up-to-date, and consistent. This, in turn, supports informed decision-making and operational efficiency across the organization.


3.0 Application Feature Overview


Information about the existent features in the application.


FeatureDescriptionLatest uppdate in production (DD/MM/YYYY)










4.0 Business Objects


This section should contain a table with the business objects used in the reports with links to the business object definition in LeanIX.  The purpose is to ensure that all DA&AI Products adhere to a centrally maintained list of business objects and definitions to allow us to achieve our digital ambitions.  For any questions about business objects and LeanIX, contact Data Governance or the Enterprise Information Architect.

Data DomainBusiness Object (in LeanIX)Business Object Definition (only use when the object is not yet in LeanIX)
ex: Marketing & Salesex:  Customer 







5.0 Functional Specification


5.1 Dashboard 

Click here for Wiki Page for DQ QlikSense Documentation 

5.2 KPI's Definitions & Data Input

Overview:
 

The Key Performance Indicators (KPIs) within the Data Quality Dashboard are defined based on data quality rules specified by data stewards from each domain. The rules define the criteria for evaluating the quality of data and are used to calculate the KPIs displayed in the dashboard. These rules are categorized under various data quality dimensions to systematically monitor and enhance data quality, and help in identifying data quality issues, thereby providing actionable insights to maintain high data quality standards.

The rules with their functional, technical specifications and the data inputs are documented in  a centralized Google Sheet.

Data Quality Dimensions:



The following are the data quality dimensions with their definitions under which the KPIs are grouped to assess the quality of data within Solvay.
DimensionDefinition
AccuracyDegree to which data correctly reflects the real world
CompletenessAchieved when all the data required for a particular use is present and available to be used
ConformityAchieved when the data is conforming to a pre-defined business rule/syntax (e.g. format, type or range)
ConsistencyAchieved when data values do not conflict with other values within a record or across different data sets and sources
IntegrityEnsures that all the data in a database can be traced and connected to other data/Degree to which a defined relational constraint is implemented between two data set
TimelinessIndicates whether the data is available when expected and needed and represent reality from the required point of time (Degree to which specified data values are up to date between data change and processing)
UniquenessMeasures the number of unique values and highlights if the are any data duplicates

5.4 Visualization

Graph name

Description 

Calculations//Measures/Rules (if applicable)Scope / FiltersGraph picture






  • Additional Information


5.1 Advanced Analytics

if already existed put the link to the wiki page of the user documentation 

5.2 Data Input

  • Data Sources (including the nature of the data and what is it needed for)
    • Business data (actuals, facts, customer data, transactions…)
    • Reference data (hierarchies, lookup tables… )
  • Transformation Rules (for each of the data source in the previous point)
    • Extraction rules and filters
    • Exception handling rules (how do we handle when data does not come in the format we need) 
    • Enrichments (normally joins) 
    • Aggregation rules

  • If data is sourced from the data ocean, the Multidimensional modeling of the data at conceptual level including fact and dimension tables to be captured or the link to the documentation of the corresponding data mart to be provided. 

5.3 Pre-processing 

Includes the details of the operations performed on the raw data to the data set that is useable for analytics. Data Cleaning, Data Reduction, Data Transformation, and Data Integration are types of preparation tasks. 

5.4 Modelling 

Analytics type: Identify and list the key features or functionalities that the application offers. These could include recommendation systems, predictive analytics, natural language processing, image recognition, etc. descriptive, prescriptive, predictive,..

End to end pipelines:  Procedures and calculation steps to make the advanced analytics pipelines 

Algorithm: Machin learning algorithms,  data mining techniques, mathematical modeling used for each step (high level). It can include the type of model (e.g., regression, classification, neural network), the algorithm used,

Data flow: Input and output of each step 

5.5 Results 

User flow:

If applicable, use diagrams or flowchart tools to create visual representations of the user flows. Start with a high-level overview of the flow and then drill down into more detailed flows for specific tasks or features.

If applicable, indicate where users provide input to the machine learning model. This could involve uploading data, entering text, making selections, or interacting with visualizations.

If applicable, consider how user feedback or actions can impact the machine learning model or the user flow. For example, if users can provide feedback on model predictions, include this in the flow.

Output:

 Capture where the machine learning model provides outputs or results to users. This might involve displaying recommendations, predictions, insights, or visualizations generated by the model.


5.1 Artificial Intelligence

if already existed put the link to the wiki page of the user documentation 


6.0 System view (Architecture)


The purpose of this part is to describe the physical components that supports the functionalities of the product. From that point of view, this part should capture and visualizes the physical components of the data products such as backend, front end, data providers, libraries for ML models, etc. 



7.0 Non-functional Descriptions 


Please populate the relevant section and delete those that are not applicable.

7.1 Usability

Usability is about the ease with which a User can learn to start using the solution and the ease with which they can use the system.  In addition to ease of learning and ease of use, usability also includes areas such as ease of recall, error avoidance and handling, accessibility among others e.g., 99% of metadata entry Users who have use the Maintenance Dashboard should be able to change filters, extract etc., when required.  Maintenance data will be centrally stored in the Google Cloud platform, which will be available to other applications e.g., and Dashboards if needed.

7.2 Regulatory Compliance

Software systems must comply with legal and regulatory e.g., GDPR requirements, this can change depending on country, organisation industry and / or region.  The software systems must be secure from unauthorized access.  The Maintenance Dashboard will comply with Solvay’s regulations and compliance e.g., access only granted to authorized Users.

7.3 Security

Security refers to essential aspects that assure a solution and its components will be protected against unauthorized access or malware attacks.  Important considerations related to security aspects of a system are User authentication, User authorization or User access privileges, data theft, malware attacks, data encryption, and maintaining audit trails, e.g., only Users with administrator access shall be able to create new accounts and assign data access privileges to the new accounts e.g.,

  • All data will be encrypted in the dashboard
  • Only authorised Users / Administrative Users will be able to access data.
  • Maintenance data will be split between either SCO or ECO, and Users will only have authority to one Entity data.

7.4 Performance

Performance defines how fast a software system or a particular section of it responds to certain User actions under a certain workload.  In most cases, this metric explains how long a User must wait before the target operation happens e.g., the page renders, a transaction is processed, etc., given the overall number of Users now.  Performance requirements may describe background processes invisible to Users, e.g., backup and speed of data transfers. 

7.5 Reliability

Reliability is the ability of a solution or its component to perform its required functions without failure under predefined conditions for a specified time / period.  Reliability can possibly be specified in terms of average time system runs before failure occurs, percentage of operations completed successfully within a time / period, maximum acceptable failure probability, or number of failures within a period.  Reliability aspects are in reference to (but not limited to) evaluation of the system to be considered as reliable, classification of reliability defining failures vs. regular failures, and the impact of failure on business operations.  The Maintenance Dashboard will display data from the previous refresh of data.   

7.6 Scalability

Scalability refers to the degree to which a solution can evolve to handle increased amounts of work.  The increased amount of work could be in terms of the user base, transactions, data, network traffic, or other factors e.g., the system should be able to handle an additional load of a maximum of 5,000 Users every month for the next 6 months without any noticeable performance impacts.  

7.7 Compatibility

Interoperability is the degree to which the solution is compatible with other components.  It is a measure of how effectively the system interoperates with other software systems and how easily it integrates with external hardware devices.

Interoperability aspects to be discussed during elicitation are in reference to (but not limited to) software systems to be interfaced with along with data / messages to be exchanged and any standard data formats, hardware components to be integrated with, and any standard communication protocols to be followed e.g., Order Management system will push the order file into a secured file transfer protocol server from where it will be loaded into the system through a daily job.  To guarantee between Google Cloud platform and SAP BW Queries e.g., BW_QRY_MVPMOR01_0002, Solvay has introduced a new tool called Xtract (Xtract).

7.8 Availability

Availability is the degree to which the solution is operable and accessible when required. It is a measure of time during which the system is fully operational e.g., available for use and sometimes included as a Service Level Agreement (SLA) considering its criticality to the business, e.g., the system shall be at least 99% available on weekdays between 09:00 to 18:30 Central European Time (CET).

7.9 Refresh of the Data

Frequency, data, and time of the data refresh in the data product.