Business Context: The dashboard supports critical data quality management processes across the involved domains. It includes:
|
Users Profiles with access to the Data Quality Dashboard to monitor data quality metrics and view failed data records can:
Rule Owner is responsible for DQ under his scope by ensures the conformity of the DQ rules and the root cause of the DQ issues.
|
Target Users: Domains data stewards, data governance teams, other stakeholders. |
|
The Data Quality process and it's key activities involved can be found here.
The primary objective of the Data Quality Dashboard is to empower data stewards and other stakeholders within each domain to maintain high standards of data quality. By implementing automated data quality rules and offering a centralized dashboard for monitoring and reviewing failed data, the dashboard provides data stewards with an opportunity to ensure that data across all domains is accurate, up-to-date, and consistent. This, in turn, supports informed decision-making and operational efficiency across the organization.
Information about the existent features in the application.
| Feature | Description | Latest uppdate in production (DD/MM/YYYY) |
|---|---|---|
This section should contain a table with the business objects used in the reports with links to the business object definition in LeanIX. The purpose is to ensure that all DA&AI Products adhere to a centrally maintained list of business objects and definitions to allow us to achieve our digital ambitions. For any questions about business objects and LeanIX, contact Data Governance or the Enterprise Information Architect.
| Data Domain | Business Object (in LeanIX) | Business Object Definition (only use when the object is not yet in LeanIX) |
|---|---|---|
| ex: Marketing & Sales | ex: Customer | |
The Scope, reload frequency, screens, filters and KPIs are documented in the Wiki Page for DQ QlikSense Documentation
The Key Performance Indicators (KPIs) within the Data Quality Dashboard are defined based on data quality rules specified by data stewards from each domain. The rules define the criteria for evaluating the quality of data and are used to calculate the KPIs displayed in the dashboard. These rules are categorized under various data quality dimensions to systematically monitor and enhance data quality, and help in identifying data quality issues, thereby providing actionable insights to maintain high data quality standards.
The following rules are currently present in the dashboard:
| Rule ID | DQ dimension | Rule Business Name | Rule Functional description |
|---|---|---|---|
| MRK-3 | Uniqueness | Duplicate customer | Customer with the same name and same address and same VAT |
The rules with their functional, technical specifications and the data inputs are documented in a centralized Google Sheet.
The following are the data quality dimensions with their definitions under which the KPIs are grouped to assess the quality of data within Solvay.
| Dimension | Definition | |
| Accuracy | Degree to which data correctly reflects the real world | |
| Completeness | Achieved when all the data required for a particular use is present and available to be used | |
| Conformity | Achieved when the data is conforming to a pre-defined business rule/syntax (e.g. format, type or range) | |
| Consistency | Achieved when data values do not conflict with other values within a record or across different data sets and sources | |
| Integrity | Ensures that all the data in a database can be traced and connected to other data/Degree to which a defined relational constraint is implemented between two data set | |
| Timeliness | Indicates whether the data is available when expected and needed and represent reality from the required point of time (Degree to which specified data values are up to date between data change and processing) | |
| Uniqueness | Measures the number of unique values and highlights if the are any data duplicates | |
| Graph name | Description | Calculations//Measures/Rules (if applicable) | Scope / Filters | Graph picture |
|---|---|---|---|---|
The purpose of this part is to describe the physical components that supports the functionalities of the product. From that point of view, this part should capture and visualizes the physical components of the data products such as backend, front end, data providers, libraries for ML models, etc.
Usability is about the ease with which a User can learn to start using the solution and the ease with which they can use the system. In addition to ease of learning and ease of use, usability also includes areas such as ease of recall, error avoidance and handling, accessibility among others e.g., 99% of metadata entry Users who have use the Maintenance Dashboard should be able to change filters, extract etc., when required. Maintenance data will be centrally stored in the Google Cloud platform, which will be available to other applications e.g., and Dashboards if needed.
Software systems must comply with legal and regulatory e.g., GDPR requirements, this can change depending on country, organisation industry and / or region. The software systems must be secure from unauthorized access. The Maintenance Dashboard will comply with Solvay’s regulations and compliance e.g., access only granted to authorized Users.
Security refers to essential aspects that assure a solution and its components will be protected against unauthorized access or malware attacks. Important considerations related to security aspects of a system are User authentication, User authorization or User access privileges, data theft, malware attacks, data encryption, and maintaining audit trails, e.g., only Users with administrator access shall be able to create new accounts and assign data access privileges to the new accounts e.g.,
Performance defines how fast a software system or a particular section of it responds to certain User actions under a certain workload. In most cases, this metric explains how long a User must wait before the target operation happens e.g., the page renders, a transaction is processed, etc., given the overall number of Users now. Performance requirements may describe background processes invisible to Users, e.g., backup and speed of data transfers.
Reliability is the ability of a solution or its component to perform its required functions without failure under predefined conditions for a specified time / period. Reliability can possibly be specified in terms of average time system runs before failure occurs, percentage of operations completed successfully within a time / period, maximum acceptable failure probability, or number of failures within a period. Reliability aspects are in reference to (but not limited to) evaluation of the system to be considered as reliable, classification of reliability defining failures vs. regular failures, and the impact of failure on business operations. The Maintenance Dashboard will display data from the previous refresh of data.
Scalability refers to the degree to which a solution can evolve to handle increased amounts of work. The increased amount of work could be in terms of the user base, transactions, data, network traffic, or other factors e.g., the system should be able to handle an additional load of a maximum of 5,000 Users every month for the next 6 months without any noticeable performance impacts.
Interoperability is the degree to which the solution is compatible with other components. It is a measure of how effectively the system interoperates with other software systems and how easily it integrates with external hardware devices.
Interoperability aspects to be discussed during elicitation are in reference to (but not limited to) software systems to be interfaced with along with data / messages to be exchanged and any standard data formats, hardware components to be integrated with, and any standard communication protocols to be followed e.g., Order Management system will push the order file into a secured file transfer protocol server from where it will be loaded into the system through a daily job. To guarantee between Google Cloud platform and SAP BW Queries e.g., BW_QRY_MVPMOR01_0002, Solvay has introduced a new tool called Xtract (Xtract).
Availability is the degree to which the solution is operable and accessible when required. It is a measure of time during which the system is fully operational e.g., available for use and sometimes included as a Service Level Agreement (SLA) considering its criticality to the business, e.g., the system shall be at least 99% available on weekdays between 09:00 to 18:30 Central European Time (CET).
Frequency, data, and time of the data refresh in the data product.