Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Jira
serverSyensqo's Jira
serverIdd8efc1ef-48bd-3b4e-8714-ad827f4f059b
keySYSM-353

https://syensqo.atlassian.net/browse/SYSM-353
Info
Info
  • Naming Conventions for Tables, Columns and Data Structures
  • Platform Context: Microsoft Fabric (Lakehouse / Warehouse)

...

  • application contains raw ingested data.
  • platform contains cleaned and standardized data.
  • entreprise contains business-ready datasets optimized for analytics.

 2.2 One Object = One Concept

...

  • _country_cd → Country code based on a standard reference (e.g., ISO).
    Example: country_cd = "FR"
  • _region_cd → Region or state code.
    Example: region_cd = "IDF"
  • _city → City name.
    Example: city = "Paris"
  • _postal_cd → Postal or ZIP code.
    Example: postal_cd = "75001"

...

7. Column Type Standards


SuffixExpected Type
_idstring or integer
_keyinteger
_
cd
refstring
_
name
cdstring
_
desc
typestring
_
amt
category
decimal
string
_
qty
segment
Decimal or integer
string
_level_
cnt
cd
integer
string
_
pct
name

decimal

_dt

date

_ts

timestamp

is_

boolean
string
_descstring
_labelstring
_dtdate
_tstimestamp
_timetime
_amtdecimal
_cost_amtdecimal
_price_amtdecimal
_rev_amtdecimal
_tax_amtdecimal
_disc_amtdecimal
_qtydecimal or integer
_cntinteger
_voldecimal
_wtdecimal
_lendecimal
_pctdecimal
_ratedecimal
_ratiodecimal
_idxdecimal or integer
is_boolean
has_boolean
_status_cdstring
_status_descstring
_hashstring
_versionstring
_seqinteger
_eventstring
_country_cdstring
_region_cdstring
_citystring
_postal_cdstring
created_tstimestamp
updated_tstimestamp
load_tstimestamp
pipeline_run_idstring
record_sourcestring
row_hashstring


...

8. Mandatory Metadata Columns

...

Column Name

Type

Description

Example

_ingestion_ts

timestamp

Timestamp when the record was ingested into the platform. Used to track ingestion time and support replay or debugging of ingestion pipelines.

2025-03-12 14:25:32

_batch_id

string

Identifier of the ingestion batch or pipeline execution that loaded the record. Useful for pipeline monitoring and troubleshooting.

batch_20250312_01

_source_system

string

Name of the source system from which the record originated. Helps identify data lineage and upstream systems.

sap

_source_file (or source object)

string

Name or path of the file used to ingest the record (when file-based ingestion is used). Enables traceability back to the original data file.

Sap_sales_20250312.csv 

_record_hash

string

Hash value representing the content of the record. Used to detect changes between loads or identify duplicates efficiently.

a9f4b12c89d2

...

Platform Layer

Column Name

Type

Description

Example

created_ts

timestamp

Timestamp when the record was first created in the platform layer. Helps identify when the data became available in the curated layer.

2025-03-12 14:25:32

updated_ts

timestamp

Timestamp of the latest update applied to the record in the platform layer. Useful for incremental processing and debugging transformations.

2025-03-12 16:02:01

row_hash

string

Hash calculated from the business columns of the row. Used to detect if the data has changed between loads.

8fa21a3b5d

record_source

string

Source system or pipeline that generated the record in the platform layer. Used for lineage and audit.

sap_sales_pipeline

is_deleted

boolean

Logical deletion flag used when the source record is removed or marked as inactive. Enables soft deletion instead of physical removal.

false

...

Entreprise Layer

Column Name

Type

Description

Example

created_ts

timestamp

Timestamp when the record was first created in the entreprise dataset. Useful for lineage and auditing.

2025-03-12 17:30:10

updated_ts

timestamp

Timestamp of the latest modification applied to the record. Helps identify refresh cycles of the dataset.

2025-03-13 02:00:00

...

Others Technical standards

Column Name

Type

Description

Example

load_ts

timestamp

Timestamp when the record was loaded into the table.

2025-03-13 02:00:00

pipeline_run_id

string

Identifier of the pipeline or job execution that produced the object.

pipeline_run_84721

...

9. Error and Data Quality Tables

Error and data quality tables are used to manage invalid, incomplete, or inconsistent data during ingestion and transformation processes.

...

  • lower layers are source-system oriented
  • upper layers are business-domain oriented

Naming Pattern

map_<source_object>_<business_object>

Example:

  • map_sap_customer_enterprise_customer

...

Mapping tables are used to:

  • align source system values with enterprise standards
  • resolve inconsistencies between systems
  • standardize business definitions
  • enable cross-system integration

Naming Pattern

map_<source_object>_<business_object>

Example:

  • map_sap_customer_enterprise_customer


When to Use Mapping Tables

Use mapping tables when:

  • multiple systems use different identifiers or codes
  • business definitions differ across systems
  • values need to be harmonized for analytics

...

  • source_system → Origin of the data
    • Example: source_system = "sap"
  • source_value → Value from the source system
    • Example: source_value = "ELEC"
  • target_value → Standardized enterprise value
    • Example: target_value = "ELECTRONICS"
  • mapping_type → Type of mapping applied
    • Example: mapping_type = "CATEGORY_MAPPING"
  • valid_from_dt → Start date of the mapping validity
    • Example: valid_from_dt = "2024-01-01"
  • valid_to_dt → End date of the mapping validity
    • Example: valid_to_dt = "9999-12-31"