Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Jira
serverSyensqo's Jira
serverIdd8efc1ef-48bd-3b4e-8714-ad827f4f059b
keySYSM-353

https://syensqo.atlassian.net/browse/SYSM-353
Info
Info
  • Naming Conventions for Tables, Columns and Data Structures
  • Platform Context: Microsoft Fabric (Lakehouse / Warehouse)

...

  • application contains raw ingested data.
  • platform contains cleaned and standardized data.
  • entreprise contains business-ready datasets optimized for analytics.

...

2.2 One Object = One Concept

...

The general naming structure of the platform is:<layer>. <grouping>.<object>

Examples:

  • application .: sap.vbak
  • platform .: sap.sales_order_header
  • entreprise .: sales.fact_sales_order

...

4. Table Naming Standards

...

application tables store raw source data exactly as ingested.

Naming pattern:application. <source_system>.<source_object>

Examples:

  • application.sap.vbak
  • application.sap.vbap
  • application.salesforce.account
  • application.workday.worker

Rules:

  • table names follow the source system object names
  • minimal transformationappend-only ingestion preferred

4.2 Platform - Tables

Platform tables contain cleaned and standardized datasets aligned to source systems.

Naming pattern:platform. <source_system>.<object_name>

Examples:

  • platform.sap.sales_order_header
  • platform.sap.sales_order_item
  • platform.salesforce.account

Rules:

  • normalized column naming
  • deduplicated records
  • standardized data types
  • still aligned with source system entities

...

Entreprise tables contain enterprise analytics datasets organized by business domain.

Naming pattern:entreprise. <domain>.<object>

Examples:

  • entreprise.sales.fact_sales_order
  • entreprise.sales.dim_customer
  • entreprise.finance.fact_invoice

...

5. Entreprise Table - Types

...

  • _country_cd → Country code based on a standard reference (e.g., ISO).
    Example: country_cd = "FR"
  • _region_cd → Region or state code.
    Example: region_cd = "IDF"
  • _city → City name.
    Example: city = "Paris"
  • _postal_cd → Postal or ZIP code.
    Example: postal_cd = "75001"

 

...

7. Column

...

Type Standards


Suffix

...

Expected Type
_id

...

string or integer
_

...

key

...

integer

...

_

...

refstring

...

_

...

cdstring

...

_

...

type

 

...

string
_

...

category

...

string

...

_

...

segmentstring
_level_cd

...

string
_namestring
_desc

...

string

...

_

...

labelstring

...

_

...

dt

...

date
_

...

tstimestamp
_

...

time

...

time

...

_

...

amtdecimal

...

_

...

cost_

...

amt

...

decimal
_price_amt

...

decimal
_

...

rev_amtdecimal
_

...

tax_amtdecimal
_disc_amtdecimal
_qtydecimal or integer
_cntinteger
_voldecimal
_wtdecimal
_lendecimal
_pctdecimal
_ratedecimal
_ratiodecimal
_idxdecimal or integer
is_boolean
has_boolean
_status_cdstring
_status_descstring
_hashstring
_versionstring
_seqinteger
_eventstring
_country

Examples:

  • sales_amt
  • order_qty
  • item_cnt
  • margin_pct

 

Boolean Fields

Preferred format: is_<condition>

Examples:

  • is_active
  • is_deleted
  • is_returned

7. Column Type Standards

Suffix

Expected Type

_id

string or integer

_key

integer

_cdstring
_nameregion_cdstring
_desccitystring
_postal_amtcddecimalstring
created_qtytsDecimal or integertimestamp
updated_cnttsintegertimestamp
load_pcttsdecimaltimestamp
pipeline_run_dtiddatestring
record_tssourcetimestampstring
isrow_hashbooleanstring


...

8. Mandatory Metadata Columns

...

Column Name

Type

Description

Example

_ingestion_ts

timestamp

Timestamp when the record was ingested into the platform. Used to track ingestion time and support replay or debugging of ingestion pipelines.

2025-03-12 14:25:32

_batch_id

string

Identifier of the ingestion batch or pipeline execution that loaded the record. Useful for pipeline monitoring and troubleshooting.

batch_20250312_01

_source_system

string

Name of the source system from which the record originated. Helps identify data lineage and upstream systems.

sap

_source_file (or source object)

string

Name or path of the file used to ingest the record (when file-based ingestion is used). Enables traceability back to the original data file.

Sap_sales_20250312.csv 

_record_hash

string

Hash value representing the content of the record. Used to detect changes between loads or identify duplicates efficiently.

a9f4b12c89d2

...

Platform Layer

Column Name

Type

Description

Example

created_ts

timestamp

Timestamp when the record was first created in the platform layer. Helps identify when the data became available in the curated layer.

2025-03-12 14:25:32

updated_ts

timestamp

Timestamp of the latest update applied to the record in the platform layer. Useful for incremental processing and debugging transformations.

2025-03-12 16:02:01

row_hash

string

Hash calculated from the business columns of the row. Used to detect if the data has changed between loads.

8fa21a3b5d

record_source

string

Source system or pipeline that generated the record in the platform layer. Used for lineage and audit.

sap_sales_pipeline

is_deleted

boolean

Logical deletion flag used when the source record is removed or marked as inactive. Enables soft deletion instead of physical removal.

false

...

Entreprise Layer

Column Name

Type

Description

Example

created_ts

timestamp

Timestamp when the record was first created in the entreprise dataset. Useful for lineage and auditing.

2025-03-12 17:30:10

updated_ts

timestamp

Timestamp of the latest modification applied to the record. Helps identify refresh cycles of the dataset.

2025-03-13 02:00:00

 

Others Technical standards

Column Name

Type

Description

Example

load_ts

timestamp

Timestamp when the record was loaded into the table.

2025-03-13 02:00:00

pipeline_run_id

string

Identifier of the pipeline or job execution that produced the object.

pipeline_run_84721

 

...

9. Error and Data Quality Tables

...

  • lower layers are source-system oriented
  • upper layers are business-domain oriented

...

map_<source_object>_<business_object>

Example:

  • map_sap_customer_enterprise_customer

Purpose

Mapping tables are used to:

  • align source system values with enterprise standards
  • resolve inconsistencies between systems
  • standardize business definitions
  • enable cross-system integration

Naming Pattern

map_<source_object>_<business_object>

Example:

  • map_sap_customer_enterprise_customer


When to Use Mapping Tables

Use mapping tables when:

  • multiple systems use different identifiers or codes
  • business definitions differ across systems
  • values need to be harmonized for analytics

...

  • source_system → Origin of the data
    • Example: source_system = "sap"
  • source_value → Value from the source system
    • Example: source_value = "ELEC"
  • target_value → Standardized enterprise value
    • Example: target_value = "ELECTRONICS"
  • mapping_type → Type of mapping applied
    • Example: mapping_type = "CATEGORY_MAPPING"
  • valid_from_dt → Start date of the mapping validity
    • Example: valid_from_dt = "2024-01-01"
  • valid_to_dt → End date of the mapping validity
    • Example: valid_to_dt = "9999-12-31"