• Naming Conventions for Tables, Columns and Data Structures
  • Platform Context: Microsoft Fabric (Lakehouse / Warehouse)


Version

Date

Description

Contributor

V0.1

 

Initial document

COLOMBANI Théo

V0.2

 

Revised document:

  • added new suffixes for columns naming
  • added section 9 and 10

COLOMBANI Théo

V0.3

 

Added to the wikiCOLOMBANI Théo

V0.4










1. Purpose and Scope

This document defines the enterprise standards for naming tables, columns, schemas and data structures within the organization’s Data Platform implemented on Microsoft Fabric.

The goal of this standard is to ensure:

This standard focuses specifically on naming conventions for:


2. Core Platform Principles

2.1 Medallion Data Architecture

The data platform follows a layered Medallion architecture where data quality increases progressively across three layers:

application → platform → entreprise

2.2 One Object = One Concept

Each table must represent one logical business concept or process.

Examples:

Tables should never represent multiple unrelated concepts, whatever the layer.

2.3 Naming Style

All names must follow the same style:

Example:


3. Platform Naming Structure

The general naming structure of the platform is: <grouping>.<object>

Examples:


4. Table Naming Standards

4.1 Application - Tables

application tables store raw source data exactly as ingested.

Naming pattern: <source_system>.<source_object>

Examples:

Rules:

4.2 Platform - Tables

Platform tables contain cleaned and standardized datasets aligned to source systems.

Naming pattern: <source_system>.<object_name>

Examples:

Rules:

4.3 Entreprise - Tables

Entreprise tables contain enterprise analytics datasets organized by business domain.

Naming pattern: <domain>.<object>

Examples:


5. Entreprise Table - Types

Dimension Tables

Pattern: dim_<entity>

Examples:

Fact Tables

Pattern: fact_<xxxx>

Examples:

Bridge Tables

Pattern: brg_<entity>_<entity>

Examples:

Reference Tables

Pattern: ref_<reference_object>

Examples:

Aggregate Tables

Pattern: agg_<metric>_<grain>

Examples:


6. Column Naming Standards

Column names must be explicit, consistent, and semantically meaningful.
All columns must follow this structure:

<business_term>_<suffix>

Examples:

6.1 Identity & Keys

Suffixes:

Decision Rules: _id vs _key vs _cd

Use _id when:

Use _key when:

Use _cd when:

Summary:

6.2 Codes & Classifications

Suffixes:

6.3 Names & Descriptions

Suffixes:

6.4 Dates & Time

Suffixes:

6.5 Financial & Monetary

Suffixes:

6.6 Quantities & Measures

Suffixes:

6.7 Percentages & Ratios

Suffixes:

6.8 Boolean Fields

Suffixes:

Rule:

6.9 Status & Lifecycle

Suffixes:

6.10 Audit & Metadata

Columns:

6.11 Technical Fields

Suffixes:

6.12 Event / Log Data

Suffixes:

6.13 Geography

Suffixes:


7. Column Type Standards


SuffixExpected Type
_idstring or integer
_keyinteger
_refstring
_cdstring
_typestring
_categorystring
_segmentstring
_level_cdstring
_namestring
_descstring
_labelstring
_dtdate
_tstimestamp
_timetime
_amtdecimal
_cost_amtdecimal
_price_amtdecimal
_rev_amtdecimal
_tax_amtdecimal
_disc_amtdecimal
_qtydecimal or integer
_cntinteger
_voldecimal
_wtdecimal
_lendecimal
_pctdecimal
_ratedecimal
_ratiodecimal
_idxdecimal or integer
is_boolean
has_boolean
_status_cdstring
_status_descstring
_hashstring
_versionstring
_seqinteger
_eventstring
_country_cdstring
_region_cdstring
_citystring
_postal_cdstring
created_tstimestamp
updated_tstimestamp
load_tstimestamp
pipeline_run_idstring
record_sourcestring
row_hashstring



8. Mandatory Metadata Columns

Application Layer

The application layer captures raw ingestion information to ensure traceability and replay capability.

Column Name

Type

Description

Example

_ingestion_ts

timestamp

Timestamp when the record was ingested into the platform. Used to track ingestion time and support replay or debugging of ingestion pipelines.

2025-03-12 14:25:32

_batch_id

string

Identifier of the ingestion batch or pipeline execution that loaded the record. Useful for pipeline monitoring and troubleshooting.

batch_20250312_01

_source_system

string

Name of the source system from which the record originated. Helps identify data lineage and upstream systems.

sap

_source_file (or source object)

string

Name or path of the file used to ingest the record (when file-based ingestion is used). Enables traceability back to the original data file.

Sap_sales_20250312.csv 

_record_hash

string

Hash value representing the content of the record. Used to detect changes between loads or identify duplicates efficiently.

a9f4b12c89d2

Platform Layer

Column Name

Type

Description

Example

created_ts

timestamp

Timestamp when the record was first created in the platform layer. Helps identify when the data became available in the curated layer.

2025-03-12 14:25:32

updated_ts

timestamp

Timestamp of the latest update applied to the record in the platform layer. Useful for incremental processing and debugging transformations.

2025-03-12 16:02:01

row_hash

string

Hash calculated from the business columns of the row. Used to detect if the data has changed between loads.

8fa21a3b5d

record_source

string

Source system or pipeline that generated the record in the platform layer. Used for lineage and audit.

sap_sales_pipeline

is_deleted

boolean

Logical deletion flag used when the source record is removed or marked as inactive. Enables soft deletion instead of physical removal.

false

Entreprise Layer

Column Name

Type

Description

Example

created_ts

timestamp

Timestamp when the record was first created in the entreprise dataset. Useful for lineage and auditing.

2025-03-12 17:30:10

updated_ts

timestamp

Timestamp of the latest modification applied to the record. Helps identify refresh cycles of the dataset.

2025-03-13 02:00:00

Others Technical standards

Column Name

Type

Description

Example

load_ts

timestamp

Timestamp when the record was loaded into the table.

2025-03-13 02:00:00

pipeline_run_id

string

Identifier of the pipeline or job execution that produced the object.

pipeline_run_84721


9. Error and Data Quality Tables

Error and data quality tables are used to manage invalid, incomplete, or inconsistent data during ingestion and transformation processes.

They are critical to:

Naming Patterns

Examples:

Key Differences

When to Use Each Type

reject_<object>

Use this table when:

Typical cases:

Example:

quarantine_<object>

Use this table when:

Typical cases:

Example:

error_<object>

Use this table when:

Typical cases:

Example:

Columns

Typical structure:


10. Mapping Tables

Mapping tables are used to translate, standardize, or align data between source-oriented datasets and enterprise business models.

Especially for an architectures where:

Mapping tables are used to:

Naming Pattern

map_<source_object>_<business_object>

Example:


When to Use Mapping Tables

Use mapping tables when:

Common Use Cases

Code Standardization

Different systems use different formats for the same concept.

Example:

Entity Alignment

Different systems represent the same entity differently.

Example:

Category Harmonization

Source categories do not match enterprise categories.

Example:

Hierarchy Standardization

Different hierarchy structures across systems.

Example:

Typical Structure

Mapping tables usually contain: