Introduction

Microsoft Fabric provides native support for Delta Lake tables within Lakehouses, including time travel and versioning capabilities. However, when working with mirrored tables (CDC via Mirroring), these capabilities are not directly exposed in the Fabric UI as they are in standard Lakehouse tables.

This document summarizes how to access:

0. Context of the test

Is there a way to see those 44 lines (including updates and deletions) instead of the 42? 

1. Reading the Delta Table (Current Version)

Even for mirrored tables, the data is stored as a standard Delta table in OneLake and can be accessed via its ABFSS path.

Delta table - data

path = "abfss://69300957-941f-4c8a-970f-49b3fce16e0d@onelake.dfs.fabric.microsoft.com/f73eb213-d657-4e43-8856-30a30f7cb1bf/Tables/dbo/RefSujetsIAAURA"
df = spark.read.format("delta").load(path)

display(df)

This returns the latest state of the table, equivalent to what is visible in Fabric after synchronization.

2. Exploring Underlying Storage (Metadata & Files)

The Delta table is physically composed of:

You can list these files using:

path = "abfss://69300957-941f-4c8a-970f-49b3fce16e0d@onelake.dfs.fabric.microsoft.com/f73eb213-d657-4e43-8856-30a30f7cb1bf/Tables/dbo/RefSujetsIAAURA"
files = spark._jvm.org.apache.hadoop.fs.FileSystem \
    .get(spark._jsc.hadoopConfiguration()) \
    .listStatus(spark._jvm.org.apache.hadoop.fs.Path(path))

for f in files:
    print(f.getPath().toString())

Example

Example output:

_delta_log/

_index_bin/

deletion_vector_*.bin

part-*.parquet

metadata/

Key components:

3. Accessing Historical Versions (Time Travel)

Although not exposed in the Fabric UI for mirrored tables, Delta Lake versioning is still fully available via Spark.

from delta.tables import DeltaTable
from pyspark.sql import functions as F

path = "abfss://69300957-941f-4c8a-970f-49b3fce16e0d@onelake.dfs.fabric.microsoft.com/f73eb213-d657-4e43-8856-30a30f7cb1bf/Tables/dbo/RefSujetsIAAURA"


# 1) Historique Delta
delta_table = DeltaTable.forPath(spark, path)
history_df = delta_table.history()  # reverse chronological order

display(history_df)

# 2) Liste des versions disponibles, triées dans l'ordre croissant
versions = [
    row["version"]
    for row in history_df.select("version").distinct().orderBy("version").collect()
]print("Versions disponibles :", versions)

# 3) Création d'un DataFrame par version
dfs_by_version = {}

for v in versions:
    df_v = (
        spark.read
        .format("delta")
        .option("versionAsOf", v)
        .load(path)
        .withColumn("_delta_version", F.lit(v))
    )
    dfs_by_version[v] = df_v


print(f"{len(dfs_by_version)} DataFrames créés")
print("Exemple version la plus récente :", max(dfs_by_version.keys()))

 

Example 


Version 2  :

display(dfs_by_version[2])



Version 3 - Updated made on one field (most recent version so equivalent from delta table synchronization) → most recent version

display(dfs_by_version[3])

4. Key Observations

But UI exposed only the last version (SCD1 type (owerwritte))



It seems we can access to commit logs


But without testing more tok now which row was deleted or added