Data retrieval strategy from SharePoint:
- Files, Lists, custom applications, OneLake File Explorer
- security, best practices

Version	Date	Description	Contributor
V0.1	30 Mar 2026	Initial document	COLOMBANI Théo
V0.2	07 Apr 2026	Added to the wiki	COLOMBANI Théo
V0.3

1. Axis — Load into Lakehouse Files

1.1 OneLake Shortcut (SharePoint / OneDrive)

Description
Logical link exposing SharePoint folders in OneLake without data duplication.

Functioning

Shortcut points to a SharePoint folder (folder-level only)
Data remains in SharePoint and is accessed virtually
Accessible across Fabric workloads

Key capabilities

Data virtualization (no physical copy)
Automatic synchronization with source changes
Unified access through OneLake

Advantages

No pipelines or ETL required
No data duplication
Fast implementation
Unified access layer

Limitations (decision drivers)

Folder-level granularity only
Performance dependent on SharePoint (latency, throttling)
No control over ingestion (no filtering, no incremental logic)
Runtime dependency on source availability
Not suitable when strong data isolation or historization is required

1.2 Custom ingestion — API (Notebook or Pipeline) → Files

Description
Extraction via Microsoft Graph or SharePoint REST API and storage in Lakehouse Files.

Execution models

Notebook (Spark / Python)
Data Pipeline:

Web Activity (REST calls)
Copy Activity with API source

Functioning

API calls to retrieve files or metadata
Data written into OneLake Files

Key capabilities

Supports full SharePoint surface (files, folders, metadata)
Custom ingestion logic (filtering, incremental, structuring)
Can be orchestrated via pipelines

Advantages

Full flexibility on ingestion logic
Ability to implement incremental loads (delta, watermark) at bronze layer
Can handle complex folder structures and edge cases
Works even when no native connector exists

Limitations (decision drivers)

Requires handling:

authentication (OAuth / Service Principal)
pagination (@odata.nextLink)
API rate limits / throttling

More complex error handling and retry logic
Development and maintenance effort
Pipeline Web Activity is stateless (no built-in transformation)
Copy Activity / Web Activity require manual schema handling

2. Axis — Load into Lakehouse Tables

2.1 Shortcut with transformation → Delta Tables

Description
Use of SharePoint shortcut with transformation to project files into Delta tables.

Functioning

Shortcut exposes files
Transformation step converts them into structured tables
Tables remain synchronized with source

Key capabilities

Automatic file-to-table conversion
Continuous synchronization
Direct consumption in SQL / BI

Advantages

No pipeline required
Direct analytical usability
Integrated with OneLake

Limitations (decision drivers)

Strong dependency on source file structure and quality
Limited transformation capabilities compared to ETL
Limited control over schema evolution
Debugging and lineage less explicit than pipeline-based ingestion

2.2 Mirroring (SharePoint Lists)

Description
Replication of SharePoint Lists into OneLake as Delta tables.

Functioning

Connection to SharePoint list
Continuous replication into Fabric tables
Automatic synchronization

Key capabilities

Near real-time data replication
Native Delta format
No ETL required

Advantages

Continuous synchronization
Simplified ingestion architecture
Direct usability for analytics

Limitations (decision drivers)

Limited to structured data (lists only)
Limited transformation capabilities during ingestion
Dependency on mirroring feature availability and scope
Limited control over ingestion logic (filters, enrichment)
Schema evolution handled automatically but with limited customization

2.3 Custom ingestion — API (Notebook or Pipeline) → Tables

Description
API-based extraction with transformation and direct load into Delta tables.

Same comments from Section 1.2 Custom ingestion — API (Notebook or Pipeline) → Files

3. Considerations

API usage (Notebook vs Pipeline)

Notebook

Better suited for:

complex transformations
large data processing
advanced logic (joins, enrichment)

Pipeline (Web / Copy Activity)

Better suited for:

orchestration
simple ingestion patterns
metadata-driven ingestion

Security

Authentication methods:

Organizational account
Workspace identity

Service principal recommended

API-based approaches require:

token management
permission configuration (e.g. Sites.Read.All)

4. MATRIX(s)

Synthesis

Data type	Load target	Options
Files	Files	Shortcut / API (Notebook or Pipeline)
Files	Tables	Shortcut + transformation / API (Notebook or Pipeline)
SharePoint Lists	Tables	Mirroring / API (Notebook or Pipeline)

Criteria

Criteria	Shortcut (Files)	Shortcut + Transform (Tables)	Mirroring (Lists)	API via Notebook	API via Pipeline (Web / Copy)
Data movement	No copy (virtual access)	No copy (virtual + projection)	Physical copy (replication)	Physical copy	Physical copy
Latency / freshness	Near real-time (source-driven)	Near real-time	Near real-time sync (incremental)	Depends on orchestration	Depends on orchestration
Transformation capabilities	None	Limited	Limited	Full (Spark / code)	Limited (mapping / chaining)
Incremental / CDC logic	Not supported	Limited / implicit	Built-in incremental sync	Fully customizable	Manual implementation required
Handling complex structures	Limited (folder-based only)	Limited	Not applicable (structured only)	Strong capability	Moderate (complex via chaining)
Control over ingestion logic	None	Low	Low	Full	Medium
Operational complexity	Very low	Low	Low	High	Medium
Dependency on source availability	High	High	Low	Low (after ingestion)	Low (after ingestion)
Schema control / evolution	None	Limited	Limited	Full control	Medium control
Cost (compute / storage)	Low	Low	Free	Higher (compute + dev)	Medium (pipeline runs)
Supported data types	Files only	Files (JSON, CSV, PARQUET, EXCEL) (structured)	SharePoint Lists only	All (files + lists)	All (files + lists via API)

Technical solutions (Fabric only recommended)

P1 : SharePoint Shortcuts,

Directly to Silver Tables Lakehouse with auto transform in delta (see Référence) -> newly working for .xlsx to delta table (only csv is working)
or to files zone Lakehouse (csv shortcut) then transformation to silver tables

Triggers on OneLake Events ? -> Trigger Events not working for shortcuts.

Prerequisites : folder hierarchy for files & Service Principal (or Workspace Identity). One shortcut = one folder
See also Limitations : https://learn.microsoft.com/en-us/fabric/onelake/create-onedrive-sharepoint-shortcut#limitations

P2 : PowerQuery code through notebooks
P3 : DataflowGen2
P4 : Pipelines via API (doable in Azure)