This page defines the capacity-level configuration options that must be evaluated for our Microsoft Fabric platform, with a focus on:

  • production stability
  • workload isolation
  • controlled scaling
  • governance of shared resources
  • independence of the IT production workspace

Our target operating model uses Fabric primarily as a data storage and exposure platform based on Lakehouse and Warehouse, serving both BI consumption and external data exposure.

Fabric capacity configuration is a platform governance topic, not only an infrastructure topic.

The primary control for protecting IT production is capacity isolation.

Shared capacity between IT production and Domain production creates shared operational risk.

A dedicated capacity for IT production is the recommended baseline for this architecture.


Version

Date

Description

Contributor

V0.1

 

Initial document

COLOMBANI Théo








Fabric Capacity Configuration for Our Data Platform

1. Objective

This page defines the capacity-level configuration options that must be assessed for our Microsoft Fabric platform.

The objective is to ensure:

In our architecture, Microsoft Fabric is primarily used as a storage and exposure platform, based on Lakehouse and Warehouse, for both BI consumption and external data exposure.


2. Platform Context

Our target operating model is structured as follows:

IT workspace

Domain workspaces

Key requirement

The IT production workspace must remain operational independently from Domain workspaces, including in situations where Domain workloads generate higher or less predictable compute consumption.


3. Design Principle

Recommendation

Capacity design must be driven by isolation first, then by optimization.

Rationale

In our context, the main purpose of capacity governance is not only to size compute correctly. It is primarily to:

Decision statement

For our platform, capacity is an architecture boundary, not only a billing or administration object.


4. Recommended Target Model

Target architecture

Capacity A — IT Production

Used only for:

Capacity B — Domain Production

Used for:

Capacity C — Non-Production

Used for:


Recommendation

Do not place IT production and Domain production on the same capacity if IT production must remain operational independently.

Why this matters

A shared capacity creates a shared risk envelope. Even if workspaces are logically separated, they still depend on the same underlying capacity behavior.


5. Capacity-Level Settings to Document

5.1 Workspace-to-capacity assignment

What it is

The assignment of workspaces to specific Fabric capacities.

Why it matters

This is the most important configuration decision in our model because it determines whether workloads share the same compute risk domain.

Recommendation

Confluence panel text

Recommendation
Workspace assignment is the primary mechanism used to guarantee production isolation and operational independence.


5.2 Capacity administration and reassignment governance

What it is

The set of permissions allowing administrators to manage a capacity and move workspaces into or out of it.

Why it matters

Even with a good target architecture, weak governance can reintroduce risk if workspaces are moved without control.

Recommendation

Confluence panel text

Warning
A dedicated production capacity loses most of its value if workspace assignment is not tightly governed.


5.3 Surge protection

What it is

A protection mechanism used to manage overload situations and reduce the impact of excessive background activity on a capacity.

Why it matters

It can help protect shared capacities, especially where Domain workspaces may generate bursty or uneven usage patterns.

Recommendation

Position

Surge protection is a supporting control, not a substitute for proper isolation.

Confluence panel text

Recommendation
Use surge protection on shared capacities.
Do not use it as a replacement for dedicated capacity when a workspace is mission-critical.


5.4 Capacity sizing and scaling

What it is

The sizing of Fabric capacity and the ability to adjust it as workload volume evolves.

Why it matters

Even a well-isolated architecture can fail operationally if the capacity is persistently undersized.

Recommendation

Practical interpretation

Confluence panel text

Decision
IT production capacity sizing must prioritize service continuity over cost minimization.


5.5 Capacity overage

What it is

A mechanism that allows excess usage beyond the purchased capacity threshold, subject to billing and governance.

Why it matters

It can reduce the risk of operational disruption during rare peaks.

Recommendation

Position

Overage is a safety net, not a sizing strategy.

Confluence panel text

Warning
Do not use overage to compensate for structural under-sizing.


5.6 Monitoring and operational visibility

What it is

The monitoring of capacity usage, saturation patterns, top consumers, and operational degradation signals.

Why it matters

Capacity governance is only effective if usage and saturation can be observed and acted upon.

Recommendation

For each production capacity, define:

Minimum baseline

Confluence panel text

Recommendation
Capacity monitoring must be part of normal run operations, not only incident management.


5.7 Disaster recovery

What it is

The capacity-level disaster recovery posture associated with production data continuity.

Why it matters

The IT production workspace supports bronze and silver foundations, which makes it a core dependency for downstream exposure.

Recommendation

Position

For IT production, DR should never be left undocumented.

Confluence panel text

Decision
Disaster recovery for IT production must be assessed explicitly and recorded as an approved architecture choice.


5.8 Notifications and alerting

What it is

The definition of who is informed when capacity issues occur and how operational response is triggered.

Why it matters

Without alert ownership, capacity incidents tend to be handled too late or inconsistently.

Recommendation

Define:

Confluence panel text

Recommendation
Every production capacity must have a clearly assigned operational owner and alerting path.


5.9 Data Engineering and Spark-related settings

What it is

Capacity-level settings related to Spark and Data Engineering workloads.

Why it matters

These settings are relevant if Spark-based processing is materially used in the IT workspace.

Recommendation

Position

This is a secondary topic in our model unless Spark becomes a major production dependency.


6. Recommended Configuration Matrix

SettingIT ProductionDomain ProductionNon-ProductionRecommendation
Dedicated capacityYesPreferredSeparateMandatory for IT production
Shared with IT productionNoNoNoNot allowed
Workspace reassignment rightsVery restrictedRestrictedControlledGovern centrally
Surge protectionOptional complementRecommendedOptionalPrimarily for shared/variable workloads
Capacity overageOptional, cappedOptional, cappedUsually not requiredSafety net only
MonitoringMandatoryMandatoryRecommendedStandard operating baseline
DR assessmentMandatoryCase by caseNot priorityExplicit decision required
Spark governanceCase by caseCase by caseFlexibleOnly where relevant
Scaling review cadenceRegularRegularPeriodicMetrics-driven

7. Operational Rules

Rule 1

Protect IT production by design.
Critical IT workloads must not depend on the same shared capacity behavior as variable domain workloads.

Rule 2

Use isolation before optimization.
Do not try to solve structural contention only with reactive tuning or protection features.

Rule 3

Treat overage as an exception mechanism.
It may improve resilience, but it must not become the default operating mode.

Rule 4

Make monitoring part of standard operations.
Capacity review must be proactive and periodic.

Rule 5

Separate production from experimentation.
Development and testing workloads must not compete with critical production capacity.


8. Proposed Architecture Decision

Recommended decision

The recommended target state for our platform is:

Architecture conclusion

This is the most coherent model for a Fabric platform used primarily as a storage and exposure layer, where the IT production workspace must remain stable independently from Domain activity.


9. Configuration Decisions to Validate

Checklist


10. Callout Blocks

Fabric capacity configuration is a platform governance topic, not only an infrastructure topic.

The primary control for protecting IT production is capacity isolation.

Shared capacity between IT production and Domain production creates shared operational risk.

A dedicated capacity for IT production is the recommended baseline for this architecture.


Si tu veux, je peux maintenant te faire une version encore plus compacte, vraiment au format wiki exécutif, avec moins de texte narratif et davantage de blocs “Decision / Recommendation / Rationale”.