Insight

How Does Data Factory in Microsoft Fabric Work (Pipelines, Dataflow Gen2)?

4 March 2026

5–7 minutes read

·Solv. Systems

High-tech visual of Microsoft Fabric Data Factory, pipelines, and Dataflow Gen2.

In Short: What actually is Microsoft Fabric Data Factory ?

Data Factory in Microsoft Fabric is the data integration layer that provides cloud - scale data movement and data transformation services for ETL and orchestration inside Fabric.In practice, most teams use it through two primary building blocks:

Pipelines: Orchestration workflows that group activities(copy, transform, control flow) and run on - demand, on schedules, or via event triggers.
Dataflow Gen2: A Power Query: based, low - code ETL experience for ingesting from many sources, applying transformations, and loading to Fabric destinations like Lakehouse and Warehouse.

The real story is that Fabric Data Factory is Microsoft’s attempt to make ingestion, transformation, and orchestration feel like one product experience instead of a stitched - together toolchain.

Which data challenges is Fabric Data Factory designed to solve ?

Most organizations don’t struggle because they can’t connect to data; they struggle because they can’t repeatably move and shape data without creating fragility.Common issues include manual refresh processes, "one-off" ETL logic trapped in personal workspaces, and inconsistent transformations.

Fabric Data Factory targets this by offering a governed place to build repeatable pipelines and standardized transformations.

What are pipelines in Microsoft Fabric Data Factory ?

A pipeline is a logical grouping of activities that together perform a task: so you deploy and schedule the workflow as a unit rather than managing each step separately.

What do Fabric pipelines typically do?

Fabric pipelines commonly orchestrate end - to - end workflows including:

Ingestion: Moving data from sources via the Copy activity.
Validation: Shaping and checking data quality.
Control Flow: Running logic using conditions, loops, and error handling.
Downstream Triggers: Kicking off a Dataflow Gen2, notebook, or notifying teams.

How do pipelines run in Fabric ?

Pipelines can be executed on - demand, on a schedule, or via event - based triggers.Microsoft categorizes activities into three main buckets: data movement, data transformation, and control flow.

What is Dataflow Gen2 and why do teams use it ?

Dataflow Gen2 is a cloud - based, low - code ETL tool built on the familiar Power Query experience.It is designed to connect to many sources, apply transformations, and load to multiple destinations including Lakehouse and Warehouse.

Teams lean on Dataflow Gen2 because it provides a low - code surface that BI and analytics teams can own, it supports multiple Fabric destinations, and it integrates tightly with pipelines for larger workflows.

How do pipelines and Dataflow Gen2 work together ?

Fabric encourages a specific pattern: Pipelines handle the orchestration(the when, why, and how of the workflow), while Dataflow Gen2 handles the transformations(the how of the data shaping).The outputs then land in Fabric destinations for downstream analytics.

What are the common limitations of Fabric Data Factory ?

While powerful, the platform is still evolving.Treating Microsoft’s limitations list as required reading is critical before any production rollout.

What are the current pipeline limitations ?

Some notable constraints include:

No Tumbling Window Triggers: A common pattern in older ADF that is not yet natively supported.
Credential Storage: Connectors don't currently support OAuth and Azure Key Vault for pipelines in older patterns.
Managed System Identity(MSI): Support is currently limited primarily to Azure Blob Storage, with broader support coming soon.
Missing Activities: Mapping Data Flow and SSIS integration runtime are not yet available.

What are the practical scale boundaries for pipelines ?

Microsoft publishes workspace and pipeline limits that you must design around, including the maximum number of pipelines per workspace and concurrency limits.These matter most when trying to centralize all ingestion into "mega-pipelines."

What are the Dataflow Gen2 "gotchas" ?

Common issues we see in projects include:

Lakehouse Naming Rules: Spaces and special characters are not supported in column or table names.
Unsupported Types: Duration and binary columns are not supported for Lakehouse destinations.
Gateway Requirements: You must maintain a supported gateway version(one of the last six releases).
Refresh Limits: Token refresh limitations can cause refreshes longer than one hour to fail.
Validation Timeouts: There is a 10 - minute publish and validation limit per query.

Who is Fabric Data Factory for?

It is a strong fit when you want a standard ingestion and transformation approach that is visible, schedulable, and monitorable.It provides a clear path from raw ingestion to curated datasets, especially for organizations trying to reduce "ETL sprawl."

What is the strategic point most organizations miss ?

Success comes down to the operating model more than the tool.You must define who owns transformations, how to separate reusable patterns from one - off logic, and how to design around known platform limits.

Why work with Solv Systems on Fabric Data Factory ?

At Solv Systems, we implement Fabric Data Factory as a repeatable ingestion product, not just a collection of pipelines.

Strategy Before Build

We start with business outcomes and data SLAs to determine the right split between pipelines and Gen2 transformations.

Patterns That Scale

We design modular pipeline patterns that respect platform limits and avoid the fragility of "mega-pipelines."

Proactive Engineering

We proactively engineer around common Gen2 constraints like naming, gateway refresh behavior, and publish limits so you don't hit blockers in production.

Governance and Adoption

We establish clear ownership of transformations so your data foundation becomes reusable across the organization.

FAQ

Frequently Asked Questions

Quick answers to your questions about Microsoft Fabric.

Pipelines are primarily for orchestration and data movement, while Dataflow Gen2 is optimized for low-code data transformation using the Power Query interface.

Yes, Dataflow Gen2 is designed to load directly into Fabric destinations, including the Lakehouse and Warehouse, enabling unified analytics foundations.

When loading to a Lakehouse destination, Gen2 currently does not support spaces or special characters in column or table names, which can lead to publish errors if not planned for.

Use Spark Notebooks for complex, code-heavy transformations at massive scale, while Dataflow Gen2 is better for low-code, Power Query-based ETL tasks.

Yes, through the use of an On-Premises Data Gateway, you can securely connect Fabric Pipelines and Dataflow Gen2 to data residing behind firewalls.

Accelerate Your Data Ingestion

We design modular pipeline patterns that respect platform limits. Solv Systems ensures your Fabric Data Factory implementation is scalable and avoids common 'gotchas'.

Get in Touch

Liked this Post? View more related posts below

Explore more insights, articles, and guides from our expert team.

View all resources

Graphic representing Microsoft Fabric announcements at FabCon 2026, including Unified Fabric Platform, Integrated Data Factory, and CI/CD Integration

Insight

FabCon The Microsoft Fabric Announcements That Actually Matter

Mar 20, 2026

6–7 minutes

FabCon 2026 delivered the largest set of Microsoft Fabric announcements to date. Here are the seven that carry real strategic weight, and what they mean for organisations running Fabric today.

Read Article →

Enterprise data team reviewing Microsoft Fabric analytics dashboards on large screen

Insight

Microsoft Fabric at $2 Billion ARR: What Is Driving Enterprise Adoption?

Mar 20, 2026

5–6 minutes

Microsoft Fabric has reached $2 billion in annual recurring revenue with 60% year-on-year growth. Here's what is actually driving enterprise adoption, and what it means for organisations evaluating the platform.

Read Article →

Diagram showing AI agents connecting to Microsoft Fabric data sources through the Remote MCP Server

Insight

What Is Microsoft Fabric's Remote MCP Server, and What Does It Mean for AI Agents?

Mar 20, 2026

5–6 minutes

Fabric's Remote MCP Server lets AI agents connect directly to your data environment through a standardised, governed interface. Here's what it is and why it matters for enterprise AI deployments.

Read Article →