Insight

    How Does Data Factory in Microsoft Fabric Work (Pipelines, Dataflow Gen2)?

    4 March 2026
    ·
    5–7 minutes read
    ·Solv. Systems
    High-tech visual of Microsoft Fabric Data Factory, pipelines, and Dataflow Gen2.
    High-tech visual of Microsoft Fabric Data Factory, pipelines, and Dataflow Gen2.

    In Short: What actually is Microsoft Fabric Data Factory ?

    Data Factory in Microsoft Fabric is the data integration layer that provides cloud - scale data movement and data transformation services for ETL and orchestration inside Fabric.In practice, most teams use it through two primary building blocks:

    • Pipelines: Orchestration workflows that group activities(copy, transform, control flow) and run on - demand, on schedules, or via event triggers.
    • Dataflow Gen2: A Power Query: based, low - code ETL experience for ingesting from many sources, applying transformations, and loading to Fabric destinations like Lakehouse and Warehouse.

    The real story is that Fabric Data Factory is Microsoft’s attempt to make ingestion, transformation, and orchestration feel like one product experience instead of a stitched - together toolchain.

    Which data challenges is Fabric Data Factory designed to solve ?

    Most organizations don’t struggle because they can’t connect to data; they struggle because they can’t repeatably move and shape data without creating fragility.Common issues include manual refresh processes, "one-off" ETL logic trapped in personal workspaces, and inconsistent transformations.

    Fabric Data Factory targets this by offering a governed place to build repeatable pipelines and standardized transformations.

    What are pipelines in Microsoft Fabric Data Factory ?

    A pipeline is a logical grouping of activities that together perform a task: so you deploy and schedule the workflow as a unit rather than managing each step separately.

    What do Fabric pipelines typically do?

    Fabric pipelines commonly orchestrate end - to - end workflows including:

    • Ingestion: Moving data from sources via the Copy activity.
    • Validation: Shaping and checking data quality.
    • Control Flow: Running logic using conditions, loops, and error handling.
    • Downstream Triggers: Kicking off a Dataflow Gen2, notebook, or notifying teams.

    How do pipelines run in Fabric ?

    Pipelines can be executed on - demand, on a schedule, or via event - based triggers.Microsoft categorizes activities into three main buckets: data movement, data transformation, and control flow.

    What is Dataflow Gen2 and why do teams use it ?

    Dataflow Gen2 is a cloud - based, low - code ETL tool built on the familiar Power Query experience.It is designed to connect to many sources, apply transformations, and load to multiple destinations including Lakehouse and Warehouse.

    Teams lean on Dataflow Gen2 because it provides a low - code surface that BI and analytics teams can own, it supports multiple Fabric destinations, and it integrates tightly with pipelines for larger workflows.

    How do pipelines and Dataflow Gen2 work together ?

    Fabric encourages a specific pattern: Pipelines handle the orchestration(the when, why, and how of the workflow), while Dataflow Gen2 handles the transformations(the how of the data shaping).The outputs then land in Fabric destinations for downstream analytics.

    What are the common limitations of Fabric Data Factory ?

    While powerful, the platform is still evolving.Treating Microsoft’s limitations list as required reading is critical before any production rollout.

    What are the current pipeline limitations ?

    Some notable constraints include:

    • No Tumbling Window Triggers: A common pattern in older ADF that is not yet natively supported.
    • Credential Storage: Connectors don't currently support OAuth and Azure Key Vault for pipelines in older patterns.
    • Managed System Identity(MSI): Support is currently limited primarily to Azure Blob Storage, with broader support coming soon.
    • Missing Activities: Mapping Data Flow and SSIS integration runtime are not yet available.

    What are the practical scale boundaries for pipelines ?

    Microsoft publishes workspace and pipeline limits that you must design around, including the maximum number of pipelines per workspace and concurrency limits.These matter most when trying to centralize all ingestion into "mega-pipelines."

    What are the Dataflow Gen2 "gotchas" ?

    Common issues we see in projects include:

    • Lakehouse Naming Rules: Spaces and special characters are not supported in column or table names.
    • Unsupported Types: Duration and binary columns are not supported for Lakehouse destinations.
    • Gateway Requirements: You must maintain a supported gateway version(one of the last six releases).
    • Refresh Limits: Token refresh limitations can cause refreshes longer than one hour to fail.
    • Validation Timeouts: There is a 10 - minute publish and validation limit per query.

    Who is Fabric Data Factory for?

    It is a strong fit when you want a standard ingestion and transformation approach that is visible, schedulable, and monitorable.It provides a clear path from raw ingestion to curated datasets, especially for organizations trying to reduce "ETL sprawl."

    What is the strategic point most organizations miss ?

    Success comes down to the operating model more than the tool.You must define who owns transformations, how to separate reusable patterns from one - off logic, and how to design around known platform limits.

    Why work with Solv Systems on Fabric Data Factory ?

    At Solv Systems, we implement Fabric Data Factory as a repeatable ingestion product, not just a collection of pipelines.

    Strategy Before Build

    We start with business outcomes and data SLAs to determine the right split between pipelines and Gen2 transformations.

    Patterns That Scale

    We design modular pipeline patterns that respect platform limits and avoid the fragility of "mega-pipelines."

    Proactive Engineering

    We proactively engineer around common Gen2 constraints like naming, gateway refresh behavior, and publish limits so you don't hit blockers in production.

    Governance and Adoption

    We establish clear ownership of transformations so your data foundation becomes reusable across the organization.

    FAQ

    Frequently Asked Questions

    Quick answers to your questions about Microsoft Fabric.

    Pipelines are primarily for orchestration and data movement, while Dataflow Gen2 is optimized for low-code data transformation using the Power Query interface.

    Yes, Dataflow Gen2 is designed to load directly into Fabric destinations, including the Lakehouse and Warehouse, enabling unified analytics foundations.

    When loading to a Lakehouse destination, Gen2 currently does not support spaces or special characters in column or table names, which can lead to publish errors if not planned for.

    Use Spark Notebooks for complex, code-heavy transformations at massive scale, while Dataflow Gen2 is better for low-code, Power Query-based ETL tasks.

    Yes, through the use of an On-Premises Data Gateway, you can securely connect Fabric Pipelines and Dataflow Gen2 to data residing behind firewalls.

    Accelerate Your Data Ingestion

    We design modular pipeline patterns that respect platform limits. Solv Systems ensures your Fabric Data Factory implementation is scalable and avoids common 'gotchas'.

    Get in Touch
    Solv.

    Experts in Power BI & Microsoft Fabric Implementation. Empowering businesses through data excellence.

    Navigate

    Office

    1 Crane Ave, Greenshields Park, Gqeberha, South Africa

    info@solv-systems.com

    © 2026 Solv Systems. All rights reserved.