Microsoft Fabric | Power BI | Data Analytics & AI

Quick Tip : Validate runMultiple DAG In Fabric

Validate the DAG before execution

PublishedAugust 23, 2024

Principal Program Manager, Microsoft Fabric CAT helping users and organizations build scalable, insightful, secure solutions. Blogs, opinions are my own and do not represent my employer.

First, if you haven't noticed mssparkutils has been officially renamed to notebookutils. Check out the official documentation for details. Be sure to use/update your notebooks to notebookutils.

I have written about runMultiple before. It allows you to run multiple notebooks in parallel with a defined orchestration pattern including dependencies. notebookutils now also has .validateDAG method to check if the DAG has been defined per the expected JSON structure. It can be helpful check before executing runMultiple.

Example:

I will use the same DAG I used in my previous blog.

DAG = {
    "activities": [
        {
            "name": "extract_customers", 
            "path": "extract_customers", 
            "timeoutPerCellInSeconds": 120,
            "args": {"rows": 1000},
        },
        {
            "name": "extract_products", 
            "path": "extract_products", 
            "timeoutPerCellInSeconds": 120,
            "args": {"rows": 5000},
        },
        {
            "name": "extract_offers", 
            "path": "extract_offers", 
            "timeoutPerCellInSeconds": 120,
            "args": {"rows": 1000},
        },
        {
            "name": "extract_leads", 
            "path": "extract_leads", 
            "timeoutPerCellInSeconds": 120,
            "args": {"rows": 100000},
        },
        {
            "name": "customer_table",
            "path": "customer_table",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["extract_customers"]
        },
        {
            "name": "products_table",
            "path": "products_table",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["extract_products"]
        },
                {
            "name": "leads_table",
            "path": "leads_table",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["extract_leads","customer_table", "products_table"]
        },
           {
            "name": "offers_table",
            "path": "offers_table",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["extract_offers","customer_table", "products_table"]
        },
                   {
            "name": "refresh_dataset",
            "path": "refresh_dataset",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["customer_table","products_table","leads_table","offers_table"]
        }

    ],
    "timeoutInSeconds": 3600, # max 1 hour for the entire pipeline
    "concurrency": 5 # max 5 notebooks in parallel
}

notebookutils.notebook.validateDAG(DAG)
#Output True

If I add a dependency that doesn't exist, validation will fail.

INVALID_DAG = {
    "activities": [
        {
            "name": "extract_customers", 
            "path": "extract_customers", 
            "timeoutPerCellInSeconds": 120,
            "args": {"rows": 1000},
        },
        {
            "name": "extract_products", 
            "path": "extract_products", 
            "timeoutPerCellInSeconds": 120,
            "args": {"rows": 5000},
        },
        {
            "name": "extract_offers", 
            "path": "extract_offers", 
            "timeoutPerCellInSeconds": 120,
            "args": {"rows": 1000},
        },
        {
            "name": "extract_leads", 
            "path": "extract_leads", 
            "timeoutPerCellInSeconds": 120,
            "args": {"rows": 100000},
        },
        {
            "name": "customer_table",
            "path": "customer_table",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["THIS_NOTEBOOK_DOES_NOT_EXIST"] ###INVALID
        },
        {
            "name": "products_table",
            "path": "products_table",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["extract_products"]
        },
                {
            "name": "leads_table",
            "path": "leads_table",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["extract_leads","customer_table", "products_table"]
        },
           {
            "name": "offers_table",
            "path": "offers_table",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["extract_offers","customer_table", "products_table"]
        },
                   {
            "name": "refresh_dataset",
            "path": "refresh_dataset",
            "timeoutPerCellInSeconds": 90,
            "retry": 1,
            "retryIntervalInSeconds": 10,
            "dependencies": ["customer_table","products_table","leads_table","offers_table"]
        }

    ],
    "timeoutInSeconds": 3600, # max 1 hour for the entire pipeline
    "concurrency": 5 # max 5 notebooks in parallel
}

notebookutils.notebook.validateDAG(INVALID_DAG)
#Returns error

Note that validation is not exhaustive. For example, you could enter concurrency as -5 which is invalid as it has to be a positive number but validateDAG will not flag it as an error. But this is still very handy.

#notebookutils #runmultiple #mssparkutils #microsoftfabric #microsoft-fabric #notebook

Comments (1)

Join the discussion

Caleb Bish1y ago

HIRE DIGITAL HACK RECOVERY FOR BEST STOLEN CRYPTO RECOVERY SERVICES ONLINE

Navigating the complexities of legal practice involves a great deal of precision and care, and I had always been diligent about securing my financial assets. Despite this, I found myself in a distressing predicament when I lost access to my Bitcoin wallet containing $300,000. The problem began when a computer crash wiped out my encrypted wallet file, leaving me unable to retrieve my funds. The situation was particularly dire as the money was crucial not only for personal savings but also for maintaining the financial stability of my practice .The initial panic was overwhelming. Despite my usual preparedness for any legal or financial issue, this felt like an insurmountable obstacle. I explored every possible solution, from consulting IT specialists to trying various recovery tools, but to no avail. My frustration grew as the days passed with no progress .During my search for help, I discovered DIGITAL HACK RECOVERY. Although I was initially skeptical, the gravity of the situation pushed me to give them a chance. From our first interaction, I was impressed by their professionalism and empathy. They understood the urgency of my case and were transparent about the recovery process. What stood out was their comprehensive approach. DIGITAL HACK RECOVERY didn’t just focus on recovering my wallet; they also provided valuable advice on enhancing my digital security. They introduced me to advanced encryption techniques and secure backup practices that I hadn’t previously considered. This not only helped in recovering my lost Bitcoin but also strengthened my overall financial security. The waiting period was fraught with anxiety and uncertainty. However, DIGITAL HACK RECOVERY kept me informed and reassured throughout the process. When they successfully recovered my wallet, the relief was immense. I felt a tremendous weight lifted off my shoulders and regained confidence in managing financial setbacks. This experience underscored the importance of having a robust backup plan and the right tools in place, no matter how prepared you think you are. Thanks to DIGITAL HACK RECOVERY, I not only regained access to my funds but also gained a deeper understanding of digital asset security. Their expertise and support were truly invaluable, and I wholeheartedly recommend their services to anyone facing similar challenges. If you ever find yourself in a bind, DIGITAL HACK RECOVERY is the team you need on your side .Reach out to DIGITAL HACK RECOVERY via their contact WhatsApp +19152151930

https :// digital hack recovery . com

digital hack recovery @ techie . com

Quick Tip : Validate runMultiple DAG In Fabric

Example:

Comments (1)

More from this blog

RAG in Fabric Notebook Using Microsoft Harrier Multilingual Text Embedding Model

Programmatically Retrieve Prep Data For AI Configuration of Semantic Models

Cross-referencing Notebooks In The Updated Fabric Notebook Copilot

Programmatically Comparing Draft vs Production Fabric Data Agent Responses

Monitoring Power BI Modeling MCP Server Usage and Adoption

Command Palette

Example:

Comments (1)

More from this blog