Refreshing Individual Tables and Partitions With Semantic Link

Using Semantic Link to refresh tables and partitions in a dataset

The latest version of Semantic Link (0.4.0) has many methods that provide a convenient abstraction for calling Fabric/Power BI REST APIs. You can see them here. In this blog, I will show how to use the .refresh_dataset() which uses the Enhanced Refresh API to refresh Power BI semantic models, tables and partitions.

Prerequisites: You need Microsoft Fabric capacity. The semantic model needs to be in a workspace backed by F/P/PPU capacity/license.

Method 1: Using the Enhanced Refresh API

To refresh the entire semantic model, use fabric.refresh_dataset( dataset, workspace ) . But you can also provide a list with key-value pairs for tables and partitions to refresh. This is particularly helpful for large semantic models and models with incremental refresh defined.

In the below example, I am only refreshing the Order_Details table and Customer-ROW partition from the Customers table.

You can use the REST API method as well, but this is a more convenient way in my opinion.

#!pip install semantic-link --q
# Define the dataset and workspace
import sempy.fabric as fabric 
dataset = "SL-Refresh"
workspace = "Sales"

# Objects to refresh, define using a dictionary
objects_to_refresh = [
    {
        "table": "Customers",
        "partition": "Customers-ROW"
    },
    {
        "table": "Order_Details"
    }
]

# Refresh the dataset
fabric.refresh_dataset(
    workspace=workspace,
    dataset=dataset, 
    objects=objects_to_refresh,
)

# List the refresh requests
fabric.list_refresh_requests(dataset=dataset, workspace=workspace)

To confirm the partitions have been refreshed successfully, we can use the .get_tmsl() to get the details on each partition in the table.

import pandas as pd
import json
import sempy.fabric as fabric 

def get_partition_refreshes(dataset, workspace):
    """
    Sandeep Pawar  |   fabric.guru
    Returns a pandas dataframe with three columns - table_name, partition_name, refreshedTime
    """
    tmsl_data = json.loads(fabric.get_tmsl(dataset=dataset, workspace=workspace))

    df = pd.json_normalize(
        tmsl_data, 
        record_path=['model', 'tables', 'partitions'], 
        meta=[
            ['model', 'name'], 
            ['model', 'tables', 'name']
        ],
        errors='ignore',
        record_prefix='partition_'
    )

    df = df.rename(columns={'model.tables.name': 'table_name'})
    return df[['table_name', 'partition_name', 'partition_refreshedTime']]

df = get_partition_refreshes(dataset=dataset, workspace=workspace)
df

Sample output:

This method also has a ton of other options to refresh the semantic models in granular details and I highly recommend checking it out.

Method 2: Using TMSL

Alternatively, you can use the .execute_tmsl to refresh specific tables and partitions. This is a far more flexible and powerful method because you can not only refresh the tables, you can alter the table/partition properties as well.

#define TMSL
#In the below example, I am refreshing a table called Order_Details and 
#partition named Customers-ROW from Customers table
#database is your semantic model name

tmsl_script = {
  "refresh": {
    "type": "full",
    "objects": [
      {
        "database": "SL-Refresh",
        "table": "Order_Details"
      },
      {
        "database": "SL-Refresh",
        "table": "Customers",
        "partition": "Customers-ROW"
      }
    ]
  }
}

fabric.execute_tmsl(workspace="<workspace_name>", script=tmsl_script)

With Semantic Link, you can define your own custom refresh schedules, make refreshes conditional (e.g. based on data quality checks, arrival of data etc.) and much more.

Did you find this article valuable?

Support Sandeep Pawar by becoming a sponsor. Any amount is appreciated!