# Sync Multiple Fabric Workspaces With GitHub Using Semantic Link Labs

I've been living on the edge. I use my personal Fabric trial capacity *a lot* for testing and learning. As the number of days left in the trial dwindles to single digits, I start praying to the Fabric gods to renew the trial. :D So far they have been renewing my capacity, so thank you for that. I have 80+ workspaces and hundreds, if not thousands, of items so I wanted to automatically create private Github repos for each workspace, sync it with the main branch and commit the items. I chose GitHub because that’s where I keep my personal projects. The process would be the same for Azure DevOps repos.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1739396747292/e32141f8-bac7-4a30-be58-e29220ccdc93.png align="center")

### Pre-requisites:

* ### GitHub Personal Access Token
    
    You will need a PAT to create a connection to the repos. I created a classic token to use it for all repos. Choose whatever is best for your scenario and limit the scope as required.
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1739393962705/abd78ae0-073d-4e83-a32d-cf4b9d5f8ce4.png align="center")
    
* ### Connection Id
    

Once you have the PAT, create a cloud connection in Fabric to generate a connection Id. Choose Github -Source control as the connection type. This is under Settings &gt; Manage Connections in Fabric.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1739394262370/a42b94a4-5f34-416c-9297-a7b95010b948.png align="center")

* You need to have an Admin of the workspaces you want to sync.
    
* Enable GitHub and ADO in tenant settings.
    
* Install `Pygihub` and `Semantic Link Labs` in Fabric Python notebook
    

## Code:

Here is the logic, change it as needed:

* Get a list of Fabric workspaces
    
* Get a list of workspaces that are already git enabled
    
* Make a list of workspaces that need to be sync’d
    
* For each of the above workspaces:
    
    * create and initialize a repo in GitHub with a `readme.md`. I name the repo as `fabric_lower_case_worskpace_name`
        
    * create a connection to the above repo in Fabric using Semantic Link Labs
        
    * wait 60s
        
    * get the latest commit hash
        
    * get the ids of the items you want to commit. In my case, I only want to sync notebooks. Change this to include other supported items.
        
    * commit the items if items exist using Semantic Link Labs
        

Note that there can be delay in sync depending on the number of items, item types etc, so adjust the `time.sleep(n)` wait period based on your scenario. You can customize this further and make the logic more granular but for my case this works well. I just need it for backup more than anything else.

```python
# !pip install PyGithub semantic-link-labs --q

from github import Github
import sempy.fabric as fabric
import pandas as pd
import time
import uuid
import sempy_labs as labs

# get github PAT
g = Github("github_pat_xxxxxx")
conn_id = "xxxxxxxxxxxxxx"

def create_and_initialize_github_repo(repo_name):
    user = g.get_user()
    try:
        repo = user.get_repo(repo_name)
        print(f"Repository '{repo_name}' already exists.")
    except Exception as e:
        repo = user.create_repo(
            name=repo_name,
            description="Repository for workspace " + repo_name,
            private=True
        )
        repo.create_file("README.md", "Initial commit", "# " + repo_name)
        print(f"Repository '{repo_name}' created and initialized with README.md.")
    return repo

def get_latest_commit_hash(repo_name):
    """ Fetch the latest commit hash from GitHub """
    owner = g.get_user().login
    repo = g.get_repo(f"{owner}/{repo_name}")
    commits = repo.get_commits()
    return commits[0].sha if commits.totalCount > 0 else None

# only Fabric workspaces
workspaces_df = fabric.list_workspaces().query('Type!="AdminInsights" and `Is On Dedicated Capacity`==True')
git_connections_df = labs.admin.list_git_connections()
# workspaces without git enabled
workspaces_without_git = workspaces_df[~workspaces_df["Id"].isin(git_connections_df["Workspace Id"])]

for index, row in workspaces_without_git.iterrows():
    workspace_name = row["Name"]
    formatted_repo_name = "fabric_" + workspace_name.replace(" ", "_").lower()

    # Create (or reuse) GitHub repository
    repo = create_and_initialize_github_repo(formatted_repo_name)

    # Resolve workspace details
    resolved_workspace_name= fabric.resolve_workspace_name(workspace_name)
    workspace_id = fabric.resolve_workspace_id(workspace_name)

    # Connect the workspace to GitHub
    labs.connect_workspace_to_github(
        owner_name=g.get_user().login,
        repository_name=formatted_repo_name,
        branch_name="main",
        directory_name="/",
        connection_id=conn_id,
        workspace=workspace_id
    )
    print(f"🟢 The '{resolved_workspace_name}' workspace has been connected to the '{formatted_repo_name}' GitHub repository.")

    # Wait until the Git connection is detected
    max_retries = 10
    retry_interval = 5  # seconds
    connection_ready = False
    for attempt in range(max_retries):
        current_connections = labs.admin.list_git_connections()
        if workspace_id in current_connections["Workspace Id"].values:
            print(f"Git connection detected for workspace '{resolved_workspace_name}'.")
            connection_ready = True
            break
        else:
            print(f"Waiting for Git connection to initialize for '{resolved_workspace_name}' (Attempt {attempt+1}/{max_retries})...")
            time.sleep(retry_interval)
    if not connection_ready:
        print(f"Git connection not initialized for workspace '{resolved_workspace_name}'. Skipping sync.")
        continue

    time.sleep(60)
    remote_commit_hash = labs.initialize_git_connection(workspace=workspace_id)

    print(f"🟢 Git connection initialized. Remote commit hash: {remote_commit_hash}")

    time.sleep(60)  # Adjust timing as needed

    # Get item ids and commit only the Notebooks
    print("Fetching all item IDs for commit...")
    workspace_items = fabric.list_items(workspace=workspace_id).query('Type=="Notebook"')
    item_ids = list(workspace_items["Id"])

    if not item_ids:
        print(f"❌ No items found in workspace '{resolved_workspace_name}'. Skipping commit.")
        continue

    print(f"🟢 Found {len(item_ids)} items to commit.")

    # Commit specified items to Git
    try:
        labs.commit_to_git(
            comment="Sync workspace with main branch",
            item_ids=item_ids,
            workspace=workspace_id
        )
    except Exception as e:
        print(str(e))

    print(f"✅ Workspace '{resolved_workspace_name}' has been successfully committed to Git.")
```

**Result:**

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1739395814932/c079fb7a-f904-42e1-96f6-6ff77c0d4d92.png align="center")

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1739395858632/2b47bac3-a456-49f4-938e-b3365943909e.png align="center")

Admittedly, getting this to work wasn’t very straightforward. So, if you run into issues, I won’t be surprised. If you do improve the code, I would love to know. Thanks !
