Getting A List of Folders and Delta Tables in the Fabric Lakehouse
Python function to get list of files and tables
When you are working with the files and folders in Fabric, you will often need the list of file folders and tables for parameterization, data validation, DataOps, pipeline orchestration etc. Below is a quick helper function I have been using to get a list of all the folders and the tables as pandas dataframe along with the abfss file paths. Very handy when working on large projects. You can extend further if you need to get to the sub-folders in the Files section.
import pandas as pd
from notebookutils import mssparkutils
def get_file_table_list()->pd.DataFrame:
'''
Function to get a list of Folders and Tables to the mounted Lakehouse.
This function will return a pandas dataframe containing names and abfss paths of each folder in the Files section
and the Tables in the managed section of the mounted or the default Lakehouse the notebook is attached to.
'''
LH_ID = spark.conf.get("trident.lakehouse.id")
WS_ID = spark.conf.get("trident.workspace.id")
base_path = f'abfss://{WS_ID}@onelake.dfs.fabric.microsoft.com/{LH_ID}'
data_types = ['Files', 'Tables']
df = pd.concat([
pd.DataFrame({
'name': [item.name for item in mssparkutils.fs.ls(f'{base_path}/{data_type}/')],
'type': data_type[:-1].lower() ,
'path': [item.path for item in mssparkutils.fs.ls(f'{base_path}/{data_type}/')],
}) for data_type in data_types], ignore_index=True)
return df
get_file_table_list()
Here is the returned dataframe:
Please note that the spark configuration instead of using fabric uses trident which was the code name. This might change in the future, so refer to the official documentation.
There is one more method using spark sql show tables
:
It doesn't return the path though.
This will return files and tables in the mounted lakehouse. You can identify the mounted lakehouse by reading this blog. If the notebook is not attached to a lakehouse, the above function will return an error.