# Using Fabric OrgApps + Notebooks For Geospatial Data Exploration

[Simon Willison](https://simonwillison.net/) is one of my favorite bloggers. In fact, what I blog, how I blog & test, is inspired by him. He wrote a blog a couple of weeks ago about [FourSquare Places data](https://opensource.foursquare.com/os-places/) that has been open-sourced. I was exploring this dataset and ended up creating a few maps. I love [OrgApps](https://learn.microsoft.com/en-us/power-bi/consumer/org-app-items/org-app-items) in Fabric and I truly believe as it matures, it will be THE way for analysts & data scientists to provide rich insights + traditional reports to business users. Notebooks can augment the Power BI reports to provide insights that are otherwise not possible. I have submitted a session on this topic to FabCon ‘25, let’s see. If it is selected, I hope to show how transformational it is and how businesses can use it.

I won’t go into super details about the code below, but a few things to note:

* I used [daft](https://www.getdaft.io/) to scan 104M rows from an S3 bucket in Fabric Python notebook without downloading the entire dataset. Why daft ? Because it’s [optimized for reading S3 data](https://blog.getdaft.io/p/announcing-daft-02-10x-faster-io). If you run the below notebook, you will see there is minimal memory & CPU consumption. Look at Simon’s blog above, he used Duckdb. I cleaned the transformed the data lazily using daft.
    
* I also used Polars because polars has a nice altair integration.
    
* Folium for creating interactive maps and timeseries using Plotly.
    
* Notebook is embedded in OrgApps for users to explore the data. You can also embed a Power BI report using `QuickVisualize` for users to explore the data (as long as it is a small dataset).
    

%[https://www.youtube.com/watch?v=env-6iQFbGE] 

#### Steps:

Just [download this notebook](https://github.com/pawarbi/snippets/blob/main/foursquare_coffee_daft_sandeeppawar.ipynb), import it in your Fabric workspace and execute it.

To get a list of files at this S3 location:

```plaintext
## list of files

s3 = fs.S3FileSystem(region='us-east-1')
path = "s3://fsq-os-places-us-east-1/release/dt=2024-11-19/places/*.parquet"


file_info = s3.get_file_info(fs.FileSelector(
    "fsq-os-places-us-east-1/release/dt=2024-11-19/",
    recursive=True
))
for info in file_info:
    print(info.path)
```

#### References

* [Foursquare Open Source Places: A new foundational dataset for the geospatial community | Foursquare](https://location.foursquare.com/resources/blog/products/foursquare-open-source-places-a-new-foundational-dataset-for-the-geospatial-community/)
    
* [“foursquare” in items](https://simonwillison.net/search/?q=foursquare)
