Skip to main content

Command Palette

Search for a command to run...

RAG in Fabric Notebook Using Microsoft Harrier Multilingual Text Embedding Model

Published

Last week Microsoft released an open-source text embedding model called Harrier in three sizes- 270M, 0.6B and 27B. I have been testing it in my RAG pipeline and so far it has crushed all my metrics. It's currently the number 1 model on MTEBv2 leaderboard.

The 27B model is obviously too big for Fabric notebook but the 270M and 0.6B variant despite being small and extremely capable and fast even with CPU inferencing in Fabric Python notebook. For comparison, I have been using the text-embedding-ada-002 model in my pipeline with R1 score = 0.68, the 270M model topped that easily with R1 0.76 with lower latency. Bonus : no Fabric CU consumption (other than Python execution time). In fabric notebook, you can save the model in a lakehouse and load it for inferencing. The initial load, especially 0.6B model may take a while but the retrieval is fast.

Take a look at this sample notebook on how to operationalize it in Fabric.

snippets/harrier-ragbench-techqa-fabricguru.ipynb at main ยท pawarbi/snippets

  • Use %%configure to upgrade the compute and attach a lakehouse

  • Download the model

  • Instantiate it

  • Load the data

  • Embed and retrive

  • Just for teh sake of testing, I also added a harrier + BM25 pipeline to improve retrieval, the R3 score improved to 99%