Skip to content

Latest commit

 

History

History
93 lines (42 loc) · 2.04 KB

readme.md

File metadata and controls

93 lines (42 loc) · 2.04 KB

Search Engine Workshop

About

Handson workshop for building a semantic search engine.

Setup

If you came to this repo, during a workshop visit this custom jupyter hub with all the dependencies already set up.

The repo is located at npatta01/search-engine-workshop

To use this repo outside a workshop, please use Binder Binder

Content (Notebooks)

Data Fetching

setup notebook
stats notebook
sample image notebook

Notebooks to download unsplash dataset and save as hugging face dataset format

Non Deep Learning Retrieval

BM25 retrieval with elastic search: notebook

Deep Learning Retrieval (text)

Text Deep Learning retrieval: Link

Deep Learning Retrieval (image)

Clip Retrieval: Link

ANN

Shows how to speed up Deep Learning retrieval by exploring different ANN indexes Link

Slides

PyData Seattle 2022

PyData NYC 2022

ODSC 2022

Contact

For help or feedback, please reach out to :

Acknowledgments

This workshop uses Unsplash Lite Dataset 1.2.0 link

The hands on portion of the workshop was made possible due to JupyterHub Helm Chart

Changelog

v1.1

  • setup for PyDataNYC
  • replaced stackoverflow data with unsplash data

v1.0

  • setup for ODSC
  • used stackoverflow data