Semantic Search Swiftly: Thirty-Six minutes to Transform Sierra Data for Enhanced Data Discovery

derekbrownrhpl · April 3, 2024, 3:57pm

Program Title: Semantic Search Swiftly: Thirty-Six minutes to Transform Sierra Data for Enhanced Data Discovery
ILS: General
Program Description: There’s no avoiding AI, so you may as well understand it a bit better! This presentation will cover basic concepts of Natural Language Processing (NLP) – A subset of artificial intelligence, and machine learning – and will briefly explore how semantic search differs from traditional keyword searching. The presentation will also cover a rapid process that extracts, transforms and loads (ETL) data for the purpose of creating a semantic search platform.
In this session, we’ll utilize the the ‘sierra-ils-utils’ Python library to extract ILS data, the Google Colab platform and a transformer model from the Huggingface platform to transform the data, and Qdrant – an easy-to-use, free, and open source vector database – to load the resulting vector embeddings. This combination of a vector database and data modeling enables a powerful semantic search capability. All this within just 36 minutes (hopefully)!

Speaker/ Information: Ray Voelker (ray.voelker@cincinnatilibrary.org) - ILS Admin, Cincinnati & Hamilton County Public Library (CHPL)