π¬ Semantic Movie Search with Azure OpenAI & Elasticsearch
A full-stack application that enables semantic search over movies using Azure OpenAI embeddings and Elasticsearch vector search.
This project allows users to perform natural language queries like:
- βMovies about space explorationβ
- βThrillers similar to Inceptionβ
- βFeel-good romantic comediesβ
The system converts the query into a vector embedding, then performs a kNN search in Elasticsearch (Elastic Cloud) to return semantically relevant results.
π§ Features
- π Semantic Search: Powered by Azure OpenAI embeddings
- π¦ Vector Indexing: Using Elasticsearchβs
dense_vector
field - π Fast kNN Search: Based on AI-generated embeddings
- π₯ Movie Dataset: Easily extendable to products, articles, or any text-based data
- π REST API: For indexing and searching
- πΌοΈ React Frontend: (Optional) For demo UI
π Folder Structure
/copliot-search
β
βββ .env # Environment variables
βββ .gitignore # Git ignore rules
βββ README.md # Project documentation
βββ package.json # Project metadata and scripts
βββ pnpm-lock.yaml # pnpm lockfile
β
βββ data/ # Data files (e.g., movies.json)
β
βββ node_modules/ # Installed dependencies
β
βββ public/ # Static assets (for frontend)
β
βββ src/ # Source code
β βββ config/ # Configuration folder
β β βββ elastic.js # Elasticsearch client configuration
β βββ seed.js # Script to seed movie data
β βββ server.js # Express server entry point
β
Layer | Technology |
---|---|
Backend | Node.js + Express |
Embeddings | Azure OpenAI (text-embedding-ada-002 ) |
Vector DB | Elasticsearch (Elastic Cloud) |
Frontend | React (optional) |
Deployment | Docker-ready |
π¦ Requirements
Before running the project, make sure you have:
1. Azure OpenAI
- Access to Azure OpenAI Studio
- Deployed model:
text-embedding-ada-002
- Save:
AZURE_OPENAI_API_KEY
AZURE_OPENAI_ENDPOINT
2. Elastic Cloud
- Cluster created at https://cloud.elastic.co
- Save:
ELASTIC_CLOUD_ID
ELASTIC_USERNAME
ELASTIC_PASSWORD
π Setup Instructions
1. Clone the repo
git clone https://github.com/yourusername/semantic-movie-search.git
cd semantic-movie-search
2. Install dependencies
npm install
3. Set up environment variables
Create a .env
file:
# Azure OpenAI
AZURE_OPENAI_API_KEY=your_azure_openai_api_key
AZURE_OPENAI_ENDPOINT=https://<your-resource>.openai.azure.com/openai/deployments/embedding-model/embeddings?api-version=2023-05-15
# Elastic Cloud
ELASTIC_CLOUD_ID=your-cloud-id-from-elastic-cloud
ELASTIC_USERNAME=elastic
ELASTIC_PASSWORD=your-elastic-password
4. Create Elasticsearch index
Use Kibana Dev Tools:
PUT /movies_with_embeddings
{
"mappings": {
"properties": {
"id": { "type": "keyword" },
"title": { "type": "text" },
"overview": { "type": "text" },
"release_date": { "type": "date" },
"vote_average": { "type": "float" },
"vote_count": { "type": "integer" },
"embedding": {
"type": "dense_vector",
"dims": 1536
}
}
}
}
5. Seed movie data
Make sure data/movies.json
exists, then run:
node seed.js
6. Start the server
node server.js
π§ͺ Available Endpoints
Method | Endpoint | Description |
---|---|---|
POST | /index-movie |
Index one movie with embedding |
POST | /search |
Perform semantic search |
GET | /health |
Check if service is running |
Example Request Body for /search
{
"query": "space exploration"
}
Returns:
[
{
"title": "Interstellar",
"overview": "A team of explorers travel through a wormhole...",
"vote_average": 8.6
},
...
]
π Suggested Use Cases
- Movie recommendation engine
- E-commerce product search
- Article/document similarity search
- Chatbot-integrated search UI
Prompt
Understand natural language movie queries and return a meaningful explanation of the movie that aligns with the user's request.
Use the provided query to infer the user's intent, preferences, or desired movie characteristics. Match their description to a movie and provide a concise explanation that includes relevant information such as the plot, genre, theme, notable cast, or why it matches the query.
# Steps
1. Parse the query and determine the key themes, preferences, or traits being described (e.g., genre, story elements, mood, actors, release era).
2. Identify a movie that best matches the parsed details.
3. Provide a concise explanation of the selected movie, including:
- Title
- A brief description of the plot or theme.
- Noteworthy features that align with the request (e.g., genre, actors, mood, historical significance).
- A clear connection to the user's query.
# Output Format
The output should be a **concise paragraph** structured as follows:
1. The movie title in **bold**.
2. A short description of the plot or core theme.
3. Reference the key elements that match the query, explaining *why* it suits the request.
# Examples
### Input 1:
*"A feel-good movie with a strong female lead overcoming challenges"*
### Output 1:
**The Pursuit of Happyness**
This inspiring movie follows the life of Chris Gardner, a struggling salesman who faces numerous adversities while raising his young son. With its focus on resilience and the triumph of the human spirit, it is a perfect feel-good movie that showcases determination and hope.
---
### Input 2:
*"A sci-fi movie about space exploration and the meaning of life."*
### Output 2:
**Interstellar**
This thought-provoking sci-fi epic explores humanity's survival as a team of astronauts travels beyond our galaxy to find a habitable planet. Directed by Christopher Nolan, the movie delves deeply into themes of space exploration, love, and sacrifice, making it an ideal choice for fans of existential sci-fi rooted in space discovery.
# Notes
- If multiple movies could be good matches, select the **most iconic or suitable choice**.
- Avoid spoilers in explanations unless theyβre critical to understanding why the movie matches the query.
- Remain concise but informative, sticking to describing just one movie per query.
π· Demo Screenshot (Optional)
π License
MIT License β see LICENSE
π€ Contributing
Contributions are welcome! Please open an issue or submit a PR if youβd like to improve this project.