Back to Projects
AI & Machine Learning
Completed

AI Book Discovery Platform

Local RAG-based book recommendation system with semantic search.

Oct - Nov 2025Team: Personal ProjectRole: Ai & Software Developer

About this Project

The AI Book Discovery Platform is an intelligent recommendation system that moves beyond simple keyword matching to understand the semantic context of user queries. Powered by Ollama, it leverages the "nomic-embed-text" model for vector-based search and "llama3.2:1b" for expert-level book analysis. The entire system runs 100% locally, ensuring complete user privacy while providing deep insights into search intent and reader profiling.

Tech Stack

Python
Streamlit
Ollama
LangChain
Nomic Embed
Llama 3.2
Pandas
Scikit-learn

Tools Used

VS Code
Ollama Runtime
Streamlit Cloud
Jupyter Notebook

Key Features

Search Architecture

  • Semantic Discovery: Natural language understanding using vector embeddings to find books by theme and intent.
  • Vector Search: Implementation of Cosine Similarity algorithms to rank matches by semantic relevance.
  • Contextual Logic: Ability to find relevant titles even without matching exact keywords in descriptions.

Local AI Intelligence

  • LLM Analysis: Leveraging Llama 3.2 to generate expert book reviews, summaries, and audience insights.
  • Query Decomposition: Intelligent breakdown of user search descriptions to identify hidden reading preferences.
  • Reader Profiling: AI-generated reports on the user's reading style based on their historical search history.

Data Management

  • Curated Datasets: Pre-loaded collection of diverse literature for immediate discovery.
  • Custom Ingestion: Support for user-uploaded CSV datasets to enable search across personal libraries.
  • Dynamic Tabulation: Real-time processing and ranking of large book catalogs using optimized Pandas pipelines.

Privacy & Performance

  • Edge Computing: 100% local inference via Ollama, ensuring no data ever leaves the user's machine.
  • Inference Optimization: Fine-tuned model parameters (top_k, temperature) to balance generation speed and analytical depth.
  • Responsive UI: Instant feedback loops in Streamlit for embedding generation and generation status.

Highlights

Semantic Search Engine
100% Local LLM Inference
Privacy-First Architecture

Installation

Ollama Setup

ollama pull nomic-embed-text
ollama pull llama3.2:1b
# Ensure Ollama serve is running

Environment Launch

git clone https://github.com/Arfazrll/OllamaLLM-RecomendationSystem.git
pip install -r requirements.txt
streamlit run OllamaLLM.py

Challenges & Solutions

Challenge

Vector Storage Overhead

Solution

Developed an on-demand embedding generation and caching strategy for local datasets to eliminate the need for an external vector database.

Challenge

LLM Hallucination Control

Solution

Implemented strict system prompts and few-shot examples within the RAG pipeline to ensure generated reviews stay grounded in the provided book metadata.

Challenge

System Resource Management

Solution

Optimized the transition between the embedding model and generative model to prevent VRAM spikes on machines with limited hardware.

LinkedIn