Transforming YouTube Learning with YTBuddy Extension
Industry
EdTech / AI Tools
Function
AI R&D & User Experience
Technology
Location
Global
Client Overview
YTBuddy is a privacy-first AI browser extension that revolutionizes how users interact with YouTube content. Designed for students, researchers, and professionals, YTBuddy empowers users to ask natural language questions about any YouTube video and get precise answers with timestamped references.
The tool leverages Retrieval-Augmented Generation (RAG) using Ollama’s Phi3 model and runs entirely locally—ensuring full data privacy. Users can semantically search within videos and retrieve contextual answers instantly, turning passive video consumption into an active learning experience.
The Challenges
- Information Retrieval Inefficiency: Users wasted hours skimming or rewatching videos to locate specific concepts or statements.
- Limited Searchability: YouTube’s native search couldn’t index spoken words, making in-video concept retrieval impossible.
- Loss of Context: Viewers couldn’t cite or revisit exact moments, weakening research and knowledge-sharing workflows.
- Privacy Concerns: Existing AI tools required transcript or user data to be uploaded to external servers, posing privacy risks.
Solutions

AI-Powered Browser Extension
- Developed a Chrome extension using Manifest V3 with a draggable overlay chatbot.
- Integrated content scripts to seamlessly interact with YouTube videos.
- Enabled contextual Q&A with timestamped citations directly in-video.
Local Semantic Search & Caching
- Used LangChain + sentence-transformers to create and store semantic embeddings via FAISS vector DB.
- Employed a FastAPI backend to manage transcript parsing and chat processing.
- Implemented local vector storage and intelligent caching for sub-second response time.
Full Privacy via Local Processing
- All AI inference (via Ollama’s Phi3) and transcript processing performed locally.
- No data sent to external servers, meeting high standards for privacy and data security.
The Results
- Improved Learning Efficiency: Users saved up to 70% of the time spent finding insights in instructional videos.
- Enhanced Content Accessibility: Semantic search uncovered in-video answers that were previously unsearchable.
- Privacy-First Design: Eliminated concerns around personal or professional data leaving the local machine.
- Contextual Referencing: Timestamp-linked responses improved note-taking, team collaboration, and content review.