← Back to Projects
Scholar AI - LLM based RAG System
A LLM-based RAG academic literature retrieval system with intelligent document parsing and real-time streaming responses.
GPT 4o
React
FastAPI
Python
TypeScript
Ant Design
Elasticsearch
Hybrid Search BM25+Vector
BGE m3 embedding
BGE Reranker
Overview
Scholar AI is a comprehensive academic literature retrieval system that leverages large language models and retrieval-augmented generation (RAG) to help researchers find and understand relevant academic papers. The system combines traditional keyword search with semantic understanding to provide more accurate and contextual results.
Key Features
- Intelligent document parsing (PDF/Word/PPT): Seamless extraction and understanding of content from multiple document format.
- Real-time streaming responses: Dynamic LLM integration with GPT-4 for instant, conversational AI interactions .
- Multi-language support (Chinese/English/Japanese): Cross-lingual capabilities for global accessibility .
- Intelligent question recommendations: Context-aware suggestions to guide user inquiries and exploration .
- Hybrid search combining BM25 and vector similarity: Advanced retrieval combining keyword matching and semantic understanding .
-
BGE reranking for improved relevance: Enhanced result ordering for more accurate information retrieval
.
Technical Architecture
Core Technical Implementations
-
Layout-Aware PDF Parsing: Preserves reading order and structure.
-
Hybrid Retrieval System: Keyword search misses semantics; vector search misses exact terms (model names, authors).