Vector Retrieval

This chapter introduces the core retrieval technology principles of the GC-QA-RAG intelligent Q&A system, including vectorization strategies, hybrid retrieval mechanisms, RRF fusion ranking, and other key implementation details.

1. Retrieval Process Overview

The system adopts a typical RAG (Retrieval-Augmented Generation) three-stage architecture. The goal of the retrieval stage is: When users ask questions, combine keywords with semantic understanding to quickly locate the most relevant knowledge points, providing support for subsequent generation of high-quality answers.

The retrieval process is as follows:

User inputs a question;
System vectorizes the question (dense + sparse);
Parallel retrieval of "question" and "answer" fields in the knowledge base;
Use RRF (Reciprocal Rank Fusion) algorithm to fuse multi-path retrieval results and return TopK optimal answers.

2. Hybrid Retrieval Mechanism

2.1 Multi-channel Retrieval

The system adopts Hybrid Search, utilizing both sparse vectors (BM25) and dense vectors (Dense Embedding) to retrieve "question" and "answer" fields:

Sparse Retrieval (BM25): Suitable for queries with clear keywords, strong recall capability;
Dense Retrieval (Dense Vector): Based on semantic similarity, suitable for complex expressions and fuzzy queries.

Each retrieval path obtains TopK=40 candidate results.

2.2 Retrieval Fields

Each knowledge entry contains four types of vector features:

Prefix_Question_Dense
Prefix_Answer_Dense
Prefix_Question_Sparse
Prefix_Answer_Sparse

During retrieval, user questions are matched against both "preset questions" and "answer" fields' dense/sparse vectors, greatly improving recall rate and relevance.

2.3 RRF Fusion Ranking

Multi-path retrieval results are fused and ranked through the RRF (Reciprocal Rank Fusion) algorithm, ultimately selecting TopK=8 optimal results to return. RRF effectively balances the advantages of different retrieval channels, improving the diversity and accuracy of final results.

3. Retrieval Implementation Details

3.1 Vectorization and Querying

User questions first generate dense vectors and sparse vectors (such as BM25 weights) through embedding models;
During retrieval, four vector paths are used as queries: "question dense", "answer dense", "question sparse", "answer sparse", calling the vector database's (such as Qdrant) multi-path prefetch interface;
Retrieval results are fused through RRF, deduplicated, and returned.

3.2 Code Implementation Key Points

Taking search.py as an example, the core retrieval logic is as follows:

get_embedding_pair: Generate dense and sparse vectors for input questions;
search_sementic_hybrid_single: For a single knowledge base collection, perform prefetch retrieval with four vector paths and fuse ranking through RRF;
search_sementic_hybrid: Parallel retrieval across all knowledge bases (such as documents, forum Q&A, tutorials), merging results;
distinct_search_hits: Deduplicate retrieval results to ensure each knowledge point is unique.

3.3 Retrieval Process Diagram

User Question
   │
   ├─> Generate dense/sparse vectors
   │
   ├─> [Question Dense] ─┐
   ├─> [Answer Dense] ─┼─> Multi-path retrieval (TopK=40)
   ├─> [Question Sparse] ─┤
   └─> [Answer Sparse] ─┘
         │
   └─> RRF fusion ranking → TopK=8
         │
   └─> Return retrieval results

4. Structure and Usage of Retrieval Results

Each retrieval result contains:

Question: Preset question
Answer: Standard answer
FullAnswer: Detailed explanation
Summary: Context summary
Url, Title, Category, Date and other metadata

This information is not only used for direct display but also provides rich context for subsequent large model answer generation.

5. Technical Advantages and Optimization Points

Multi-path hybrid retrieval: Balances keyword and semantic understanding, greatly improving recall rate and accuracy;
RRF fusion ranking: Effectively fuses multi-channel results, improving diversity and relevance;
Prefix mechanism: Through document category/title prefixes, avoids semantic aliasing and improves retrieval precision;
Efficient deduplication: Ensures each knowledge point is unique, avoiding interference from duplicate information.

6. Summary

This system achieves efficient and precise knowledge retrieval capabilities through multi-channel hybrid retrieval, RRF fusion ranking, and rich vectorization and metadata design, providing a solid foundation for intelligent Q&A systems.