RAG

Which LLM wins at RAG Q&A?

Kari Jaaskelainen

06 Nov 2025 — 1 min read

Which LLM wins at RAG Q&A?

RAG (Retrieval-Augmented Generation) helps reduce hallucinations by grounding answers in source documents. This study tested five 7B-class models on computer science literature Q&A to compare accuracy and speed.

GPT‑3.5 + RAG answered both yes/no and long-form questions effectively.
Mistral‑7B‑Instruct + RAG led the open-source pack on both question types.
Orca‑mini‑v3‑7B was fastest (lowest average latency); LLaMa2‑7B‑Chat was slowest.

How they measured it: accuracy and precision for binary questions; human expert and Gemini rankings; and cosine similarity for long answers.

Big picture: With the right RAG setup and infrastructure, open-source LLMs can stand shoulder to shoulder with proprietary models like GPT‑3.5.

Paper by Ranul Dayarathne, Uvini Ranaweera, and Upeksha Ganegoda. Read more: http://arxiv.org/abs/2511.03261v1

Paper: http://arxiv.org/abs/2511.03261v1

Register: https://www.AiFeta.com

RAG LLMs QA GenerativeAI Mistral GPT35 OpenSourceAI AIResearch NLP ComputerScience

Which LLM wins at RAG Q&A?

Kari Jaaskelainen

Which LLM wins at RAG Q&A?

Read more

Automating GDPR Compliance: A Roadmap for Companies and Law Firms

FPGAs for Faster, Leaner Deep Learning: A Review of CNN Accelerators

Dynamic-K: Recommendations That Know When to Stop

Teaching chatbots to stop contradicting themselves (DECODE)