LLM - AI Feta, the news about scientific AI research

AI

When Bias Pretends to Be Truth: How Spurious Correlations Undermine Hallucination Detection in LLMs

Why do AI models sometimes sound sure while being wrong? This study spotlights a subtle culprit: spurious correlations—strong but misleading patterns in training data (like linking certain surnames to a nationality). * These shortcuts make LLMs produce confident, wrong answers. * Making models bigger doesn’t fix it. * Popular detectors—confidence

AI

Meet DeepPersona: AI personas that actually feel human

AI personas that actually feel human What if our "AI citizens" behaved less like cardboard cutouts and more like real, diverse people? DeepPersona is a generative engine that builds rich, fully narrated synthetic personas at scale. It first assembles a large, hierarchical taxonomy of human attributes by mining

AI

AdaRec: Smarter recommendations from fewer clicks

Meet AdaRec: smarter recommendations from fewer clicks AdaRec is a new way to personalize recommendations using large language models. Instead of hand-crafted features, it turns your past interactions into short, readable stories—“narrative profiles”—and lets an LLM reason over them with just a few examples. It thinks in two

AI

Meet DeepPersona: Scaling Realistic Synthetic Personas

DeepPersona in a nutshell Researchers built a generative engine that creates richly detailed, realistic synthetic personas for AI—without using private personal data. * Deeper profiles: Each persona includes hundreds of structured attributes and about 1 MB of narrative text—roughly 100x more detail than prior work. * Big taxonomy: A hierarchy

AI

Hands-Free AI Help for Robotic Surgery

In robotic surgery, a surgeon’s hands and eyes are busy. That makes it hard to pull up labs, flip through CT slices, or inspect 3D anatomy without breaking focus. This study introduces the voice-directed Surgical Agent Orchestrator Platform (SAOP): a coordinated team of AI agents that listens to spoken

AI

LoRA on the Go: Instance-level Dynamic LoRA Selection and Merging

TL;DR LoRA adapters are small plug-in modules that fine-tune big language models cheaply. But one adapter per task doesn’t fit messy, real-world inputs. What’s new LoRA on the Go (LoGo) lets a model pick and blend the best adapters for each individual input—no labels, no extra

AI

AI Debates That Fact-Check Themselves—and Persuade

What if AI could argue with itself to spot misinformation—and change minds? Researchers introduce ED2D, an evidence-based multi-agent debate system. It doesn’t just label claims; it retrieves factual sources and generates clear debate transcripts so people can see the reasoning. * Outperforms prior methods on multiple detection benchmarks. * When

AI

Test-Driving Social Media Moderation with AI 'What If' Simulations

Can we test moderation before it goes live? Researchers built an AI-powered simulator that replays the same online conversation under different moderation rules—like running parallel "what if" worlds—so we can measure how toxic talk changes while keeping everything else equal. * LLM agents act with striking psychological

AI

DeepEyesV2: Teaching AI to Use Tools

AI that sees, thinks—and uses tools Meet DeepEyesV2, a multimodal “agentic” model that doesn’t just read text and look at images—it can call external tools like code runners and web search, then weave the results into its reasoning. Key ideas: * Two-stage training: a cold-start phase teaches basic

AI

LiveStar: An AI Co-Host for Live Streams

Meet LiveStar — an AI co‑host for live streams that understands what’s happening on screen and speaks up at the right moment. Most video AIs do great offline, but stumble live: they process frames slowly and interrupt or fall behind. LiveStar fixes that with: * Adaptive streaming decoding: keeps a

AI

AI Agents Clean Maintenance Logs for Smarter Predictive Maintenance

Predictive maintenance means fixing equipment before it fails. But in the auto industry, messy maintenance logs—full of typos, missing fields, near-duplicates, and wrong dates—can derail the machine-learning models that power those predictions. This study tests large language model (LLM) agents as smart cleaners for those logs. The team

AI

AI that finds the facts hidden in court verdicts

Criminal justice data often record charges and outcomes—but not what actually happened. Yet many continental European verdicts spell out the facts. This study shows we can automatically pull those fact sections from public Slovak court decisions. The team tested two tools: 1) smart pattern‑matching that recognizes how courts