Research - AI Feta, the news about scientific AI research

AI

Escaping the Verifier: Learning to Reason via Demonstrations

LLMs can learn to reason—without task verifiers Many real-world problems don’t have automatic checkers to grade answers, even though we have lots of expert solutions. RARO (Relativistic Adversarial Reasoning Optimization) shows how to train reasoning skills from those examples alone. How it works: * A policy (the model) tries

Robotics

VacuumVLA: A Two-in-One Robot Hand That Grips and Suctions

Robots guided by Vision-Language-Action (VLA) AI are getting better at everyday tasks—but most still use simple two-finger grippers. That limits them on smooth, flat, or handleless objects. VacuumVLA is a low-cost robot hand that merges a standard two-finger gripper with a vacuum suction cup. The robot can switch between

AI

ToolOrchestra: Small AI that smartly manages tools

Large models are powerful, but solving deep, multi-step problems is pricey. ToolOrchestra trains a small “orchestrator” that decides which tools to use, when, and how—like a smart project manager for AI. It uses reinforcement learning with rewards for outcomes, efficiency, and user preferences. The result: an 8B Orchestrator that

AI

ToolOrchestra: Small Maestros, Big Intelligence

Small model, big wins Large language models are great generalists, but really tough, multi-step problems still strain both brains and budgets. ToolOrchestra flips the script: instead of one giant model, a small “orchestrator” coordinates other models and specialized tools. Trained with reinforcement learning that rewards outcomes, efficiency, and user preferences,

AI

BAMAS: Budget-Aware AI Teams That Deliver

What’s new As AI "teams" of LLM agents grow, the cloud bill can explode. BAMAS is a framework that designs these teams with a dollar cap in mind. How it works * Picks the right mix of models: Uses integer linear programming to balance accuracy vs. price. * Plans

Robotics

VacuumVLA: Two skills, one robot hand

Robots guided by Vision-Language-Action (VLA) models are getting good at everyday tasks — but most still grab with simple two-finger claws. That limits them on flat, slippery, or handle-less surfaces. VacuumVLA adds a low-cost twist: a single end-effector that combines a standard gripper with a vacuum suction tool. It can switch

AI

Universe of Thoughts: Can AI Think Creatively?

Can AI think creatively? Large language models are great at step-by-step logic, but real-world breakthroughs need more than routine problem-solving. This paper proposes a creative reasoning framework, rooted in cognitive science, and turns it into practice with "Universe of Thoughts" (UoT): methods that help LLMs search wider and

AI

Can “Vibe Coding” Beat Grad Students? Not Yet.

LLMs can write tidy code that passes unit tests—but can they build money-making agents that plan, bid, and deliver under pressure? This study pitted 40 LLM-coded agents (prompted with methods including “vibe coding,” i.e., high-level guidance) against 17 human-coded agents from grad CS students in a real-world-style challenge:

AI

Teaching AI to Smell: Meet the New York Smells Dataset

Computers can see and hear, but smelling the world has been out of reach. New research introduces New York Smells: a large, real-world dataset that pairs odors with images. * 7,000 smell-image pairs from 3,500 objects, indoors and outdoors — about 70x more objects than past datasets. * Benchmark tasks: match

AI

New Open‑Source Tool to Check If AI Trained on Your Work

If you’re a writer, artist, or publisher, you can now verify whether your content was used to train large language models—without a data center or a PhD. In “Copyright Detection in Large Language Models,” David Szczecina, Senan Gaffori, and Edmond Li introduce an open-source platform that makes copyright

AI

Fighting AI with AI: Safer Planes and Self-Driving Cars

Fighting AI with AI How do we trust AI inside airplanes and self-driving cars? Deep neural networks are powerful, but opaque—and that makes traditional safety checks hard. This paper proposes using foundation models to make AI-enabled systems safer, from requirements to deployment. * REACT: uses Large Language Models to translate

AI

Wearables + Words: Estimating Calories with NPLM

What if your watch could help estimate your daily calories—not just count steps? Researchers introduce the Nutrition Photoplethysmography Language Model (NPLM), which blends heart-signal data from consumer wearables (PPG) with meal descriptions so AI can reason about both physiology and food. Trained on 19,340 people and 1.1