AI Feta, the news about scientific AI research

AI that Transcribes Drums—No Paired Audio Required

AI that transcribes drums—no paired audio required Most drum-transcription AIs need huge, matched audio–MIDI datasets. Those are scarce. Synthetic stand‑ins often sound cheap, creating a domain gap. We flip the script. With a semi‑supervised pipeline, we automatically curate a large, diverse library of high‑quality one‑

Tiny Pixels, Big Clues: Teaching AI Chinese with 8x8 Images

Tiny pixels, big clues for Chinese AI Chinese characters pack meaning and sound into their shapes. This study asks: can AI learn better by seeing characters instead of just looking up IDs? * Using tiny 8x8 grayscale images of single characters, the model hits 39.2% accuracy—on par with 39.

Smaller, Smarter Music AI That Still Rocks

Smaller, smarter music AI Foundation models are powerful—but huge and pricey to run. This study shows how to keep music understanding strong while trimming model size and compute. * What’s new: A lean architecture that swaps heavy self‑attention for Branchformer + SummaryMixing, plus a simple random quantizer to learn

DPWriter: Planning for More Diverse AI Stories

New AI that keeps creativity: DPWriter LLMs trained with reinforcement learning often play it safe, shrinking the variety of their stories. DPWriter flips the script by planning before writing. It breaks generation into semi-structured steps, then uses Diverse Planning Branching to explore multiple, intentionally different routes. A group-aware diversity reward

Teaching Transformers to Understand Numbers (for Real)

Large language models can ace math benchmarks yet still stumble on simple number sense because they treat numbers like ordinary words. This work fixes that by giving models a value-aware way to read numbers. How it works: whenever a number appears, the input is augmented with a tiny prefix token

Omni-R1: AI that draws its thoughts

What if AI could think with pictures? Omni-R1 is a new multimodal AI that doesn’t just “talk through” problems—it draws its intermediate steps. Instead of relying on one fixed reasoning style, it unifies many skills (like zooming into regions, pointing to objects, or marking paths) by generating small

AI that auto-builds large-scale optimization models (LEAN-LLM-OPT)

Big business decisions often rely on complex optimization models, but building them is slow and manual. Meet LEAN-LLM-OPT, a lightweight, multi-agent AI that auto-formulates large-scale optimization models from a plain-English problem description and datasets. How it works: two planner agents design a step-by-step workflow for similar problems; a builder agent

Teaching Humanoid Robots to Team Up—By Watching Humans

Humanoid robots need to coordinate physically with people—lifting, handing over, steadying—but we lack data of humans interacting with robots. What if they learned from humans interacting with humans? The catch: simply mapping human motions onto a robot often breaks crucial touches and supports. The team proposes PAIR (Physics-Aware

Private LLM Inference on Consumer Blackwell GPUs: A Practical Guide for Cost-Effective Local Deployment in SMEs

Want private AI without cloud risk or spend? This study shows SMEs can run production LLMs on NVIDIA Blackwell consumer GPUs (RTX 5060 Ti, 5070 Ti, 5090). * Cost: $0.001–$0.04 per million tokens (electricity only) — 40–200x cheaper than budget cloud APIs. * ROI: Hardware can pay for itself

Meet Promptware: How AI attacks became malware-like campaigns

LLM-based apps—from chatbots to code-running agents—are creating a new playground for attackers. A new paper by Ben Nassi, Bruce Schneier, and Oleg Brodt argues these aren’t one-off "prompt injections," but full-on malware campaigns they call promptware. They map attacks to a five-step "kill chain&

AI discovers better ways to fast-charge batteries

AI discovers better ways to fast-charge batteries Charging batteries quickly without wearing them out is hard—and testing each new idea takes time and money. Researchers show that large language models (the tech behind chatbots) can help design smarter charging “recipes.” * Two approaches: Prompt-to-Optimizer (P2O), where an AI writes small

LLMs can Compress LLMs: Adaptive Pruning by Agents

TL;DR An LLM acts as a coach to prune another LLM, shrinking it ~45% while preserving key knowledge and accuracy. Traditional pruning uses fixed rules and often wipes out facts. This paper lets a foundation model adaptively choose which layers to trim each round. It reads layer sensitivity snapshots—

Join to hear the AI news first!

Latest