Kari Jaaskelainen - AI Feta, the news about scientific AI research (Page 65)

TAMAS: Stress-testing Multi‑Agent AI for Safety

AI agents are starting to work in teams. That unlocks power—and new ways things can go wrong. TAMAS is a benchmark that stress‑tests multi‑agent LLM systems against adversarial tricks and coordination failures. * 5 realistic scenarios, 300 attack instances across 6 attack types * 211 tools, plus 100 harmless

AI

DeepEyesV2: Teaching AI to Use Tools

AI that sees, thinks—and uses tools Meet DeepEyesV2, a multimodal “agentic” model that doesn’t just read text and look at images—it can call external tools like code runners and web search, then weave the results into its reasoning. Key ideas: * Two-stage training: a cold-start phase

AI

Enhancing Public Speaking Skills in Engineering Students Through AI

Meet your AI public-speaking coach Engineers need to explain complex ideas clearly, but personalized practice is hard to get. This research builds a multimodal AI trainer that analyzes both what you say and how you say it, then delivers targeted feedback at scale. * Verbal: pitch, loudness, pacing, intonation * Non-

Can AI follow your company’s rules in long chats? Meet the Pluralistic Behavior Suite

AI assistants are not used in a vacuum—they operate inside hospitals, banks, classrooms, and brands, each with its own rules. This paper introduces Pluralistic Behavior Suite (PBSUITE), a testbed to see whether language models can stick to your custom policies over multi-turn conversations. * 300 realistic behavioral policies across

AI

LiveStar: An AI Co-Host for Live Streams

Meet LiveStar — an AI co‑host for live streams that understands what’s happening on screen and speaks up at the right moment. Most video AIs do great offline, but stumble live: they process frames slowly and interrupt or fall behind. LiveStar fixes that with: * Adaptive streaming decoding: keeps a

AI

AI Agents Clean Maintenance Logs for Smarter Predictive Maintenance

Predictive maintenance means fixing equipment before it fails. But in the auto industry, messy maintenance logs—full of typos, missing fields, near-duplicates, and wrong dates—can derail the machine-learning models that power those predictions. This study tests large language model (LLM) agents as smart cleaners for those logs.

AI

AI that finds the facts hidden in court verdicts

Criminal justice data often record charges and outcomes—but not what actually happened. Yet many continental European verdicts spell out the facts. This study shows we can automatically pull those fact sections from public Slovak court decisions. The team tested two tools: 1) smart pattern‑matching that recognizes how courts

AILiteracy

Hands-On, No-Code AI for Community Colleges

Artificial intelligence is now part of everyday tools. How should community colleges teach it to non-STEM students? Researchers tested AI User, a no-code, scenario-based online curriculum, with four focus groups of community college instructors. * Instructors loved exploratory, real-world tasks that let students experiment safely with AI.

UrbanPlanning

Reasoning Is All You Need for Urban Planning AI

Cities need AI that can explain itself Most planning AI spots patterns in data. The next leap is AI that helps choose sites, allocate budgets, and balance trade-offs—while showing its work. This paper introduces the Agentic Urban Planning AI Framework built on new reasoning methods like Chain-of-

AudioAI

Stronger Song IDs for the TikTok Era

Remixes, sped-up clips, and heavy compression make it hard to trace where music on social platforms comes from. This study introduces more robust neural audio fingerprints—compact “IDs” for audio—that can still recognize a song after it’s been distorted. * Music-first brains: Instead of training from scratch,

AI

Poke Around and Learn: No‑Code AI Projects Boost Critical Thinking

As AI shapes decisions everywhere, learners beyond computer science need practical AI literacy. But too often, lessons are code-heavy or lecture-only—hard to access and easy to remember. A new study of AI User, a modular, no-code web curriculum, shows another path. Researchers reviewed Projects 5–8

Rethinking AI Literacy: Test What Workers Actually Do

Most AI 'literacy' tests still grade you on math and code. But in real jobs, success looks more like choosing the right tool, interpreting model output, and flagging ethical risks. A team working with a US Navy robotics training program built a task-oriented assessment: scenario questions that