REMA: A Unified Reasoning Manifold Framework for Interpreting Large Language Model

Kari Jaaskelainen

29 Sep 2025 — 1 min read

Diagnosing LLM reasoning failures as geometric deviations from a low-dimensional manifold.

Why do large language models succeed—or fail—on complex reasoning? REMA reframes the question geometrically. It posits a Reasoning Manifold: a low-dimensional structure formed by internal representations associated with correct reasoning trajectories. Errors, then, are measurable deviations from this manifold.

REMA operationalizes this idea in two steps. First, it approximates the manifold using representations from correct samples and computes a unified failure signal: the k-nearest neighbors distance from erroneous representations to the manifold. Second, it localizes divergence points by tracking this deviation layer-by-layer and contrasting it with the natural internal variability of correct examples, pinpointing where a chain of thought begins to derail.

Across diverse models and tasks—language and multimodal—REMA finds strong separability between correct and incorrect reasoning states and consistent low-dimensional structure in successful trajectories. Practically, this makes it a powerful tool for diagnosis, curriculum design, and targeted interventions (e.g., layer-wise constraints, selective self-consistency, or process supervision precisely where deviations emerge).

What’s new: a unified, quantitative interpretability lens for reasoning, bridging abstract failure narratives with concrete geometric signatures. Why it matters: model developers can move beyond outcome-only metrics, instrument training and inference with deviation-based monitoring, and study generalization and brittleness through manifold topology.

Expect REMA to inform better evaluation suites, model debugging pipelines, and potentially new training objectives that explicitly regularize toward stable reasoning subspaces.

Paper: http://arxiv.org/abs/2509.22518v1

Register: https://www.AiFeta.com

#AI #LLM #Interpretability #Reasoning #RepresentationLearning #Manifold

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

Evidence that RL teaches genuinely new abilities: compositional skills emerge and transfer across tasks Does RL merely reweight what an LLM already knows—or can it teach genuinely new skills? This paper offers concrete evidence for the latter. Using a controlled, synthetic framework, the authors define “skills” as string transformation

OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

A structured 80k instruction–image corpus spanning 11 domains and 51 subtasks to train unified visual editors Unified models for image generation and editing hit a data ceiling: existing corpora emphasize basic manipulations but miss real‑world complexity. OpenGPT‑4o‑Image tackles this with a hierarchical task taxonomy and automated

Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

ROVER replaces PPO loops with uniform‑policy Q‑values—boosting quality and diversity in math reasoning Popular RLVR methods for LLM reasoning lean on generalized policy iteration (e.g., PPO/GRPO), but suffer instability and diversity collapse. This paper reframes math RLVR as a specialized finite‑horizon MDP with deterministic

CLPO: Curriculum Learning meets Policy Optimization for LLM Reasoning

A dynamic, self‑paced curriculum that restructures problems to match model ability in RLVR Online RL with Verifiable Rewards (RLVR) has boosted LLM reasoning—but most methods treat all problems equally, wasting effort on solved items and flailing on those beyond current capability. CLPO fixes that with a dynamic pedagogy:

Read more

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

CLPO: Curriculum Learning meets Policy Optimization for LLM Reasoning