CALM: Faster AI by predicting vectors, not tokens

CALM: Faster AI by predicting vectors, not tokens

CALM: Faster AI by predicting vectors, not tokens

Large language models usually write one token at a time — a core bottleneck for speed and cost. CALM (Continuous Autoregressive Language Models) flips the script.

Instead of guessing the next token, CALM predicts the next continuous vector. A high‑fidelity autoencoder packs a chunk of K tokens into one vector, then reconstructs the original text with over 99.9% accuracy. Fewer steps (about K× fewer) means faster, cheaper generation.

  • Models language as a sequence of continuous vectors
  • Likelihood‑free training, evaluation, and controllable sampling in the continuous domain
  • Matches strong discrete baselines at significantly lower compute
  • A scalable pathway toward ultra‑efficient LLMs

Paper: http://arxiv.org/abs/2510.27688v1

Code: https://github.com/shaochenze/calm

Project: https://shaochenze.github.io/blog/2025/CALM

Authors: Chenze Shao, Darren Li, Fandong Meng, Jie Zhou

Paper: http://arxiv.org/abs/2510.27688v1

Register: https://www.AiFeta.com

#AI #LLM #NLP #MachineLearning #DeepLearning #LanguageModels #Efficiency #Research #OpenSource

Read more