CAT: A Radically Efficient Transformer

MathyAIwithMike

November 23rd, 2025 (4 months ago)

Nov 22 - Nov 23, 2025

Unknown length

Discover the Compress & Attend Transformer (CAT), a novel AI architecture that dynamically adjusts its efficiency *after* training. Unlike inflexible models locked into a fixed quality-compute trade-off, CAT allows users to choose their desired performance profile at test time. It uses a compressor to create learned vector representations from chunks of the input sequence, and a decoder attends to both local tokens and compressed representations of past chunks. This design enables parallelized training and efficient generation with a rolling memory system. CAT offers a path to more flexible and resource-aware AI.

Download