Reversal Curse in Idiomatic Tasks of Large Language Models

One-Line Summary

A systematic evaluation revealing that large language models suffer from the reversal curse when processing idioms across multiple languages — they can complete an idiom from its beginning but consistently fail to retrieve the beginning from its ending, exposing a fundamental directional bias in how autoregressive models store memorized linguistic expressions.

Background & Motivation

The reversal curse is a recently identified phenomenon in large language models: if a model learns that "A is B," it cannot reliably infer that "B is A." While this has been demonstrated for factual knowledge (e.g., "Tom Cruise's mother is Mary Lee South" vs. "Mary Lee South's son is Tom Cruise"), its manifestation in linguistic knowledge — specifically idiom processing — has remained unexplored.

Why Idioms Are a Unique Testbed:

Fixed expressions: Idioms are memorized as whole units (e.g., "break the ice," "spill the beans"), so their retrieval depends on sequential memory rather than compositional reasoning.
Non-compositional meaning: The meaning of an idiom cannot be derived from its parts, making it a purer test of whether models store bidirectional associations or merely left-to-right sequences.
Cross-lingual universality: Idioms exist in every language with varying structural properties, enabling cross-lingual investigation of the reversal curse beyond English-centric studies.
Gap in existing research: Prior reversal curse studies focused exclusively on factual triplets (entity-relation-entity), leaving open whether the phenomenon extends to linguistic knowledge stored during pretraining.

This paper asks a simple but fundamental question: if an LLM can complete "break the ___" with "ice," can it also produce "break the" when given "___ ice"? The answer reveals deep insights about how autoregressive models encode memorized language patterns.

Reversal Curse: From Facts to Language

The original reversal curse phenomenon was demonstrated in the domain of factual knowledge. This work extends the investigation to an entirely different type of stored knowledge — linguistic patterns:

Aspect	Factual Reversal Curse	Idiomatic Reversal Curse (This Work)
Knowledge type	World knowledge (entity relations)	Linguistic knowledge (fixed expressions)
Example	"Tom Cruise's mother is ___" vs. "Mary Lee South's son is ___"	"Break the ___" vs. "___ ice"
Storage mechanism	Fact memorization	Sequence memorization
Compositional?	Partially (entity-relation composable)	No (idiom meaning is non-compositional)
Cross-lingual?	English-focused studies	Multi-language evaluation

Proposed Method

The study designs a controlled experimental framework to measure the reversal curse in idiom processing, consisting of three key components:

1

Bidirectional Idiom Task Design

Two complementary evaluation tasks are constructed: Forward Completion, where the model receives the first half of an idiom and must predict the remainder, and Reverse Retrieval, where the model receives the latter half and must recover the beginning. By comparing performance on these symmetric tasks, the degree of directional asymmetry (i.e., the reversal curse) can be precisely quantified. The task design ensures that both directions require the same underlying knowledge — only the direction of retrieval differs.

2

Multilingual Idiom Dataset Construction

Idiom datasets are collected across multiple languages, ensuring diversity in both linguistic families and idiom structure. Each idiom is split at a natural boundary to create forward and reverse test pairs. This cross-lingual design enables the study to determine whether the reversal curse is a universal property of autoregressive models or varies by language characteristics such as word order and morphological complexity. Languages are selected to cover different typological profiles, providing robust evidence beyond any single language.

3

Multi-Model Systematic Evaluation

Multiple LLM architectures are evaluated under consistent conditions, with analysis across several dimensions: model size and family, idiom frequency in training corpora, structural properties of idioms (length, compositionality), and language-specific effects. This systematic approach isolates which factors contribute to or mitigate the reversal curse in linguistic knowledge.

Controlled Experimental Design:

The experimental framework is carefully designed to isolate the directional bias from confounding factors:

Same knowledge, different direction: Forward and reverse tasks test the same idiom, so any performance gap is attributable to directional bias rather than knowledge gaps.
Natural split points: Idioms are divided at linguistically natural boundaries (e.g., between clauses or phrases), avoiding artificial splits that might introduce artifacts.
Controlled prompting: Both tasks use parallel prompt structures to ensure that differences in performance reflect the reversal curse rather than prompt engineering effects.

Experimental Results

The experiments reveal a consistent and significant directional asymmetry across all evaluated models and languages:

Forward vs. Reverse Performance

Factor	Forward Completion	Reverse Retrieval	Asymmetry
Overall (all models, all languages)	High accuracy	Substantially lower	Consistent gap
High-frequency idioms	Very high	Moderately lower	Reduced but present
Low-frequency idioms	Moderate	Very low	Large gap
Short/formulaic idioms	High	Moderate	Smaller gap
Long/opaque idioms	Moderate	Very low	Largest gap

Key Findings

Clear reversal curse in idioms: Forward idiom completion achieves substantially higher accuracy than reverse retrieval across all evaluated models, confirming that the reversal curse extends beyond factual knowledge to memorized linguistic expressions.
Cross-lingual consistency: The forward-reverse performance gap is observed across all tested languages, demonstrating that the reversal curse in idiom processing is not language-specific but a fundamental property of autoregressive left-to-right training.
Frequency effect: More frequent idioms exhibit a slightly reduced reversal curse, suggesting that repeated exposure during pretraining partially compensates for the directional bias — but even high-frequency idioms show a significant forward-reverse gap.
Model size effect: Larger models show improved performance on both forward and reverse tasks, but the relative asymmetry between the two directions persists, indicating that scaling alone does not resolve the reversal curse.
Structural factors: Shorter and more formulaic idioms tend to have a smaller reversal gap, while longer and more compositionally opaque idioms exhibit a stronger directional bias, suggesting that structural complexity amplifies the effect.
Not a tokenization artifact: The reversal curse persists across different tokenization schemes and languages with different scripts, ruling out tokenization as the primary cause and pointing to the autoregressive training objective itself.

Why It Matters

This work makes several important contributions to our understanding of knowledge representation in large language models:

Extends reversal curse to linguistic knowledge: By moving beyond factual triplets to idioms, this study demonstrates that the reversal curse is not limited to world knowledge but affects how models store all types of memorized sequential patterns, including language itself.
Reveals directional storage of fixed expressions: The findings suggest that autoregressive LLMs encode idioms as directional sequences rather than bidirectional associations, meaning they fundamentally lack symmetric access to memorized linguistic patterns.
Cross-lingual evidence of an architectural limitation: The consistency of results across languages provides strong evidence that the reversal curse stems from the autoregressive training objective itself, not from language-specific data artifacts or tokenization effects.
Implications for downstream applications: Tasks that require reverse lookup of linguistic patterns — such as paraphrase retrieval, idiom-based search, or figurative language understanding from partial cues — may be systematically disadvantaged by this directional bias in current LLM architectures.

Fundamental Insight: The reversal curse in idioms reveals something deeper than a task-specific limitation: autoregressive models do not form symmetric associations between tokens, even for highly memorized sequences. This has profound implications for how we understand knowledge storage in LLMs. While these models appear to "know" idioms, their knowledge is fundamentally directional — they can traverse memorized sequences forward but cannot flexibly access them from arbitrary positions, suggesting that true bidirectional understanding may require architectural innovations beyond the standard autoregressive paradigm.