General Discussion

highplainsdem

(61,364 posts)

4. Instead of the link for the PDF copy of that paper you mentioned, here's the link for the HTML version:

Mon Mar 2, 2026, 12:14 AM

Yesterday

Alignment Faking in Large Language Models
https://arxiv.org/html/2412.14093v2

The paper is from December 2024.

A new paper shed a lot of light on LLM "reasoning" - or lack of same - and I posted this thread about it last month:

A very pro-AI account on both Bluesky and X posted about a "disturbing" Stanford paper on LLMs' failures at reasoning
https://www.democraticunderground.com/100221009224

See the replies there as well, especially reply 28, linking to Gary Marcus's post on Substack the next day about that new study.

Direct link to what Gary wrote:

BREAKING: LLM “reasoning” continues to be deeply flawed
https://garymarcus.substack.com/p/breaking-llm-reasoning-continues

Link for that new paper, published less than a month ago:

Large Language Model Reasoning Failures
https://arxiv.org/abs/2602.06176

And links on that page will let you choose the PDF or HTML version.

Large Language Models (LLMs) have exhibited remarkable reasoning capabilities, achieving impressive results across a wide range of tasks. Despite these advances, significant reasoning failures persist, occurring even in seemingly simple scenarios. To systematically understand and address these shortcomings, we present the first comprehensive survey dedicated to reasoning failures in LLMs. We introduce a novel categorization framework that distinguishes reasoning into embodied and non-embodied types, with the latter further subdivided into informal (intuitive) and formal (logical) reasoning. In parallel, we classify reasoning failures along a complementary axis into three types: fundamental failures intrinsic to LLM architectures that broadly affect downstream tasks; application-specific limitations that manifest in particular domains; and robustness issues characterized by inconsistent performance across minor variations. For each reasoning failure, we provide a clear definition, analyze existing studies, explore root causes, and present mitigation strategies. By unifying fragmented research efforts, our survey provides a structured perspective on systemic weaknesses in LLM reasoning, offering valuable insights and guiding future research towards building stronger, more reliable, and robust reasoning capabilities. We additionally release a comprehensive collection of research works on LLM reasoning failures, as a GitHub repository at this https URL, to provide an easy entry point to this area.

Edit history

Please sign in to view edit histories.

Recommendations

1 members have recommended this reply (displayed in chronological order):

reACTIONary

9 replies

= new reply since forum marked as read

Highlight:

ChatGPT Confesses... Reveals Deep Learning Deep Secrets... [View all] reACTIONary Sunday OP

the first link doesn't work for me... ret5hd Sunday #1

and now it's fine. ret5hd Sunday #2

Oh good! I was about to turn in.... reACTIONary Sunday #3

It doesn't work for me Layzeebeaver Yesterday #7

Instead of the link for the PDF copy of that paper you mentioned, here's the link for the HTML version: highplainsdem Yesterday #4

Thanks! I'm going to start working my way through your links! reACTIONary Yesterday #9

"Stochastic parrot" is an insult to parrots. hunter Yesterday #5

Damn, sycophantic little punk, isn't it LearnedHand Yesterday #6

Yes, very sycophantic. Every question was "incisive", "insightful", "subtle"..... reACTIONary Yesterday #8