Sustainable Green Computing and Carbon-Aware Artificial Intelligence
June 10, 2026openalex
Language models frequently fail at multi-step reasoning tasks requiring logical consistency, mathematical precision, or compositional generalization. Chain-of-thought prompting improves surface performance but does not guarantee faithful internal reasoning — models may produce correct-looking traces while relying on shortcuts. Process reward models and tree-of-thoughts approaches show promise but add significant inference cost. Achieving reliable, verifiable reasoning across diverse domains without exponential compute overhead remains open.