RoadblockArtificial IntelligencePartial

Reliable multi-step reasoning

Language models frequently fail at multi-step reasoning tasks requiring logical consistency, mathematical precision, or compositional generalization. Chain-of-thought prompting improves surface performance but does not guarantee faithful internal reasoning — models may produce correct-looking traces while relying on shortcuts. Process reward models and tree-of-thoughts approaches show promise but add significant inference cost. Achieving reliable, verifiable reasoning across diverse domains without exponential compute overhead remains open.

Reliable multi-step reasoning

Knowing the Self, Understanding the World: A Dual-Cognition Benchmark for UAV Spatio-temporal Reasoning with MLLMs

FVAttn: Adaptive Sparse Attention with Runtime Load Balancing for Video Generation

PagedWeight: Efficient MoE LLM Serving with Dynamic Quality-Aware Weight Quantization

A Blueprint for Equilibrium-Based Differentiable Continuous-Variable Thermodynamic Computing

Vision-Language Assistant for Emotional Reactions to Risky Driving

Cluster-Aware Matching via Laplacian Optimal Transport

Physics-enhanced reinforcement learning for real-time optimal control of dynamical systems

Evaluating Open-Weight LLMs for Generating Structured Threat Information for Autonomous Vehicle Vulnerabilities

Vision-Language-Motion Maps: An Open-Vocabulary, Uncertainty-Aware, Queryable Motion Attribute for 3D Scene Maps

When Does Muon Help Agentic Reinforcement Learning?