All digests
ResearchersENArtificial Intelligencedaily

[Artificial Intelligence] Daily digest — 289 papers, 0 strong connections (2026-05-05)

DeepScience — Artificial Intelligence
DeepScience
Artificial Intelligence · Daily Digest
May 05, 2026
289
Papers
10/10
Roadblocks Active
0
Connections
⚡ Signal of the Day
• Frontier AI agents face simultaneous capability ceilings and new attack surfaces: the best models hit only 55% on academic-level tasks, while adversarial compliance pressure collapses metacognition in 8 of 11 tested systems.
• These findings converge on a single tension — agents are being deployed before their failure modes are understood. The Compliance Trap paper shows that performance collapse is triggered by structural compliance instructions, not psychological content, meaning current safety tuning may be masking rather than solving the problem.
• Watch for whether the robotics papers (MolmoAct2, CoRAL) hold up under adversarial evaluation conditions similar to those in the Compliance Trap study — the gap between benchmark performance and real-world reliability is the defining open question for 2026.
📄 Top 10 Papers
MolmoAct2: Action Reasoning Models for Real-world Deployment
MolmoAct2 introduces a robot-control model trained on 720 hours of teleoperation data and 3.3 million spatial-reasoning examples, using a 'specialize-then-rehearse' recipe to build a vision-language backbone before coupling it to a motion-planning module. It outperforms GPT-5 and Gemini Robotics ER-1.5 across 13 embodied-reasoning benchmarks while reducing inference latency by only recomputing depth estimates for changed scene regions. The full model weights, training code, and datasets are publicly released, making this one of the more reproducible robotics AI releases in recent months.
██████████ 0.9 embodied-ai Preprint
The Compliance Trap: How Structural Constraints Degrade Frontier AI Metacognition Under Adversarial Pressure
When prompted with instructions that force compliance, 8 of 11 tested frontier AI models showed metacognitive accuracy drops of up to 30 percentage points — meaning they became significantly worse at knowing what they know. The key finding is that the trigger is the structural compliance demand itself, not any threatening or emotional content, and removing the compliance suffix restores performance even under active adversarial pressure. This matters because it suggests current safety-aligned models may be systematically miscalibrated in high-stakes deployment contexts where compliance-forcing instructions are common.
█████████ 0.9 alignment-safety Preprint
Perceptual Flow Network for Visually Grounded Reasoning
Large vision-language models tend to hallucinate because their training objective doesn't constrain where in an image they actually 'look' before answering — the model can generate plausible-sounding text without genuine visual grounding. PFlowNet addresses this by separating the perception step (where to look and what to extract) from the reasoning step, using a variational reinforcement learning framework with multiple reward signals rather than simple next-token prediction. The approach is evaluated on Qwen3-VL 8B and shows that loose geometric priors from a visual expert are more effective than tight precision constraints for downstream reasoning accuracy.
█████████ 0.9 hallucination-grounding Preprint
AcademiClaw: When Students Set Challenges for AI Agents
AcademiClaw is a benchmark of 80 real-world tasks designed by students at Shanghai Jiao Tong University, spanning 25+ domains and including GPU-intensive computational challenges, executed in isolated Docker environments and scored by six complementary methods. The best frontier model achieves only 55% on these tasks, with sharp capability boundaries between domains and a notable disconnect between how many tokens a model uses and how good its outputs actually are. The benchmark is publicly available and provides a more authentic measure of agent capability than synthetic evaluations, since the tasks reflect what domain experts actually find difficult.
█████████ 0.9 agent-tool-use Preprint
Autonomous LLM Agent Worms: Cross-Platform Propagation, Automated Discovery and Temporal Re-Entry Defense
This paper demonstrates that LLM agent systems with persistent file-backed state can be infected by self-propagating worms that spread autonomously across agent ecosystems without any user interaction, exploiting the fact that agents read files and inject their content into LLM context. A key counter-intuitive finding is that read operations — not write operations — are the primary integrity threat vector, and that user-prompt carriers achieve higher attack compliance than system-prompt carriers. The authors also propose a formal defense framework, but core attack details are withheld pending coordinated disclosure with the affected open-source agent frameworks.
█████████ 0.9 agent-tool-use Preprint
CoRAL: Contact-Rich Adaptive LLM-based Control for Robotic Manipulation
CoRAL solves a core problem in LLM-based robotics: LLMs are too slow for real-time control, but their high-level reasoning is valuable. The system uses an LLM to design cost functions for a fast sampling-based motion controller (MPPI) rather than issuing direct commands, while a vision-language model estimates physical properties like mass and friction that are refined in real time as the robot interacts with objects. This separation of semantic reasoning from reactive execution enables zero-shot manipulation of contact-rich tasks without pre-training on those specific tasks.
█████████ 0.9 embodied-ai Preprint
Source Reconstructibility for Robust Document Processing and Relational Querying with LLMs — Verification, Domain Partitioning, and Data-Driven Query Catalogs in Enterprise Settings
Enterprise LLM deployments fail in three structurally distinct ways: they hallucinate precise values (like numbers or dates), contaminate answers with information from unrelated documents in retrieval-augmented pipelines, and generate database queries that are technically valid but logically ill-posed. The VERA system addresses the first failure mode by anchoring every extracted value back to a specific region of the source document and applying multiple verification layers including fuzzy logic and a numeric determinism check, achieving a 96.1% hallucination capture rate. The unifying principle — that every LLM output should be traceable back to a specific source location — offers a practical architectural guideline for production document-processing systems.
█████████ 0.9 hallucination-grounding Peer-reviewed
OAgents: A Pre-Standardization Draft Profile for Operational AI Agent Trustworthiness
As LLM-based agents move into operational roles — executing code, calling APIs, managing files — the gap between what they can do and what safeguards exist around them has grown into a practical deployment risk. OAgents proposes a structured trust framework of 26 controls across 7 categories organized into three conformance levels, covering pre-execution gates, post-execution verification, and behavioral guarantees. Unlike approaches that rely on model-level alignment, this framework treats trustworthiness as an engineering property of the deployment envelope, making it applicable regardless of which underlying model is used.
█████████ 0.9 agent-tool-use Peer-reviewed
Foundation-Model-Based Agents in Industrial Automation: Purposes, Capabilities, and Open Challenges
A systematic review of 88 publications on foundation-model agents in industrial automation finds that 75% of reported systems are still at prototype or early validation stages (Technology Readiness Levels 4–6), with only 9% providing deployment-oriented evidence. The dominant use cases are user assistance, monitoring, and process optimization — not the production-control tasks that industrial multi-agent systems have traditionally handled — suggesting that FM-based agents are filling new niches rather than replacing existing automation. The gap between reported capability and demonstrated readiness is the central finding, reinforcing concerns about hallucination and reliability in high-stakes industrial settings.
██████████ 0.8 hallucination-grounding Preprint
APIOT: Autonomous Vulnerability Management Across Bare-Metal Industrial OT Networks
APIOT demonstrates that LLM agents can autonomously complete a full attack-and-remediation cycle on industrial control devices — discovering vulnerabilities, exploiting them via protocol-level reasoning over Modbus/TCP and CoAP, patching them, and verifying the fix — achieving a 90% mission success rate across 290 experiment runs on bare-metal Zephyr RTOS firmware. A critical finding is that a runtime governance layer ('overseer') is the single most important engineering variable: without it, agents reliably fall into repetition loops, skip crash verification, and deadlock during reconnaissance. This has direct implications for any organization considering agentic AI for security operations, as the agent's behavior is highly sensitive to runtime constraints rather than just model capability.
██████████ 0.8 agent-tool-use Preprint
🔬 Roadblock Activity
Roadblock Papers Status Signal
Reasoning Reliability 121 Active High volume of activity today, with papers spanning legal reasoning (neuro-symbolic offloading), industrial agent surveys, and exploit synthesis all converging on the same finding: probabilistic LLM inference alone is insufficient for high-stakes sequential reasoning tasks.
Interpretability 113 Active Conceptual architectures dominate today's interpretability papers, with limited empirical validation; the HADES drug-injury paper is the most concrete application, framing explainability as hypothesis generation rather than post-hoc attribution.
Data Quality & Curation 109 Active MolmoAct2's 3.3M-sample spatial-embodied corpus and quality-filtered robot datasets represent the most substantive data curation contribution today, signaling that curated task-specific data remains a key differentiator for embodied AI performance.
Hallucination & Grounding 96 Active Multiple independent approaches to hallucination reduction appeared today — source anchoring for documents (VERA), decoupled perceptual flows for vision (PFlowNet), and neuro-symbolic offloading for legal reasoning — suggesting the field is converging on grounding-by-architecture rather than grounding-by-training.
Multimodal Understanding 83 Active MolmoAct2 and PFlowNet both tackle the gap between visual perception and downstream reasoning, with PFlowNet providing the clearest mechanistic account of why standard training objectives fail to produce genuine visual grounding.
Efficiency & Scaling 83 Active MolmoAct2's adaptive-depth reasoning (re-predicting depth tokens only for changed scene regions) and the Semantic Autonomy Framework's 103,000-fold latency reduction via cross-robot memory are the most concrete efficiency contributions today.
Agent Tool Use 83 Active A dual signal today: capability benchmarks (AcademiClaw, APIOT) show large gaps between frontier model performance and task requirements, while security research (LLM worms, OAgents) reveals that the attack surface of tool-using agents is poorly understood and largely undefended.
Alignment & Safety 78 Active The Compliance Trap paper is the strongest safety signal today, showing that compliance-forcing instructions — not adversarial content — are the primary driver of metacognitive collapse in frontier models, with implications for any deployment where users can issue directive-style prompts.
Long Context 37 Active Relatively quiet day for long-context work; the LLM worm paper touches on context injection as an attack vector but there are no direct advances in long-context modeling or retrieval today.
Embodied AI 29 Active Two substantive robotics papers today — MolmoAct2 with full open-source release and CoRAL with a novel LLM-as-cost-designer architecture — represent meaningful progress, though both acknowledge the gap between lab benchmarks and reliable real-world deployment.
View Full Analysis
DeepScience — Cross-domain scientific intelligence
Sources: arXiv · OpenAlex · Unpaywall
deepsci.io