DeepScience

DeepScience — Artificial Intelligence

DeepScience

Artificial Intelligence · Daily Digest

April 13, 2026

Papers

9/9

Roadblocks Active

Connections

⚡ Signal of the Day

• Today's AI pipeline is dominated by low-quality Zenodo preprints: most are conceptual position papers, speculative frameworks, or submissions without empirical validation, making this a weak day for actionable research signals.

• The clearest substantive finding is that frontier LLM agents (GPT-5 Mini, Grok 4.1 Fast, Gemini 3.1 Flash Lite) are systematically vulnerable to social manipulation in negotiation contexts across 20,880 sessions — a concrete alignment-safety concern at scale.

• Watch the agent-tool-use and alignment-safety roadblocks: the social manipulation and prompt injection results suggest that deployment-level adversarial robustness remains unsolved even in the latest model families.

📄 Top 10 Papers

Sensemaking in User-Driven Algorithm Auditing: A Case Study on Gender Bias in an Image Captioning Model

This study ran a between-subjects experiment with 60 participants auditing a Salesforce BLIP image captioning model for occupational gender bias under three interface conditions, finding that regular users can surface systematic bias patterns when given appropriate sensemaking tools. The work matters because it provides an empirical pathway for delegating bias detection to affected communities rather than relying solely on internal model evaluations. For AI interpretability, it shows that audit interface design directly shapes what biases get found — a practical design variable that is often ignored.

██████████ 0.8 interpretability Peer-reviewed

Read

Replication materials: Social Manipulation of AI Agents in Online Market Negotiations

Across 20,880 negotiation sessions on a multi-seller marketplace platform, AI agents built on three current frontier LLMs — GPT-5 Mini, Grok 4.1 Fast, and Gemini 3.1 Flash Lite — were found to be consistently vulnerable to social manipulation tactics. This matters because it shows that the latest generation of commercial models does not automatically inherit robustness against human persuasion strategies, even in constrained economic tasks. The scale of the dataset (nearly 21,000 sessions) makes this one of the more empirically grounded alignment-safety findings in today's pipeline.

██████████ 0.8 alignment-safety Peer-reviewed

Read

Replication materials: Social Manipulation of AI Agents in Online Market Negotiations

This is the accompanying replication package for the social manipulation study, depositing experimental code, analysis scripts, and session-level data from the same 20,880-session negotiation experiment. The additional finding here is that inoculation methods — pre-exposing agents to manipulation examples — can reduce but not eliminate vulnerability. The package is currently access-restricted on Zenodo, which limits independent verification despite the authors' claim that scripts reproduce every reported statistic.

██████████ 0.7 alignment-safety Peer-reviewed

Read

Prompt Injection and Data Leakage in Large Language Models: An Empirical Study on TinyLlama

Using TinyLlama-1.1B-Chat, this study tested two attack vectors: instruction-override prompt injection and LoRA fine-tuning on synthetic credentials. The fine-tuned model achieved 100% retrieval of memorized synthetic secrets, and prompt-based defenses reduced but did not block injection success. The result illustrates a fundamental tension: fine-tuning that improves capability on a task also bakes in memorized content that adversarial prompts can later extract, which is directly relevant to deploying smaller open-weight models in sensitive environments.

██████████ 0.7 agent-tool-use Peer-reviewed

Read

memory-spark: GPU-Accelerated Persistent Memory for Autonomous AI Agents

This paper proposes a 15-stage retrieval pipeline combining Hypothetical Document Embeddings (HyDE) with hybrid dense-sparse fusion via Reciprocal Rank Fusion, plus a Dynamic Reranker Gate claimed to reduce retrieval latency by 78.7% on BEIR benchmarks. It also identifies eight 'silent failure modes' in production RAG systems that do not surface as obvious errors. Confidence in these results is low: only Zenodo metadata was accessible, no baseline comparisons or ablation details are visible, and the 78.7% latency claim is unverifiable without runnable code.

██████████ 0.6 agent-tool-use Peer-reviewed

Read

memory-spark: GPU-Accelerated Persistent Memory for Autonomous AI Agents

A second Zenodo deposit under the same title presents overlapping findings — the 15-stage HyDE/RRF pipeline with a Dynamic Reranker Gate, zero-shot BEIR evaluation, and the 78.7% latency reduction claim — but additionally foregrounds the long-context implications of persistent memory for autonomous agents. The duplication across two DOIs with nearly identical abstracts raises data-quality concerns. Until full paper content and code are accessible, the empirical claims cannot be assessed.

██████████ 0.6 hallucination-grounding Peer-reviewed

Read

SAGE-RAI: Design Patterns for Transparent RAG Systems

This paper reports a systematic evaluation of a transparent retrieval-augmented generation (RAG) system in an educational setting, finding 92.3% of users rated it 4–5 stars with a mean of 4.62/5. The key tension identified is that AI-provided transparency about information sourcing may undermine students' own reasoning if they over-rely on the system rather than engaging critically with sources. This is a practical design tradeoff for any RAG deployment in learning environments where fostering independent judgment matters.

██████████ 0.6 hallucination-grounding Peer-reviewed

Read

The Hidden Field Stack Inside AI Environment

This position paper argues that AI systems should be understood not primarily through their models or prompts, but through a layered 'field stack' of context regimes, memory policies, tool permissions, evaluation contracts, and governance overlays. The claim is that this hidden stack — not the model weights — determines what outputs are possible and what actions are permitted in practice. The framework is purely conceptual with no empirical validation, but it offers a useful vocabulary for analyzing why the same model behaves differently across deployment environments.

██████████ 0.5 agent-tool-use Peer-reviewed

Read

THE GAP: A Neutral Layer Standard for Irreversible Digital Decisions

This paper identifies a structural trilemma for AI providers: giving advice creates liability, withholding it abandons users, and internalizing pause mechanisms shifts design liability onto the provider. It proposes 'THE GAP' — an external neutral layer that AI systems enumerate rather than recommend — operationalized as a machine-readable taxonomy of 82 decision contexts across 6 irreversibility layers. The legal grounding cites actual cases through 2026 (including Garcia v. Character Technologies and Raine v. OpenAI), making this more concrete than most AI governance proposals, though no empirical validation of the framework's effectiveness is provided.

██████████ 0.4 alignment-safety Peer-reviewed

Read

Moltbook Social Interactions Dataset

This dataset captures longitudinal social interactions — posts, comments, agent profiles, social graphs, and activity timelines — from autonomous AI agents ('Molties') operating on a dedicated social platform, collected automatically every six hours. The value is in providing time-series behavioral data for studying how AI agents evolve social interaction patterns without human co-authorship of each act. Methodological detail on agent architecture and interaction generation is not provided in the available metadata.

██████████ 0.4 agent-tool-use Peer-reviewed

Read

🔬 Roadblock Activity

Roadblock	Papers	Status	Signal
Model Interpretability	21	Active	High paper volume but today's most concrete interpretability contribution comes from user-driven auditing research, showing that interface design shapes which biases get discovered.
Reasoning Reliability	21	Active	Activity is high but today's papers are predominantly conceptual frameworks and philosophical analyses rather than empirical advances in reliable reasoning.
Agent Tool Use & Planning	18	Active	Social manipulation vulnerability across 20,880 negotiation sessions and prompt injection results both point to agent-tool-use environments as the primary attack surface in current deployments.
Alignment & Safety	16	Active	The social manipulation study is the day's clearest alignment-safety signal: frontier models from multiple providers share consistent vulnerability to persuasion tactics in economic contexts.
Efficiency & Scaling	16	Active	No strong empirical papers on efficiency or scaling appeared today; the memory-spark latency claims (78.7% reduction) remain unverifiable without accessible code.
Hallucination & Grounding	15	Active	RAG transparency and retrieval pipeline work (memory-spark, SAGE-RAI) dominated this roadblock, but all submissions lack reproducible artifacts to validate grounding claims.
Data Quality & Curation	12	Active	The Moltbook longitudinal dataset is the day's only concrete data contribution; the knowledge graph self-indexing submissions add noise rather than signal to this roadblock.
Multimodal Understanding	10	Active	The gender bias auditing study on an image captioning model is the sole empirical multimodal result today, with modest but rigorous evidence on how visual-language systems encode occupational stereotypes.
Long-Context Processing	4	Open	Minimal activity; the memory-spark persistent memory framing touches long-context implicitly but no dedicated long-context research appeared today.

View Full Analysis

DeepScience — Cross-domain scientific intelligence
Sources: arXiv · OpenAlex · Unpaywall
deepsci.io

Unsubscribe