AI 论文日报（2026-03-24）

Published: March 24, 2026

English version: /paper-news/2026-03-24/

运行统计

候选论文: 1193
入选论文: 30
已精读完成: 30
时间窗口 (UTC): 2026-03-20T00:00:00Z → 2026-03-21T00:00:00Z (weekend_backlog_sun, expanded=0)

展开查看用于总结的论文列表

arXiv ID	标题 / 链接	分类	评分	入选理由	标签
`2603.14923`	Directional Routing in Transformers PDF	cs.LG, cs.AI	94	New transformer routing; strong causal ablations + mech interp show routing is dominant pathway	transformers, routing, mechanistic-interpretability, circuits, architecture
`2603.14723`	Beyond Creed: A Non-Identity Safety Condition A Strong Empirical Alternative to Identity Framing in Low-Data LoRA Fine-Tuning PDF	cs.CL	94	Shows non-identity safety supervision beats identity framing in low-data LoRA on HarmBench.	llm-safety, fine-tuning, LoRA, HarmBench, jailbreak-robustness, supervision-design
`2603.18444`	Discounted Beta--Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards PDF	cs.LG, cs.AI	91	Sample-efficient RLVR via reward distribution estimation; directly targets LLM reasoning post-training.	RLVR, post-training, reasoning, sample-efficiency, reward-modeling, LLMs
`2603.18545`	CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models PDF	cs.CV, cs.AI	90	Clinically plausible distribution-shift attack chain + token-space repair for medical VLM robustness.	robustness, distribution-shift, medical, vision-language, attacks, repair
`2603.18495`	Cross-Domain Demo-to-Code via Neurosymbolic Counterfactual Reasoning PDF	cs.AI	90	Neurosymbolic counterfactuals for demo-to-code; aims at verifiable procedure adaptation under domain shift.	agents, robotics, neurosymbolic, counterfactual-reasoning, verification, code-generation, VLM
`2603.15600`	From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation PDF	cs.RO, cs.AI, cs.CL, cs.CV	90	RL turns video MLLM into goal-aware process critic for long-horizon robot manipulation monitoring	robotics, process-supervision, reinforcement-learning, multimodal, monitoring
`2603.19223`	F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World PDF	cs.CL, cs.AI	90	Large multilingual embedding family (80M–14B) with strong MTEB results; useful for RAG/search.	embeddings, multilingual, retrieval, MTEB, efficiency, distillation
`2603.18411`	TARo: Token-level Adaptive Routing for LLM Test-time Alignment PDF	cs.CL, cs.AI, cs.LG	89	Token-level test-time alignment routing using step-wise reward signals; sizable reasoning gains claimed.	test-time-alignment, reasoning, reward-model, routing, inference-time, LLMs
`2603.11558`	RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks PDF	cs.RO, cs.AI	88	Unified VLM-driven long-horizon robotics with self-resetting data collection via entangled action pairs	robotics, agents, VLA, long-horizon, data-collection, self-improvement
`2603.17425`	Proactive Knowledge Inquiry in Doctor-Patient Dialogue: Stateful Extraction, Belief Updating, and Path-Aware Action Planning PDF	cs.AI	88	POMDP-lite proactive inquiry for doctor-patient dialogue; explicit belief updates and gap-aware planning.	agents, planning, POMDP, uncertainty, dialogue-systems, clinical, tool-use
`2603.17872`	Mitigating LLM Hallucinations through Domain-Grounded Tiered Retrieval PDF	cs.CL, cs.AI	86	Tiered retrieval+verification pipeline (LangGraph) targeting hallucination reduction in high-stakes domains.	hallucinations, RAG, verification, agents, reliability, grounding
`2603.18914`	Security, privacy, and agentic AI in a regulatory view: From definitions and distinctions to provisions and reflections PDF	cs.CR, cs.AI, cs.CY	86	Clear regulatory synthesis for security/privacy of agentic AI; useful for governance & deployment.	agentic-ai, regulation, security, privacy, EU-AI-Act, governance
`2603.14889`	Modeling and Benchmarking Spoken Dialogue Rewards with Modality and Colloquialness PDF	eess.AS, cs.CL, cs.LG	86	Speech dialogue reward model + new preference dataset targeting prosody & colloquialness gaps	reward-modeling, speech, preference-data, evaluation, alignment
`2603.19131`	From Inference Efficiency to Embodied Efficiency: Revisiting Efficiency Metrics for Vision-Language-Action Models PDF	cs.LG, cs.RO	86	Proposes embodied efficiency metrics for VLA robots; challenges FLOPs/params as proxy for real performance	embodied-agents, VLA, evaluation, efficiency-metrics, robotics
`2603.11863`	CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges PDF	cs.AI	86	Creativity benchmark with executable metrics to separate novelty from hallucination; self-evolving challenges.	evaluation, benchmarks, code-generation, self-play, open-endedness, reliability
`2603.09868`	CarbonBench: A Global Benchmark for Upscaling of Carbon Fluxes Using Zero-Shot Learning PDF	cs.LG, physics.ao-ph	86	Large zero-shot spatial transfer benchmark for carbon fluxes; strong eval protocols + scale.	benchmark, evaluation, zero-shot, domain-generalization, time-series, climate
`2603.14712`	Towards Next-Generation LLM Training: From the Data-Centric Perspective PDF	cs.CL, cs.LG	86	Data-centric LLM training: agentic data pipelines, selection/mixture optimization; high leverage bottleneck.	llm-training, data-centric-ai, data-mixtures, data-selection, agents, scaling
`2603.09356`	Democratising Clinical AI through Dataset Condensation for Classical Clinical Models PDF	cs.LG, cs.AI, cs.CR	86	DP dataset condensation for non-differentiable clinical models; practical privacy+utility angle	privacy, differential-privacy, dataset-condensation, synthetic-data, healthcare, reliability
`2603.14838`	The Impact of Ideological Discourses in RAG: A Case Study with COVID-19 Treatments PDF	cs.CL	86	Directly studies ideological retrieval effects in RAG on COVID treatments; relevant to grounding risks.	RAG, bias, ideology, misinformation, evaluation, prompting
`2603.18388`	Reflection in the Dark: Exposing and Escaping the Black Box in Reflective Prompt Optimization PDF	cs.AI, cs.MA	86	Makes reflective prompt optimization more interpretable/robust with multi-agent verification and restarts.	prompt-optimization, reflection, agents, robustness, interpretability, evaluation
`2603.15262`	Probe-then-Plan: Environment-Aware Planning for Industrial E-commerce Search PDF	cs.AI	84	Probe-then-plan grounds LLM search plans in live retrieval state to cut latency and invalid tool plans.	agents, tool-use, planning, retrieval, latency, deployment
`2603.19185`	MIDST Challenge at SaTML 2025: Membership Inference over Diffusion-models-based Synthetic Tabular data PDF	cs.LG	84	Challenge-style benchmark on membership inference vs diffusion synthetic tabular data; concrete privacy eval.	privacy, membership-inference, diffusion-models, synthetic-data, tabular, benchmark
`2603.17312`	Recurrent Reasoning with Vision-Language Models for Estimating Long-Horizon Embodied Task Progress PDF	cs.CV, cs.AI	84	Recurrent snippet-based VLM reasoning for long-horizon task progress; cheaper than full-trajectory video	embodied, VLM, reasoning, long-context, planning, monitoring
`2603.11479`	Grammar of the Wave: Towards Explainable Multivariate Time Series Event Detection via Neuro-Symbolic VLM Agents PDF	cs.LG, cs.AI, cs.MA	84	Neuro-symbolic agent framework to ground natural-language event specs into time-series intervals	agents, neuro-symbolic, grounding, time-series, evaluation
`2603.19225`	FinTradeBench: A Financial Reasoning Benchmark for LLMs PDF	cs.CE, cs.AI, cs.CL, cs.IR, q-fin.CP	84	New benchmark for LLM financial reasoning over fundamentals + trading signals; closer to real analyst workflows	LLM-evaluation, benchmark, reasoning, finance, multisignal
`2603.15183`	Token Coherence: Adapting MESI Cache Protocols to Minimize Synchronization Overhead in Multi-Agent LLM Systems PDF	cs.DC, cs.AI, cs.LG, cs.MA	84	Maps multi-agent LLM sync to cache coherence; proposes lazy invalidation to cut coordination cost	multi-agent, systems, coordination, scalability, synchronization
`2603.19002`	RADIUS: Ranking, Distribution, and Significance - A Comprehensive Alignment Suite for Survey Simulation PDF	cs.CL	84	Proposes standardized alignment metrics for LLM survey simulation incl. ranking+distribution.	evaluation, alignment-metrics, survey-simulation, benchmarking, distribution-shift
`2603.18447`	SODIUM: From Open Web Data to Queryable Databases PDF	cs.DB, cs.AI, cs.CL, cs.CV, cs.IR	84	Formalizes web-to-database agentic pipeline + benchmark; relevant to tool-using agents and data quality.	agents, information-extraction, web, databases, benchmark, tool-use
`2603.18481`	T-QPM: Enabling Temporal Out-Of-Distribution Detection and Domain Generalization for Vision-Language Models in Open-World PDF	cs.CV, cs.LG	83	Temporal OOD detection for VLMs under drift + covariate shift; open-world robustness focus.	OOD-detection, robustness, vision-language, distribution-shift, domain-generalization, evaluation
`2603.08321`	CORE-Acu: Structured Reasoning Traces and Knowledge Graph Safety Verification for Acupuncture Clinical Decision Support PDF	cs.AI	82	Neuro-symbolic CDS: structured reasoning traces + KG safety verification to constrain clinical outputs.	safety, clinical, neuro-symbolic, knowledge-graphs, reasoning-traces, verification

AI 论文洞察简报

2026-03-24

0) 执行要点（先读这个）

“验证-修订（verify-and-revise）”正在固化为可复用的安全模式：CORE-Acu（临床知识图谱否决 + 有界重写）、用于幻觉的分层检索验证、以及用于机器人学的神经符号反事实验证，都体现了同一种动作——生成 → 对照显式约束/世界模型检查 → 修订或拒答。
RAG 正越来越多地被视为攻击面，而不只是事实性修复手段：意识形态检索上下文可测量地引导输出（且显式意识形态描述会放大这种效应）；而分层检索流水线仍会在错误前提的过度断言上失败——这表明“检索治理 + 可回答性门控”正在变成必选项。
轻量级路由/协同机制正在三个层面涌现：(i) 架构层（Transformer 内的 Directional Routing），(ii) 解码时（TARo 在 token 级自适应混合 base+reward logits），(iii) 系统层（Token Coherence 在多智能体工作流中替代广播同步）。三者都旨在降低干扰/成本，同时保持行为可控。
时间与分布漂移正在通过基准与协议被“工程化”：CarbonBench 标准化了碳通量回归的零样本空间迁移；T-QPM 面向 VLM 的时间 OOD；CoDA 面向医学影像中的贴近流水线现实的分布链。
低数据对齐可能取决于措辞，而不只是“更多数据”：在 130 个样本的 LoRA 中，匹配的非身份（non-identity）安全表述在三类模型家族上于 HarmBench 优于 creed/constitutional 表述，同时 MMLU/ARC 变化可忽略。
具身部署指标正在与推理指标分化：压缩/剪枝/token/action 减少可以保持成功率，却会恶化 jerk/路径长度/耗时——对 VLA 模型的“效率”主张需要具身效率报告支撑。

2) 关键主题（聚类）

主题：面向高风险决策的神经符号验证闭环

重要性：当错误具有安全关键性（临床干预、机器人执行）时，“更好的提示词”不够——系统需要可审计的中间结构与确定性检查来否决或修复输出。
代表论文：
共同方法：
- 强制 结构化中间表示（S-CoT 因果链；事件逻辑树 ELT；STRIPS/PDDL 操作符 + 场景图）。
- 增加 符号验证器/执行器（KG 约束检查；前向仿真；操作符有效性门）。
- 违规时使用 有界修订循环（生成–验证–修订；反事实修复迭代）。
开放问题 / 失效模式：
- 覆盖范围限制：KG/谓词集合可能不完整；二元否决可能遗漏细微权衡。
- 模式抽取质量成为瓶颈（ELT 解析错误；VLM 场景图/操作符错误）。
- 运行时成本与对强专有模型的依赖（如 SELA/NESYCR 中的 GPT-5）。

主题：检索既是缓解手段，也是操控通道

重要性：检索可减少幻觉，但也可能引导输出（意识形态）或分散模型（时间序列/数值证据），因此安全需要控制检索的内容与使用方式。
代表论文：
共同方法：
- 将查询路由到不同来源（可信仓库 → Web 回退；申报文件 vs 价格的双轨检索）。
- 检索后评分/重排（CRAG 风格文档评分；对象/路径感知重排）。
- 显式度量检索效应（与意识形态两极的语义/词汇对齐；“检索增量（retrieval delta）”与指标 F1 变化）。
开放问题 / 失效模式：
- 错误前提的过度断言：验证流水线仍可能“验证前提”而不是拒答。
- 意识形态放大：在提示中显式化话语维度可能进一步引导输出。
- 数值/时间序列脆弱性：RAG 可能降低推理深度与指标保真度（当证据为表格/时间对齐数据时）。

主题：路由/协同作为通用控制旋钮（模型、解码、系统）

重要性：随着模型与智能体系统扩展，干扰与协同成本占主导。路由提供了一种紧凑方式来动态分配计算/权限——可能提升可解释性、可靠性与成本表现。
代表论文：
共同方法：
- 学习 输入相关的抑制/混合（方向分量抑制；逐 token 的 α 混合 base+reward logits）。
- 用 自适应、局部决策 替代全局/静态旋钮（token 级 vs 固定插值；一致性失效通知 vs 广播）。
- 增加 形式化/因果探针 展示关键行为（关闭路由器会导致归纳/回忆崩塌；用 TLA+ 不变量验证同步安全）。
开放问题 / 失效模式：
- 泛化性与方差：Directional Routing 结果来自有限规模/随机种子；基准增益不总随 PPL 增益出现。
- 测试时对齐路由器对奖励模型依赖强，且对 OOD 敏感。
- 一致性协议依赖若干假设（中心化权威；仿真 vs 生产轨迹；故障下的活性）。

主题：在真实分布漂移下的鲁棒性（时间、空间、流水线）

重要性：部署失败往往来自结构化漂移（时间漂移、地理差异、临床流水线），而非 i.i.d. 噪声——基准与威胁模型正更贴近运营现实。
代表论文：
共同方法：
- 定义 漂移感知协议（留出站点做零样本空间迁移；时间分区；链式流水线变换）。
- 报告 尾部/分位数指标 与运营指标（按站点分位数；随时间的 FPR95/AUROC；通过 SSIM 的合理性约束）。
- 探索 轻量适配/修复（T-QPM 仅学习两个融合标量；CoDA 在干净图像上训练 token 空间适配器）。
开放问题 / 失效模式：
- 难目标仍然很难：CarbonBench 显示 NEE 比 GPP/RECO 难得多；残差中的误差会放大。
- T-QPM 对 caption 依赖（对 caption 质量/多样性敏感）。
- CoDA 的修复是部分性的；更广的架构覆盖与更大的临床任务仍未测试。

主题：对齐、隐私与“创造力”的评测基础设施

重要性：多篇论文更关注测量而非新模型：安全表述效应、问卷模拟保真度、合成数据隐私泄露、以及自演化代码系统的创造力指标。
代表论文：
共同方法：
- 构建 任务特定指标 暴露失效模式（TRM/RC vs TVD/DH；质量×新颖性；MIA 的 TPR@10%FPR）。
- 使用 受控实验设计（匹配措辞条件；bootstrap 的并列处理；竞赛赛道）。
- 提供 可扩展流水线（自演化挑战生成；先校准再扩展；公开工件/影子模型）。
开放问题 / 失效模式：
- 评审依赖与可复现性（HarmBench 流水线中的闭源评审器；CreativeBench 中基于 LLM 的构造偏差）。
- 合成数据并非“默认隐私”（即便是基于扩散的表格合成，MIA 仍有不低成功率）。
- 指标严格性 vs 样本量（如 DH 作为更严格判据）。

3) 技术综合

多项工作在 结构化中间产物 上收敛，将其作为验证单元：症状→病机→原则→穴位链（CORE-Acu）、ELT 模式（SELA）、符号操作符与场景图（NESYCR）、以及状态/事件元组 + 权重（医患探询）。
有界循环 是主导的安全/控制原语：生成–验证–修订（CORE-Acu；分层检索验证；NESYCR 修复），并配有显式回退（人工确认；礼貌致歉）。
路由正在无处不在：模型内部（方向抑制）、解码阶段（token 级 α）、检索阶段（领域/层级路由；双轨检索）、以及服务侧（电商的复杂度感知路由器）。
多篇论文表明：若目标不匹配，指标提升可能具有误导性：Directional Routing 带来大幅 PPL 降低，但多选基准无增益；VLA 压缩改善推理指标却恶化具身 jerk/路径/时间。
时间锚定 正成为长时程理解的通用技巧：PRIMO 的 (I_init, V_seq, I_curr) 输入结构；T-QPM 的时间步条件原型与漂移惩罚。
鲁棒性研究正从“单一扰动”转向 组合的、真实的漂移链（CoDA 的 A∘R∘D），并从静态 OOD 转向 流式时间漂移（T-QPM）。
评测越来越 关注尾部：CarbonBench 报告按站点分位数；T-QPM 报告早/晚时间步的 FPR95/AUROC；Token Coherence 分析波动性区间。
检索/验证系统的反复失效模式是 前提验证：系统可能在错误框架下变得自信（错误前提过度断言；意识形态放大）。
当基础模型冻结时，更偏好轻量适配：带重加权损失的 LoRA（CORE-Acu）、两标量融合学习（T-QPM）、token 空间线性适配器修复（CoDA）、以及仅推理期的激活引导（EvoRePE）。

4) Top 5 论文（含“为何是现在”）

1) CORE-Acu: Structured Reasoning Traces and Knowledge Graph Safety Verification for Acupuncture Clinical Decision Support（用于针灸临床决策支持的结构化推理轨迹与知识图谱安全验证）

引入完整的神经符号安全栈：结构化 S-CoT + 中医 KG + 实体重加权损失 + 生成–验证–修订循环。
报告验证后 0/1,000 次 KG 定义的安全违规，而同一基准上 GPT-4o 为 8.5%。
为其他高风险领域提供实用模板：当 token 级实体保真 与 硬性禁忌规则 重要时尤其适用。
保留意见：安全性取决于 KG 覆盖；二元否决可能遗漏细微临床权衡。

2) Token Coherence: Adapting MESI Cache Protocols to Minimize Synchronization Overhead in Multi-Agent LLM Systems（Token Coherence：将 MESI 缓存协议改造用于降低多智能体 LLM 系统同步开销）

将多智能体上下文共享重构为缓存一致性；给出解析节省界与具体协议（CCS）。
使用 TLA+ 模型检查 验证不变量（单写者、版本单调、陈旧度有界）。
仿真显示在不同波动性区间下，惰性失效通知可节省 ~84–95% token——是智能体部署的直接成本杠杆。
保留意见：评估基于仿真；中心化权威与故障下活性仍是担忧。

3) Directional Routing in Transformers（Transformer 中的方向性路由）

增加小型路由器以抑制学习到的 head-space 方向；路由变为 关键承载（关闭路由器会导致回忆/归纳崩塌）。
报告显著 领域困惑度下降（31–56%），参数开销约 ~3.9%。
提供内置、可因果操控的“方向”作为可解释性钩子。
保留意见：随机种子/规模有限；PPL 增益未转化为多选基准增益。

4) TARo: Token-level Adaptive Routing for LLM Test-time Alignment（用于 LLM 测试时对齐的 token 级自适应路由）

学习逐 token 的 base 与 reward logits 混合，避免测试时对齐中脆弱的固定插值。
报告 MATH500 大幅提升（如 Table 1 中 Llama-3.1-8B：32.0% → 54.4%），并对更大骨干呈现弱到强迁移。
适用于重训昂贵但可在解码期进行引导的部署场景。
保留意见：依赖奖励模型质量/领域偏置；全 logits 路由可能损害吞吐。

5) The Impact of Ideological Discourses in RAG: A Case Study with COVID-19 Treatments（RAG 中意识形态话语的影响：以 COVID-19 治疗为例）

表明 RAG 可从检索文本中 传播意识形态立场；加入显式 LMDA 描述通常会放大对齐。
提供量化引导的具体方法学（LMDA + 受控检索 + 语义/词汇相似度 + ANOVA）。
“为何是现在”：RAG 在生产中无处不在；该工作凸显了超越幻觉之外的治理缺口。
保留意见：语料领域特定且示例选择经策划；效应可能随检索/重排策略变化。

5) 实用下一步

对任何安全关键助手，原型化 生成–验证–修订 控制器，包含：(i) 显式中间模式，(ii) 确定性约束检查，(iii) 有界重试，(iv) 未解决时的拒答/交接策略。
为 RAG 流水线加入 检索前的可回答性/前提检查门，以减少错误前提过度断言（在分层检索验证中被明确标注为关键失效模式）。
将检索语料视为不可信输入：实施 检索治理（来源白名单、意识形态/偏置检测、chunk 级溯源），并在受控检索两极下测试 立场引导。
若运行多智能体工作流，按同步边界度量 token 消耗，并对比 一致性式失效通知 与广播；在上线前验证不变量（单写者、版本单调、陈旧度界）。
使用测试时对齐时，用 自适应路由（token 级 α）替代固定混合，并不仅测准确率，还要测 吞吐成本 与 OOD 行为。
对 VLM 鲁棒性，将评估从单一扰动扩展到 组合流水线漂移（CoDA 风格）与 时间漂移（T-QPM 风格）；跟踪早/晚时间步指标。
对具身智能体，在宣称“效率提升”前，同时报告 具身效率指标（jerk、路径长度、完成时间、动作率）与推理指标。

由逐篇论文分析生成；未进行外部浏览。

Di Tang

AI 论文洞察简报

2026-03-24

0) 执行要点（先读这个）

2) 关键主题（聚类）

主题：面向高风险决策的神经符号验证闭环

主题：检索既是缓解手段，也是操控通道

主题：路由/协同作为通用控制旋钮（模型、解码、系统）

主题：在真实分布漂移下的鲁棒性（时间、空间、流水线）

主题：对齐、隐私与“创造力”的评测基础设施

3) 技术综合

4) Top 5 论文（含“为何是现在”）

5) 实用下一步