Arxiv今日论文 | 2025-09-30

本篇博文主要内容为 2025-09-30 从Arxiv.org论文网站获取的最新论文列表，自动更新，按照NLP、CV、ML、AI、IR五个大方向区分，若需要邮件定时接收，请在评论区留下你的邮箱号。

说明：每日论文数据从Arxiv.org获取，每天早上12:00左右定时自动更新。

友情提示: 如何您需要邮箱接收每日论文数据，请在评论处留下你的邮箱。

链接: https://arxiv.org/abs/2509.25189
作者: Gongrui Zhang,Jialiang Zhu,Ruiqi Yang,Kai Qiu,Miaosen Zhang,Zhirong Wu,Qi Dai,Bei Liu,Chong Luo,Zhengyuan Yang,Linjie Li,Lijuan Wang,Weizhu Chen,Yuan Zhang,Xin Li,Zhaoyi Liu,Xin Geng,Baining Guo
机构: Southeast University (东南大学); Brown University (布朗大学); Microsoft (微软)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-1] Learning to Parallel: Accelerating Diffusion Large Language Models via Adaptive Parallel Decoding

链接: https://arxiv.org/abs/2509.25188
作者: Wenrui Bao,Zhiben Chen,Dan Xu,Yuzhang Shang
机构: University of Central Florida (中佛罗里达大学); Mobi.AI; HKUST (香港科技大学)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-2] Incentive-Aligned Multi-Source LLM Summaries

链接: https://arxiv.org/abs/2509.25184
作者: Yanchen Jiang,Zhe Feng,Aranyak Mehta
机构: Harvard University (哈佛大学); Google Research (谷歌研究)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT)
备注:

点击查看摘要

[NLP-3] NAIPv2: Debiased Pairwise Learning for Efficient Paper Quality Estimation

链接: https://arxiv.org/abs/2509.25179
作者: Penghai Zhao,Jinyu Tian,Qinghua Xing,Xin Zhang,Zheng Li,Jianjun Qian,Ming-Ming Cheng,Xiang Li
机构: VCIP, CS, Nankai University (南开大学); NKIARI, Shenzhen Futian (深圳市福田区); PCALab, School of Computer Science and Engineering, Nanjing University of Science and Technology (南京理工大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: NAIPv2 complements our earlier work NAIPv1 ( arXiv:2408.03934 ). Whereas NAIPv1 addressed citation count-based impact prediction, NAIPv2 estimates research quality using peer review data

点击查看摘要

[NLP-4] SIRI: Scaling Iterative Reinforcement Learning with Interleaved Compression

【速读】：该论文旨在解决大型推理模型（Large Reasoning Models, LRM）在推理过程中存在的冗余思考模式问题，即模型常产生重复性、低效的推理路径，导致计算资源浪费且难以平衡性能与效率。传统方法试图减少冗余时往往牺牲模型性能，形成性能-效率之间的权衡困境。解决方案的关键在于提出一种名为SIRI（Scaling Iterative Reinforcement Learning with Interleaved Compression）的训练机制，通过在训练中周期性地交替执行压缩（compression）和扩展（expansion）阶段：压缩阶段缩短最大推理轨迹长度（rollout length），迫使模型在有限上下文中做出高密度、高价值决策，从而削减冗余token；扩展阶段则放宽长度限制，允许模型在长 horizon 设置下进行探索和规划。这种迭代式动态调整策略有效提升了推理密度与准确性，并逐步逼近性能-效率的帕累托前沿（Pareto frontier），实现更优的平衡。

链接: https://arxiv.org/abs/2509.25176
作者: Haoming Wen,Yushi Bai,Juanzi Li,Jie Tang
机构: Tsinghua University (清华大学)
类目: Machine Learning (cs.LG); Computation and Language (cs.CL)
备注: In submission

点击查看摘要

Abstract:We introduce SIRI, Scaling Iterative Reinforcement Learning with Interleaved Compression, a simple yet effective RL approach for Large Reasoning Models (LRMs) that enables more efficient and accurate reasoning. Existing studies have observed repetitive thinking patterns in LRMs, and attempts to reduce them often come at the cost of performance. In this paper, we show that this trade-off can be overcome through a training regime that iteratively alternates between compressing and expanding the reasoning budget, by dynamically adjusting the maximum rollout length during training. The compression phase cuts the rollout length, forcing the model to make precise and valuable decisions within a limited context, which effectively reduces redundant tokens and increases reasoning density. The expansion phase then relaxes the length limit, providing space for the model to explore and plan in long-horizon settings. Remarkably, we find that after each compression-expansion cycle, the model’s performance improves even as its output length decreases, steadily pushing it closer to the Pareto frontier in the performance-efficiency trade-off. Training on DeepSeek-R1-Distill-Qwen-1.5B, SIRI-low improves performance on AIME24 by 43.2% while reducing token usage by 46.9% after three iterations, and SIRI-high achieves the highest accuracy compared to all other methods (Figure 1). Our findings shed light on the potential of periodically oscillating the LRM’s output truncation length during training to dynamically balance exploration and efficiency in reasoning, converging towards an optimal “sweet spot” between the two. Our models are publicly available.
zh

[NLP-5] EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

链接: https://arxiv.org/abs/2509.25175
作者: Haolei Xu,Xinyu Mei,Yuchen Yan,Rui Zhou,Wenqi Zhang,Weiming Lu,Yueting Zhuang,Yongliang Shen
机构: Zhejiang University (浙江大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: project: this https URL

点击查看摘要

[NLP-6] GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts

链接: https://arxiv.org/abs/2509.25160
作者: Fan Yuan,Yuchen Yan,Yifan Jiang,Haoran Zhao,Tao Feng,Jinyan Chen,Yanwei Lou,Wenqi Zhang,Yongliang Shen,Weiming Lu,Jun Xiao,Yueting Zhuang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注: 68 pages, 6 figures, Project Page: this https URL Code: this https URL Datasets: this https URL

点击查看摘要

[NLP-7] Pretraining Large Language Models with NVFP4

链接: https://arxiv.org/abs/2509.25149
作者: NVIDIA,Felix Abecassis,Anjulie Agrusa,Dong Ahn,Jonah Alben,Stefania Alborghetti,Michael Andersch,Sivakumar Arayandi,Alexis Bjorlin,Aaron Blakeman,Evan Briones,Ian Buck,Bryan Catanzaro,Jinhang Choi,Mike Chrzanowski,Eric Chung,Victor Cui,Steve Dai,Bita Darvish Rouhani,Carlo del Mundo,Deena Donia,Burc Eryilmaz,Henry Estela,Abhinav Goel,Oleg Goncharov,Yugi Guvvala,Robert Hesse,Russell Hewett,Herbert Hum,Ujval Kapasi,Brucek Khailany,Mikail Khona,Nick Knight,Alex Kondratenko,Ronny Krashinsky,Ben Lanir,Simon Layton,Michael Lightstone,Daniel Lo,Paulius Micikevicius,Asit Mishra,Tim Moon,Deepak Narayanan,Chao Ni,Abhijit Paithankar,Satish Pasumarthi,Ankit Patel,Mostofa Patwary,Ashwin Poojary,Gargi Prasad,Sweta Priyadarshi,Yigong Qin,Xiaowei Ren,Oleg Rybakov,Charbel Sakr,Sanjeev Satheesh,Stas Sergienko,Pasha Shamis,Kirthi Shankar,Nishant Sharma,Mohammad Shoeybi,Michael Siu,Misha Smelyanskiy,Darko Stosic,Dusan Stosic,Bor-Yiing Su,Frank Sun,Nima Tajbakhsh,Shelby Thomas,Przemek Tredak,Evgeny Tsykunov,Gandhi Vaithilingam,Aditya Vavre,Rangharajan Venkatesan,Roger Waleffe,Qiyu Wan,Hexin Wang,Mengdi Wang,Lizzie Wei,Hao Wu,Evan Wu,Keith Wyss,Ning Xu,Jinze Xue,Charlene Yang,Yujia Zhai,Ruoxi Zhang,Jingyang Zhu,Zhongbo Zhu
机构: NVIDIA(英伟达)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-8] Paired by the Teacher: Turning Unpaired Data into High-Fidelity Pairs for Low-Resource Text Generation EMNLP2025

链接: https://arxiv.org/abs/2509.25144
作者: Yen-Ju Lu,Thomas Thebaud,Laureano Moro-Velazquez,Najim Dehak,Jesus Villalba
机构: Johns Hopkins University (约翰霍普金斯大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: Accepted at EMNLP 2025 (Main Conference)

点击查看摘要

[NLP-9] mMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models

链接: https://arxiv.org/abs/2509.25143
作者: Junyi Zhang,Jia-Chen Gu,Wenbo Hu,Yu Zhou,Robinson Piramuthu,Nanyun Peng
机构: University of California, Los Angeles (加州大学洛杉矶分校); Amazon (亚马逊)
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-10] Reasoning Bank: Scaling Agent Self-Evolving with Reasoning Memory

链接: https://arxiv.org/abs/2509.25140
作者: Siru Ouyang,Jun Yan,I-Hung Hsu,Yanfei Chen,Ke Jiang,Zifeng Wang,Rujun Han,Long T. Le,Samira Daruki,Xiangru Tang,Vishy Tirumalashetty,George Lee,Mahsan Rofouei,Hangfei Lin,Jiawei Han,Chen-Yu Lee,Tomas Pfister
机构: Google Cloud AI Research(谷歌云人工智能研究); Yale University(耶鲁大学); Google Cloud AI(谷歌云人工智能); University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注: 11 pages, 7 figures, 4 tables

点击查看摘要

[NLP-11] Investigating Language and Retrieval Bias in Multilingual Previously Fact-Checked Claim Detection

链接: https://arxiv.org/abs/2509.25138
作者: Ivan Vykopal,Antonia Karamolegkou,Jaroslav Kopčan,Qiwei Peng,Tomáš Javůrek,Michal Gregor,Marián Šimko
机构: Brno University of Technology (布诺理工大学); Kempelen Institute of Intelligent Technologies (肯佩伦智能技术研究所); University of Copenhagen (哥本哈根大学)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-12] he Era of Real-World Human Interaction: RL from User Conversations

链接: https://arxiv.org/abs/2509.25137
作者: Chuanyang Jin,Jing Xu,Bo Liu,Leitian Tao,Olga Golovneva,Tianmin Shu,Wenting Zhao,Xian Li,Jason Weston
机构: 未知
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-13] Rethinking Entropy Regularization in Large Reasoning Models

链接: https://arxiv.org/abs/2509.25133
作者: Yuxian Jiang,Yafu Li,Guanxu Chen,Dongrui Liu,Yu Cheng,Jing Shao
机构: Shanghai Artificial Intelligence Laboratory (上海人工智能实验室); Fudan University (复旦大学); Shanghai Jiao Tong University (上海交通大学); Chinese University of Hong Kong (香港中文大学)
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-14] MGM-Omni: Scaling Omni LLM s to Personalized Long-Horizon Speech

链接: https://arxiv.org/abs/2509.25131
作者: Chengyao Wang,Zhisheng Zhong,Bohao Peng,Senqiao Yang,Yuqi Liu,Haokun Gui,Bin Xia,Jingyao Li,Bei Yu,Jiaya Jia
机构: 未知
类目: ound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
备注: Code is available at this https URL

点击查看摘要

[NLP-15] From f(x) and g(x) to f(g(x)): LLM s Learn New Skills in RL by Composing Old Ones

链接: https://arxiv.org/abs/2509.25123
作者: Lifan Yuan,Weize Chen,Yuchen Zhang,Ganqu Cui,Hanbin Wang,Ziming You,Ning Ding,Zhiyuan Liu,Maosong Sun,Hao Peng
机构: University of Illinois Urbana-Champaign (伊利诺伊大学厄巴纳-香槟分校); Tsinghua University (清华大学); Shanghai AI Laboratory (上海人工智能实验室); Peking University (北京大学)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-16] Knowledge Extraction on Semi-Structured Content: Does It Remain Relevant for Question Answering in the Era of LLM s?

链接: https://arxiv.org/abs/2509.25107
作者: Kai Sun,Yin Huang,Srishti Mehra,Mohammad Kachuee,Xilun Chen,Renjie Tao,Zhaojiang Lin,Andrea Jessee,Nirav Shah,Alex Betty,Yue Liu,Anuj Kumar,Wen-tau Yih,Xin Luna Dong
机构: Meta Reality Labs (Meta现实实验室); FAIR, Meta (FAIR, Meta)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-17] owards Personalized Deep Research: Benchmarks and Evaluations

链接: https://arxiv.org/abs/2509.25106
作者: Yuan Liang,Jiaxian Li,Yuqing Wang,Piaohong Wang,Motong Tian,Pai Liu,Shuofei Qiao,Runnan Fang,He Zhu,Ge Zhang,Minghao Liu,Yuchen Eleanor Jiang,Ningyu Zhang,Wangchunshu Zhou
机构: OPPO(欧珀); Zhejiang University (浙江大学); M-A-P; 2077.AI
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
备注:

点击查看摘要

[NLP-18] ORPO-Distill: Mixed-Policy Preference Optimization for Cross-Architecture LLM Distillation NEURIPS2025

【速读】：该论文旨在解决跨架构大语言模型（Large Language Model, LLM）知识蒸馏（Knowledge Distillation, KD）中的效率与效果问题，特别是传统基于思维链（Chain-of-Thought, CoT）的蒸馏方法在推理路径多样性不足、学生模型学习信号有限等方面的局限性。其解决方案的关键在于将知识蒸馏建模为偏好优化（Preference Optimization）任务，提出一种基于几率比（Odds-Ratio）的偏好优化目标（ORPO-Distill），通过对比教师与学生生成的多样化推理轨迹来增强学习信号，并采用混合策略（mixed-policy strategy）有效利用学生自身生成的输出，从而在多个数据集和学生模型上显著优于传统的黑盒KD基线方法。

链接: https://arxiv.org/abs/2509.25100
作者: Aasheesh Singh,Vishal Vaddina,Dagnachew Birru
机构: Phi Labs; Quantiphi
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注: Accepted at NeurIPS 2025, Efficient Reasoning Workshop

点击查看摘要

Abstract:We introduce ORPO-Distill, a general-purpose method for cross-architecture LLM distillation that formulates the problem as a preference optimization task. Un- like standard CoT distillation, the approach transfers knowledge through diverse reasoning traces. It employs an Odds-Ratio Preference Optimization objective that contrasts teacher and student traces for more effective learning, and adopts a mixed-policy strategy for utilizing student-generated outputs, outperforming both off- and on-policy alternatives. Experiments on five datasets and multiple student models show consistent improvements over conventional black-box KD baselines.
zh

[NLP-19] Scaling with Collapse: Efficient and Predictable Training of LLM Families

链接: https://arxiv.org/abs/2509.25087
作者: Shane Bergsma,Bin Claire Zhang,Nolan Dey,Shaheer Muhammad,Gurpreet Gosal,Joel Hestness
机构: Cerebras Systems (Cerebras 系统公司)
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-20] owards Trustworthy Lexical Simplification: Exploring Safety and Efficiency with Small LLM s

链接: https://arxiv.org/abs/2509.25086
作者: Akio Hayakawa,Stefan Bott,Horacio Saggion
机构: Universitat Pompeu Fabra (庞佩乌·法布拉大学)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-21] jina-reranker-v3: Last but Not Late Interaction for Document Reranking DATE

链接: https://arxiv.org/abs/2509.25085
作者: Feng Wang,Yuqing Li,Han Xiao
机构: Jina AI GmbH (Jina AI GmbH); University of Pittsburgh (匹兹堡大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
备注: early draft, CoIR table needs to be updated

点击查看摘要

[NLP-22] Scaling Generalist Data-Analytic Agents

链接: https://arxiv.org/abs/2509.25084
作者: Shuofei Qiao,Yanqiu Zhao,Zhisong Qiu,Xiaobin Wang,Jintian Zhang,Zhao Bin,Ningyu Zhang,Yong Jiang,Pengjun Xie,Fei Huang,Huajun Chen
机构: Zhejiang University (浙江大学); Alibaba Group
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
备注: Work in progress

点击查看摘要

[NLP-23] An empirical study on the limitation of Transformers in program trace generation

链接: https://arxiv.org/abs/2509.25073
作者: Simeng Sun
机构: NVIDIA(英伟达)
类目: Computation and Language (cs.CL)
备注: two-page extended abstract

点击查看摘要

[NLP-24] Learning from Convenience Samples: A Case Study on Fine-Tuning LLM s for Survey Non-response in the German Longitudinal Election Study

链接: https://arxiv.org/abs/2509.25063
作者: Tobias Holtdirk,Dennis Assenmacher,Arnim Bleier,Claudia Wagner
机构: 1. University of Hamburg (汉堡大学); 2. Max Planck Institute for Human Development (马克斯·普朗克人类发展研究所); 3. German Institute for Economic Research (德国经济研究所); 4. Hertie School of Governance (赫尔蒂治理学院); 5. Berlin School of Economics and Law (柏林经济与法律应用科学大学)
类目: Computers and Society (cs.CY); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-25] Confidence-Guided Error Correction for Disordered Speech Recognition ICASSP

链接: https://arxiv.org/abs/2509.25048
作者: Abner Hernandez,Tomás Arias Vergara,Andreas Maier,Paula Andrea Pérez-Toro
机构: 未知
类目: Computation and Language (cs.CL)
备注: Preprint submitted to ICASSP

点击查看摘要

[NLP-26] Hyperdimensional Probe: Decoding LLM Representations via Vector Symbolic Architectures

链接: https://arxiv.org/abs/2509.25045
作者: Marco Bronzini,Carlo Nicolini,Bruno Lepri,Jacopo Staiano,Andrea Passerini
机构: University of Trento (特伦托大学); Ipazia S.p.A. (Ipazia公司); Fondazione Bruno Kessler (FBK) (布鲁诺·凯斯勒基金会)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-27] GateMABSA: Aspect-Image Gated Fusion for Multimodal Aspect-based Sentiment Analysis

链接: https://arxiv.org/abs/2509.25037
作者: Adamu Lawan,Haruna Yunusa
机构: 未知
类目: Computation and Language (cs.CL)
备注: 6 pages, 2 tables

点击查看摘要

[NLP-28] Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct

【速读】：该论文旨在解决生成式 AI (Generative AI) 中语言文本生成速度慢的问题，尤其是在保持高质量输出的前提下实现高效推理。其核心挑战在于如何在不显著牺牲模型性能的情况下大幅提升生成效率。解决方案的关键在于提出一种基于训练的蒸馏方法——Discrete Diffusion Divergence Instruct (DiDi-Instruct)，该方法从预训练的离散扩散语言模型（discrete diffusion language model, dLLM）出发，通过积分KL散度最小化框架进行优化，并引入分组奖励归一化（grouped reward normalization）、中间状态匹配（intermediate-state matching）以及奖励引导的祖先采样器（reward-guided ancestral sampler, RGAS）等关键技术，显著提升了训练稳定性、模型覆盖能力和推理性能。实验表明，DiDi-Instruct 在 OpenWebText 数据集上实现了高达 64 倍的加速，同时仅带来约 1% 的熵损失和 20 倍更少的额外训练时间，验证了其在效率与效果之间的优异平衡。

链接: https://arxiv.org/abs/2509.25035
作者: Haoyang Zheng,Xinyang Liu,Cindy Xiangrui Kong,Nan Jiang,Zheyuan Hu,Weijian Luo,Wei Deng,Guang Lin
机构: Purdue University(普渡大学); UT Austin(德克萨斯大学奥斯汀分校); National University of Singapore(新加坡国立大学); hi-Lab, Xiaohongshu(小红书实验室); ML Research, Morgan Stanley(摩根士丹利机器学习研究部)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 56 pages, 7 figures, 7 tables

点击查看摘要

Abstract:Fast generation of language texts is the holy grail that people pursue in the AI era. In this work, we introduced Discrete Diffusion Divergence Instruct (DiDi-Instruct), a training-based method that leads to fast language generation models by initializing from a pre-trained (masked) discrete diffusion language model (dLLM). The resulting DiDi-Instruct model outperforms the dLLM counterparts and the GPT-2 baseline with 64x acceleration. In the theoretical part of the paper, we build the foundation of DiDi-Instruct in a framework of integral KL-divergence minimization, with practical training algorithms. We also introduce techniques like grouped reward normalization, intermediate-state matching, and the reward-guided ancestral sampler (RGAS) that significantly improve the training stability, the model coverage, and the inference performances. On OpenWebText, DiDi-Instruct outperforms all accelerated language generation models as well as the GPT-2 baseline and the standard dLLMs, achieving sample perplexities ranging from 62.2 (8 NFEs) to 18.4 (128 NFEs). These performance gains are accomplished with a negligible entropy loss of about 1% and 20x less additional training wall-clock time. We further validate the robustness and effectiveness of DiDi-Instruct through extensive ablation studies, model scaling, and the generation of discrete protein sequences. In conclusion, DiDi-Instruct is an efficient yet effective distillation method, enabling language generation in the blink of an eye. We will release both code and models at this http URL.
zh

[NLP-29] Circuit Distillation

链接: https://arxiv.org/abs/2509.25002
作者: Somin Wadhwa,Silvio Amir,Byron C. Wallace
机构: Khoury College of Computer Sciences (计算机科学学院); Northeastern University (东北大学)
类目: Computation and Language (cs.CL)
备注: Preprint; Under Review

点击查看摘要

[NLP-30] Generalized Correctness Models: Learning Calibrated and Model-Agnostic Correctness Predictors from Historical Patterns

链接: https://arxiv.org/abs/2509.24988
作者: Hanqi Xiao,Vaidehi Patil,Hyunji Lee,Elias Stengel-Eskin,Mohit Bansal
机构: UNC Chapel Hill (北卡罗来纳大学教堂山分校); The University of Texas at Austin (德克萨斯大学奥斯汀分校)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: Code: this https URL

点击查看摘要

[NLP-31] DiffTester: Accelerating Unit Test Generation for Diffusion LLM s via Repetitive Pattern

链接: https://arxiv.org/abs/2509.24975
作者: Lekang Yang,Yuetong Liu,Yitong Zhang,Jia Li
机构: Tsinghua University (清华大学); Beihang University (北京航空航天大学)
类目: oftware Engineering (cs.SE); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-32] SemanticShield: LLM -Powered Audits Expose Shilling Attacks in Recommender Systems

链接: https://arxiv.org/abs/2509.24961
作者: Kaihong Li,Huichi Zhou,Bin Ma,Fangjun Huang
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-33] he Dialogue That Heals: A Comprehensive Evaluation of Doctor Agents Inquiry Capability

链接: https://arxiv.org/abs/2509.24958
作者: Linlu Gong,Ante Wang,Yunghwei Lai,Weizhi Ma,Yang Liu
机构: Institute for AI Industry Research (AIR), Tsinghua University (清华大学); Department of Computer Science and Technology, Tsinghua University (清华大学)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-34] MobileLLM -R1: Exploring the Limits of Sub-Billion Language Model Reason ers with Open Training Recipes

链接: https://arxiv.org/abs/2509.24945
作者: Changsheng Zhao,Ernie Chang,Zechun Liu,Chia-Jung Chang,Wei Wen,Chen Lai,Rick Cao,Yuandong Tian,Raghuraman Krishnamoorthi,Yangyang Shi,Vikas Chandra
机构: Meta(元)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: Model: this https URL

点击查看摘要

[NLP-35] How Well Do LLM s Imitate Human Writing Style?

链接: https://arxiv.org/abs/2509.24930
作者: Rebira Jemama,Rajesh Kumar
机构: 未知
类目: Computation and Language (cs.CL); Computers and Society (cs.CY)
备注: IEEE UEMCON 2025, 11 pages, 4 figures, and 4 tables

点击查看摘要

[NLP-36] When Greedy Wins: Emergent Exploitation Bias in Meta-Bandit LLM Training

链接: https://arxiv.org/abs/2509.24923
作者: Sanxing Chen,Xiaoyin Chen,Yukun Huang,Roy Xie,Bhuwan Dhingra
机构: Duke University (杜克大学); Mila - Québec AI Institute (蒙特利尔魁北克人工智能研究所); Université de Montréal (蒙特利尔大学)
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-37] MASLegalBench: Benchmarking Multi-Agent Systems in Deductive Legal Reasoning

链接: https://arxiv.org/abs/2509.24922
作者: Huihao Jing,Wenbin Hu,Hongyu Luo,Jianhui Yang,Wei Fan,Haoran Li,Yangqiu Song
机构: Hong Kong University of Science and Technology (香港科技大学); Tsinghua University (清华大学)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-38] BOE-XSUM: Extreme Summarization in Clear Language of Spanish Legal Decrees and Notifications

链接: https://arxiv.org/abs/2509.24908
作者: Andrés Fernández García,Javier de la Rosa,Julio Gonzalo,Roser Morante,Enrique Amigó,Alejandro Benito-Santos,Jorge Carrillo-de-Albornoz,Víctor Fresno,Adrian Ghajari,Guillermo Marco,Laura Plaza,Eva Sánchez Salido
机构: Universidad Nacional de Educación a Distancia (西班牙国家远程教育大学); The National Library of Norway (挪威国家图书馆)
类目: Computation and Language (cs.CL)
备注: Published in SEPLN 2025. 20 pages, 4 figures

点击查看摘要

[NLP-39] Neural network embeddings recover value dimensions from psychometric survey items on par with human data

链接: https://arxiv.org/abs/2509.24906
作者: Max Pellert,Clemens M. Lechner,Indira Sen,Markus Strohmaier
机构: Barcelona Supercomputing Center(巴塞罗那超级计算中心); GESIS – Leibniz Institute for the Social Sciences(德国社会科学研究协会-莱布尼茨社会科学研究机构); University of Mannheim(曼海姆大学); Complexity Science Hub Vienna(维也纳复杂科学中心)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-40] MMRQA: Signal-Enhanced Multimodal Large Language Models for MRI Quality Assessment

链接: https://arxiv.org/abs/2509.24888
作者: Fankai Jia,Daisong Gan,Zhe Zhang,Zhaochi Wen,Chenchen Dan,Dong Liang,Haifeng Wang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-41] Expanding Computation Spaces of LLM s at Inference Time

链接: https://arxiv.org/abs/2509.24884
作者: Yoonna Jang,Kisu Yang,Isabelle Augenstein
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-42] Retro*: Optimizing LLM s for Reasoning -Intensive Document Retrieval

链接: https://arxiv.org/abs/2509.24869
作者: Junwei Lan,Jianlyu Chen,Zheng Liu,Chaofan Li,Siqi Bao,Defu Lian
机构: University of Science and Technology of China (中国科学技术大学); Beijing Academy of Artificial Intelligence (北京人工智能研究院); Beijing University of Posts and Telecommunications (北京邮电大学); Hong Kong Polytechnic University (香港理工大学); Hong Kong University of Science and Technology (香港科技大学)
类目: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-43] Metaphor identification using large language models : A comparison of RAG prompt engineering and fine-tuning

链接: https://arxiv.org/abs/2509.24866
作者: Matteo Fuoli,Weihang Huang,Jeannette Littlemore,Sarah Turner,Ellen Wilding
机构: University of Birmingham (伯明翰大学); Coventry University (考文垂大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-44] Between Help and Harm: An Evaluation of Mental Health Crisis Handling by LLM s

链接: https://arxiv.org/abs/2509.24857
作者: Adrian Arnaiz-Rodriguez,Miguel Baidal,Erik Derner,Jenn Layton Annable,Mark Ball,Mark Ince,Elvira Perez Vallejos,Nuria Oliver
机构: ELLIS Alicante(ELLIS阿尔卡纳); The University of Nottingham(诺丁汉大学); Derby University(德比大学); School of Computer Science & School of Medicine, The University of Nottingham(计算机科学学院与医学院，诺丁汉大学)
类目: Computation and Language (cs.CL); Computers and Society (cs.CY)
备注:

点击查看摘要

[NLP-45] Hierarchical Error Correction for Large Language Models : A Systematic Framework for Domain-Specific AI Quality Enhancement

链接: https://arxiv.org/abs/2509.24841
作者: Zhilong Zhao,Yindi Liu
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 10 pages, 4 figures, 4 tables

点击查看摘要

[NLP-46] Pushing LLM s to Their Logical Reasoning Reasoning Bound: The Role of Data Reasoning Intensity

链接: https://arxiv.org/abs/2509.24836
作者: Zhen Bi,Zhenlin Hu,Jinnan Yang,Mingyang Chen,Cheng Deng,Yida Xue,Zeyu Yang,Qing Shen,Zhenfang Liu,Kang Zhao,Ningyu Zhang,Jungang Lou
机构: Huzhou University (湖州大学); Banbu AI Foundation (斑布人工智能基金会); Baichuan Inc. (百川智能); University of Edinburgh (爱丁堡大学); Zhejiang University (浙江大学); Nanjing University of Science and Technology (南京理工大学)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-47] SemShareKV: Efficient KVCache Sharing for Semantically Similar Prompts via Token-Level LSH Matching

链接: https://arxiv.org/abs/2509.24832
作者: Xinye Zhao,Spyridon Mastorakis
机构: University of Notre Dame (圣母大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 11 figures, 14pages

点击查看摘要

[NLP-48] DiaCDM: Cognitive Diagnosis in Teacher-Student Dialogues using the Initiation-Response-Evaluation Framework

链接: https://arxiv.org/abs/2509.24821
作者: Rui Jia,Yuang Wei,Ruijia Li,Yuang-Hao Jiang,Xinyu Xie,Yaomin Shen,Min Zhang,Bo Jiang
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-49] KnowGuard: Knowledge-Driven Abstention for Multi-Round Clinical Reasoning

【速读】：该论文旨在解决大型语言模型（Large Language Models, LLMs）在临床决策中缺乏有效“弃权”机制的问题，即当患者信息不足时，现有模型常因过度自信而提供不准确的诊断建议，从而带来潜在风险。其核心挑战在于传统弃权方法仅依赖模型自身评估，未能结合外部医学证据系统性识别知识边界。解决方案的关键在于提出 KnowGuard，这是一种“先探究后弃权”（investigate-before-abstain）的新范式，通过两个阶段实现对医学知识图谱的系统性探索：第一阶段基于图扩展与直接检索进行证据发现；第二阶段则综合多因素对证据进行排序并动态调整探索策略，从而在共享上下文证据池中构建结构化推理路径，精准识别证据不足情形。实验表明，该方法显著提升了诊断准确性（+3.93%），同时减少无效交互轮次（平均-7.27轮）。

链接: https://arxiv.org/abs/2509.24816
作者: Xilin Dang,Kexin Chen,Xiaorui Su,Ayush Noori,Iñaki Arango,Lucas Vittor,Xinyi Long,Yuyang Du,Marinka Zitnik,Pheng Ann Heng
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

Abstract:In clinical practice, physicians refrain from making decisions when patient information is insufficient. This behavior, known as abstention, is a critical safety mechanism preventing potentially harmful misdiagnoses. Recent investigations have reported the application of large language models (LLMs) in medical scenarios. However, existing LLMs struggle with the abstentions, frequently providing overconfident responses despite incomplete information. This limitation stems from conventional abstention methods relying solely on model self-assessments, which lack systematic strategies to identify knowledge boundaries with external medical evidences. To address this, we propose \textbfKnowGuard, a novel \textitinvestigate-before-abstain paradigm that integrates systematic knowledge graph exploration for clinical decision-making. Our approach consists of two key stages operating on a shared contextualized evidence pool: 1) an evidence discovery stage that systematically explores the medical knowledge space through graph expansion and direct retrieval, and 2) an evidence evaluation stage that ranks evidence using multiple factors to adapt exploration based on patient context and conversation history. This two-stage approach enables systematic knowledge graph exploration, allowing models to trace structured reasoning paths and recognize insufficient medical evidence. We evaluate our abstention approach using open-ended multi-round clinical benchmarks that mimic realistic diagnostic scenarios, assessing abstention quality through accuracy-efficiency trade-offs beyond existing closed-form evaluations. Experimental evidences clearly demonstrate that KnowGuard outperforms state-of-the-art abstention approaches, improving diagnostic accuracy by 3.93% while reducing unnecessary interaction by 7.27 turns on average.
zh

[NLP-50] Evaluating Spatiotemporal Consistency in Automatically Generated Sewing Instructions EMNLP2025

链接: https://arxiv.org/abs/2509.24792
作者: Luisa Geiger,Mareike Hartmann,Michael Sullivan,Alexander Koller
机构: 未知
类目: Computation and Language (cs.CL)
备注: 18 pages, 14 figures; to be published in EMNLP 2025 proceedings

点击查看摘要

[NLP-51] SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language Models EMNLP2025

链接: https://arxiv.org/abs/2509.24781
作者: Jun Rao,Yunjie Liao,Xuebo Liu,Zepeng Lin,Lian Lian,Dong Jin,Shengjun Cheng,Jun Yu,Min Zhang
机构: Harbin Institute of Technology, Shenzhen (哈尔滨工业大学深圳校区); Huawei Cloud Computing Technologies Co., Ltd. (华为云计算技术有限公司)
类目: Computation and Language (cs.CL)
备注: EMNLP 2025 Findings

点击查看摘要

[NLP-52] LatentEvolve: Self-Evolving Test-Time Scaling in Latent Space

链接: https://arxiv.org/abs/2509.24771
作者: Guibin Zhang,Fanci Meng,Guancheng Wan,Zherui Li,Kun Wang,Zhenfei Yin,Lei Bai,Shuicheng Yan
机构: NUS (新加坡国立大学); USTC (中国科学技术大学); UCLA (加州大学洛杉矶分校); NTU (南洋理工大学); Shanghai AI Lab (上海人工智能实验室)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-53] ProxyAttn: Guided Sparse Attention via Representative Heads

链接: https://arxiv.org/abs/2509.24745
作者: Yixuan Wang,Huang He,Siqi Bao,Hua Wu,Haifeng Wang,Qingfu Zhu,Wanxiang Che
机构: Harbin Institute of Technology (哈尔滨工业大学); Baidu Inc. (百度公司)
类目: Computation and Language (cs.CL); Machine Learning (cs.LG)
备注: 14pages, 5figures

点击查看摘要

[NLP-54] Socratic-Zero : Bootstrapping Reasoning via Data-Free Agent Co-evolution

链接: https://arxiv.org/abs/2509.24726
作者: Shaobo Wang,Zhengbo Jiao,Zifan Zhang,Yilang Peng,Xu Ze,Boyu Yang,Wei Wang,Hu Wei,Linfeng Zhang
机构: Alibaba Group Holding Limited (阿里巴巴集团); EPIC Lab, Shanghai Jiao Tong University (上海交通大学EPIC实验室); Shanghai University of Finance and Economics (上海财经大学); Wuhan University (武汉大学); Zhejiang University (浙江大学)
类目: Computation and Language (cs.CL)
备注: 23 pages, 3 figures

点击查看摘要

[NLP-55] On the Self-awareness of Large Reasoning Models Capability Boundaries

链接: https://arxiv.org/abs/2509.24711
作者: Qingjie Zhang,Yujia Fu,Yang Wang,Liu Yan,Tao Wei,Ke Xu,Minlie Huang,Han Qiu
机构: Tsinghua University (清华大学); Ant Group (蚂蚁集团)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-56] MemGen: Weaving Generative Latent Memory for Self-Evolving Agents

链接: https://arxiv.org/abs/2509.24704
作者: Guibin Zhang,Muxin Fu,Shuicheng Yan
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-57] Reference-Free Rating of LLM Responses via Latent Information

链接: https://arxiv.org/abs/2509.24678
作者: Leander Girrbach,Chi-Ping Su,Tankred Saanum,Richard Socher,Eric Schulz,Zeynep Akata
机构: Technical University of Munich (慕尼黑工业大学); National Yang Ming Chiao Tung University (国立阳明交通大学); Helmholtz Munich (赫尔姆霍兹慕尼黑研究中心); Harvard University (哈佛大学); you.com
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 21 pages

点击查看摘要

[NLP-58] Understanding the Dilemma of Unlearning for Large Language Models

链接: https://arxiv.org/abs/2509.24675
作者: Qingjie Zhang,Haoting Qian,Zhicong Huang,Cheng Hong,Minlie Huang,Ke Xu,Chao Zhang,Han Qiu
机构: Tsinghua University (清华大学); Ant Group (蚂蚁集团)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-59] InfLLM -V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation

链接: https://arxiv.org/abs/2509.24663
作者: Weilin Zhao,Zihan Zhou,Zhou Su,Chaojun Xiao,Yuxuan Li,Yanghao Li,Yudi Zhang,Weilun Zhao,Zhen Li,Yuxiang Huang,Ao Sun,Xu Han,Zhiyuan Liu
机构: Tsinghua University (清华大学); OpenBMB; Harbin Institute of Technology (哈尔滨工业大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-60] Hype or not? Formalizing Automatic Promotional Language Detection in Biomedical Research

链接: https://arxiv.org/abs/2509.24638
作者: Bojan Batalo,Erica K. Shimomoto,Neil Millar
机构: National Institute of Advanced Industrial Science and Technology (AIST); University of Tsukuba
类目: Computation and Language (cs.CL); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-61] HiKE: Hierarchical Evaluation Framework for Korean-English Code-Switching Speech Recognition ICASSP2026

链接: https://arxiv.org/abs/2509.24613
作者: Gio Paik,Yongbeom Kim,Soungmin Lee,Sangmin Ahn,Chanwoo Kim
机构: 未知
类目: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
备注: 5 pages, 2 figures, Submitted to ICASSP2026

点击查看摘要

[NLP-62] OrthAlign: Orthogonal Subspace Decomposition for Non-Interfering Multi-Objective Alignment

链接: https://arxiv.org/abs/2509.24610
作者: Liang Lin,Zhihao Xu,Junhao Dong,Jian Zhao,Yuchen Yuan,Guibin Zhang,Miao Yu,Yiming Zhang,Zhengtao Yao,Huahui Yi,Dongrui Liu,Xinfeng Li,Kun Wang
机构: Institute of Artificial Intelligence (TeleAI), China Telecom; RUC; USTC; NTU; NUS; USC; Shanghai Artificial Intelligence Laboratory
类目: Machine Learning (cs.LG); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-63] Inducing Dyslexia in Vision Language Models

链接: https://arxiv.org/abs/2509.24597
作者: Melika Honarmand,Ayati Sharma,Badr AlKhamissi,Johannes Mehrer,Martin Schrimpf
机构: EPFL (瑞士联邦理工学院)
类目: Computation and Language (cs.CL); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-64] NeMo: Needle in a Montage for Video-Language Understanding

链接: https://arxiv.org/abs/2509.24563
作者: Zi-Yuan Hu,Shuo Liang,Duo Zheng,Yanyang Li,Yeyao Tao,Shijia Huang,Wei Feng,Jia Qin,Jianguang Yu,Jing Huang,Meng Fang,Yin Li,Liwei Wang
机构: The Chinese University of Hong Kong (香港中文大学); Phoenix TV (凤凰卫视); Stanford University (斯坦福大学); University of Liverpool (利物浦大学); University of Wisconsin-Madison (威斯康星大学麦迪逊分校)
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-65] AdaThink-Med: Medical Adaptive Thinking with Uncertainty-Guided Length Calibration

链接: https://arxiv.org/abs/2509.24560
作者: Shaohao Rui,Kaitao Chen,Weijie Ma,Xiaosong Wang
机构: SJTU(上海交通大学); SII; Shanghai AI Lab(上海人工智能实验室); FDU(复旦大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-66] LEAF: A Robust Expert-Based Framework for Few-Shot Continual Event Detection

链接: https://arxiv.org/abs/2509.24547
作者: Bao-Ngoc Dao,Quang Nguyen,Luyen Ngo Dinh,Minh Le,Linh Ngo Van
机构: 未知
类目: Machine Learning (cs.LG); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-67] Experience-guided reflective co-evolution of prompts and heuristics for automatic algorithm design

链接: https://arxiv.org/abs/2509.24509
作者: Yihong Liu,Junyi Li,Wayne Xin Zhao,Hongyu Lu,Ji-Rong Wen
机构: Gaoling School of Artificial Intelligence, Renmin University of China (中国人民大学高瓴人工智能学院); Department of Data Science, City University of Hong Kong (香港城市大学数据科学系); School of Information, Renmin University of China (中国人民大学信息学院); WeChat, Tencent (腾讯微信)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-68] Building Benchmarks from the Ground Up: Community-Centered Evaluation of LLM s in Healthcare Chatbot Settings

链接: https://arxiv.org/abs/2509.24506
作者: Hamna,Gayatri Bhat,Sourabrata Mukherjee,Faisal Lalani,Evan Hadfield,Divya Siddarth,Kalika Bali,Sunayana Sitaram
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-69] Knowledge Editing with Subspace-Aware Key-Value Mappings

链接: https://arxiv.org/abs/2509.24502
作者: Haewon Park,Sangwoo Kim,Yohan Jo
机构: Seoul National University (首尔国立大学)
类目: Computation and Language (cs.CL)
备注: 25 pages, 12 figures, 10 tables

点击查看摘要

[NLP-70] GRPO-MA: Multi-Answer Generation in GRPO for Stable and Efficient Chain-of-Thought Training

【速读】：该论文旨在解决GRPO（Generalized Reward Policy Optimization）算法在训练大语言模型（LLMs）和视觉-语言模型（VLMs）进行链式思维（Chain-of-Thought, CoT）推理时面临的三个核心挑战：1）思维与答案之间的梯度耦合问题；2）由于有限的并行采样导致的稀疏奖励信号；3）不稳定的优势估计。解决方案的关键在于提出GRPO-MA方法，其核心创新是每条思维路径生成多个答案（multi-answer generation），从而降低思维优势估计的方差，并显著减少训练过程中的梯度波动。理论分析表明，随着每个思维对应的答案数量增加，思维优势的方差呈下降趋势；实验验证了该机制可提升模型在数学、代码及多模态任务上的性能与训练效率。

链接: https://arxiv.org/abs/2509.24494
作者: Hongcheng Wang,Yinuo Huang,Sukai Wang,Guanghui Ren,Hao Dong
机构: Peking University (北京大学); University of Electronic Science and Technology of China (电子科技大学); PKU-Agibot Joint Lab (北大-阿吉博特联合实验室)
类目: Computation and Language (cs.CL)
备注: Under review

点击查看摘要

Abstract:Recent progress, such as DeepSeek-R1, has shown that the GRPO algorithm, a Reinforcement Learning (RL) approach, can effectively train Chain-of-Thought (CoT) reasoning in Large Language Models (LLMs) and Vision-Language Models (VLMs). In this paper, we analyze three challenges of GRPO: gradient coupling between thoughts and answers, sparse reward signals caused by limited parallel sampling, and unstable advantage estimation. To mitigate these challenges, we propose GRPO-MA, a simple yet theoretically grounded method that leverages multi-answer generation from each thought process, enabling more robust and efficient optimization. Theoretically, we show that the variance of thought advantage decreases as the number of answers per thought increases. Empirically, our gradient analysis confirms this effect, showing that GRPO-MA reduces gradient spikes compared to GRPO. Experiments on math, code, and diverse multimodal tasks demonstrate that GRPO-MA substantially improves performance and training efficiency. Our ablation studies further reveal that increasing the number of answers per thought consistently enhances model performance.
zh

[NLP-71] Sanitize Your Responses: Mitigating Privacy Leakage in Large Language Models

链接: https://arxiv.org/abs/2509.24488
作者: Wenjie Fu,Huandong Wang,Junyao Gao,Guoan Wan,Tao Jiang
机构: 未知
类目: Computation and Language (cs.CL); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-72] A Text-To-Text Alignment Algorithm for Better Evaluation of Modern Speech Recognition Systems

链接: https://arxiv.org/abs/2509.24478
作者: Lasse Borgholt,Jakob Havtorn,Christian Igel,Lars Maaløe,Zheng-Hua Tan
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-73] Euclids Gift: Enhancing Spatial Perception and Reasoning in Vision-Language Models via Geometric Surrogate Tasks

链接: https://arxiv.org/abs/2509.24473
作者: Shijie Lian,Changti Wu,Laurence Tianruo Yang,Hang Yuan,Bin Yu,Lei Zhang,Kai Chen
机构: Huazhong University of Science and Technology (华中科技大学); Zhongguancun Academy (中关村学院); East China Normal University (华东师范大学); Zhengzhou University (郑州大学); Zhongguancun Institute of Artificial Intelligence (中关村人工智能研究院)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-74] Bias Mitigation or Cultural Commonsense? Evaluating LLM s with a Japanese Dataset EMNLP2025

链接: https://arxiv.org/abs/2509.24468
作者: Taisei Yamamoto,Ryoma Kumon,Danushka Bollegala,Hitomi Yanaka
机构: The University of Tokyo (东京大学); Riken (理化学研究所); University of Liverpool (利物浦大学)
类目: Computation and Language (cs.CL)
备注: Accepted to EMNLP 2025 main

点击查看摘要

[NLP-75] Beyond Isolated Facts: Synthesizing Narrative and Grounded Supervision for VideoQA

链接: https://arxiv.org/abs/2509.24445
作者: Jianxin Liang,Tan Yue,Yuxuan Wang,Yueqian Wang,Zhihan Yin,Huishuai Zhang,Dongyan Zhao
机构: Wangxuan Institute of Computer Technology, Peking University (北京大学王选计算机技术研究所); State Key Laboratory of General Artificial Intelligence (通用人工智能国家重点实验室)
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-76] Alternatives To Next Token Prediction In Text Generation - A Survey

链接: https://arxiv.org/abs/2509.24435
作者: Charlie Wyatt,Aditya Joshi,Flora Salim
机构: UNSW(新南威尔士大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-77] CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition Domain and Task EMNLP2025

链接: https://arxiv.org/abs/2509.24422
作者: Haosi Mo,Xinyu Ma,Xuebo Liu,Derek F. Wong,Yu Li,Jie Liu,Min Zhang
机构: Harbin Institute of Technology, Shenzhen(哈尔滨工业大学深圳); University of Macau(澳门大学); Zhejiang University(浙江大学); Harbin Institute of Technology(哈尔滨工业大学)
类目: Computation and Language (cs.CL)
备注: 20 pages, 5 figures, EMNLP2025 Findings

点击查看摘要

[NLP-78] Multilingual Text-to-SQL: Benchmarking the Limits of Language Models with Collaborative Language Agents

链接: https://arxiv.org/abs/2509.24405
作者: Khanh Trinh Pham,Thu Huong Nguyen,Jun Jo,Quoc Viet Hung Nguyen,Thanh Tam Nguyen
机构: Griffith University (格里菲斯大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Databases (cs.DB); Emerging Technologies (cs.ET); Information Retrieval (cs.IR)
备注:

点击查看摘要

[NLP-79] Agent ar-Scale-SQL: Advancing Text-to-SQL through Orchestrated Test-Time Scaling

链接: https://arxiv.org/abs/2509.24403
作者: Pengfei Wang,Baolin Sun,Xuemei Dong,Yaxun Dai,Hongwei Yuan,Mengdie Chu,Yingqi Gao,Xiang Qi,Peng Zhang,Ying Yan
机构: Ant Digital Technologies (Ant Group); Soochow University; Zhejiang University
类目: Computation and Language (cs.CL); Databases (cs.DB)
备注:

点击查看摘要

[NLP-80] owards Safe Reasoning in Large Reasoning Models via Corrective Intervention

链接: https://arxiv.org/abs/2509.24393
作者: Yichi Zhang,Yue Ding,Jingwen Yang,Tianwei Luo,Dongbai Li,Ranjie Duan,Qiang Liu,Hang Su,Yinpeng Dong,Jun Zhu
机构: THU(清华大学); RealAI; CASIA(中国科学院自动化研究所)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-81] LLaDA-MoE: A Sparse MoE Diffusion Language Model

链接: https://arxiv.org/abs/2509.24389
作者: Fengqi Zhu,Zebin You,Yipeng Xing,Zenan Huang,Lin Liu,Yihong Zhuang,Guoshan Lu,Kangyu Wang,Xudong Wang,Lanning Wei,Hongrui Guo,Jiaqi Hu,Wentao Ye,Tieyuan Chen,Chenchen Li,Chengfu Tang,Haibo Feng,Jun Hu,Jun Zhou,Xiaolu Zhang,Zhenzhong Lan,Junbo Zhao,Da Zheng,Chongxuan Li,Jianguo Li,Ji-Rong Wen
机构: Renmin University of China (中国人民大学); Ant Group (蚂蚁集团); Shanghai Jiao Tong University (上海交通大学); Zhejiang University (浙江大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-82] HarmMetric Eval: Benchmarking Metrics and Judges for LLM Harmfulness Assessment

链接: https://arxiv.org/abs/2509.24384
作者: Langqi Yang,Tianhang Zheng,Kedong Xiu,Yixuan Chen,Di Wang,Puning Zhao,Zhan Qin,Kui Ren
机构: Zhejiang University (浙江大学); Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security (杭州高新区（滨江）区块链与数据安全研究院)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-83] Reinforcement Mid-Training

链接: https://arxiv.org/abs/2509.24375
作者: Yijun Tian,Shaoyu Chen,Zhichao Xu,Yawei Wang,Jinhe Bi,Peng Han,Wei Wang
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-84] Beyond Repetition: Text Simplification and Curriculum Learning for Data-Constrained Pretraining EMNLP2025

链接: https://arxiv.org/abs/2509.24356
作者: Matthew Theodore Roque,Dan John Velasco
机构: Samsung R&D Institute Philippines (三星研发研究院菲律宾)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: To be published in BabyLM Workshop at EMNLP 2025

点击查看摘要

[NLP-85] AlignX: Advancing Multilingual Large Language Models with Multilingual Representation Alignment EMNLP2025

链接: https://arxiv.org/abs/2509.24338
作者: Mengyu Bu,Shaolei Zhang,Zhongjun He,Hua Wu,Yang Feng
机构: Institute of Computing Technology, Chinese Academy of Sciences (ICT/CAS); Key Laboratory of AI Safety, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Beijing, China; Baidu Inc.
类目: Computation and Language (cs.CL)
备注: Accepted to EMNLP 2025 Main Conference. The code will be available at this https URL

点击查看摘要

[NLP-86] Speculative Verification: Exploiting Information Gain to Refine Speculative Decoding

链接: https://arxiv.org/abs/2509.24328
作者: Sungkyun Kim,Jaemin Kim,Dogyung Yoon,Jiho Shin,Junyeol Lee,Jiwon Seo
机构: Hanyang University (汉阳大学); Seoul National University (首尔国立大学); University of Seoul (首尔大学)
类目: Computation and Language (cs.CL)
备注: 14 pages, 6 figures

点击查看摘要

[NLP-87] MAS2: Self-Generative Self-Configuring Self-Rectifying Multi-Agent Systems

链接: https://arxiv.org/abs/2509.24323
作者: Kun Wang,Guibin Zhang,ManKit Ye,Xinyu Deng,Dongxia Wang,Xiaobin Hu,Jinyang Guo,Yang Liu,Yufei Guo
机构: NTU(南洋理工大学); NUS(新加坡国立大学); USTC(中国科学技术大学); ZJU(浙江大学); BUAA(北京航空航天大学); PKU(北京大学)
类目: Multiagent Systems (cs.MA); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-88] Multimodal Large Language Models Meet Multimodal Emotion Recognition and Reasoning : A Survey

链接: https://arxiv.org/abs/2509.24322
作者: Yuntao Shou,Tao Meng,Wei Ai,Keqin Li
机构: Central South University of Forestry and Technology(中南林业科技大学); State University of New York New Paltz(纽约州立大学新帕尔茨分校)
类目: Computation and Language (cs.CL)
备注: 35 pages, 10 figures, 8 tables

点击查看摘要

[NLP-89] Dual Mechanisms of Value Expression: Intrinsic vs. Prompted Values in LLM s

链接: https://arxiv.org/abs/2509.24319
作者: Jongwook Han,Jongwon Lim,Injin Kong,Yohan Jo
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-90] Bridging the behavior-neural gap: A multimodal AI reveals the brains geometry of emotion more accurately than human self-reports

链接: https://arxiv.org/abs/2509.24298
作者: Changde Du,Yizhuo Lu,Zhongyu Huang,Yi Sun,Zisen Zhou,Shaozheng Qin,Huiguang He
机构: Institute of Automation, Chinese Academy of Sciences (中国科学院自动化研究所); University of Chinese Academy of Sciences (中国科学院大学); Beijing Normal University (北京师范大学)
类目: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY); Multimedia (cs.MM)
备注:

点击查看摘要

[NLP-91] Q-Mirror: Unlocking the Multi-Modal Potential of Scientific Text-Only QA Pairs

【速读】：该论文旨在解决高质量多模态问答（Multi-Modal QA, MMQA）基准数据集难以大规模构建的问题，因其手动创建成本高且不可扩展。解决方案的关键在于提出一个TQA-to-MMQA框架，通过定义任务规范与多维质量评估标准，将文本-only的问答对（Text-Only QA Pairs, TQAs）自动转化为高质量MMQA，并开发了一个闭环迭代优化系统Q-Mirror——该系统集成MMQA生成与评估模块，利用先进理解模型对生成内容进行可靠质量评估，从而实现自动化的迭代改进。实验表明，该方法显著提升了MMQA的质量与通过率，为构建大规模科学推理基准提供了可行路径。

链接: https://arxiv.org/abs/2509.24297
作者: Junying Wang,Zicheng Zhang,Ye Shen,Yalun Wu,Yingji Liang,Yijin Guo,Farong Wen,Wenzhe Li,Xuezhi Zhao,Qi Jia,Guangtao Zhai
机构: Fudan University (复旦大学); Shanghai Artificial Intelligence Laboratory (上海人工智能实验室); Shanghai Jiao Tong University (上海交通大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 26 pages

点击查看摘要

Abstract:High-quality, multi-modal benchmarks are crucial for advancing scientific reasoning in large models yet their manual creation is costly and unscalable. To address this bottleneck, we explore the potential for transforming Text-Only QA Pairs (TQAs) into high-quality Multi-Modal QA Pairs (MMQAs), which include three parts: 1) Task Definition \ Evaluation Rubric: We develop a TQA-to-MMQA framework and establish a comprehensive, multi-dimensional MMQA quality rubric that provides principles for the transformation. 2) Benchmark Construction: Then we construct two extensive benchmarks to rigorously evaluate state-of-the-art generation \ understanding models on the distinct tasks of MMQA generation \ MMQA quality evaluation. 3) Preliminary Solution: We develop an agentic system (Q-Mirror), which operationalizes our framework by integrating MMQA generation and evaluation into a closed loop for iterative refinement. Our experiments show that while state-of-the-art models can generate MMQAs, their outputs still leave substantial gaps, underscoring the need for reliable evaluation. We further demonstrate that top-tier understanding models align closely with human judgment in MMQA quality assessment. Leveraging both insights, the Q-Mirror agent raises average scores from 78.90 to 85.22 and pass rates from 72% to 95%, offering a practical path to large-scale scientific benchmarks.
zh

[NLP-92] DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

链接: https://arxiv.org/abs/2509.24296
作者: Zherui Li,Zheng Nie,Zhenhong Zhou,Yufei Guo,Yue Liu,Yitong Zhang,Yu Cheng,Qingsong Wen,Kun Wang,Jiaheng Zhang
机构: BUPT(北京邮电大学); NUS(新加坡国立大学); NTU(南洋理工大学); PKU(北京大学); THU(清华大学); CUHK(香港中文大学); Squirrel AI(松鼠AI)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-93] LOGOS: LLM -driven End-to-End Grounded Theory Development and Schema Induction for Qualitative Research

链接: https://arxiv.org/abs/2509.24294
作者: Xinyu Pi,Qisen Yang,Chuong Nguyen
机构: 未知
类目: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)
备注:

点击查看摘要

[NLP-94] Let LLM s Speak Embedding Languages: Generative Text Embeddings via Iterative Contrastive Refinement

链接: https://arxiv.org/abs/2509.24291
作者: Yu-Che Tsai,Kuan-Yu Chen,Yuan-Chi Li,Yuan-Hao Chen,Ching-Yu Tsai,Shou-De Lin
机构: National Taiwan University (国立台湾大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-95] SCI-Verifier: Scientific Verifier with Thinking

链接: https://arxiv.org/abs/2509.24285
作者: Shenghe Zheng,Chenyu Huang,Fangchen Yu,Junchi Yao,Jingqi Ye,Tao Chen,Yun Luo,Ning Ding,LEI BAI,Ganqu Cui,Peng Ye
机构: Shanghai AI Laboratory (上海人工智能实验室); Harbin Institute of Technology (哈尔滨工业大学); Fudan University (复旦大学); CUHK (香港中文大学); Tsinghua University (清华大学); CUHK-Shenzhen (香港中文大学深圳分校); UESTC (电子科技大学); USTC (中国科学技术大学)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
备注: This paper focuses on LLM-as-a-Judge, and the project is currently in progress

点击查看摘要

[NLP-96] Overview of SCIDOCA 2025 Shared Task on Citation Prediction Discovery and Placement

链接: https://arxiv.org/abs/2509.24283
作者: An Dao,Vu Tran,Le-Minh Nguyen,Yuji Matsumoto
机构: 未知
类目: Digital Libraries (cs.DL); Computation and Language (cs.CL)
备注: 16 pages, SCIDOCA 2025

点击查看摘要

[NLP-97] SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents

链接: https://arxiv.org/abs/2509.24282
作者: Gyuhyeon Seo,Jungwoo Yang,Junseong Pyo,Nalim Kim,Jonggeun Lee,Yohan Jo
机构: Seoul National University (首尔国立大学); Hanyang University (汉阳大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-98] AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models

链接: https://arxiv.org/abs/2509.24269
作者: Zihao Zhu,Xinyu Wu,Gehan Hu,Siwei Lyu,Ke Xu,Baoyuan Wu
机构: The Chinese University of Hong Kong, Shenzhen (深圳中文大学); State University of New York at Buffalo (纽约州立大学布法罗分校); Huawei International, Singapore (华为国际，新加坡)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-99] PAME-AI: Patient Messaging Creation and Optimization using Agent ic AI

链接: https://arxiv.org/abs/2509.24263
作者: Junjie Luo,Yihong Guo,Anqi Liu,Ritu Agarwal,Gordon(Guodong)Gao
机构: Johns Hopkins School of Medicine (约翰霍普金斯医学院); Johns Hopkins University (约翰霍普金斯大学); Johns Hopkins University, CDHAI (约翰霍普金斯大学, CDHAI)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-100] MRAG -Suite: A Diagnostic Evaluation Platform for Visual Retrieval-Augmented Generation

链接: https://arxiv.org/abs/2509.24253
作者: Yuelyu Ji
机构: University of Pittsburgh (匹兹堡大学)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-101] Latent Visual Reasoning

链接: https://arxiv.org/abs/2509.24251
作者: Bangzheng Li,Ximeng Sun,Jiang Liu,Ze Wang,Jialian Wu,Xiaodong Yu,Hao Chen,Emad Barsoum,Muhao Chen,Zicheng Liu
机构: University of California, Davis (加州大学戴维斯分校); Advanced Micro Devices, Inc. (超微半导体公司)
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-102] SpecExit: Accelerating Large Reasoning Model via Speculative Exit

链接: https://arxiv.org/abs/2509.24248
作者: Rubing Yang,Huajun Bai,Song Liu,Guanghua Yu,Runzhi Fan,Yanbin Dang,Jiejing Zhang,Kai Liu,Jianchen Zhu,Peng Chen
机构: Tencent(腾讯)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-103] Prompt and Parameter Co-Optimization for Large Language Models

链接: https://arxiv.org/abs/2509.24245
作者: Xiaohe Bo,Rui Li,Zexu Sun,Quanyu Dai,Zeyu Zhang,Zihang Tian,Xu Chen,Zhenhua Dong
机构: Gaoling School of Artificial Intelligence, Renmin University of China (中国人民大学高瓴人工智能学院); Huawei Noah’s Ark Lab (华为诺亚方舟实验室)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 19 pages, 10 figures

点击查看摘要

[NLP-104] Learning to Ponder: Adaptive Reasoning in Latent Space

链接: https://arxiv.org/abs/2509.24238
作者: Yixin He,Lumingyuan Tang
机构: University of Southern California (南加州大学); Independent Researcher (独立研究员)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-105] Model Fusion with Multi-LoRA Inference for Tool-Enhanced Game Dialogue Agents

链接: https://arxiv.org/abs/2509.24229
作者: Kangxu Wang,Ze Chen,Chengcheng Wei,Jiewen Zheng,Jiarong He,Max Gao
机构: Interactive Entertainment Group of Netease Inc.(网易互动娱乐集团)
类目: Computation and Language (cs.CL)
备注: 8 pages

点击查看摘要

[NLP-106] MoVa: Towards Generalizable Classification of Human Morals and Values EMNLP2025

链接: https://arxiv.org/abs/2509.24216
作者: Ziyu Chen,Junfei Sun,Chenxi Li,Tuan Dung Nguyen,Jing Yao,Xiaoyuan Yi,Xing Xie,Chenhao Tan,Lexing Xie
机构: The Australian National University (澳大利亚国立大学); University of Chicago (芝加哥大学); University of Pennsylvania (宾夕法尼亚大学); Microsoft Research Asia (微软亚洲研究院)
类目: Computation and Language (cs.CL); Computers and Society (cs.CY)
备注: 9 pages, 10 figures and tables, EMNLP 2025 main conference

点击查看摘要

[NLP-107] Metamorphic Testing for Audio Content Moderation Software

链接: https://arxiv.org/abs/2509.24215
作者: Wenxuan Wang,Yongjiang Wu,Junyuan Zhang,Shuqing Li,Yun Peng,Wenting Chen,Shuai Wang,Michael R. Lyu
机构: 未知
类目: oftware Engineering (cs.SE); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM)
备注: Accepted by ASE 2025

点击查看摘要

[NLP-108] ScenarioBench: Trace-Grounded Compliance Evaluation for Text-to-SQL and RAG

链接: https://arxiv.org/abs/2509.24212
作者: Zahra Atf,Peter R Lewis
机构: 未知
类目: Computation and Language (cs.CL)
备注: Accepted for presentation at the LLMs Meet Databases (LMD) Workshop, 35th IEEE International Conference on Collaborative Advances in Software and Computing, 2025. Workshop website: this https URL

点击查看摘要

[NLP-109] BeyondBench: Benchmark-Free Evaluation of Reasoning in Language Models

链接: https://arxiv.org/abs/2509.24210
作者: Gaurav Srivastava,Aafiya Hussain,Zhenyu Bi,Swastik Roy,Priya Pitre,Meng Lu,Morteza Ziyadi,Xuan Wang
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 113 pages, 5 figures, 30 tables

点击查看摘要

[NLP-110] Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

链接: https://arxiv.org/abs/2509.24203
作者: Chaorui Yao,Yanxi Chen,Yuchang Sun,Yushuo Chen,Wenhao Zhang,Xuchen Pan,Yaliang Li,Bolin Ding
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-111] Can Large Language Models Express Uncertainty Like Human?

链接: https://arxiv.org/abs/2509.24202
作者: Linwei Tao,Yi-Fan Yeh,Bo Kai,Minjing Dong,Tao Huang,Tom A. Lamb,Jialin Yu,Philip H.S. Torr,Chang Xu
机构: University of Sydney (悉尼大学); City University of Hong Kong (香港城市大学); Shanghai Jiao Tong University (上海交通大学); University of Oxford (牛津大学)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 10 pages

点击查看摘要

[NLP-112] AceSearcher: Bootstrapping Reasoning and Search for LLM s via Reinforced Self-Play NEURIPS2025

链接: https://arxiv.org/abs/2509.24193
作者: Ran Xu,Yuchen Zhuang,Zihan Dong,Jonathan Wang,Yue Yu,Joyce C. Ho,Linjun Zhang,Haoyu Wang,Wenqi Shi,Carl Yang
机构: Emory University (埃默里大学); Georgia Institute of Technology (佐治亚理工学院); Rutgers University (罗格斯大学); SUNY Albany (纽约州立大学阿尔巴尼分校); UT Southwestern Medical Center (德克萨斯大学西南医学中心)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
备注: Accepted to NeurIPS 2025 (Spotlight)

点击查看摘要

[NLP-113] PET: Preference Evolution Tracking with LLM -Generated Explainable Distribution

链接: https://arxiv.org/abs/2509.24189
作者: Luyang Zhang,Siyuan Peng,Jialu Wang,Shichao Zhu,Beibei Li,Zhongcun Wang,Guangmou Pan,Yan Li,Song Yang
机构: Carnegie Mellon University (卡内基梅隆大学); University of Maryland (马里兰大学); TikTok Inc. (抖音公司)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-114] Beyond Overall Accuracy: A Psychometric Deep Dive into the Topic-Specific Medical Capabilities of 80 Large Language Models

链接: https://arxiv.org/abs/2509.24186
作者: Zhimeng Luo,Lixin Wu,Adam Frisch,Daqing He
机构: University of Pittsburgh (匹兹堡大学); University of Illinois Urbana–Champaign (伊利诺伊大学厄巴纳-香槟分校)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-115] Retrieval-augmented GUI Agents with Generative Guidelines EMNLP2025

链接: https://arxiv.org/abs/2509.24183
作者: Ran Xu,Kaixin Ma,Wenhao Yu,Hongming Zhang,Joyce C. Ho,Carl Yang,Dong Yu
机构: Emory University (埃默里大学); Tencent AI Lab (腾讯人工智能实验室)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: Accepted to EMNLP 2025 (Main Conference)

点击查看摘要

[NLP-116] ask Vectors Learned Not Extracted: Performance Gains and Mechanistic Insight

【速读】：该论文旨在解决当前对大语言模型（Large Language Models, LLMs）中上下文学习（In-Context Learning, ICL）机制理解不足的问题，特别是任务向量（Task Vectors, TVs）的提取方法复杂且不透明，以及TVs如何影响模型内部计算的机制尚不清楚。解决方案的关键在于提出可学习任务向量（Learned Task Vectors, LTVs）——通过直接训练获得TVs，其在准确性和灵活性上均优于传统提取方法，且能在任意层、位置甚至与ICL提示共存时有效工作；同时，通过系统性机制分析揭示TVs的作用路径：低层级主要通过注意力头的输出-值（OV）电路实现预测引导，其中少数“关键头”起决定性作用；高层级则表现为TV传播近似线性，早期TV被旋转至任务相关子空间以提升目标标签的logits，后期TV则主要通过幅度缩放进行调节。

链接: https://arxiv.org/abs/2509.24169
作者: Haolin Yang,Hakaze Cho,Kaize Ding,Naoya Inoue
机构: 未知
类目: Computation and Language (cs.CL)
备注: 48 pages, 95 figures, 17 tables

点击查看摘要

Abstract:Large Language Models (LLMs) can perform new tasks from in-context demonstrations, a phenomenon known as in-context learning (ICL). Recent work suggests that these demonstrations are compressed into task vectors (TVs), compact task representations that LLMs exploit for predictions. However, prior studies typically extract TVs from model outputs or hidden states using cumbersome and opaque methods, and they rarely elucidate the mechanisms by which TVs influence computation. In this work, we address both limitations. First, we propose directly training Learned Task Vectors (LTVs), which surpass extracted TVs in accuracy and exhibit superior flexibility-acting effectively at arbitrary layers, positions, and even with ICL prompts. Second, through systematic analysis, we investigate the mechanistic role of TVs, showing that at the low level they steer predictions primarily through attention-head OV circuits, with a small subset of “key heads” most decisive. At a higher level, we find that despite Transformer nonlinearities, TV propagation is largely linear: early TVs are rotated toward task-relevant subspaces to improve logits of relevant labels, while later TVs are predominantly scaled in magnitude. Taken together, LTVs not only provide a practical approach for obtaining effective TVs but also offer a principled lens into the mechanistic foundations of ICL.
zh

[NLP-117] Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis

链接: https://arxiv.org/abs/2509.24164
作者: Haolin Yang,Hakaze Cho,Naoya Inoue
机构: University of Chicago (芝加哥大学); JAIST (日本信息科学与技术研究院); RIKEN (理化学研究所)
类目: Computation and Language (cs.CL)
备注: 45 pages, 88 figures, 10 tables

点击查看摘要

[NLP-118] Reasoning or Retrieval? A Study of Answer Attribution on Large Reasoning Models

链接: https://arxiv.org/abs/2509.24156
作者: Yuhui Wang,Changjiang Li,Guangke Chen,Jiacheng Liang,Ting Wang
机构: Stony Brook University (石溪大学)
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-119] Your thoughts tell who you are: Characterize the reasoning patterns of LRMs

链接: https://arxiv.org/abs/2509.24147
作者: Yida Chen,Yuning Mao,Xianjun Yang,Suyu Ge,Shengjie Bi,Lijuan Liu,Saghar Hosseini,Liang Tan,Yixin Nie,Shaoliang Nie
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 32 pages, 28 figures

点击查看摘要

[NLP-120] Generalist Scanner Meets Specialist Locator: A Synergistic Coarse-to-Fine Framework for Robust GUI Grounding

链接: https://arxiv.org/abs/2509.24133
作者: Zhecheng Li,Guoxian Song,Yiwei Wang,Zhen Xiong,Junsong Yuan,Yujun Cai
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-121] Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE

链接: https://arxiv.org/abs/2509.24130
作者: Guancheng Wan,Lucheng Fu,Haoxin Liu,Yiqiao Jin,Hui Yi Leong,Eric Hanchen Jiang,Hejia Geng,Jinhe Bi,Yunpu Ma,Xiangru Tang,B. Aditya Prakash,Yizhou Sun,Wei Wang
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-122] EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos EMNLP2025

链接: https://arxiv.org/abs/2509.24120
作者: Sourjyadip Ray,Shubham Sharma,Somak Aditya,Pawan Goyal
机构: Indian Institute of Technology, Kharagpur (印度理工学院，克哈格普尔分校); Panjab University, Chandigarh (旁遮普大学，昌迪加尔分校)
类目: Computation and Language (cs.CL)
备注: EMNLP 2025 (Main)

点击查看摘要

[NLP-123] Dual-Scale World Models for LLM Agents Towards Hard-Exploration Problems

链接: https://arxiv.org/abs/2509.24116
作者: Minsoo Kim,Seung-won Hwang
机构: Seoul National University (首尔国立大学)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-124] Prag matic Inference for Moral Reasoning Acquisition: Generalization via Distributional Semantics

链接: https://arxiv.org/abs/2509.24102
作者: Guangliang Liu,Xi Chen,Bocheng Chen,Xitong Zhang,Kristen Johnson
机构: Michigan State University (密歇根州立大学); Nanyang Technological University (南洋理工大学); University of Mississippi (密西西比大学)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-125] BTC-SAM: Leverag ing LLM s for Generation of Bias Test Cases for Sentiment Analysis Models EMNLP2025

链接: https://arxiv.org/abs/2509.24101
作者: Zsolt T. Kardkovács,Lynda Djennane,Anna Field,Boualem Benatallah,Yacine Gaci,Fabio Casati,Walid Gaaloul
机构: Insight SFI Research Center on Data Analytics, Dublin City University, Dublin, Ireland; Laboratoire LITAN, École supérieure en Sciences et Technologies de l’Informatique et du Numérique, RN 75, Amizour 06300, Bejaia, Algérie; School of Computing, Dublin City University, Dublin, Ireland; Plus Que Pro, Strasbourg, France; ServiceNow, Zurich, Switzerland; Department of Information Engineering and Computer Science, University of Trento, Via Sommarive 9, Povo, 38123 Trento, Italy; Télécom SudParis, SAMOVAR, Institut Polytechnique de Paris, Paris, France
类目: Computation and Language (cs.CL)
备注: Accepted at EMNLP 2025 main conference

点击查看摘要

[NLP-126] GEAR: A General Evaluation Framework for Abductive Reasoning

链接: https://arxiv.org/abs/2509.24096
作者: Kaiyu He,Peilin Wu,Mian Zhang,Kun Wan,Wentian Zhao,Xinya Du,Zhiyu Chen
机构: The University of Texas at Dallas (德克萨斯大学达拉斯分校); Adobe(Adobe)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: Coda and Data: this https URL

点击查看摘要

[NLP-127] Large-Scale Constraint Generation - Can LLM s Parse Hundreds of Constraints?

链接: https://arxiv.org/abs/2509.24090
作者: Matteo Boffa,Jiaxuan You
机构: Politecnico di Torino (都灵理工大学); University of Urbana Champaign (UIUC) (伊利诺伊大学厄巴纳-香槟分校)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-128] Do Repetitions Matter? Strengthening Reliability in LLM Evaluations

链接: https://arxiv.org/abs/2509.24086
作者: Miguel Angel Alvarado Gonzalez,Michelle Bruno Hernandez,Miguel Angel Peñaloza Perez,Bruno Lopez Orozco,Jesus Tadeo Cruz Soto,Sandra Malagon
机构: Carreras con Impacto; Aixo Lab; Centro de Investigación Científica y de Educación Superior de Ensenada, Baja California, México; Facultad de Ciencias, UNAM, México; Facultad de Matemáticas, Universidad Veracruzana, México
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-129] Ensembling Multilingual Transformers for Robust Sentiment Analysis of Tweets

链接: https://arxiv.org/abs/2509.24080
作者: Meysam Shirdel Bilehsavar,Negin Mahmoudi,Mohammad Jalili Torkamani,Kiana Kiashemshaki
机构: University of South Carolina (南卡罗来纳大学); Stevens Institute of Technology (史蒂文斯理工学院); University of Nebraska–Lincoln (内布拉斯加林肯大学); Bowling Green State University (鲍林格林州立大学)
类目: Computation and Language (cs.CL)
备注: 19 pages, 4 figures, 2 tables

点击查看摘要

[NLP-130] ResFormer: All-Time Reservoir Memory for Long Sequence Classification EMNLP2025

链接: https://arxiv.org/abs/2509.24074
作者: Hongbo Liu,Jia Xu
机构: Stevens Institute of Technology (斯蒂文斯理工学院)
类目: Computation and Language (cs.CL)
备注: Accepted at EMNLP 2025. To appear in the proceedings

点击查看摘要

[NLP-131] he Role of Logic and Automata in Understanding Transformers

链接: https://arxiv.org/abs/2509.24024
作者: Anthony W. Lin,Pablo Barcelo
机构: 未知
类目: Formal Languages and Automata Theory (cs.FL); Computation and Language (cs.CL); Machine Learning (cs.LG); Logic in Computer Science (cs.LO)
备注: Preprint of invited paper for RP’25

点击查看摘要

[NLP-132] SparseD: Sparse Attention for Diffusion Language Models

链接: https://arxiv.org/abs/2509.24014
作者: Zeqing Wang,Gongfan Fang,Xinyin Ma,Xingyi Yang,Xinchao Wang
机构: National University of Singapore(新加坡国立大学); The Hong Kong Polytechnic University(香港理工大学)
类目: Computation and Language (cs.CL)
备注: The code is available at this https URL

点击查看摘要

[NLP-133] Sequential Diffusion Language Models

链接: https://arxiv.org/abs/2509.24007
作者: Yangzhou Liu,Yue Cao,Hao Li,Gen Luo,Zhe Chen,Weiyun Wang,Xiaobo Liang,Biqing Qi,Lijun Wu,Changyao Tian,Yanting Zhang,Yuqiang Li,Tong Lu,Yu Qiao,Jifeng Dai,Wenhai Wang
机构: Shanghai AI Laboratory; Nanjing University; Tsinghua University; Fudan University; The Chinese University of Hong Kong; Soochow University; Donghua University
类目: Computation and Language (cs.CL); Machine Learning (cs.LG)
备注: 14 pages, 5 figures, technical report

点击查看摘要

[NLP-134] MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

链接: https://arxiv.org/abs/2509.24002
作者: Zijian Wu,Xiangyan Liu,Xinyuan Zhang,Lingjun Chen,Fanqing Meng,Lingxiao Du,Yiran Zhao,Fanshi Zhang,Yaoqi Ye,Jiawei Wang,Zirui Wang,Jinjie Ni,Yufan Yang,Arvin Xu,Michael Qizhe Shieh
机构: 1. Tsinghua University (清华大学); 2. Alibaba Group (阿里巴巴集团); 3. Tongyi Lab (通义实验室); 4. Microsoft (微软); 5. MIT (麻省理工学院)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 42 pages, 27 figures, 10 tables

点击查看摘要

[NLP-135] he AI Agent Code of Conduct: Automated Guardrail Policy-as-Prompt Synthesis NEURIPS2025

链接: https://arxiv.org/abs/2509.23994
作者: Gauri Kholkar,Ratinder Ahuja
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: Accepted at Regulatable ML Workshop at NEURIPS 2025

点击查看摘要

[NLP-136] he Hidden Costs of Translation Accuracy: Distillation Quantization and Environmental Impact

链接: https://arxiv.org/abs/2509.23990
作者: Dhaathri Vijay,Anandaswarup Vadapalli
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-137] oward Preference-aligned Large Language Models via Residual-based Model Steering

链接: https://arxiv.org/abs/2509.23982
作者: Lucio La Cava,Andrea Tagarelli
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
备注:

点击查看摘要

[NLP-138] ByteSized32Refactored: Towards an Extensible Interactive Text Games Corpus for LLM World Modeling and Evaluation EMNLP2025

链接: https://arxiv.org/abs/2509.23979
作者: Haonan Wang,Junfeng Sun,Xingdi Yuan,Ruoyao Wang,Ziang Xiao
机构: 未知
类目: Computation and Language (cs.CL)
备注: 14 pages,15 figures, Accepted to the 5th Wordplay: When Language Meets Games Workshop, EMNLP 2025

点击查看摘要

[NLP-139] HiPO: Hybrid Policy Optimization for Dynamic Reasoning in LLM s

链接: https://arxiv.org/abs/2509.23967
作者: Ken Deng,Zizheng Zhan,Wen Xiang,Wenqiang Zhu,Tianhao Peng,Xinping Lei,Weihao Li,Jingxuan Xu,Kun Wu,Yifan Yao,Haoyang Huang,Huaixi Tang,Kepeng Lei,Zhiyi Lai,Songwei Yu,Zongxian Feng,Zuchen Gao,Weihao Xie,Chenchen Zhang,Yanan Wu,Yuanxing Zhang,Lecheng Huang,Yuqun Zhang,Jie Liu,Zhaoxiang Zhang,Haotian Zhang,Bin Chen,Jiaheng Liu
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-140] Detecting and Rectifying Noisy Labels: A Similarity-based Approach

链接: https://arxiv.org/abs/2509.23964
作者: Dang Huu-Tien,Naoya Inoue
机构: JAIST(日本信息科学与技术大学院大学); RIKEN(理化学研究所)
类目: Machine Learning (cs.LG); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-141] Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models

链接: https://arxiv.org/abs/2509.23962
作者: Guanxu Chen,Yafu Li,Yuxian Jiang,Chen Qian,Qihan Ren,Jingyi Yang,Yu Cheng,Dongrui Liu,Jing Shao
机构: 未知
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注: 18 pages, 13 figures, 4 tables

点击查看摘要

[NLP-142] Vision-Grounded Machine Interpreting: Improving the Translation Process through Visual Cues

链接: https://arxiv.org/abs/2509.23957
作者: Claudio Fantinuoli
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: Paper presented at AMTA 2025

点击查看摘要

[NLP-143] Explore-Execute Chain: Towards an Efficient Structured Reasoning Paradigm ICLR2026

链接: https://arxiv.org/abs/2509.23946
作者: Kaisen Yang,Lixuan He,Rushi Shah,Kaicheng Yang,Qinwei Ma,Dianbo Liu,Alex Lamb
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (stat.ML)
备注: Under review ICLR 2026

点击查看摘要

[NLP-144] Easy Turn: Integrating Acoustic and Linguistic Modalities for Robust Turn-Taking in Full-Duplex Spoken Dialogue Systems

链接: https://arxiv.org/abs/2509.23938
作者: Guojian Li,Chengyou Wang,Hongfei Xue,Shuiyuan Wang,Dehui Gao,Zihan Zhang,Yuke Lin,Wenjie Li,Longshuai Xiao,Zhonghua Fu,Lei Xie
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-145] Assessing Large Language Models in Updating Their Forecasts with New Information

链接: https://arxiv.org/abs/2509.23936
作者: Zhangdie Yuan,Zifeng Ding,Andreas Vlachos
机构: University of Cambridge (剑桥大学)
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-146] Beyond Benchmarks: Understanding Mixture-of-Experts Models through Internal Mechanisms

链接: https://arxiv.org/abs/2509.23933
作者: Jiahao Ying,Mingbao Lin,Qianru Sun,Yixin Cao
机构: Singapore Management University (新加坡管理大学); Rakuten Singapore (乐天新加坡); Institute of Trustworthy Embodied AI, Fudan University (复旦大学可信具身智能研究所)
类目: Machine Learning (cs.LG); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-147] aming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step

链接: https://arxiv.org/abs/2509.23924
作者: Jingyi Yang,Guanxu Chen,Xuhao Hu,Jing Shao
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 10 pages, 4 figures, 7 tables. Code: this https URL

点击查看摘要

[NLP-148] Dynamic Orthogonal Continual Fine-tuning for Mitigating Catastrophic Forgettings

链接: https://arxiv.org/abs/2509.23893
作者: Zhixin Zhang,Zeming Wei,Meng Sun
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Cryptography and Security (cs.CR); Optimization and Control (math.OC)
备注:

点击查看摘要

[NLP-149] DocPruner: A Storag e-Efficient Framework for Multi-Vector Visual Document Retrieval via Adaptive Patch-Level Embedding Pruning

链接: https://arxiv.org/abs/2509.23883
作者: Yibo Yan,Guangwei Xu,Xin Zou,Shuliang Liu,James Kwok,Xuming Hu
机构: 未知
类目: Computation and Language (cs.CL); Information Retrieval (cs.IR)
备注: Under review

点击查看摘要

[NLP-150] PCRI: Measuring Context Robustness in Multimodal Models for Enterprise Applications EMNLP2025

链接: https://arxiv.org/abs/2509.23879
作者: Hitesh Laxmichand Patel,Amit Agarwal,Srikant Panda,Hansa Meghwani,Karan Dua,Paul Li,Tao Sheng,Sujith Ravi,Dan Roth
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM)
备注: Accepted in EMNLP 2025

点击查看摘要

[NLP-151] Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

链接: https://arxiv.org/abs/2509.23873
作者: Shaobo Wang,Jiaming Wang,Jiajun Zhang,Cong Wang,Yue Min,Zichen Wen,Fei Huang,Huiqiang Jiang,Junyang Lin,Dayiheng Liu,Linfeng Zhang
机构: 未知
类目: Computation and Language (cs.CL)
备注: 19 pages, 6 figures

点击查看摘要

[NLP-152] Bridging the Knowledge-Prediction Gap in LLM s on Multiple-Choice Questions

链接: https://arxiv.org/abs/2509.23782
作者: Yoonah Park,Haesung Pyun,Yohan Jo
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-153] Knowledge Homophily in Large Language Models

链接: https://arxiv.org/abs/2509.23773
作者: Utkarsh Sahu,Zhisheng Qi,Mahantesh Halappanavar,Nedim Lipka,Ryan A. Rossi,Franck Dernoncourt,Yu Zhang,Yao Ma,Yu Wang
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Social and Information Networks (cs.SI)
备注:

点击查看摘要

[NLP-154] From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning

链接: https://arxiv.org/abs/2509.23768
作者: Cheng Yang,Jiaxuan Lu,Haiyuan Wan,Junchi Yu,Feiwei Qin
机构: 未知
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-155] From Personal to Collective: On the Role of Local and Global Memory in LLM Personalization

链接: https://arxiv.org/abs/2509.23767
作者: Zehong Wang,Junlin Wu,ZHaoxuan Tan,Bolian Li,Xianrui Zhong,Zheli Liu,Qingkai Zeng
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-156] Knowledge-Level Consistency Reinforcement Learning: Dual-Fact Alignment for Long-Form Factuality

链接: https://arxiv.org/abs/2509.23765
作者: Junliang Li,Yucheng Wang,Yan Chen,Yu Ran,Ruiqing Zhang,Jing Liu,Hua Wu,Haifeng Wang
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-157] Understanding Textual Capability Degradation in Speech LLM s via Parameter Importance Analysis

链接: https://arxiv.org/abs/2509.23755
作者: Chao Wang,Rui-Chen Zheng,Yang Ai,Zhen-Hua Ling
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-158] Anchored Supervised Fine-Tuning

链接: https://arxiv.org/abs/2509.23753
作者: He Zhu,Junyou Su,Peng Lai,Ren Ma,Wenjia Zhang,Linyi Yang,Guanhua Chen
机构: 未知
类目: Machine Learning (cs.LG); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-159] Beyond Game Theory Optimal: Profit-Maximizing Poker Agents for No-Limit Holdem

链接: https://arxiv.org/abs/2509.23747
作者: SeungHyun Yi,Seungjun Yi
机构: 未知
类目: Computer Science and Game Theory (cs.GT); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-160] Compose and Fuse: Revisiting the Foundational Bottlenecks in Multimodal Reasoning

链接: https://arxiv.org/abs/2509.23744
作者: Yucheng Wang,Yifan Hou,Aydin Javadov,Mubashara Akhtar,Mrinmaya Sachan
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: Our code ( this https URL ) and data ( this https URL ) are publicly available

点击查看摘要

[NLP-161] Do LLM s Understand Romanian Driving Laws? A Study on Multimodal and Fine-Tuned Question Answering

链接: https://arxiv.org/abs/2509.23715
作者: Eduard Barbu,Adrian Marius Dumitran
机构: 未知
类目: Computation and Language (cs.CL); Machine Learning (cs.LG)
备注: Accepted@ CONSILR 2025 Bucharest Romania 9-10 October

点击查看摘要

[NLP-162] Collaboration of Fusion and Independence: Hypercomplex-driven Robust Multi-Modal Knowledge Graph Completion

链接: https://arxiv.org/abs/2509.23714
作者: Zhiqiang Liu,Yichi Zhang,Mengshu Sun,Lei Liang,Wen Zhang
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-163] VIVA: Human-Centered Situational Decision-Making EMNLP2025

链接: https://arxiv.org/abs/2509.23698
作者: Zhe Hu,Yixiao Ren,Guanzhong Liu,Jing Li,Yu Yin
机构: 未知
类目: Computation and Language (cs.CL)
备注: EMNLP 2025 Findings

点击查看摘要

[NLP-164] SafeSearch: Automated Red-Teaming for the Safety of LLM -Based Search Agents

链接: https://arxiv.org/abs/2509.23694
作者: Jianshuo Dong,Sheng Guo,Hao Wang,Zhuotao Liu,Tianwei Zhang,Ke Xu,Minlie Huang,Han Qiu
机构: 未知
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Cryptography and Security (cs.CR)
备注: Preprint

点击查看摘要

[NLP-165] HomeSafeBench: A Benchmark for Embodied Vision-Language Models in Free-Exploration Home Safety Inspection

链接: https://arxiv.org/abs/2509.23690
作者: Siyuan Gao,Jiashu Yao,Haoyu Wen,Yuhang Guo,Zeming Liu,Heyan Huang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-166] F-Bench: Evaluating Program Semantics Reasoning with Type Inference in System F NEURIPS’25

链接: https://arxiv.org/abs/2509.23686
作者: Yifeng He,Luning Yang,Christopher Castro Gaw Gonzalo,Hao Chen
机构: 未知
类目: Computation and Language (cs.CL); Programming Languages (cs.PL); Software Engineering (cs.SE)
备注: NeurIPS '25, package released at: this https URL

点击查看摘要

[NLP-167] owards a Comprehensive Scaling Law of Mixture-of-Experts

链接: https://arxiv.org/abs/2509.23678
作者: Guoliang Zhao,Yuhan Fu,Shuaipeng Li,Xingwu Sun,Ruobing Xie,An Wang,Weidong Han,Zhen Yang,Weixuan Sun,Yudong Zhang,Cheng-zhong Xu,Di Wang,Jie Jiang
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-168] From Reasoning to Answer: Empirical Attention-Based and Mechanistic Insights into Distilled DeepSeek R1 Models EMNLP’25

链接: https://arxiv.org/abs/2509.23676
作者: Jue Zhang,Qingwei Lin,Saravan Rajmohan,Dongmei Zhang
机构: 未知
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注: Accepted by EMNLP’25 (Main)

点击查看摘要

[NLP-169] RCI: A Score for Evaluating Global and Local Reasoning in Multimodal Benchmarks EMNLP2025

链接: https://arxiv.org/abs/2509.23673
作者: Amit Agarwal,Hitesh Laxmichand Patel,Srikant Panda,Hansa Meghwani,Jyotika Singh,Karan Dua,Paul Li,Tao Sheng,Sujith Ravi,Dan Roth
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM)
备注: Accepted in EMNLP 2025

点击查看摘要

[NLP-170] Aligning LLM s for Multilingual Consistency in Enterprise Applications EMNLP2025

链接: https://arxiv.org/abs/2509.23659
作者: Amit Agarwal,Hansa Meghwani,Hitesh Laxmichand Patel,Tao Sheng,Sujith Ravi,Dan Roth
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: Accepted in EMNLP 2025

点击查看摘要

[NLP-171] Beyond English-Centric Training: How Reinforcement Learning Improves Cross-Lingual Reasoning in LLM s

链接: https://arxiv.org/abs/2509.23657
作者: Shulin Huang,Yiran Ding,Junshu Pan,Yue Zhang
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-172] Dont Settle Too Early: Self-Reflective Remasking for Diffusion Language Models

链接: https://arxiv.org/abs/2509.23653
作者: Zemin Huang,Yuhang Wang,Zhiyang Chen,Guo-Jun Qi
机构: 未知
类目: Computation and Language (cs.CL)
备注: 24 pages,11 figures

点击查看摘要

[NLP-173] From Past To Path: Masked History Learning for Next-Item Prediction in Generative Recommendation

链接: https://arxiv.org/abs/2509.23649
作者: KaiWen Wei,Kejun He,Xiaomian Kang,Jie Zhang,Yuming Yang,Jiang Zhong,He Bai,Junnan Zhu
机构: 未知
类目: Information Retrieval (cs.IR); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-174] Fast Thinking for Large Language Models

链接: https://arxiv.org/abs/2509.23633
作者: Haoyu Zheng,Zhuonan Wang,Yuqian Yuan,Tianwei Lin,Wenqiao Zhang,Zheqi Lv,Juncheng Li,Siliang Tang,Yueting Zhuang,Hongyang He
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-175] RIV: Recursive Introspection Mask Diffusion Vision Language Model

链接: https://arxiv.org/abs/2509.23625
作者: YuQian Li,Limeng Qiao,Lin Ma
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-176] mber: Training-free Instruct Model Refining with Base via Effective Rank

链接: https://arxiv.org/abs/2509.23595
作者: Taiqiang Wu,Runming Yang,Tao Liu,Jiahao Wang,Zenan Xu,Ngai Wong
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 7 figures, 8 tables, Working in progress

点击查看摘要

[NLP-177] LLM Hallucination Detection: HSAD

链接: https://arxiv.org/abs/2509.23580
作者: JinXin Li,Gang Tu,JunJie Hu
机构: 未知
类目: Computation and Language (cs.CL)
备注: in Chinese language

点击查看摘要

[NLP-178] Jackal: A Real-World Execution-Based Benchmark Evaluating Large Language Models on Text-to-JQL Tasks

链接: https://arxiv.org/abs/2509.23579
作者: Kevin Frank,Anmol Gulati,Elias Lumer,Sindy Campagna,Vamse Kumar Subbiah
机构: 未知
类目: Computation and Language (cs.CL)
备注: 17 pages

点击查看摘要

[NLP-179] owards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales

链接: https://arxiv.org/abs/2509.23574
作者: Jianzhi Yan,Le Liu,Youcheng Pan,Shiwei Chen,Yang Xiang,Buzhou Tang
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 18 pages, 10 figures

点击查看摘要

[NLP-180] Clean First Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment NEURIPS2025

链接: https://arxiv.org/abs/2509.23564
作者: Min-Hsuan Yeh,Yixuan Li
机构: 未知
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注: NeurIPS 2025

点击查看摘要

[NLP-181] Automatic Speech Recognition for Greek Medical Dictation

链接: https://arxiv.org/abs/2509.23550
作者: Vardis Georgilas,Themos Stafylakis
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-182] On the Shelf Life of Fine-Tuned LLM Judges: Future Proofing Backward Compatibility and Question Generalization

链接: https://arxiv.org/abs/2509.23542
作者: Janvijay Singh,Austin Xu,Yilun Zhou,Yefan Zhou,Dilek Hakkani-Tur,Shafiq Joty
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 21 pages

点击查看摘要

[NLP-183] From Human Annotation to Automation: LLM -in-the-Loop Active Learning for Arabic Sentiment Analysis

链接: https://arxiv.org/abs/2509.23515
作者: Dania Refai,Alaa Dalaq,Doaa Dalaq,Irfan Ahmad
机构: King Fahd University of Petroleum and Minerals (KFUPM)
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[NLP-184] AraS2P: Arabic Speech-to-Phonemes System

链接: https://arxiv.org/abs/2509.23504
作者: Bassam Matar(1),Mohamed Fayed(2 and 3),Ayman Khalafallah(2) ((1) Alexandria University, (2) Applied Innovation Center, (3) Georgia Institute of Technology)
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-185] he Impact of Role Design in In-Context Learning for Large Language Models

链接: https://arxiv.org/abs/2509.23501
作者: Hamidreza Rouzegar,Masoud Makrehchi
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: Code is available at this https URL

点击查看摘要

[NLP-186] Multi-modal Data Spectrum: Multi-modal Datasets are Multi-dimensional

链接: https://arxiv.org/abs/2509.23499
作者: Divyam Madaan,Varshan Muhunthan,Kyunghyun Cho,Sumit Chopra
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
备注:

点击查看摘要

[NLP-187] Mapping Overlaps in Benchmarks through Perplexity in the Wild

链接: https://arxiv.org/abs/2509.23488
作者: Siyang Wu,Honglin Bao,Sida Li,Ari Holtzman,James A. Evans
机构: 未知
类目: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-188] mporal Generalization: A Reality Check

链接: https://arxiv.org/abs/2509.23487
作者: Divyam Madaan,Sumit Chopra,Kyunghyun Cho
机构: 未知
类目: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[NLP-189] xt-Based Approaches to Item Difficulty Modeling in Large-Scale Assessments: A Systematic Review

链接: https://arxiv.org/abs/2509.23486
作者: Sydney Peters,Nan Zhang,Hong Jiao,Ming Li,Tianyi Zhou,Robert Lissitz
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注: 45 pages, 9 figures

点击查看摘要

[NLP-190] MaskSQL: Safeguarding Privacy for LLM -Based Text-to-SQL via Abstraction NEURIPS2025

链接: https://arxiv.org/abs/2509.23459
作者: Sepideh Abedini(1,2),Shubhankar Mohapatra(1),D. B. Emerson(2),Masoumeh Shafieinejad(2),Jesse C. Cresswell(3),Xi He(1,2) ((1) University of Waterloo, (2) Vector Institute, (3) Layer 6 AI)
机构: 未知
类目: Cryptography and Security (cs.CR); Computation and Language (cs.CL)
备注: Accepted to the NeurIPS 2025 Workshop on Regulatable Machine Learning (Regulatable ML @ NeurIPS 2025). Code available at this https URL

点击查看摘要

[NLP-191] FoR-SALE: Frame of Reference-guided Spatial Adjustment in LLM -based Diffusion Editing

链接: https://arxiv.org/abs/2509.23452
作者: Tanawan Premsri,Parisa Kordjamshidi
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
备注: 9 pages, 3 Tables, 4 Figures, Under Reviewed

点击查看摘要

[NLP-192] Cognition-of-Thought Elicits Social-Aligned Reasoning in Large Language Models

链接: https://arxiv.org/abs/2509.23441
作者: Xuanming Zhang,Yuxuan Chen,Min-Hsuan Yeh,Yixuan Li
机构: 未知
类目: Computation and Language (cs.CL)
备注:

点击查看摘要

[NLP-193] SPIKE-RL: Video-LLM s meet Bayesian Surprise

链接: https://arxiv.org/abs/2509.23433
作者: Sahithya Ravi,Aditya Chinchure,Raymond T. Ng,Leonid Sigal,Vered Shwartz
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
备注: 10 pages, 4 figures, Code: this https URL

点击查看摘要

[NLP-194] Retrieval-Constrained Decoding Reveals Underestimated Parametric Knowledge in Language Models

链接: https://arxiv.org/abs/2509.23417
作者: Rajaa El Hamdani,Samy Haffoudhi,Nils Holzenberger,Fabian Suchanek,Thomas Bonald,Fragkiskos D. Malliaros
机构: 未知
类目: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

计算机视觉

[CV-0] VGGT-X: When VGGT Meets Dense Novel View Synthesis

【速读】：该论文旨在解决将3D基础模型（3DFM）应用于密集视图合成（dense Novel View Synthesis, NVS）时面临的两大核心挑战：一是显存（VRAM）消耗急剧增加，二是生成结果质量不足导致对初始化敏感的3D训练性能下降。为应对这些问题，论文提出VGGT-X框架，其关键在于三点创新：首先采用内存高效的VGGT实现，使模型可扩展至1000张以上图像；其次引入自适应全局对齐机制以增强VGGT输出质量；最后结合鲁棒的3D高斯泼溅（3DGS）训练策略。实验表明，该方案显著缩小了与COLMAP初始化方法在保真度上的差距，在无需COLMAP的密集NVS和位姿估计任务中达到当前最优效果。

链接: https://arxiv.org/abs/2509.25191
作者: Yang Liu,Chuanchen Luo,Zimo Tang,Junran Peng,Zhaoxiang Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Project Page: this https URL

点击查看摘要

Abstract:We study the problem of applying 3D Foundation Models (3DFMs) to dense Novel View Synthesis (NVS). Despite significant progress in Novel View Synthesis powered by NeRF and 3DGS, current approaches remain reliant on accurate 3D attributes (e.g., camera poses and point clouds) acquired from Structure-from-Motion (SfM), which is often slow and fragile in low-texture or low-overlap captures. Recent 3DFMs showcase orders of magnitude speedup over the traditional pipeline and great potential for online NVS. But most of the validation and conclusions are confined to sparse-view settings. Our study reveals that naively scaling 3DFMs to dense views encounters two fundamental barriers: dramatically increasing VRAM burden and imperfect outputs that degrade initialization-sensitive 3D training. To address these barriers, we introduce VGGT-X, incorporating a memory-efficient VGGT implementation that scales to 1,000+ images, an adaptive global alignment for VGGT output enhancement, and robust 3DGS training practices. Extensive experiments show that these measures substantially close the fidelity gap with COLMAP-initialized pipelines, achieving state-of-the-art results in dense COLMAP-free NVS and pose estimation. Additionally, we analyze the causes of remaining gaps with COLMAP-initialized rendering, providing insights for the future development of 3D foundation models and dense NVS. Our project page is available at this https URL
zh

[CV-1] Visual Jigsaw Post-Training Improves MLLM s

链接: https://arxiv.org/abs/2509.25190
作者: Penghao Wu,Yushan Zhang,Haiwen Diao,Bo Li,Lewei Lu,Ziwei Liu
机构: S-Lab, Nanyang Technological University (南洋理工大学); Linköping University (林雪平大学); SenseTime Research (商汤科技研究)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-2] FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation

链接: https://arxiv.org/abs/2509.25187
作者: Yunyang Ge,Xinhua Cheng,Chengshu Zhao,Xianyi He,Shenghai Yuan,Bin Lin,Bin Zhu,Li Yuan
机构: Peking University, Shenzhen Graduate School (北京大学深圳研究生院); Peng Cheng Laboratory (鹏城实验室); Rabbitpre AI (兔头科技)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-3] PixelCraft: A Multi-Agent System for High-Fidelity Visual Reasoning on Structured Images

链接: https://arxiv.org/abs/2509.25185
作者: Shuoshuo Zhang,Zijian Li,Yizhen Zhang,Jingjing Fu,Lei Song,Jiang Bian,Jun Zhang,Yujiu Yang,Rui Wang
机构: Microsoft Research (微软研究院); Tsinghua University (清华大学); Hong Kong University of Science and Technology (香港科技大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-4] PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos SIGGRAPH

【速读】：该论文旨在解决从随意拍摄、无固定姿态的单目视频中重建可变形三维物体的问题，尤其针对长视频序列中物体形变显著、相机运动范围大以及视场覆盖有限等挑战性场景。解决方案的关键在于提出PAD3R方法，其核心是训练一个个性化的、以物体为中心的姿态估计器，该估计器由预训练的图像到三维模型监督，从而引导可变形三维高斯表示的优化；同时，通过在整个输入视频上进行长期二维点跟踪来进一步正则化优化过程。结合生成先验与可微渲染机制，PAD3R实现了类别无关的高质量、关节式三维重建，展现出在动态场景理解与三维内容生成中的强大潜力。

链接: https://arxiv.org/abs/2509.25183
作者: Ting-Hsuan Liao,Haowen Liu,Yiran Xu,Songwei Ge,Gengshan Yang,Jia-Bin Huang
机构: University of Maryland College Park (马里兰大学学院市分校)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: SIGGRAPH Asia 2025. Project page: this https URL

点击查看摘要

Abstract:We present PAD3R, a method for reconstructing deformable 3D objects from casually captured, unposed monocular videos. Unlike existing approaches, PAD3R handles long video sequences featuring substantial object deformation, large-scale camera movement, and limited view coverage that typically challenge conventional systems. At its core, our approach trains a personalized, object-centric pose estimator, supervised by a pre-trained image-to-3D model. This guides the optimization of deformable 3D Gaussian representation. The optimization is further regularized by long-term 2D point tracking over the entire input video. By combining generative priors and differentiable rendering, PAD3R reconstructs high-fidelity, articulated 3D representations of objects in a category-agnostic way. Extensive qualitative and quantitative results show that PAD3R is robust and generalizes well across challenging scenarios, highlighting its potential for dynamic scene understanding and 3D content creation.
zh

[CV-5] DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder

链接: https://arxiv.org/abs/2509.25182
作者: Junyu Chen,Wenkun He,Yuchao Gu,Yuyang Zhao,Jincheng Yu,Junsong Chen,Dongyun Zou,Yujun Lin,Zhekai Zhang,Muyang Li,Haocheng Xi,Ligeng Zhu,Enze Xie,Song Han,Han Cai
机构: NVIDIA(英伟达)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: Tech Report. The first three authors contributed equally to this work

点击查看摘要

[CV-6] DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

链接: https://arxiv.org/abs/2509.25180
作者: Wenkun He,Yuchao Gu,Junyu Chen,Dongyun Zou,Yujun Lin,Zhekai Zhang,Haocheng Xi,Muyang Li,Ligeng Zhu,Jincheng Yu,Junsong Chen,Enze Xie,Song Han,Han Cai
机构: NVIDIA(英伟达)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: Tech Report. The first three authors contributed equally to this work

点击查看摘要

[CV-7] GHOST: Hallucination-Inducing Image Generation for Multimodal LLM s

链接: https://arxiv.org/abs/2509.25178
作者: Aryan Yazdan Parast,Parsa Hosseini,Hesam Asadollahzadeh,Arshia Soltani Moakhar,Basim Azam,Soheil Feizi,Naveed Akhtar
机构: The University of Melbourne (墨尔本大学); University of Maryland (马里兰大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-8] Mitigating Hallucination in Multimodal LLM s with Layer Contrastive Decoding

链接: https://arxiv.org/abs/2509.25177
作者: Bingkui Tong,Jiaer Xia,Kaiyang Zhou
机构: Mohamed bin Zayed University of Artificial Intelligence (穆罕默德·本·扎耶德人工智能大学); Hong Kong Baptist University (香港浸会大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-9] Personalized Vision via Visual In-Context Learning

链接: https://arxiv.org/abs/2509.25172
作者: Yuxin Jiang,Yuchao Gu,Yiren Song,Ivor Tsang,Mike Zheng Shou
机构: Show Lab, National University of Singapore (新加坡国立大学); A*STAR, Singapore (新加坡科技研究局)
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注: Project page: this https URL

点击查看摘要

[CV-10] YOLO26: Key Architectural Enhancements and Performance Benchmarking for Real-Time Object Detection

链接: https://arxiv.org/abs/2509.25164
作者: Ranjan Sapkota,Rahul Harsha Cheppally,Ajay Sharda,Manoj Karkee
机构: Cornell University (康奈尔大学); Kansas State University (堪萨斯州立大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-11] Aligning Visual Foundation Encoders to Tokenizers for Diffusion Models

链接: https://arxiv.org/abs/2509.25162
作者: Bowei Chen,Sai Bi,Hao Tan,He Zhang,Tianyuan Zhang,Zhengqi Li,Yuanjun Xiong,Jianming Zhang,Kai Zhang
机构: University of Washington (华盛顿大学); Adobe (Adobe公司); Massachusetts Institute of Technology (麻省理工学院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Project Page: this https URL

点击查看摘要

[CV-12] Rolling Forcing: Autoregressive Long Video Diffusion in Real Time

链接: https://arxiv.org/abs/2509.25161
作者: Kunhao Liu,Wenbo Hu,Jiale Xu,Ying Shan,Shijian Lu
机构: Nanyang Technological University (南洋理工大学); ARC Lab, Tencent PCG (腾讯PCG实验室)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Project page: this https URL

点击查看摘要

[CV-13] VideoAnchor: Reinforcing Subspace-Structured Visual Cues for Coherent Visual-Spatial Reasoning

【速读】：该论文旨在解决多模态大语言模型（Multimodal Large Language Models, MLLMs）在视觉空间推理能力上的局限性问题，其核心原因是注意力机制中视觉令牌被语言令牌压制，导致模型难以在视频帧间一致地识别相同视觉线索。解决方案的关键在于提出一个名为VideoAnchor的即插即用模块，该模块通过引入稀疏子空间聚类中的自表达性特性与Transformer注意力机制建立新关联，利用子空间亲和力强化跨帧视觉线索，从而锚定注意力到共享视觉结构上，且无需重新训练即可实现性能提升。

链接: https://arxiv.org/abs/2509.25151
作者: Zhaozhi Wang,Tong Zhang,Mingyue Guo,Yaowei Wang,Qixiang Ye
机构: University of Chinese Academy of Sciences (中国科学院大学); Peng Cheng Lab (鹏城实验室)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 16 pages, 6 figures

点击查看摘要

Abstract:Multimodal Large Language Models (MLLMs) have achieved impressive progress in vision-language alignment, yet they remain limited in visual-spatial reasoning. We first identify that this limitation arises from the attention mechanism: visual tokens are overshadowed by language tokens, preventing the model from consistently recognizing the same visual cues across frames. To address this challenge, we draw a novel connection between the self-expressiveness property in sparse subspace clustering and the attention mechanism in Transformers. Building on this insight, we propose VideoAnchor, a plug-and-play module that leverages subspace affinities to reinforce visual cues across frames without retraining, effectively anchoring attention to shared visual structures. Extensive experiments across benchmarks and backbone models show consistent performance gains – e.g. , 3.2% and 4.6% improvements on VSI-Bench and Video-MME (spatial-related tasks) with InternVL2-8B and Qwen2.5VL-72B – while qualitative analyses demonstrate more coherent subspace partitions and stronger visual grounding. Our codes will be made public available at this https URL.
zh

[CV-14] Fast Feature Field (textF3): A Predictive Representation of Events

链接: https://arxiv.org/abs/2509.25146
作者: Richeek Das,Kostas Daniilidis,Pratik Chaudhari
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
备注: 39 pages, 9 figures

点击查看摘要

[CV-15] Vision-and-Language Navigation with Analogical Textual Descriptions in LLM s

链接: https://arxiv.org/abs/2509.25139
作者: Yue Zhang,Tianyi Ma,Zun Wang,Yanyuan Qiao,Parisa Kordjamshidi
机构: Michigan State University (密歇根州立大学); UNC Chapel Hill (北卡罗来纳大学教堂山分校); University of Adelaide (阿德莱德大学)
类目: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
备注:

点击查看摘要

[CV-16] LayerD: Decomposing Raster Graphic Designs into Layers ICCV2025

链接: https://arxiv.org/abs/2509.25134
作者: Tomoyuki Suzuki,Kang-Jun Liu,Naoto Inoue,Kota Yamaguchi
机构: CyberAgent( CyberAgent); Tohoku University(东北大学)
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
备注: ICCV 2025, Project page: this https URL , GitHub: this https URL

点击查看摘要

[CV-17] Score Distillation of Flow Matching Models

链接: https://arxiv.org/abs/2509.25127
作者: Mingyuan Zhou,Yi Gu,Huangjie Zheng,Liangchen Song,Guande He,Yizhe Zhang,Wenze Hu,Yinfei Yang
机构: Apple(苹果公司); University of California, Berkeley (加州大学伯克利分校)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-18] riangle Splatting: Differentiable Rendering with Opaque Triangles

链接: https://arxiv.org/abs/2509.25122
作者: Jan Held,Renaud Vandeghen,Sanghyun Son,Daniel Rebain,Matheus Gadelha,Yi Zhou,Ming C. Lin,Marc Van Droogenbroeck,Andrea Tagliasacchi
机构: University of Liège (列日大学); Simon Fraser University (西蒙弗雷泽大学); University of Maryland (马里兰大学); University of British Columbia (不列颠哥伦比亚大学); University of Toronto (多伦多大学); Adobe Research (Adobe 研究院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 9 pages, 6 figures, 2 tables

点击查看摘要

[CV-19] Unsupervised Representation Learning for 3D Mesh Parameterization with Semantic and Visibility Objectives

链接: https://arxiv.org/abs/2509.25094
作者: AmirHossein Zamani,Bruno Roy,Arianna Rampini
机构: Autodesk Research (Autodesk 研究院); Mila – Quebec AI Institute (魁北克人工智能研究所); Concordia University (康考迪亚大学)
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-20] MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification

链接: https://arxiv.org/abs/2509.25082
作者: Xiaoyi Huang,Junwei Wu,Kejia Zhang,Carl Yang,Zhiming Luo
机构: Xiamen University (厦门大学); Emory University (埃默里大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-21] UniLat3D: Geometry-Appearance Unified Latents for Single-Stage 3D Generation

链接: https://arxiv.org/abs/2509.25079
作者: Guanjun Wu,Jiemin Fang,Chen Yang,Sikuang Li,Taoran Yi,Jia Lu,Zanwei Zhou,Jiazhong Cen,Lingxi Xie,Xiaopeng Zhang,Wei Wei,Wenyu Liu,Xinggang Wang,Qi Tian
机构: Huawei Inc.(华为公司); Huazhong University of Science and Technology(华中科技大学); Shanghai Jiaotong University(上海交通大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
备注: Project page: this https URL

点击查看摘要

[CV-22] BRIDGE - Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation

链接: https://arxiv.org/abs/2509.25077
作者: Dingning Liu,Haoyu Guo,Jingyi Zhou,Tong He
机构: Shanghai Artificial Intelligence Laboratory (上海人工智能实验室); Fudan University (复旦大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 20 pages, 7 figures

点击查看摘要

[CV-23] GEM: 3D Gaussian Splatting for Efficient and Accurate Cryo-EM Reconstruction

链接: https://arxiv.org/abs/2509.25075
作者: Huaizhi Qu,Xiao Wang,Gengwei Zhang,Jie Peng,Tianlong Chen
机构: University of North Carolina at Chapel Hill (北卡罗来纳大学教堂山分校); University of Washington (华盛顿大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Computational Engineering, Finance, and Science (cs.CE)
备注:

点击查看摘要

[CV-24] CharGen: Fast and Fluent Portrait Modification

链接: https://arxiv.org/abs/2509.25058
作者: Jan-Niklas Dihlmann,Arnela Killguss,Hendrik P.A. Lensch
机构: University of Tübingen(图宾根大学)
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
备注: Project page: this https URL

点击查看摘要

[CV-25] A Scalable Distributed Framework for Multimodal GigaVoxel Image Registration

链接: https://arxiv.org/abs/2509.25044
作者: Rohit Jena,Vedant Zope,Pratik Chaudhari,James C. Gee
机构: University of Pennsylvania (宾夕法尼亚大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
备注:

点击查看摘要

[CV-26] Fast Real-Time Pipeline for Robust Arm Gesture Recognition

链接: https://arxiv.org/abs/2509.25042
作者: Milán Zsolt Bagladi,László Gulyás,Gergő Szalay
机构: ELTE Eötvös Loránd University Faculty of Informatics (埃斯特大学信息学院)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-27] VT-FSL: Bridging Vision and Text with LLM s for Few-Shot Learning NEURIPS2025

链接: https://arxiv.org/abs/2509.25033
作者: Wenhao Li,Qiangchang Wang,Xianjing Meng,Zhibin Wu,Yilong Yin
机构: Shandong University (山东大学); Shenzhen Loop Area Institute (深圳环区研究所); Shandong University of Finance and Economics (山东财经大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注: Accepted by NeurIPS 2025

点击查看摘要

[CV-28] AIRoA MoMa Dataset: A Large-Scale Hierarchical Dataset for Mobile Manipulation

链接: https://arxiv.org/abs/2509.25032
作者: Ryosuke Takanami,Petr Khrapchenkov,Shu Morikuni,Jumpei Arima,Yuta Takaba,Shunsuke Maeda,Takuya Okubo,Genki Sano,Satoshi Sekioka,Aoi Kadoya,Motonari Kambara,Naoya Nishiura,Haruto Suzuki,Takanori Yoshimoto,Koya Sakamoto,Shinnosuke Ono,Hu Yang,Daichi Yashima,Aoi Horo,Tomohiro Motoda,Kensuke Chiyoma,Hiroshi Ito,Koki Fukuda,Akihito Goto,Kazumi Morinaga,Yuya Ikeda,Riko Kawada,Masaki Yoshikawa,Norio Kosuge,Yuki Noguchi,Kei Ota,Tatsuya Matsushima,Yusuke Iwasawa,Yutaka Matsuo,Tetsuya Ogata
机构: The University of Tokyo (东京大学); AI Robot Association (AIRoA) (人工智能机器人协会); Toyota Motor Corporation (丰田汽车公司); Telexistence, Inc. (Teleexistence公司); National Institute of Advanced Industrial Science and Technology (AIST) (日本产业技术综合研究所); Waseda University (早稻田大学)
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-29] STAGE: Stable and Generalizable GRPO for Autoregressive Image Generation KR

链接: https://arxiv.org/abs/2509.25027
作者: Xiaoxiao Ma,Haibo Qiu,Guohui Zhang,Zhixiong Zeng,Siqi Yang,Lin Ma,Feng Zhao
机构: University of Science and Technology of China (中国科学技术大学); Meituan (美团)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Code available at this https URL

点击查看摘要

[CV-30] GeoVLM-R1: Reinforcement Fine-Tuning for Improved Remote Sensing Reasoning

链接: https://arxiv.org/abs/2509.25026
作者: Mustansar Fiaz,Hiyam Debary,Paolo Fraccaro,Danda Paudel,Luc Van Gool,Fahad Khan,Salman Khan
机构: IBM Research(IBM研究实验室); INSAIT; ETH Zürich(苏黎世联邦理工学院); MBZUAI; Linköping University(林雪平大学); ANU Australia(澳大利亚国立大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Tables 6 and Figures 9. this https URL

点击查看摘要

[CV-31] Uncertainty-Aware Deep Learning for Wildfire Danger Forecasting

链接: https://arxiv.org/abs/2509.25017
作者: Spyros Kondylatos,Gustau Camps-Valls,Ioannis Papoutsis
机构: 未知
类目: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-32] CLASP: Adaptive Spectral Clustering for Unsupervised Per-Image Segmentation

链接: https://arxiv.org/abs/2509.25016
作者: Max Curie,Paulo da Costa
机构: Integral Ad Science (积分广告科学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-33] Score-based Membership Inference on Diffusion Models

链接: https://arxiv.org/abs/2509.25003
作者: Mingxing Rao,Bowen Qu,Daniel Moyer
机构: Vanderbilt University (范德堡大学)
类目: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-34] LVT: Large-Scale Scene Reconstruction via Local View Transformers SIGGRAPH

链接: https://arxiv.org/abs/2509.25001
作者: Tooba Imtiaz,Lucy Chai,Kathryn Heal,Xuan Luo,Jungyeon Park,Jennifer Dy,John Flynn
机构: Google(谷歌); Northeastern University (东北大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注: SIGGRAPH Asia 2025 camera-ready version; project page this https URL

点击查看摘要

[CV-35] PanoWorld-X: Generating Explorable Panoramic Worlds via Sphere-Aware Video Diffusion

【速读】：该论文旨在解决全景视频（panoramic video）生成中面临的两大挑战：一是传统方法受限于窄视场（narrow field-of-view），难以合成连续且整体一致的360度场景；二是相机控制能力不足，限制了用户或自主代理对场景的自由探索。解决方案的关键在于提出PanoWorld-X框架，其核心创新包括：首先构建大规模全景视频探索路径数据集，通过Unreal Engine在虚拟3D环境中模拟多样化相机轨迹；其次设计Sphere-Aware Diffusion Transformer架构，将等距圆柱投影特征重新映射到球面以建模潜在空间中的几何邻接关系，从而显著提升视觉保真度和时空连续性。

链接: https://arxiv.org/abs/2509.24997
作者: Yuyang Yin,HaoXiang Guo,Fangfu Liu,Mengyu Wang,Hanwen Liang,Eric Li,Yikai Wang,Xiaojie Jin,Yao Zhao,Yunchao Wei
机构: Beijing Jiaotong University (北京交通大学); Skywork AI; Tsinghua University (清华大学); University of Toronto (多伦多大学); Beijing Normal University (北京师范大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Project page: \url{ this https URL }

点击查看摘要

Abstract:Generating a complete and explorable 360-degree visual world enables a wide range of downstream applications. While prior works have advanced the field, they remain constrained by either narrow field-of-view limitations, which hinder the synthesis of continuous and holistic scenes, or insufficient camera controllability that restricts free exploration by users or autonomous agents. To address this, we propose PanoWorld-X, a novel framework for high-fidelity and controllable panoramic video generation with diverse camera trajectories. Specifically, we first construct a large-scale dataset of panoramic video-exploration route pairs by simulating camera trajectories in virtual 3D environments via Unreal Engine. As the spherical geometry of panoramic data misaligns with the inductive priors from conventional video diffusion, we then introduce a Sphere-Aware Diffusion Transformer architecture that reprojects equirectangular features onto the spherical surface to model geometric adjacency in latent space, significantly enhancing visual fidelity and spatiotemporal continuity. Extensive experiments demonstrate that our PanoWorld-X achieves superior performance in various aspects, including motion range, control precision, and visual quality, underscoring its potential for real-world applications.
zh

[CV-36] Light-SQ: Structure-aware Shape Abstraction with Superquadrics for Generated Meshes SIGGRAPH

链接: https://arxiv.org/abs/2509.24986
作者: Yuhan Wang,Weikai Chen,Zeyu Hu,Runze Zhang,Yingda Yin,Ruoyu Wu,Keyang Luo,Shengju Qian,Yiyan Ma,Hongyi Li,Yuan Gao,Yuhuan Zhou,Hao Luo,Wan Wang,Xiaobin Shen,Zhaowei Li,Kuixin Zhu,Chuanlang Hong,Yueyue Wang,Lijie Feng,Xin Wang,Chen Change Loy
机构: S-Lab, Nanyang Technological University (南洋理工大学); LIGHTSPEED (中国)
类目: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
备注: SIGGRAPH Asia 2025. Project Page this https URL

点击查看摘要

[CV-37] SDPose: Exploiting Diffusion Priors for Out-of-Domain and Robust Pose Estimation

链接: https://arxiv.org/abs/2509.24980
作者: Shuang Liang,Jing He,Chuanmeizhi Wang,Lejun Liao,Guo Zhang,Yingcong Chen,Yuan Yuan
机构: Rama Alpaca Technology Company(拉玛阿尔帕卡科技公司); Boston College (波士顿学院); HKUST(GZ)(香港科技大学(广州)); The University of Hong Kong (香港大学); HKUST(香港科技大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 18 pages, 9 figures, 9 tables

点击查看摘要

[CV-38] Wan-Alpha: High-Quality Text-to-Video Generation with Alpha Channel

链接: https://arxiv.org/abs/2509.24979
作者: Haotian Dong,Wenjing Wang,Chen Li,Di Lin
机构: Tianjin University (天津大学); Tianjin University (天津大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-39] On-the-Fly Data Augmentation for Brain Tumor Segmentation

链接: https://arxiv.org/abs/2509.24973
作者: Ishika Jain,Siri Willems,Steven Latre,Tom De Schepper
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-40] Event-based Facial Keypoint Alignment via Cross-Modal Fusion Attention and Self-Supervised Multi-Event Representation Learning

链接: https://arxiv.org/abs/2509.24968
作者: Donghwa Kang,Junho Kim,Dongwoo Kang
机构: Hongik University (弘益大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 11 pages, 7 figures

点击查看摘要

[CV-41] Social 3D Scene Graphs: Modeling Human Actions and Relations for Interactive Service Robots

链接: https://arxiv.org/abs/2509.24966
作者: Ermanno Bartoli,Dennis Rotondi,Buwei He,Patric Jensfelt,Kai O. Arras,Iolanda Leite
机构: Faculty of Robotics Perception and Learning, KTH Royal Institute of Technology, Stockholm, Sweden; Socially Intelligent Robotics Lab, Institute for Artificial Intelligence, University of Stuttgart, Germany
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-42] Evaluating Temperature Scaling Calibration Effectiveness for CNNs under Varying Noise Levels in Brain Tumour Detection ALT

链接: https://arxiv.org/abs/2509.24951
作者: Ankur Chanda,Kushan Choudhury,Shubhrodeep Roy,Shubhajit Biswas,Somenath Kuiry
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Accepted and presented in INTERNATIONAL CONFERENCE ON ADVANCING SCIENCE AND TECHNOLOGIES IN HEALTH SCIENCE

点击查看摘要

[CV-43] Perceive Reflect and Understand Long Video: Progressive Multi-Granular Clue Exploration with Interactive Agents

链接: https://arxiv.org/abs/2509.24943
作者: Jiahua Li,Kun Wei,Zhe Xu,Zibo Su,Xu Yang,Cheng Deng
机构: Xidian University (西安电子科技大学); Hong Kong University of Science and Technology (香港科技大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-44] Scalable GANs with Transformers

链接: https://arxiv.org/abs/2509.24935
作者: Sangeek Hyun,MinKyu Lee,Jae-Pil Heo
机构: Sungkyunkwan University (成均馆大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-45] Segmentor-Guided Counterfactual Fine-Tuning for Image Synthesis MICCAI2025

链接: https://arxiv.org/abs/2509.24913
作者: Tian Xia,Matthew Sinclair,Andreas Schuh,Fabio De Sousa Ribeiro,Raghav Mehta,Rajat Rasal,Esther Puyol-Antón,Samuel Gerber,Kersten Petersen,Michiel Schaap,Ben Glocker
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: Accepted at MICCAI 2025

点击查看摘要

[CV-46] Learning Goal-Oriented Language-Guided Navigation with Self-Improving Demonstrations at Scale

链接: https://arxiv.org/abs/2509.24910
作者: Songze Li,Zun Wang,Gengze Zhou,Jialu Li,Xiangyu Zeng,Limin Wang,Yu Qiao,Qi Wu,Mohit Bansal,Yi Wang
机构: Shanghai AI Laboratory (上海人工智能实验室); UNC Chapel Hill (北卡罗来纳大学教堂山分校); Fudan University (复旦大学); The University of Adelaide (阿德莱德大学); Nanjing University (南京大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-47] DRCP: Diffusion on Reinforced Cooperative Perception for Perceiving Beyond Limits

链接: https://arxiv.org/abs/2509.24903
作者: Lantao Li,Kang Yang,Rui Song,Chen Sun
机构: Sony (China) Limited (索尼(中国)有限公司); Renmin University of China (中国人民大学); Fraunhofer IVI (弗劳恩霍夫IVI研究所)
类目: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
备注:

点击查看摘要

[CV-48] OpenGPT -4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

链接: https://arxiv.org/abs/2509.24900
作者: Zhihong Chen,Xuehai Bai,Yang Shi,Chaoyou Fu,Huanyu Zhang,Haotian Wang,Xiaoyan Sun,Zhang Zhang,Liang Wang,Yuanxing Zhang,Pengfei Wan,Yi-Fan Zhang
机构: USTC(中国科学技术大学); HDU(杭州电子科技大学); PKU(北京大学); NJU(南京大学); CASIA(中国科学院自动化研究所); THU(清华大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-49] Attention Surgery: An Efficient Recipe to Linearize Your Video Diffusion Transformer

链接: https://arxiv.org/abs/2509.24899
作者: Mohsen Ghafoorian,Denis Korzhenkov,Amirhossein Habibian
机构: Qualcomm AI Research (高通人工智能研究中心)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-50] Accurate Cobb Angle Estimation via SVD-Based Curve Detection and Vertebral Wedging Quantification

链接: https://arxiv.org/abs/2509.24898
作者: Chang Shi,Nan Meng,Yipeng Zhuang,Moxin Zhao,Jason Pui Yin Cheung,Hua Huang,Xiuyuan Chen,Cong Nie,Wenting Zhong,Guiqiang Jiang,Yuxin Wei,Jacob Hong Man Yu,Si Chen,Xiaowen Ou,Teng Zhang
机构: The University of Hong Kong (香港大学); Shanghai Jiao Tong University (上海交通大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-51] DAM: Dual Active Learning with Multimodal Foundation Model for Source-Free Domain Adaptation

链接: https://arxiv.org/abs/2509.24896
作者: Xi Chen,Hongxun Yao,Zhaopan Xu,Kui Jiang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 5 pages

点击查看摘要

[CV-52] DWGS: Enhancing Sparse-View Gaussian Splatting with Hybrid-Loss Depth Estimation and Bidirectional Warping

链接: https://arxiv.org/abs/2509.24893
作者: Yu Ma,Guoliang Wei,Yue Cheng
机构: University of Shanghai for Science and Technology (上海理工大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 14 pages, 21 figures

点击查看摘要

[CV-53] VAGUEGAN: Stealthy Poisoning and Backdoor Attacks on Image Generative Pipelines

链接: https://arxiv.org/abs/2509.24891
作者: Mostafa Mohaimen Akand Faisal,Rabeya Amin Jhuma
机构: University of Information Technology and Sciences (UITS)
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-54] Vehicle Classification under Extreme Imbalance: A Comparative Study of Ensemble Learning and CNNs

【速读】：该论文旨在解决交通物流领域中车辆类型识别任务因公共数据集存在严重类别不平衡而导致稀有类别识别性能下降的问题。其解决方案的关键在于通过融合多个数据源构建一个包含16类共约4.7万张图像的语料库，并采用SMOTE过采样与针对性欠采样相结合的方法生成六个平衡变体；同时对比轻量级集成模型（如随机森林、AdaBoost及基于MobileNet-V2特征的软投票组合）与配置灵活的ResNet风格卷积神经网络（CNN）在强增强和标签平滑训练下的表现，最终发现深度模型在全测试集上达到79.19%准确率，在未见推理批次上达81.25%，显著优于集成方法，但最稀有类别“驳船”（Barge）仍为性能瓶颈，表明单纯重平衡策略存在局限，需结合额外少数类数据采集和代价敏感学习（如焦点损失）等手段进一步优化。

链接: https://arxiv.org/abs/2509.24880
作者: Abu Hanif Muhammad Syarubany
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

Abstract:Accurate vehicle type recognition underpins intelligent transportation and logistics, but severe class imbalance in public datasets suppresses performance on rare categories. We curate a 16-class corpus (~47k images) by merging Kaggle, ImageNet, and web-crawled data, and create six balanced variants via SMOTE oversampling and targeted undersampling. Lightweight ensembles, such as Random Forest, AdaBoost, and a soft-voting combiner built on MobileNet-V2 features are benchmarked against a configurable ResNet-style CNN trained with strong augmentation and label smoothing. The best ensemble (SMOTE-combined) attains 74.8% test accuracy, while the CNN achieves 79.19% on the full test set and 81.25% on an unseen inference batch, confirming the advantage of deep models. Nonetheless, the most under-represented class (Barge) remains a failure mode, highlighting the limits of rebalancing alone. Results suggest prioritizing additional minority-class collection and cost-sensitive objectives (e.g., focal loss) and exploring hybrid ensemble or CNN pipelines to combine interpretability with representational power.
zh

[CV-55] hermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation NEURIPS2025

【速读】：该论文旨在解决RGB-thermal（RGB-T）图像对数据稀缺的问题，这限制了视觉-热成像传感器融合及跨模态任务（如多模态图像配准与检索）的发展。为克服这一挑战，作者提出ThermalGen，一种基于自适应流的生成模型，其关键创新在于引入了RGB图像条件化架构和风格解耦机制，从而能够从大量RGB数据中合成具有显著视角、传感器特性与环境条件变化的热图像，实现高质量的RGB到热图像翻译。

链接: https://arxiv.org/abs/2509.24878
作者: Jiuhong Xiao,Roshan Nayak,Ning Zhang,Daniel Tortei,Giuseppe Loianno
机构: New York University (纽约大学); Technology Innovation Institute (技术创新研究所); University of California, Berkeley (加州大学伯克利分校)
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注: 23 pages including the checklist and appendix. Accepted at NeurIPS 2025

点击查看摘要

Abstract:Paired RGB-thermal data is crucial for visual-thermal sensor fusion and cross-modality tasks, including important applications such as multi-modal image alignment and retrieval. However, the scarcity of synchronized and calibrated RGB-thermal image pairs presents a major obstacle to progress in these areas. To overcome this challenge, RGB-to-Thermal (RGB-T) image translation has emerged as a promising solution, enabling the synthesis of thermal images from abundant RGB datasets for training purposes. In this study, we propose ThermalGen, an adaptive flow-based generative model for RGB-T image translation, incorporating an RGB image conditioning architecture and a style-disentangled mechanism. To support large-scale training, we curated eight public satellite-aerial, aerial, and ground RGB-T paired datasets, and introduced three new large-scale satellite-aerial RGB-T datasets–DJI-day, Bosonplus-day, and Bosonplus-night–captured across diverse times, sensor types, and geographic regions. Extensive evaluations across multiple RGB-T benchmarks demonstrate that ThermalGen achieves comparable or superior translation performance compared to existing GAN-based and diffusion-based methods. To our knowledge, ThermalGen is the first RGB-T image translation model capable of synthesizing thermal images that reflect significant variations in viewpoints, sensor characteristics, and environmental conditions. Project page: this http URL
zh

[CV-56] Environment-Aware Satellite Image Generation with Diffusion Models

链接: https://arxiv.org/abs/2509.24875
作者: Nikos Kostagiolas,Pantelis Georgiades,Yannis Panagakis,Mihalis A. Nicolaou
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-57] StreamForest: Efficient Online Video Understanding with Persistent Event Memory NEURIPS2025

链接: https://arxiv.org/abs/2509.24871
作者: Xiangyu Zeng,Kefan Qiu,Qingyu Zhang,Xinhao Li,Jing Wang,Jiaxin Li,Ziang Yan,Kun Tian,Meng Tian,Xinhai Zhao,Yi Wang,Limin Wang
机构: Nanjing University (南京大学); Shanghai AI Laboratory (上海人工智能实验室); Zhejiang University (浙江大学); Noah’s Ark Lab, Huawei (华为诺亚方舟实验室); Yinwang Intelligent Tech. (殷旺智能科技)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Accepted as a Spotlight at NeurIPS 2025

点击查看摘要

[CV-58] Vision At Night: Exploring Biologically Inspired Preprocessing For Improved Robustness Via Color And Contrast Transformations ICCV2025

链接: https://arxiv.org/abs/2509.24863
作者: Lorena Stracke,Lia Nimmermann,Shashank Agnihotri,Margret Keuper,Volker Blanz
机构: Media Systems, University of Siegen, Germany (媒体系统，锡根大学，德国); Data and Web Science Group, University of Mannheim, Germany (数据与网络科学组，曼海姆大学，德国); Max-Planck-Institute for Informatics, Saarland Informatics Campus, Germany (马克斯·普朗克信息研究所，萨尔兰计算机科学园区，德国)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Accepted at the ICCV 2025 Workshop on Responsible Imaging

点击查看摘要

[CV-59] ELPG-DTFS: Prior-Guided Adaptive Time-Frequency Graph Neural Network for EEG Depression Diagnosis

链接: https://arxiv.org/abs/2509.24860
作者: Jingru Qiu,Jiale Liang,Xuanhan Fan,Mingda Zhang,Zhenli He
机构: Yunnan University (云南大学); Beijing Institute of Technology (北京理工大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 8 page,3 figures

点击查看摘要

[CV-60] PHASE-Net: Physics-Grounded Harmonic Attention System for Efficient Remote Photoplethysmography Measurement

链接: https://arxiv.org/abs/2509.24850
作者: Bo Zhao,Dan Guo,Junzhe Cao,Yong Xu,Tao Tan,Yue Sun,Bochao Zou,Jie Zhang,Zitong Yu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-61] raining-Free Token Pruning via Zeroth-Order Gradient Estimation in Vision-Language Models

链接: https://arxiv.org/abs/2509.24837
作者: Youngeun Kim,Youjia Zhang,Huiling Liu,Aecheon Jung,Sunwoo Lee,Sungeun Hong
机构: Amazon; Sungkyunkwan University; Inha University
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-62] Of-SemWat: High-payload text embedding for semantic watermarking of AI-generated images with arbitrary size

链接: https://arxiv.org/abs/2509.24823
作者: Benedetta Tondi,Andrea Costanzo,Mauro Barni
机构: 未知
类目: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注: 5 pages, 2 figures

点击查看摘要

[CV-63] UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections

链接: https://arxiv.org/abs/2509.24817
作者: Zeyu Cai,Ziyang Li,Xiaoben Li,Boqian Li,Zeyu Wang,Zhenyu Zhang,Yuliang Xiu
机构: Westlake University (西湖大学); Nanjing University (南京大学); The Hong Kong University of Science and Technology (Guangzhou) (香港科技大学（广州)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Page: this https URL

点击查看摘要

[CV-64] ACO-Net: Topological Signatures Triumph in 3D Object Classification

链接: https://arxiv.org/abs/2509.24802
作者: Anirban Ghosh,Ayan Dutta
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-65] Causal-Adapter: Taming Text-to-Image Diffusion for Faithful Counterfactual Generation

链接: https://arxiv.org/abs/2509.24798
作者: Lei Tong,Zhihua Liu,Chaochao Lu,Dino Oglic,Tom Diethe,Philip Teare,Sotirios A. Tsaftaris,Chen Jin
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 9 pages, 26 figures

点击查看摘要

[CV-66] Vision Function Layer in Multimodal LLM s NEURIPS2025

链接: https://arxiv.org/abs/2509.24791
作者: Cheng Shi,Yizhou Yu,Sibei Yang
机构: Sun Yat-sen University (中山大学); School of Computing and Data Science, The University of Hong Kong (香港大学计算与数据科学学院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Accepted at NeurIPS 2025 (preview; camera-ready in preparation)

点击查看摘要

[CV-67] LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning

链接: https://arxiv.org/abs/2509.24786
作者: Shenghao Fu,Qize Yang,Yuan-Ming Li,Xihan Wei,Xiaohua Xie,Wei-Shi Zheng
机构: Sun Yat-sen University (中山大学); Alibaba Group (阿里巴巴集团); Peng Cheng Laboratory (鹏城实验室); Ministry of Education (教育部); Guangdong Province Key Laboratory of Information Security Technology (广东省信息安全技术重点实验室); Pazhou Laboratory (黄埔实验室)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-68] SkyLink: Unifying Street-Satellite Geo-Localization via UAV-Mediated 3D Scene Alignment

链接: https://arxiv.org/abs/2509.24783
作者: Hongyang Zhang,Yinhao Liu,Zhenyu Kuang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
备注:

点击查看摘要

[CV-69] VTPerception-R1: Enhancing Multimodal Reasoning via Explicit Visual and Textual Perceptual Grounding

链接: https://arxiv.org/abs/2509.24776
作者: Yizhuo Ding,Mingkang Chen,Zhibang Feng,Tong Xiao,Wanying Qu,Wenqi Shao,Yanwei Fu
机构: Fudan University (复旦大学); Shanghai AI Laboratory (上海人工智能实验室); The University of Hong Kong (香港大学); Shenzhen University (深圳大学); University of Science and Technology of China (中国科学技术大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-70] ExGS: Extreme 3D Gaussian Compression with Diffusion Priors

链接: https://arxiv.org/abs/2509.24758
作者: Jiaqi Chen,Xinhao Ji,Yuanyuan Gao,Hao Li,Yuning Gong,Yifei Liu,Dan Xu,Zhihang Zhong,Dingwen Zhang,Xiao Sun
机构: Northwestern Polytechnical University (西北工业大学); Shanghai Artificial Intelligence Laboratory (上海人工智能实验室); Hong Kong University of Science and Technology (香港科技大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-71] Collaborating Vision Depth and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm

链接: https://arxiv.org/abs/2509.24741
作者: Xue-Feng Zhu,Tianyang Xu,Yifan Pan,Jinjie Gu,Xi Li,Jiwen Lu,Xiao-Jun Wu,Josef Kittler
机构: Jiangnan University (江南大学); Zhejiang Univeristy (浙江大学); Tsinghua University (清华大学); University of Surrey (萨里大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-72] oward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation NEURIPS2025

链接: https://arxiv.org/abs/2509.24739
作者: Huu Tien Nguyen,Dac Thai Nguyen, TheMinh Duc Nguyen,Trung Thanh Nguyen,Thao Nguyen Truong,Huy Hieu Pham,Johan Barthelemy,Minh Quan Tran,Thanh Tam Nguyen,Quoc Viet Hung Nguyen,Quynh Anh Chau,Hong Son Mai,Thanh Trung Nguyen,Phi Le Nguyen
机构: AI4LIFE(人工智能生命科学); Hanoi University of Science and Technology (河内科学技术大学); Nagoya University (名古屋大学); AIST (国立研究开发法人产业技术综合研究所); VinUniversity (VinUni); NVIDIA; Griffith University (格里菲斯大学); Hanoi Medical University (河内医科大学); 108 Military Central Hospital (108军区中央医院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 39th Conference on Neural Information Processing Systems (NeurIPS 2025)

点击查看摘要

[CV-73] A TRIANGLE Enables Multimodal Alignment Beyond Cosine Similarity NEURIPS2025

链接: https://arxiv.org/abs/2509.24734
作者: Giordano Cicchetti,Eleonora Grassucci,Danilo Comminiello
机构: Sapienza University of Rome (罗马大学)
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
备注: NeurIPS 2025

点击查看摘要

[CV-74] Evaluation of Polarimetric Fusion for Semantic Segmentation in Aquatic Environments

链接: https://arxiv.org/abs/2509.24731
作者: Luis F. W. Batista,Tom Bourbon,Cedric Pradalier
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注: Accepted to VCIP 2025

点击查看摘要

[CV-75] IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

链接: https://arxiv.org/abs/2509.24709
作者: Yang Chen,Minghao Liu,Yufan Shen,Yunwen Li,Tianyuan Huang,Xinyu Fang,Tianyu Zheng,Wenxuan Huang,Cheng Yang,Daocheng Fu,Jianbiao Mei,Rong Wu,Licheng Wen,Xuemeng Yang,Song Mao,Qunshu Lin,Zhi Yu,Yongliang Shen,Yu Qiao,Botian Shi
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-76] Enhancing Physical Plausibility in Video Generation by Reasoning the Implausibility

链接: https://arxiv.org/abs/2509.24702
作者: Yutong Hao,Chen Chen,Ajmal Saeed Mian,Chang Xu,Daochang Liu
机构: University of Western Australia (西澳大利亚大学); University of Sydney (悉尼大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-77] SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

链接: https://arxiv.org/abs/2509.24695
作者: Junsong Chen,Yuyang Zhao,Jincheng Yu,Ruihang Chu,Junyu Chen,Shuai Yang,Xianbang Wang,Yicheng Pan,Daquan Zhou,Huan Ling,Haozhe Liu,Hongwei Yi,Hao Zhang,Muyang Li,Yukang Chen,Han Cai,Sanja Fidler,Ping Luo,Song Han,Enze Xie
机构: NVIDIA; HKU (香港大学); MIT (麻省理工学院); THU (清华大学); PKU (北京大学); KAUST (沙特阿卜杜拉国王科技大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 21 pages, 15 figures, 7 tables

点击查看摘要

[CV-78] raumatic Brain Injury Segmentation using an Ensemble of Encoder-decoder Models

链接: https://arxiv.org/abs/2509.24684
作者: Ghanshyam Dhamat,Vaanathi Sundaresan
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 9 pages, 4 figures, and 1 table

点击查看摘要

[CV-79] Classifier-Centric Adaptive Framework for Open-Vocabulary Camouflaged Object Segmentation

链接: https://arxiv.org/abs/2509.24681
作者: Hanyu Zhang,Yiming Zhou,Jinxia Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-80] CEDex: Cross-Embodiment Dexterous Grasp Generation at Scale from Human-like Contact Representations

链接: https://arxiv.org/abs/2509.24661
作者: Zhiyuan Wu,Rolandos Alexandros Potamias,Xuyang Zhang,Zhongqun Zhang,Jiankang Deng,Shan Luo
机构: King’s College London (伦敦国王学院); Imperial College London (帝国理工学院); Nankai University (南开大学)
类目: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-81] VNODE: A Piecewise Continuous Volterra Neural Network

链接: https://arxiv.org/abs/2509.24659
作者: Siddharth Roheda,Aniruddha Bala,Rohit Chowdhury,Rohan Jaiswal
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 5 pages

点击查看摘要

[CV-82] Learning Object-Centric Representations Based on Slots in Real World Scenarios

链接: https://arxiv.org/abs/2509.24652
作者: Adil Kaan Akan
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: PhD Thesis, overlap with arXiv:2507.20855 and arXiv:2501.15878

点击查看摘要

[CV-83] RIFLE: Removal of Image Flicker-Banding via Latent Diffusion Enhancement

链接: https://arxiv.org/abs/2509.24644
作者: Zhu,Libo,Zhou,Zihan, Liu,Xiaoyang,Zhang,Weihang, Shi,Keyu, Fu,Yifan,Zhang,Yulun
机构: Shanghai Jiao Tong University (上海交通大学); HUAWEI (华为)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-84] Can you SPLICE it together? A Human Curated Benchmark for Probing Visual Reasoning in VLMs

链接: https://arxiv.org/abs/2509.24640
作者: Mohamad Ballout,Okajevo Wilfred,Seyedalireza Yaghoubi,Nohayr Muhammad Abdelmoneim,Julius Mayer,Elia Bruni
机构: Osnabrück University (奥斯纳布吕克大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-85] FreeRet: MLLM s as Training-Free Retrievers

链接: https://arxiv.org/abs/2509.24621
作者: Yuhan Zhu,Xiangyu Zeng,Chenting Wang,Xinhao Li,Yicheng Xu,Ziang Yan,Yi Wang,Limin Wang
机构: Nanjing University (南京大学); Shanghai AI Laboratory (上海人工智能实验室); Shanghai Jiaotong University (上海交通大学); Institute of Science Tokyo (东京科学研究所); Zhejiang University (浙江大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-86] Biomechanical-phase based Temporal Segmentation in Sports Videos: a Demonstration on Javelin-Throw

链接: https://arxiv.org/abs/2509.24606
作者: Bikash Kumar Badatya,Vipul Baghel,Jyotirmoy Amin,Ravi Hegde
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: This paper has been accepted at the IEEE STAR Workshop 2025

点击查看摘要

[CV-87] Discovering “Words” in Music: Unsupervised Learning of Compositional Sparse Code for Symbolic Music

【速读】：该论文旨在解决从符号化音乐数据中自动识别重复性模式（称为“music-words”）的问题，这些模式是音乐结构的基本单元并反映作曲过程中的认知机制，但因音乐解释的语义模糊性而难以提取。解决方案的关键在于将音乐词发现任务建模为一个统计优化问题，并提出基于期望最大化（Expectation-Maximization, EM）的两阶段学习框架：首先构建音乐词典，然后重构音乐数据；通过最小化编码长度来有效缓解语义模糊性，从而实现对音乐基本构成单元的自动提取，支持结构分析与稀疏编码。

链接: https://arxiv.org/abs/2509.24603
作者: Tianle Wang,Sirui Zhang,Xinyi Tong,Peiyang Yu,Jishang Chen,Liangke Zhao,Xinpu Gao,Yves Zhu,Tiezheng Ge,Bo Zheng,Duo Xu,Yang Liu,Xin Jin,Feng Yu,Songchun Zhu
机构: Central Conservatory of Music (中央音乐学院); Ajou University (亚洲大学); Peking University (北京大学); Alibaba Group (阿里巴巴集团); Bigai (百川智能)
类目: ound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

Abstract:This paper presents an unsupervised machine learning algorithm that identifies recurring patterns – referred to as music-words'' -- from symbolic music data. These patterns are fundamental to musical structure and reflect the cognitive processes involved in composition. However, extracting these patterns remains challenging because of the inherent semantic ambiguity in musical interpretation. We formulate the task of music-word discovery as a statistical optimization problem and propose a two-stage Expectation-Maximization (EM)-based learning framework: 1. Developing a music-word dictionary; 2. Reconstructing the music data. When evaluated against human expert annotations, the algorithm achieved an Intersection over Union (IoU) score of 0.61. Our findings indicate that minimizing code length effectively addresses semantic ambiguity, suggesting that human optimization of encoding systems shapes musical semantics. This approach enables computers to extract basic building blocks’’ from music data, facilitating structural analysis and sparse encoding. The method has two primary applications. First, in AI music, it supports downstream tasks such as music generation, classification, style transfer, and improvisation. Second, in musicology, it provides a tool for analyzing compositional patterns and offers insights into the principle of minimal encoding across diverse musical styles and composers.
zh

[CV-88] Comprehensive Benchmarking of YOLOv11 Architectures for Scalable and Granular Peripheral Blood Cell Detection

链接: https://arxiv.org/abs/2509.24595
作者: Mohamad Abou Ali,Mariam Abdulfattah,Baraah Al Hussein,Fadi Dornaika,Ali Cherry,Mohamad Hajj-Hassan,Lara Hamawy
机构: University of the Basque Country (巴斯克大学); Lebanese International University (黎巴嫩国际大学); The International University of Beirut (贝鲁特国际大学); IKERBASQUE (IKERBASQUE)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-89] SAIP: A Plug-and-Play Scale-adaptive Module in Diffusion-based Inverse Problems

链接: https://arxiv.org/abs/2509.24580
作者: Lingyu Wang,Xiangming Meng
机构: 未知
类目: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-90] BFSM: 3D Bidirectional Face-Skull Morphable Model

链接: https://arxiv.org/abs/2509.24577
作者: Zidu Wang,Meng Xu,Miao Xu,Hengyuan Ma,Jiankuo Zhao,Xutao Li,Xiangyu Zhu,Zhen Lei
机构: State Key Laboratory of Multimodal Artificial Intelligence Systems (多模态人工智能系统国家重点实验室); Institute of Automation, Chinese Academy of Sciences (中国科学院自动化研究所); School of Artificial Intelligence, University of Chinese Academy of Sciences (中国科学院大学人工智能学院); Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (中国医学科学院整形外科医院); Centre for Artificial Intelligence and Robotics, Hong Kong Institute of Science & Innovation, Chinese Academy of Sciences (中国科学院香港科学与创新研究院人工智能与机器人中心)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Under review

点击查看摘要

[CV-91] SCOPE: Semantic Conditioning for Sim2Real Category-Level Object Pose Estimation in Robotics

链接: https://arxiv.org/abs/2509.24572
作者: Peter Hönig,Stefan Thalhammer,Jean-Baptiste Weibel,Matthias Hirschmanner,Markus Vincze
机构: TU Wien (维也纳工业大学); UAS Technikum Vienna (维也纳应用科技大学); BOKU University Vienna (维也纳农业大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注:

点击查看摘要

[CV-92] okenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models

链接: https://arxiv.org/abs/2509.24566
作者: Zhifang Zhang,Qiqi Tao,Jiaqi Lv,Na Zhao,Lei Feng,Joey Tianyi Zhou
机构: Southeast University (东南大学); Singapore University of Technology and Design (新加坡科技设计大学); A*STAR Centre for Frontier AI Research (CFAR) (新加坡科技研究局前沿人工智能研究中心)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-93] Foggy Crowd Counting: Combining Physical Priors and KAN-Graph

链接: https://arxiv.org/abs/2509.24545
作者: Yuhao Wang,Zhuoran Zheng,Han Hu,Dianjie Lu,Guijuan Zhang,Chen Lyu
机构: Shandong Normal University (山东师范大学); Sun Yat-sen University (中山大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-94] Diffusion Bridge or Flow Matching? A Unifying Framework and Comparative Analysis

【速读】：该论文旨在解决生成式模型中Diffusion Bridge（扩散桥）与Flow Matching（流匹配）两种方法在理论基础和实际性能上的模糊对比问题，尤其在不同训练数据规模和分布差异下的相对优势尚不明确。其解决方案的关键在于首次从随机最优控制（Stochastic Optimal Control）和最优输运（Optimal Transport）两个统一理论视角出发，证明Diffusion Bridge的代价函数更低，能引导系统走向更稳定、自然的轨迹；同时指出Flow Matching中插值系数 $ t $ 和 $ 1-t $ 在小样本场景下有效性显著下降。为验证理论，作者设计了一个基于潜在Transformer（latent Transformer）的新型Diffusion Bridge架构，并以相同结构实现Flow Matching模型，从而在图像修复、超分辨率、去噪、风格迁移等六类任务中进行公平比较，实验证据完全支持理论预测，清晰界定两者适用边界。

链接: https://arxiv.org/abs/2509.24531
作者: Kaizhen Zhu,Mokai Pan,Zhechuan Yu,Jingya Wang,Jingyi Yu,Ye Shi
机构: ShanghaiTech University (上海科技大学); MoE Key Laboratory of Intelligent Perception and Human Machine Collaboration (教育部智能感知与人机协同重点实验室)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

Abstract:Diffusion Bridge and Flow Matching have both demonstrated compelling empirical performance in transformation between arbitrary distributions. However, there remains confusion about which approach is generally preferable, and the substantial discrepancies in their modeling assumptions and practical implementations have hindered a unified theoretical account of their relative merits. We have, for the first time, provided a unified theoretical and experimental validation of these two models. We recast their frameworks through the lens of Stochastic Optimal Control and prove that the cost function of the Diffusion Bridge is lower, guiding the system toward more stable and natural trajectories. Simultaneously, from the perspective of Optimal Transport, interpolation coefficients t and 1-t of Flow Matching become increasingly ineffective when the training data size is reduced. To corroborate these theoretical claims, we propose a novel, powerful architecture for Diffusion Bridge built on a latent Transformer, and implement a Flow Matching model with the same structure to enable a fair performance comparison in various experiments. Comprehensive experiments are conducted across Image Inpainting, Super-Resolution, Deblurring, Denoising, Translation, and Style Transfer tasks, systematically varying both the distributional discrepancy (different difficulty) and the training data size. Extensive empirical results align perfectly with our theoretical predictions and allow us to delineate the respective advantages and disadvantages of these two models. Our code is available at this https URL.
zh

[CV-95] CORE-3D: Context-aware Open-vocabulary Retrieval by Embeddings in 3D ICLR2026

链接: https://arxiv.org/abs/2509.24528
作者: Mohamad Amin Mirzaei,Pantea Amoie,Ali Ekhterachian,Matin Mirzababaei
机构: Sharif University of Technology (谢里夫理工大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 9 pages without the refrences, 4 figures, sybmitted for ICLR 2026 conference

点击查看摘要

[CV-96] CMT: Mid-Training for Efficient Learning of Consistency Mean Flow and Flow Map Models

链接: https://arxiv.org/abs/2509.24526
作者: Zheyuan Hu,Chieh-Hsin Lai,Yuki Mitsufuji,Stefano Ermon
机构: Sony AI(索尼人工智能); Sony Group Corporation(索尼集团); Stanford University(斯坦福大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: Preprint

点击查看摘要

[CV-97] Instruction Guided Multi Object Image Editing with Quantity and Layout Consistency

链接: https://arxiv.org/abs/2509.24514
作者: Jiaqi Tan,Fangyu Li,Yang Liu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-98] Robust Multimodal Semantic Segmentation with Balanced Modality Contributions

链接: https://arxiv.org/abs/2509.24505
作者: Jiaqi Tan,Xu Zheng,Fangyu Li,Yang Liu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-99] Mitigating Visual Hallucinations via Semantic Curriculum Preference Optimization in MLLM s

链接: https://arxiv.org/abs/2509.24491
作者: Yuanshuai Li,Yuping Yan,Junfeng Tang,Yunxuan Li,Zeqi Zheng,Yaochu Jin
机构: Westlake University (西湖大学); Nanjing University of Science and Technology (南京理工大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-100] Performance-Efficiency Trade-off for Fashion Image Retrieval

链接: https://arxiv.org/abs/2509.24477
作者: Julio Hurtado,Haoran Ni,Duygu Sap,Connor Mattinson,Martin Lotz
机构: CAMaCS, University of Warwick (CAMaCS，华威大学); TRUSS; Mathematics Institute, University of Warwick (数学研究所，华威大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-101] LaMoGen: Laban Movement-Guided Diffusion for Text-to-Motion Generation

链接: https://arxiv.org/abs/2509.24469
作者: Heechang Kim,Gwanghyun Kim,Se Young Chun
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-102] Generalist Multi-Class Anomaly Detection via Distillation to Two Heterogeneous Student Networks

链接: https://arxiv.org/abs/2509.24448
作者: Hangil Park,Yongmin Seo,Tae-Kyun Kim
机构: KAIST (韩国科学技术院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-103] NeoWorld: Neural Simulation of Explorable Virtual Worlds via Progressive 3D Unfolding

【速读】：该论文旨在解决现有方法在生成交互式3D虚拟世界时面临的效率与真实感难以兼顾的问题。传统方法通常采用全局世界生成或2D幻觉策略，导致计算开销大且场景细节不足。NeoWorld的核心解决方案是提出一种混合场景结构：对用户主动探索区域使用基于对象的3D表示以实现高视觉保真度，而对未交互区域则采用2D合成方式提升效率。这种设计结合了前沿的表示学习和物体到3D（object-to-3D）技术，实现了灵活视角控制、物理合理的场景动画以及自然语言驱动的对象外观与动态调控，从而在保证实时交互性能的同时提供沉浸式、连贯的用户体验。

链接: https://arxiv.org/abs/2509.24441
作者: Yanpeng Zhao,Shanyan Guan,Yunbo Wang,Yanhao Ge,Wei Li,Xiaokang Yang
机构: MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University (上海交通大学人工智能重点实验室); vivo Mobile Communication Co., Ltd. (维沃移动通信有限公司)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

Abstract:We introduce NeoWorld, a deep learning framework for generating interactive 3D virtual worlds from a single input image. Inspired by the on-demand worldbuilding concept in the science fiction novel Simulacron-3 (1964), our system constructs expansive environments where only the regions actively explored by the user are rendered with high visual realism through object-centric 3D representations. Unlike previous approaches that rely on global world generation or 2D hallucination, NeoWorld models key foreground objects in full 3D, while synthesizing backgrounds and non-interacted regions in 2D to ensure efficiency. This hybrid scene structure, implemented with cutting-edge representation learning and object-to-3D techniques, enables flexible viewpoint manipulation and physically plausible scene animation, allowing users to control object appearance and dynamics using natural language commands. As users interact with the environment, the virtual world progressively unfolds with increasing 3D detail, delivering a dynamic, immersive, and visually coherent exploration experience. NeoWorld significantly outperforms existing 2D and depth-layered 2.5D methods on the WorldScore benchmark.
zh

[CV-104] UI2V-Bench: An Understanding-based Image-to-video Generation Benchmark

链接: https://arxiv.org/abs/2509.24427
作者: Ailing Zhang,Lina Lei,Dehong Kong,Zhixin Wang,Jiaqi Xu,Fenglong Song,Chun-Le Guo,Chang Liu,Fan Li,Jie Chen
机构: Peking University (北京大学); Huawei Noah’s Ark Lab (华为诺亚方舟实验室); Nankai University (南开大学); Shenzhen Campus of Sun Yat-sen University (中山大学深圳校区); Tsinghua University (清华大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-105] Rethinking Unsupervised Cross-modal Flow Estimation: Learning from Decoupled Optimization and Consistency Constraint

链接: https://arxiv.org/abs/2509.24423
作者: Runmin Zhang,Jialiang Wang,Si-Yuan Cao,Zhu Yu,Junchen Yu,Guangyi Zhang,Hui-Liang Shen
机构: Zhejiang University (浙江大学); Ningbo Global Innovation Center, Zhejiang University (宁波全球创新中心，浙江大学); NingboTech University (宁波工程学院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-106] Proxy-GS: Efficient 3D Gaussian Splatting via Proxy Mesh

链接: https://arxiv.org/abs/2509.24421
作者: Yuanyuan Gao,Yuning Gong,Yifei Liu,Li Jingfeng,Zhihang Zhong,Dingwen Zhang,Yanci Zhang,Dan Xu,Xiao Sun
机构: Northwestern Polytechnical University (西北工业大学); Shanghai Artificial Intelligence Laboratory (上海人工智能实验室); Sichuan University (四川大学); Hong Kong University of Science and Technology (香港科技大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-107] A Data-Centric Perspective on the Influence of Image Data Quality in Machine Learning Models

链接: https://arxiv.org/abs/2509.24420
作者: Pei-Han Chen,Szu-Chi Chung
机构: National Science and Technology Council, Taiwan (台湾国家科学及技术委员会)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
备注: 9 pages, 1 figure, 12 tables

点击查看摘要

[CV-108] CLQ: Cross-Layer Guided Orthogonal-based Quantization for Diffusion Transformers

链接: https://arxiv.org/abs/2509.24416
作者: Kai Liu,Shaoqiu Zhang,Linghe Kong,Yulun Zhang
机构: Shanghai Jiao Tong University (上海交通大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 10 pages, 5 figures. Code is released at this https URL

点击查看摘要

[CV-109] Hybrid Layer-Wise ANN-SNN With Surrogate Spike Encoding-Decoding Structure

链接: https://arxiv.org/abs/2509.24411
作者: Nhan T. Luu,Duong T. Luu,Pham Ngoc Nam,Truong Cong Thang
机构: Can Tho University (芹苴大学); VinUniversity (越南海德大学); The University of Aizu (会津大学)
类目: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
备注: Work under peer-review

点击查看摘要

[CV-110] RapidMV: Leverag ing Spatio-Angular Representations for Efficient and Consistent Text-to-Multi-View Synthesis WACV2026

链接: https://arxiv.org/abs/2509.24410
作者: Seungwook Kim,Yichun Shi,Kejie Li,Minsu Cho,Peng Wang
机构: POSTECH(韩国浦项科技大学); ByteDance Seed(字节跳动种子项目); Meta(Meta)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 18 pages, 13 figures, Accepted to WACV 2026 Round 1

点击查看摘要

[CV-111] PCICF: A Pedestrian Crossing Identification and Classification Framework

【速读】：该论文旨在解决自动驾驶车辆在特定运行设计域（Operational Design Domain, ODD）内对弱势道路使用者（Vulnerable Road Users, VRUs）复杂交互场景的识别与分类问题，尤其针对多行人交叉场景中群体行为变化（如合并或分裂）带来的挑战。其解决方案的关键在于提出PCICF框架，通过构建结构化的多行人交叉情境字典MoreSMIRK，并利用空间填充曲线（Space-Filling Curves, SFCs）将多维场景特征映射为可匹配的特征模式，从而实现对复杂VRU交互事件的高效识别与分类。该方法不仅提升了场景理解的准确性，还具备在车载系统中实时部署的潜力，为ODD内的安全监控和事故分析提供了可扩展的数据驱动工具。

链接: https://arxiv.org/abs/2509.24386
作者: Junyi Gu,Beatriz Cabrero-Daniel,Ali Nouri,Lydia Armini,Christian Berger
机构: Chalmers University of Technology and University of Gothenburg (查尔默斯理工大学和哥德堡大学); Volvo Cars (沃尔沃汽车)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

Abstract:We have recently observed the commercial roll-out of robotaxis in various countries. They are deployed within an operational design domain (ODD) on specific routes and environmental conditions, and are subject to continuous monitoring to regain control in safety-critical situations. Since ODDs typically cover urban areas, robotaxis must reliably detect vulnerable road users (VRUs) such as pedestrians, bicyclists, or e-scooter riders. To better handle such varied traffic situations, end-to-end AI, which directly compute vehicle control actions from multi-modal sensor data instead of only for perception, is on the rise. High quality data is needed for systematically training and evaluating such systems within their OOD. In this work, we propose PCICF, a framework to systematically identify and classify VRU situations to support ODD’s incident analysis. We base our work on the existing synthetic dataset SMIRK, and enhance it by extending its single-pedestrian-only design into the MoreSMIRK dataset, a structured dictionary of multi-pedestrian crossing situations constructed systematically. We then use space-filling curves (SFCs) to transform multi-dimensional features of scenarios into characteristic patterns, which we match with corresponding entries in MoreSMIRK. We evaluate PCICF with the large real-world dataset PIE, which contains more than 150 manually annotated pedestrian crossing videos. We show that PCICF can successfully identify and classify complex pedestrian crossings, even when groups of pedestrians merge or split. By leveraging computationally efficient components like SFCs, PCICF has even potential to be used onboard of robotaxis for OOD detection for example. We share an open-source replication package for PCICF containing its algorithms, the complete MoreSMIRK dataset and dictionary, as well as our experiment results presented in: this https URL
zh

[CV-112] Vid-LLM : A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

链接: https://arxiv.org/abs/2509.24385
作者: Haijier Chen,Bo Xu,Shoujian Zhang,Haoze Liu,Jiaxuan Lin,Jingrong Wang
机构: Wuhan University (武汉大学); Shenzhen University (深圳大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-113] REALIGN: Regularized Procedure Alignment with Matching Video Embeddings via Partial Gromov-Wasserstein Optimal Transport

链接: https://arxiv.org/abs/2509.24382
作者: Soumyadeep Chandra,Kaushik Roy
机构: Purdue University (普渡大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 10 pages, 4 figures, 6 tables

点击查看摘要

[CV-114] Mask Clustering-based Annotation Engine for Large-Scale Submeter Land Cover Mapping

链接: https://arxiv.org/abs/2509.24374
作者: Hao Chen,Fang Xu,Tamer Saleh,Weifeng Hao,Gui-Song Xia
机构: Wuhan University (武汉大学); State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS) (信息工程测绘遥感国家重点实验室); School of Artificial Intelligence (人工智能学院); Chinese Antarctic Center of Surveying and Mapping (中国南极测绘研究中心)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Accepted in IEEE TGRS 2025; Project page: this https URL

点击查看摘要

[CV-115] DINOReg: Strong Point Cloud Registration with Vision Foundation Model

链接: https://arxiv.org/abs/2509.24370
作者: Congjia Chen,Yufu Qu
机构: Beihang University (北京航空航天大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-116] From Satellite to Street: A Hybrid Framework Integrating Stable Diffusion and PanoGAN for Consistent Cross-View Synthesis

链接: https://arxiv.org/abs/2509.24369
作者: Khawlah Bajbaa,Abbas Anwar,Muhammad Saqib,Hafeez Anwar,Nabin Sharma,Muhammad Usman
机构: King Fahd University of Petroleum and Minerals (沙特阿拉伯国王费萨尔石油大学); CSIRO (澳大利亚联邦科学与工业研究组织); National University of Computer and Emerging Sciences (FAST-NUCES) (巴基斯坦国家计算机与新兴科学大学); University of Technology Sydney (悉尼科技大学); University of Ontario Institute of Technology (安大略理工大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
备注:

点击查看摘要

[CV-117] Real-Aware Residual Model Merging for Deepfake Detection

链接: https://arxiv.org/abs/2509.24367
作者: Jinhee Park,Guisik Kim,Choongsang Cho,Junseok Kwon
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-118] Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models

链接: https://arxiv.org/abs/2509.24365
作者: Jitai Hao,Hao Liu,Xinyan Xiao,Qiang Huang,Jun Yu
机构: Harbin Institute of Technology, Shenzhen (哈尔滨工业大学深圳校区); Baidu Inc. (百度公司); Pengcheng Laboratory (鹏城实验室)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-119] UI-UG: A Unified MLLM for UI Understanding and Generation

链接: https://arxiv.org/abs/2509.24361
作者: Hao Yang,Weijie Qiu,Ru Zhang,Zhou Fang,Ruichao Mao,Xiaoyu Lin,Maji Huang,Zhaosong Huang,Teng Guo,Shuoyang Liu,Hai Rao
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
备注:

点击查看摘要

[CV-120] DRIFT: Divergent Response in Filtered Transformations for Robust Adversarial Defense

链接: https://arxiv.org/abs/2509.24359
作者: Amira Guesmi,Muhammad Shafique
机构: New York University Abu Dhabi (纽约大学阿布扎比分校)
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-121] An Enhanced Pyramid Feature Network Based on Long-Range Dependencies for Multi-Organ Medical Image Segmentation

链接: https://arxiv.org/abs/2509.24358
作者: Dayu Tan,Cheng Kong,Yansen Su,Hai Chen,Dongliang Yang,Junfeng Xia,Chunhou Zheng
机构: Anhui University (安徽大学); Anhui Chest Hospital (安徽省胸科医院)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-122] NeRV-Diffusion: Diffuse Implicit Neural Representations for Video Synthesis

链接: https://arxiv.org/abs/2509.24353
作者: Yixuan Ren,Hanyu Wang,Hao Chen,Bo He,Abhinav Shrivastava
机构: University of Maryland, College Park (马里兰大学学院市分校)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Project Page: this https URL

点击查看摘要

[CV-123] Dynamic Orchestration of Multi-Agent System for Real-World Multi-Image Agricultural VQA

链接: https://arxiv.org/abs/2509.24350
作者: Yan Ke,Xin Yu,Heming Du,Scott Chapman,Helen Huang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 13 pages, 2 figures, 2 tables

点击查看摘要

[CV-124] Hyperspherical Latents Improve Continuous-Token Autoregressive Generation

链接: https://arxiv.org/abs/2509.24335
作者: Guolin Ke,Hui Xue
机构: DP Technology( DP科技); Peking University(北京大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-125] P-MVCC: Tri-plane Multi-view Fusion Model for Silkie Chicken Counting

链接: https://arxiv.org/abs/2509.24329
作者: Sirui Chen,Yuhong Feng,Yifeng Wang,Jianghai Liao,Qi Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-126] raitSpaces: Towards Interpretable Visual Creativity for Human-AI Co-Creation

链接: https://arxiv.org/abs/2509.24326
作者: Prerna Luthra
机构: Evam Labs(埃瓦姆实验室)
类目: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-127] Similarity-Aware Selective State-Space Modeling for Semantic Correspondence ICCV2025

链接: https://arxiv.org/abs/2509.24318
作者: Seungwook Kim,Minsu Cho
机构: Pohang University of Science and Technology (POSTECH)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 23 pages, 11 figures. Accepted as Oral presentation for ICCV 2025 Findings

点击查看摘要

[CV-128] Rethinking JEPA: Compute-Efficient Video SSL with Frozen Teachers

链接: https://arxiv.org/abs/2509.24317
作者: Xianhang Li,Chen Huang,Chun-Liang Li,Eran Malach,Josh Susskind,Vimal Thilak,Etai Littwin
机构: Apple(苹果)
类目: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
备注: Technical Report

点击查看摘要

[CV-129] owards Foundation Models for Cryo-ET Subtomogram Analysis

链接: https://arxiv.org/abs/2509.24311
作者: Runmin Jiang,Wanyue Feng,Yuntian Yang,Shriya Pingulkar,Hong Wang,Xi Xiao,Xiaoyu Cao,Genpei Zhang,Xiao Wang,Xiaolong Wu,Tianyang Wang,Yang Liu,Xingjian Li,Min Xu
机构: Carnegie Mellon University (卡内基梅隆大学); Harvard University (哈佛大学); University of Alabama at Birmingham (阿拉巴马大学伯明翰分校); Oak Ridge National Laboratory (橡树岭国家实验室); K. J. Somaiya College of Engineering (K. J. 索迈亚工程学院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-130] OMeGa: Joint Optimization of Explicit Meshes and Gaussian Splats for Robust Scene-Level Surface Reconstruction

链接: https://arxiv.org/abs/2509.24308
作者: Yuhang Cao,Haojun Yan,Danya Yao
机构: Tsinghua University (清华大学); Beihang University (北京航空航天大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 12 pages, 9 figures

点击查看摘要

[CV-131] FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting ICLR2026

链接: https://arxiv.org/abs/2509.24304
作者: Zefeng He,Xiaoye Qu,Yafu Li,Siyuan Huang,Daizong Liu,Yu Cheng
机构: Shanghai AI Laboratory (上海人工智能实验室); Nanjing University (南京大学); The Chinese University of Hong Kong (香港中文大学); Shanghai Jiao Tong University (上海交通大学); Peking University (北京大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Submitted to ICLR 2026

点击查看摘要

[CV-132] SVGThinker: Instruction-Aligned and Reasoning -Driven Text-to-SVG Generation

链接: https://arxiv.org/abs/2509.24299
作者: Hanqi Chen,Zhongyin Zhao,Ye Chen,Zhujin Liang,Bingbing Ni
机构: Shanghai Jiao Tong University (上海交通大学); SJTU Paris Elite Institute of Technology (上海交通大学巴黎卓越工程师学院); PhiGent Robotics (PhiGent机器人)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-133] ASIA: Adaptive 3D Segmentation using Few Image Annotations SIGGRAPH

链接: https://arxiv.org/abs/2509.24288
作者: Sai Raj Kishore Perla,Aditya Vora,Sauradip Nag,Ali Mahdavi-Amiri,Hao Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: SIGGRAPH Asia, 2025. Project Page: this https URL

点击查看摘要

[CV-134] Robust Partial 3D Point Cloud Registration via Confidence Estimation under Global Context

链接: https://arxiv.org/abs/2509.24275
作者: Yongqiang Wang,Weigang Li,Wenping Liu,Zhe Xu,Zhiqiang Tian
机构: Wuhan University of Science and Technology (武汉科技大学); Hubei University of Economics (湖北经济学院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-135] Skeleton-based Robust Registration Framework for Corrupted 3D Point Clouds

链接: https://arxiv.org/abs/2509.24273
作者: Yongqiang Wang,Weigang Li,Wenping Liu,Zhiqiang Tian,Jinling Li
机构: Wuhan University of Science and Technology (武汉科技大学); Hubei University of Economics (湖北经济学院)
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-136] Cycle Diffusion Model for Counterfactual Image Generation

链接: https://arxiv.org/abs/2509.24267
作者: Fangrui Huang,Alan Wang,Binxu Li,Bailey Trang,Ridvan Yesiloglu,Tianyu Hua,Wei Peng,Ehsan Adeli
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-137] S2NN: Sub-bit Spiking Neural Networks

链接: https://arxiv.org/abs/2509.24266
作者: Wenjie Wei,Malu Zhang,Jieyuan Zhang,Ammar Belatreche,Shuai Wang,Yimeng Shan,Hanwen Liu,Honglin Cao,Guoqing Wang,Yang Yang,Haizhou Li
机构: University of Electronic Science and Technology of China (电子科技大学); Centre of Language, Intelligence, and Machines (LIMA) (语言、智能与机器中心)，Shenzhen Loop Area Institute (深圳环区研究院); Northumbria University (诺桑比亚大学); Liaoning Technical University (辽宁工程技术大学); The Chinese University of Hong Kong, Shenzhen (香港中文大学（深圳）)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 29 pages, 6 figures

点击查看摘要

[CV-138] When MLLM s Meet Compression Distortion: A Coding Paradigm Tailored to MLLM s

链接: https://arxiv.org/abs/2509.24258
作者: Jinming Liu,Zhaoyang Jia,Jiahao Li,Bin Li,Xin Jin,Wenjun Zeng,Yan Lu
机构: Shanghai Jiao Tong University (上海交通大学); Eastern Institute of Technology, Ningbo, China (宁波东方理工大学); Microsoft Research Asia (微软亚洲研究院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-139] FreeAction: Training-Free Techniques for Enhanced Fidelity of Trajectory-to-Video Generation

链接: https://arxiv.org/abs/2509.24241
作者: Seungwook Kim,Seunghyeon Lee,Minsu Cho
机构: POSTECH(浦项工科大学); Ewha Womans University(延世女子大学); RLWRLD
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注: 8 pages, 4 figures, accepted to CoRL 2025 LSRW workshop

点击查看摘要

[CV-140] PROFusion: Robust and Accurate Dense Reconstruction via Camera Pose Regression and Optimization

链接: https://arxiv.org/abs/2509.24236
作者: Siyan Dong,Zijun Wang,Lulu Cai,Yi Ma,Yanchao Yang
机构: The University of Hong Kong (香港大学)
类目: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-141] EVLF-FM: Explainable Vision Language Foundation Model for Medicine

链接: https://arxiv.org/abs/2509.24231
作者: Yang Bai,Haoran Cheng,Yang Zhou,Jun Zhou,Arun Thirunavukarasu,Yuhe Ke,Jie Yao,Kanae Fukutsu,Chrystie Wan Ning Quek,Ashley Hong,Laura Gutierrez,Zhen Ling Teo,Darren Shu Jeng Ting,Brian T. Soetikno,Christopher S. Nielsen,Tobias Elze,Zengxiang Li,Linh Le Dinh,Hiok Hong Chan,Victor Koh,Marcus Tan,Kelvin Z. Li,Leonard Yip,Ching Yu Cheng,Yih Chung Tham,Gavin Siew Wei Tan,Leopold Schmetterer,Marcus Ang,Rahat Hussain,Jod Mehta,Tin Aung,Lionel Tim-Ee Cheng,Tran Nguyen Tuan Anh,Chee Leong Cheng,Tien Yin Wong,Nan Liu,Iain Beehuat Tan,Soon Thye Lim,Eyal Klang,Tony Kiat Hon Lim,Rick Siow Mong Goh,Yong Liu,Daniel Shu Wei Ting
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-142] Semantic Editing with Coupled Stochastic Differential Equations

链接: https://arxiv.org/abs/2509.24223
作者: Jianxin Zhang,Clayton Scott
机构: University of Michigan (密歇根大学); Meta (Meta)
类目: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
备注:

点击查看摘要

[CV-143] Scalable Audio-Visual Masked Autoencoders for Efficient Affective Video Facial Analysis

链接: https://arxiv.org/abs/2509.24214
作者: Xuecheng Wu,Junxiao Xue,Xinyi Yin,Yunyun Shi,Liangyu Fu,Danlei Huang,Yifan Wang,Jia Zhang,Jiayu Nie,Jun Wang
机构: Xi’an Jiaotong University (西安交通大学); Zhengzhou University (郑州大学); Northwestern Polytechnical University (西北工业大学); University of Science and Technology of China (中国科学技术大学); Zhejiang Lab (浙江实验室); Inspur Group (浪潮集团)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-144] Forge4D: Feed-Forward 4D Human Reconstruction and Interpolation from Uncalibrated Sparse-view Videos

链接: https://arxiv.org/abs/2509.24209
作者: Yingdong Hu,Yisheng He,Jinnan Chen,Weihao Yuan,Kejie Qiu,Zehong Lin,Siyu Zhu,Zilong Dong,Jun Zhang
机构: HKUST (香港科技大学); Tongyi Lab, Alibaba Group (阿里巴巴集团通义实验室); NUS (新加坡国立大学); FDU (复旦大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-145] BALR-SAM: Boundary-Aware Low-Rank Adaptation of SAM for Resource-Efficient Medical Image Segmentation

链接: https://arxiv.org/abs/2509.24204
作者: Zelin Liu,Sicheng Dong,Bocheng Li,Yixuan Yang,Jiacheng Ruan,Chenxu Zhou,Suncheng Xiang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-146] UniVid: The Open-Source Unified Video Model

链接: https://arxiv.org/abs/2509.24200
作者: Jiabin Luo,Junhui Lin,Zeyu Zhang,Biao Wu,Meng Fang,Ling Chen,Hao Tang
机构: Peking University (北京大学); AI Geeks; Australian Artificial Intelligence Institute (澳大利亚人工智能研究所)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-147] An Efficient 3D Latent Diffusion Model for T1-contrast Enhanced MRI Generation

链接: https://arxiv.org/abs/2509.24194
作者: Zach Eidex,Mojtaba Safari,Jie Ding,Richard Qiu,Justin Roper,David Yu,Hui-Kuo Shu,Zhen Tian,Hui Mao,Xiaofeng Yang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-148] alk in Pieces See in Whole: Disentangling and Hierarchical Aggregating Representations for Language-based Object Detection

链接: https://arxiv.org/abs/2509.24192
作者: Sojung An,Kwanyong Park,Yong Jae Lee,Donghyun Kim
机构: Korea University (韩国大学); University of Seoul (首尔大学); University of Wisconsin-Madison (威斯康星大学麦迪逊分校)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 23 pages, 17 figures

点击查看摘要

[CV-149] Simulating Post-Neoadjuvant Chemotherapy Breast Cancer MRI via Diffusion Model with Prompt Tuning

链接: https://arxiv.org/abs/2509.24185
作者: Jonghun Kim,Hyunjin Park
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: ISBI’25, 5 pages, 4 figures

点击查看摘要

[CV-150] umor Synthesis conditioned on Radiomics WACV’25

链接: https://arxiv.org/abs/2509.24182
作者: Jonghun Kim,Inye Na,Eun Sook Ko,Hyunjin Park
机构: Sungkyunkwan University (成均馆大学); Samsung Medical Center (三星医疗中心)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: WACV’25

点击查看摘要

[CV-151] Combining Discrepancy-Confusion Uncertainty and Calibration Diversity for Active Fine-Grained Image Classification

链接: https://arxiv.org/abs/2509.24181
作者: Yinghao Jin,Xi Yang
机构: Jilin University (吉林大学); Ministry of Education (中华人民共和国教育部)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-152] High-Order Progressive Trajectory Matching for Medical Image Dataset Distillation MICCAI2025

链接: https://arxiv.org/abs/2509.24177
作者: Le Dong,Jinghao Bian,Jingyang Hou,Jingliang Hu,Yilei Shi,Weisheng Dong,Xiao Xiang Zhu,Lichao Mou
机构: MedAI Technology (Wuxi) Co. Ltd.(MedAI科技(无锡)有限公司)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: MICCAI 2025 (early accept, top 9%)

点击查看摘要

[CV-153] LatXGen: Towards Radiation-Free and Accurate Quantitative Analysis of Sagittal Spinal Alignment Via Cross-Modal Radiographic View Synthesis

链接: https://arxiv.org/abs/2509.24165
作者: Moxin Zhao,Nan Meng,Jason Pui Yin Cheung,Chris Yuk Kwan Tang,Chenxi Yu,Wenting Zhong,Pengyu Lu,Chang Shi,Yipeng Zhuang,Teng Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 8 pages, 6 figures

点击查看摘要

[CV-154] Neural Visibility of Point Sets SIGGRAPH

链接: https://arxiv.org/abs/2509.24150
作者: Jun-Hao Wang,Yi-Yang Tian,Baoquan Chen,Peng-Shuai Wang
机构: 未知
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
备注: Accepted to SIGGRAPH Asia 2025

点击查看摘要

[CV-155] Accelerating Cerebral Diagnostics with BrainFusion: A Comprehensive MRI Tumor Framework

链接: https://arxiv.org/abs/2509.24149
作者: Walid Houmaidi,Youssef Sabiri,Salmane El Mansour Billah,Amine Abouaomar
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 6 pages, 4 figures, 3 tables. Accepted at the 12th International Conference on Wireless Networks and Mobile Communications 2025 (WINCOM 2025)

点击查看摘要

[CV-156] Asymmetric VAE for One-Step Video Super-Resolution Acceleration

链接: https://arxiv.org/abs/2509.24142
作者: Jianze Li,Yong Guo,Yulun Zhang,Xiaokang Yang
机构: Shanghai Jiao Tong University (上海交通大学); South China University of Technology (华南理工大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-157] Analysis of Bias in Deep Learning Facial Beauty Regressors

链接: https://arxiv.org/abs/2509.24138
作者: Chandon Hamel,Mike Busch
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-158] EYE-DEX: Eye Disease Detection and EXplanation System

链接: https://arxiv.org/abs/2509.24136
作者: Youssef Sabiri,Walid Houmaidi,Amine Abouaomar
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 6 pages, 4 figures, 3 tables. Accepted at the 12th International Conference on Wireless Networks and Mobile Communications 2025 (WINCOM 2025)

点击查看摘要

[CV-159] Mash Spread Slice! Learning to Manipulate Object States via Visual Spatial Progress

链接: https://arxiv.org/abs/2509.24129
作者: Priyanka Mandikal,Jiaheng Hu,Shivin Dass,Sagnik Majumder,Roberto Martín-Martín,Kristen Grauman
机构: 未知
类目: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-160] GANji: A Framework for Introductory AI Image Generation

链接: https://arxiv.org/abs/2509.24128
作者: Chandon Hamel,Mike Busch
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-161] SVAC: Scaling Is All You Need For Referring Video Object Segmentation BMVC2025

链接: https://arxiv.org/abs/2509.24109
作者: Li Zhang,Haoxiang Gao,Zhihao Zhang,Luoxiao Huang,Tao Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: This paper is accepted to BMVC 2025

点击查看摘要

[CV-162] Unified Multi-Modal Interactive Reactive 3D Motion Generation via Rectified Flow ICLR2026

链接: https://arxiv.org/abs/2509.24099
作者: Prerit Gupta,Shourya Verma,Ananth Grama,Aniket Bera
机构: Purdue University (普渡大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Under review at ICLR 2026

点击查看摘要

[CV-163] Clebsch-Gordan Transformer: Fast and Global Equivariant Attention

【速读】：该论文旨在解决现有等变Transformer模型在处理高阶等变特征时面临的计算效率低下和表达能力受限的问题。具体而言，传统全局注意力机制因二次复杂度限制了其在长序列上的应用，而现有等变模型仅支持低阶特征与局部上下文窗口，导致性能瓶颈。解决方案的关键在于提出Clebsch-Gordan Transformer，其核心创新是引入一种基于SO(3)不可约表示的新型Clebsch-Gordon卷积（Clebsch-Gordon Convolution），通过利用Clebsch-Gordon矩阵的稀疏性实现O(N log N)复杂度的全局注意力机制，从而在保持任意阶等变性的同时显著提升计算效率与模型表达能力。

链接: https://arxiv.org/abs/2509.24093
作者: Owen Lewis Howell,Linfeng Zhao,Xupeng Zhu,Yaoyao Qian,Haojie Huang,Lingfeng Sun,Wil Thomason,Robert Platt,Robin Walters
机构: Northeastern University (东北大学); The Boston Dynamics AI Institute (波士顿动力人工智能研究所)
类目: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注:

点击查看摘要

Abstract:The global attention mechanism is one of the keys to the success of transformer architecture, but it incurs quadratic computational costs in relation to the number of tokens. On the other hand, equivariant models, which leverage the underlying geometric structures of problem instance, often achieve superior accuracy in physical, biochemical, computer vision, and robotic tasks, at the cost of additional compute requirements. As a result, existing equivariant transformers only support low-order equivariant features and local context windows, limiting their expressiveness and performance. This work proposes Clebsch-Gordan Transformer, achieving efficient global attention by a novel Clebsch-Gordon Convolution on \SO(3) irreducible representations. Our method enables equivariant modeling of features at all orders while achieving O(N \log N) input token complexity. Additionally, the proposed method scales well with high-order irreducible features, by exploiting the sparsity of the Clebsch-Gordon matrix. Lastly, we also incorporate optional token permutation equivariance through either weight sharing or data augmentation. We benchmark our method on a diverse set of benchmarks including n-body simulation, QM9, ModelNet point cloud classification and a robotic grasping dataset, showing clear gains over existing equivariant transformers in GPU memory size, speed, and accuracy.
zh

[CV-164] Autoregressive Video Generation beyond Next Frames Prediction

链接: https://arxiv.org/abs/2509.24081
作者: Sucheng Ren,Chen Chen,Zhenbang Wang,Liangchen Song,Xiangxin Zhu,Alan Yuille,Yinfei Yang,Jiasen Lu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-165] Uncovering Grounding IDs: How External Cues Shape Multi-Modal Binding ICLR2026

链接: https://arxiv.org/abs/2509.24072
作者: Hosein Hasani,Amirmohammad Izadi,Fatemeh Askari,Mobin Bagherian,Sadegh Mohammadian,Mohammad Izadi,Mahdieh Soleymani Baghshah
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: Under review as a conference paper at ICLR 2026

点击查看摘要

[CV-166] AQUAIR: A High-Resolution Indoor Environmental Quality Dataset for Smart Aquaculture Monitoring

链接: https://arxiv.org/abs/2509.24069
作者: Youssef Sabiri,Walid Houmaidi,Ouail El Maadi,Yousra Chtouki
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
备注: 6 pages, 6 figures, 3 tables. Accepted at the 9th IEEE Global Conference on Artificial Intelligence Internet of Things (IEEE GCAIoT) 2025. Final camera-ready manuscript. Math expressions in this field are rendered via MathJax

点击查看摘要

[CV-167] A Second-Order Perspective on Pruning at Initialization and Knowledge Transfer

链接: https://arxiv.org/abs/2509.24066
作者: Leonardo Iurada,Beatrice Occhiena,Tatiana Tommasi
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: Accepted ICIAP 2025 - IAPR Best Paper Award

点击查看摘要

[CV-168] GPS-MTM: Capturing Pattern of Normalcy in GPS-Trajectories with self-supervised learning

链接: https://arxiv.org/abs/2509.24031
作者: Umang Garg,Bowen Zhang,Anantanjit Subrahmanya,Chandrakanth Gudavalli,BS Manjunath
机构: University of California Santa Barbara (加州大学圣塔芭芭拉分校)
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
备注: 4 pages, 2 figures

点击查看摘要

[CV-169] Joint Superpixel and Self-Representation Learning for Scalable Hyperspectral Image Clustering

链接: https://arxiv.org/abs/2509.24027
作者: Xianlu Li,Nicolas Nadisic,Shaoguang Huang,Aleksandra Pizurica
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-170] mathbfR3: Reconstruction Raw and Rain: Deraining Directly in the Bayer Domain

链接: https://arxiv.org/abs/2509.24022
作者: Nate Rothschild,Moshe Kimhi,Avi Mendelson,Chaim Baskin
机构: Technion, Israel (以色列理工学院); Ben-Gurion University (本古里安大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 9 pages

点击查看摘要

[CV-171] Hazy Pedestrian Trajectory Prediction via Physical Priors and Graph-Mamba

链接: https://arxiv.org/abs/2509.24020
作者: Jian Chen,Zhuoran Zheng,Han Hu,Guijuan Zhang,Dianjie Lu,Liang Li,Chen Lyu
机构: Shandong Normal University (山东师范大学); Sun Yat-sen University (中山大学); Shandong Jiaotong University (山东交通学院)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-172] Generalized Category Discovery in Hyperspectral Images via Prototype Subspace Modeling

链接: https://arxiv.org/abs/2509.24017
作者: Xianlu Li,Nicolas Nadisic,Shaoguang Huang,Aleksandra Pizurica
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-173] FrameMind: Frame-Interleaved Chain-of-Thought for Video Reasoning via Reinforcement Learning

链接: https://arxiv.org/abs/2509.24008
作者: Haonan Ge,Yiwei Wang,Kai-Wei Chang,Hang Wu,Yujun Cai
机构: University of California, Merced (加州大学默塞德分校); University of California, Los Angeles (加州大学洛杉矶分校); The University of Queensland (昆士兰大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: Underreview

点击查看摘要

[CV-174] SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

链接: https://arxiv.org/abs/2509.24006
作者: Jintao Zhang,Haoxu Wang,Kai Jiang,Shuo Yang,Kaiwen Zheng,Haocheng Xi,Ziteng Wang,Hongzhou Zhu,Min Zhao,Ion Stoica,Joseph E. Gonzalez,Jun Zhu,Jianfei Chen
机构: Tsinghua University (清华大学); UC Berkeley (加州大学伯克利分校)
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-175] SIE3D: Single-image Expressive 3D Avatar generation via Semantic Embedding and Perceptual Expression Loss

链接: https://arxiv.org/abs/2509.24004
作者: Zhiqi Huang,Dulongkai Cui,Jinglu Hu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: This work has been submitted to the IEEE for possible publication

点击查看摘要

[CV-176] Gaze Estimation for Human-Robot Interaction: Analysis Using the NICO Platform

链接: https://arxiv.org/abs/2509.24001
作者: Matej Palider,Omar Eldardeer,Viktor Kocur
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注: Code available at this http URL

点击查看摘要

[CV-177] REAT-Net: Tabular-Referenced Echocardiography Analysis for Acute Coronary Syndrome Treatment Prediction MICCAI

链接: https://arxiv.org/abs/2509.23999
作者: Diane Kim,Minh Nguyen Nhat To,Sherif Abdalla,Teresa S.M. Tsang,Purang Abolmaesumi,and Christina Luong
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注: 11 pages, 2 figures, MICCAI ASMUS 2025 paper

点击查看摘要

[CV-178] Advancing Multi-agent Traffic Simulation via R1-Style Reinforcement Fine-Tuning

链接: https://arxiv.org/abs/2509.23993
作者: Muleilan Pei,Shaoshuai Shi,Shaojie Shen
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注:

点击查看摘要

[CV-179] RPG360: Robust 360 Depth Estimation with Perspective Foundation Models and Graph Optimization

链接: https://arxiv.org/abs/2509.23991
作者: Dongki Jung,Jaehoon Choi,Yonghan Lee,Dinesh Manocha
机构: University of Maryland, College Park (马里兰大学学院公园分校)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-180] owards Redundancy Reduction in Diffusion Models for Efficient Video Super-Resolution

链接: https://arxiv.org/abs/2509.23980
作者: Jinpei Guo,Yifei Ji,Zheng Chen,Yufei Wang,Sizhuo Ma,Yong Guo,Yulun Zhang,Jian Wang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-181] VFSI: Validity First Spatial Intelligence for Constraint-Guided Traffic Diffusion

链接: https://arxiv.org/abs/2509.23971
作者: Kargi Chauhan,Leilani H. Gilpin
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-182] A Novel Hybrid Deep Learning and Chaotic Dynamics Approach for Thyroid Cancer Classification

链接: https://arxiv.org/abs/2509.23968
作者: Nada Bouchekout,Abdelkrim Boukabou,Morad Grimes,Yassine Habchi,Yassine Himeur,Hamzah Ali Alkhazaleh,Shadi Atalla,Wathiq Mansoor
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Scientific Reports

点击查看摘要

[CV-183] Reinforcement Learning with Inverse Rewards for World Model Post-training

链接: https://arxiv.org/abs/2509.23958
作者: Yang Ye,Tianyu He,Shuo Yang,Jiang Bian
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-184] ColLab: A Collaborative Spatial Progressive Data Engine for Referring Expression Comprehension and Generation

链接: https://arxiv.org/abs/2509.23955
作者: Shilan Zhang,Jirui Huang,Ruilin Yao,Cong Wang,Yaxiong Chen,Peng Xu,Shengwu Xiong
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-185] HunyuanImage 3.0 Technical Report

链接: https://arxiv.org/abs/2509.23951
作者: Siyu Cao,Hangting Chen,Peng Chen,Yiji Cheng,Yutao Cui,Xinchi Deng,Ying Dong,Kipper Gong,Tianpeng Gu,Xiusen Gu,Tiankai Hang,Duojun Huang,Jie Jiang,Zhengkai Jiang,Weijie Kong,Changlin Li,Donghao Li,Junzhe Li,Xin Li,Yang Li,Zhenxi Li,Zhimin Li,Jiaxin Lin,Linus,Lucaz Liu,Shu Liu,Songtao Liu,Yu Liu,Yuhong Liu,Yanxin Long,Fanbin Lu,Qinglin Lu,Yuyang Peng,Yuanbo Peng,Xiangwei Shen,Yixuan Shi,Jiale Tao,Yangyu Tao,Qi Tian,Pengfei Wan,Chunyu Wang,Kai Wang,Lei Wang,Linqing Wang,Lucas Wang,Qixun Wang,Weiyan Wang,Hao Wen,Bing Wu,Jianbing Wu,Yue Wu,Senhao Xie,Fang Yang,Miles Yang,Xiaofeng Yang,Xuan Yang,Zhantao Yang,Jingmiao Yu,Zheng Yuan,Chao Zhang,Jian-Wei Zhang,Peizhen Zhang,Shi-Xue Zhang,Tao Zhang,Weigang Zhang,Yepeng Zhang,Yingfang Zhang,Zihao Zhang,Zijian Zhang,Penghao Zhao,Zhiyuan Zhao,Xuefei Zhe,Jianchen Zhu,Zhao Zhong
机构: Tencent Hunyuan Foundation Model Team (腾讯混元大模型团队)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-186] CrashSplat: 2D to 3D Vehicle Damage Segmentation in Gaussian Splatting

链接: https://arxiv.org/abs/2509.23947
作者: Dragoş-Andrei Chileban,Andrei-Ştefan Bulzan,Cosmin Cernǎzanu-Glǎvan
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-187] AutoPrune: Each Complexity Deserves a Pruning Policy

链接: https://arxiv.org/abs/2509.23931
作者: Hanshi Wang,Yuhao Xu,Zekun Xu,Jin Gao,Yufan Liu,Weiming Hu,Ke Wang,Zhipeng Zhang
机构: State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), CASIA; School of Artificial Intelligence, University of Chinese Academy of Sciences; AutoLab, School of Artificial Intelligence, Shanghai Jiao Tong University; Anyverse Intelligence; Beijing Key Laboratory of Super Intelligent Security of Multi-Modal Information; School of Information Science and Technology, ShanghaiTech University; KargoBot
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 13 pages, 2 figures

点击查看摘要

[CV-188] SAR-KnowLIP: Towards Multimodal Foundation Models for Remote Sensing

链接: https://arxiv.org/abs/2509.23927
作者: Yi Yang,Xiaokun Zhang,Qingchen Fang,Ziqi Ye,Rui Li,Li Liu,Haipeng Wang
机构: Fudan University (复旦大学); NUDT (国防科技大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-189] Learning Encoding-Decoding Direction Pairs to Unveil Concepts of Influence in Deep Vision Networks

链接: https://arxiv.org/abs/2509.23926
作者: Alexandros Doumanoglou,Kurt Driessens,Dimitrios Zarpalas
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 80 Pages. The paper’s abstract was shortened to fit the character limit

点击查看摘要

[CV-190] DriveE2E: Closed-Loop Benchmark for End-to-End Autonomous Driving through Real-to-Simulation

链接: https://arxiv.org/abs/2509.23922
作者: Haibao Yu,Wenxian Yang,Ruiyang Hao,Chuanye Wang,Jiaru Zhong,Ping Luo,Zaiqing Nie
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注: End-to-End Autonomous Driving Simulation and Benchmark

点击查看摘要

[CV-191] oken Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models

链接: https://arxiv.org/abs/2509.23919
作者: Longtao Jiang,Mingfei Han,Lei Chen,Yongqiang Yu,Feng Zhao,Xiaojun Chang,Zhihui Li
机构: University of Science and Technology of China (中国科学技术大学); Department of Computer Vision, MBZUAI (MBZUAI 计算机视觉系)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-192] Bridging the Task Gap: Multi-Task Adversarial Transferability in CLIP and Its Derivatives

链接: https://arxiv.org/abs/2509.23917
作者: Kuanrong Liu,Siyuan Liang,Cheng Qian,Ming Zhang,Xiaochun Cao
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-193] Revisit the Imbalance Optimization in Multi-task Learning: An Experimental Analysis

链接: https://arxiv.org/abs/2509.23915
作者: Yihang Guo,Tianyuan Yu,Liang Bai,Yanming Guo,Yirun Ruan,William Li,Weishi Zheng
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-194] MoReact: Generating Reactive Motion from Textual Descriptions

链接: https://arxiv.org/abs/2509.23911
作者: Xiyan Xu,Sirui Xu,Yu-Xiong Wang,Liang-Yan Gui
机构: University of Illinois Urbana-Champaign (伊利诺伊大学厄巴纳-香槟分校)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Published in Transactions on Machine Learning Research

点击查看摘要

[CV-195] EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling ITSC

链接: https://arxiv.org/abs/2509.23909
作者: Xin Luo,Jiahao Wang,Chenyuan Wu,Shitao Xiao,Xiyan Jiang,Defu Lian,Jiajun Zhang,Dong Liu,Zheng liu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Code, Models and benchmark will be publicly available at this https URL

点击查看摘要

[CV-196] Adversarial Versus Federated: An Adversarial Learning based Multi-Modality Cross-Domain Federated Medical Segmentation

链接: https://arxiv.org/abs/2509.23907
作者: You Zhou,Lijiang Chen,Shuchang Lyu,Guangxia Cui,Wenpei Bai,Zheng Zhou,Meng Li,Guangliang Cheng,Huiyu Zhou,Qi Zhao
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-197] EWC-Guided Diffusion Replay for Exemplar-Free Continual Learning in Medical Imaging DATE NEURIPS2025

链接: https://arxiv.org/abs/2509.23906
作者: Anoushka Harit,William Prew,Zhongtian Sun,Florian Markowetz
机构: University of Cambridge (剑桥大学); University of Kent (肯特大学)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: Accepted at AI That Keeps Up: NeurIPS 2025 Workshop on Continual and Compatible Foundation Model Updates

点击查看摘要

[CV-198] LifeCLEF Plant Identification Task 2014

链接: https://arxiv.org/abs/2509.23900
作者: Herve Goeau,Alexis Joly,Pierre Bonnet,Souheil Selmi,Jean-Francois Molino,Daniel Barthelemy,Nozha Boujemaa
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 18 pages, 4 figures, CLEF 2014 Conference and Labs of the Evaluation Forum, September 15 to 18, 2014, Sheffield, United Kingdom

点击查看摘要

[CV-199] Q-FSRU: Quantum-Augmented Frequency-Spectral For Medical Visual Question Answering ICLR2026

链接: https://arxiv.org/abs/2509.23899
作者: Rakesh Thakur,Yusra Tariq,Rakesh Chandra Joshi
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 12 pages (9 main + 2 references/appendix), 2 figures, conference paper submitted to ICLR 2026

点击查看摘要

[CV-200] Preserving Cross-Modal Stability for Visual Unlearning in Multimodal Scenarios

链接: https://arxiv.org/abs/2509.23895
作者: Jinghan Xu Yuyang Zhang Qixuan Cai Jiancheng Chen Keqiu Li
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 9 pages,4 figures

点击查看摘要

[CV-201] LifeCLEF Plant Identification Task 2015

链接: https://arxiv.org/abs/2509.23891
作者: Herve Goeau,Pierre Bonnet,Alexis Joly
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 15 pages, 4 figures, CLEF 2015 Conference and Labs of the Evaluation Forum, September 08 to 11, 2015, Toulouse, France

点击查看摘要

[CV-202] AssemblyHands-X: Modeling 3D Hand-Body Coordination for Understanding Bimanual Human Activities

链接: https://arxiv.org/abs/2509.23888
作者: Tatsuro Banno,Takehiko Ohkawa,Ruicong Liu,Ryosuke Furuta,Yoichi Sato
机构: The University of Tokyo (东京大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-203] unable-Generalization Diffusion Powered by Self-Supervised Contextual Sub-Data for Low-Dose CT Reconstruction

链接: https://arxiv.org/abs/2509.23885
作者: Guoquan Wei,Zekun Zhou,Liu Shi,Wenzhe Shan,Qiegen Liu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-204] Learning Adaptive Pseudo-Label Selection for Semi-Supervised 3D Object Detection

链接: https://arxiv.org/abs/2509.23880
作者: Taehun Kong,Tae-Kyun Kim
机构: KAIST
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-205] Not All Tokens are Guided Equal: Improving Guidance in Visual Autoregressive Models

链接: https://arxiv.org/abs/2509.23876
作者: Ky Dan Nguyen,Hoang Lam Tran,Anh-Dung Dinh,Daochang Liu,Weidong Cai,Xiuying Wang,Chang Xu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 17 pages, 7 figures

点击查看摘要

[CV-206] aught Well Learned Ill: Towards Distillation-conditional Backdoor Attack NEURIPS2025

链接: https://arxiv.org/abs/2509.23871
作者: Yukun Chen,Boheng Li,Yu Yuan,Leyi Qi,Yiming Li,Tianwei Zhang,Zhan Qin,Kui Ren
机构: 未知
类目: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注: The first three authors contributed equally to this work. To appear in NeurIPS 2025. 35 pages

点击查看摘要

[CV-207] Sim-DETR: Unlock DETR for Temporal Sentence Grounding ICCV2025

链接: https://arxiv.org/abs/2509.23867
作者: Jiajin Tang,Zhengxuan Wei,Yuchen Zhu,Cheng Shi,Guanbin Li,Liang Lin,Sibei Yang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: This work is accepted by ICCV 2025

点击查看摘要

[CV-208] Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation

链接: https://arxiv.org/abs/2509.23866
作者: Pengxiang Li,Zechen Hu,Zirui Shang,Jingrong Wu,Yang Liu,Hui Liu,Zhi Gao,Chenrui Shi,Bofei Zhang,Zihao Zhang,Xiaochuan Shi,Zedong YU,Yuwei Wu,Xinxiao Wu,Yunde Jia,Liuyu Xiang,Zhaofeng He,Qing Li
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-209] GroupCoOp: Group-robust Fine-tuning via Group Prompt Learning NEURIPS2024

链接: https://arxiv.org/abs/2509.23781
作者: Nayeong Kim,Seong Joon Oh,Suha Kwak
机构: Pohang University of Science and Technology (POSTECH), South Korea; Tübingen AI Center, Universität Tübingen
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: This paper was first submitted to NeurIPS 2024 in May 2024

点击查看摘要

[CV-210] xture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution

链接: https://arxiv.org/abs/2509.23774
作者: Qifan Li,Jiale Zou,Jinhua Zhang,Wei Long,Xinyu Zhou,Shuhang Gu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-211] A Modality-Tailored Graph Modeling Framework for Urban Region Representation via Contrastive Learning

链接: https://arxiv.org/abs/2509.23772
作者: Yaya Zhao,Kaiqi Zhao,Zixuan Tang,Zhiyuan Liu,Xiaoling Lu,Yalei Du
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
备注:

点击查看摘要

[CV-212] GenView: Unifying Adaptive View Generation and Quality-Driven Supervision for Contrastive Representation Learning

链接: https://arxiv.org/abs/2509.23770
作者: Xiaojie Li,Bei Wang,Jianlong Wu,Yue Yu,Liqiang Nie,Min Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: The code is available at \url{ this https URL }

点击查看摘要

[CV-213] ReLumix: Extending Image Relighting to Video via Video Diffusion Models

链接: https://arxiv.org/abs/2509.23769
作者: Lezhong Wang,Shutong Jin,Ruiqi Cui,Anders Bjorholm Dahl,Jeppe Revall Frisvad,Siavash Bigdeli
机构: Technical University of Denmark (丹麦技术大学); KTH Royal Institute of Technology (皇家理工学院)
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
备注: Project page: this https URL

点击查看摘要

[CV-214] Accuracy-Robustness Trade Off via Spiking Neural Network Gradient Sparsity Trail

链接: https://arxiv.org/abs/2509.23762
作者: Nhan T. Luu
机构: 未知
类目: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
备注: Work under review

点击查看摘要

[CV-215] UniAlignment: Semantic Alignment for Unified Image Generation Understanding Manipulation and Perception

链接: https://arxiv.org/abs/2509.23760
作者: Xinyang Song,Libin Wang,Weining Wang,Shaozhen Liu,Dandan Zheng,Jingdong Chen,Qi Li,Zhenan Sun
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-216] ransparent Visual Reasoning via Object-Centric Agent Collaboration

链接: https://arxiv.org/abs/2509.23757
作者: Benjamin Teoh,Ben Glocker,Francesca Toni,Avinash Kori
机构: 未知
类目: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-217] PVTAdpNet: Polyp Segmentation using Pyramid vision transformer with a novel Adapter block

链接: https://arxiv.org/abs/2509.23751
作者: Arshia Yousefi Nezhad,Helia Aghaei,Hedieh Sajedi
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-218] Poivre: Self-Refining Visual Pointing with Reinforcement Learning

链接: https://arxiv.org/abs/2509.23746
作者: Wenjie Yang,Zengfeng Huang
机构: Fudan University (复旦大学); Shanghai Innovation Institute (上海创新研究院)
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-219] GBSK: Skeleton Clustering via Granular-ball Computing and Multi-Sampling for Large-Scale Data

链接: https://arxiv.org/abs/2509.23742
作者: Yewang Chen,Junfeng Li,Shuyin Xia,Qinghong Lai,Xinbo Gao,Guoyin Wang,Dongdong Cheng,Yi Liu,Yi Wang
机构: Huaqiao University (华侨大学); Chongqing University of Posts and Telecommunications (重庆邮电大学); Chongqing Normal University (重庆师范大学); Yangtze Normal University (长江师范学院); Ant Group (蚂蚁集团)
类目: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
备注:

点击查看摘要

[CV-220] ResAD: Towards Class Agnostic Anomaly Detection via Residual Feature Learning NEURIPS2024

链接: https://arxiv.org/abs/2509.23741
作者: Xincheng Yao,Chao Shi,Muming Zhao,Guangtao Zhai,Chongyang Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: This paper is an extended version of our NeurIPS 2024 paper, ResAD. arXiv admin note: substantial text overlap with arXiv:2410.20047

点击查看摘要

[CV-221] GRS-SLAM3R: Real-Time Dense SLAM with Gated Recurrent State

链接: https://arxiv.org/abs/2509.23737
作者: Guole Shen,Tianchen Deng,Yanbo Wang,Yongtao Chen,Yilin Shen,Jiuming Liu,Jingchuan Wang
机构: Shanghai Jiao Tong University (上海交通大学); Institute of Medical Robotics (医学机器人研究所); School of Automation and Intelligent Sensing (自动化与智能感知学院); Key Laboratory of System Control and Information Processing, Ministry of Education of China (教育部系统控制与信息处理重点实验室)
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注:

点击查看摘要

[CV-222] HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation

链接: https://arxiv.org/abs/2509.23736
作者: Cong Chen,Ziyuan Huang,Cheng Zou,Muzhi Zhu,Kaixiang Ji,Jiajia Liu,Jingdong Chen,Hao Chen,Chunhua Shen
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-223] FastViDAR: Real-Time Omnidirectional Depth Estimation via Alternative Hierarchical Attention

链接: https://arxiv.org/abs/2509.23733
作者: Hangtian Zhao,Xiang Chen,Yizhe Li,Qianhao Wang,Haibo Lu,Fei Gao
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注:

点击查看摘要

[CV-224] LUQ: Layerwise Ultra-Low Bit Quantization for Multimodal Large Language Models

链接: https://arxiv.org/abs/2509.23729
作者: Shubhang Bhatnagar,Andy Xu,Kar-Han Tan,Narendra Ahuja
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
备注:

点击查看摘要

[CV-225] M3DLayout: A Multi-Source Dataset of 3D Indoor Layouts and Structured Descriptions for 3D Generation

链接: https://arxiv.org/abs/2509.23728
作者: Yiheng Zhang,Zhuojiang Cai,Mingdao Wang,Meitong Guo,Tianxiao Li,Li Lin,Yuwang Wang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: this https URL

点击查看摘要

[CV-226] Video Panels for Long Video Understanding

链接: https://arxiv.org/abs/2509.23724
作者: Lars Doorenbos,Federico Spurio,Juergen Gall
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-227] DiffPCN: Latent Diffusion Model Based on Multi-view Depth Images for Point Cloud Completion

链接: https://arxiv.org/abs/2509.23723
作者: Zijun Li,Hongyu Yan,Shijie Li,Kunming Luo,Li Lu,Xulei Yang,Weisi Lin
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-228] PD-Diag-Net: Clinical-Priors guided Network on Brain MRI for Auxiliary Diagnosis of Parkinsons Disease

链接: https://arxiv.org/abs/2509.23719
作者: Shuai Shao,Shu Jiang,Shiyuan Zhao,Di Yang,Yan Wang,Yutong Bai,Jianguo Zhang,Jiangtao Wang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-229] Diff-3DCap: Shape Captioning with Diffusion Models

链接: https://arxiv.org/abs/2509.23718
作者: Zhenyu Shu,Jiawei Wen,Shiyang Li,Shiqing Xin,Ligang Liu
机构: 未知
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-230] StrucADT: Generating Structure-controlled 3D Point Clouds with Adjacency Diffusion Transformer

链接: https://arxiv.org/abs/2509.23709
作者: Zhenyu Shu,Jiajun Shen,Zhongui Chen,Xiaoguang Han,Shiqing Xin
机构: NingboTech University (宁波工程学院); Zhejiang University (浙江大学); Xiamen University (厦门大学); Chinese University of Hong Kong (香港中文大学); ShanDong University (山东大学)
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-231] CrimEdit: Controllable Editing for Counterfactual Object Removal Insertion and Movement

链接: https://arxiv.org/abs/2509.23708
作者: Boseong Jeon,Junghyuk Lee,Jimin Park,Kwanyoung Kim,Jingi Jung,Sangwon Lee,Hyunbo Shim
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-232] DFG-PCN: Point Cloud Completion with Degree-Flexible Point Graph

链接: https://arxiv.org/abs/2509.23703
作者: Zhenyu Shu,Jian Yao,Shiqing Xin
机构: 未知
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-233] INSTINCT: Instance-Level Interaction Architecture for Query-Based Collaborative Perception

链接: https://arxiv.org/abs/2509.23700
作者: Yunjiang Xu,Lingzhi Li,Jin Wang,Yupeng Ouyang,Benyuan Yang
机构: Soochow University (苏州大学); Soochow University (苏州大学)
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 14 pages, 8 figures

点击查看摘要

[CV-234] Confidence Aware SSD Ensemble with Weighted Boxes Fusion for Weapon Detection

链接: https://arxiv.org/abs/2509.23697
作者: Atharva Jadhav,Arush Karekar,Manas Divekar,Shachi Natu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-235] QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification

链接: https://arxiv.org/abs/2509.23681
作者: Weilun Feng,Chuanguang Yang,Haotong Qin,Mingqiang Wu,Yuqi Li,Xiangqi Li,Zhulin An,Libo Huang,Yulun Zhang,Michele Magno,Yongjun Xu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-236] MSD-KMamba: Bidirectional Spatial-Aware Multi-Modal 3D Brain Segmentation via Multi-scale Self-Distilled Fusion Strategy

链接: https://arxiv.org/abs/2509.23677
作者: Dayu Tan,Ziwei Zhang,Yansan Su,Xin Peng,Yike Dai,Chunhou Zheng,Weimin Zhong
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-237] oken Merging via Spatiotemporal Information Mining for Surgical Video Understanding

链接: https://arxiv.org/abs/2509.23672
作者: Xixi Jiang,Chen Yang,Dong Zhang,Pingcheng Dong,Xin Yang,Kwang-Ting Cheng
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-238] HIVTP: A Training-Free Method to Improve VLMs Efficiency via Hierarchical Visual Token Pruning Using Middle-Layer-Based Importance Score

链接: https://arxiv.org/abs/2509.23663
作者: Jingqi Xu,Jingxi Lu,Chenghao Li,Sreetama Sarkar,Peter A. Beerel
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-239] LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training

链接: https://arxiv.org/abs/2509.23661
作者: Xiang An,Yin Xie,Kaicheng Yang,Wenkang Zhang,Xiuwei Zhao,Zheng Cheng,Yirui Wang,Songcen Xu,Changrui Chen,Chunsheng Wu,Huajie Tan,Chunyuan Li,Jing Yang,Jie Yu,Xiyao Wang,Bin Qin,Yumeng Wang,Zizhen Yan,Ziyong Feng,Ziwei Liu,Bo Li,Jiankang Deng
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: LLaVA-OneVision-1.5 Technical Report

点击查看摘要

[CV-240] Focusing on What Matters: Object-Agent -centric Tokenization for Vision Language Action models

链接: https://arxiv.org/abs/2509.23655
作者: Rokas Bendikas,Daniel Dijkman,Markus Peschl,Sanjay Haresh,Pietro Mazzaglia
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注: Presented at 9th Conference on Robot Learning (CoRL 2025), Seoul, Korea

点击查看摘要

[CV-241] ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agent ic Data Synthesis

链接: https://arxiv.org/abs/2509.23652
作者: Congzhi Zhang,Zhibin Wang,Yinchao Ma,Jiawei Peng,Yihan Wang,Qiang Zhou,Jun Song,Bo Zheng
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-242] Color-Pair Guided Robust Zero-Shot 6D Pose Estimation and Tracking of Cluttered Objects on Edge Devices

链接: https://arxiv.org/abs/2509.23647
作者: Xingjian Yang,Ashis G. Banerjee
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注:

点击查看摘要

[CV-243] Sparse-Up: Learnable Sparse Upsampling for 3D Generation with High-Fidelity Textures

链接: https://arxiv.org/abs/2509.23646
作者: Lu Xiao,Jiale Zhang,Yang Liu,Taicheng Huang,Xin Tian
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-244] Griffin: Generative Reference and Layout Guided Image Composition

链接: https://arxiv.org/abs/2509.23643
作者: Aryan Mikaeili,Amirhossein Alimohammadi,Negar Hassanpour,Ali Mahdavi-Amiri,Andrea Tagliasacchi
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-245] From Static to Dynamic: a Survey of Topology-Aware Perception in Autonomous Driving

链接: https://arxiv.org/abs/2509.23641
作者: Yixiao Chen,Ruining Yang,Xin Chen,Jia He,Dongliang Xu,Yue Yao
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
备注: 13 pages, 3 figures

点击查看摘要

[CV-246] EfficientMIL: Efficient Linear-Complexity MIL Method for WSI Classification

链接: https://arxiv.org/abs/2509.23640
作者: Chengying She,Ben Wang,Xinran Zhang,Dongjie Fan,Jialu Zhang,Chengwei Chen,Lizhuang Liu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Submitted to iScience

点击查看摘要

[CV-247] LightFair: Towards an Efficient Alternative for Fair T2I Diffusion via Debiasing Pre-trained Text Encoders

链接: https://arxiv.org/abs/2509.23639
作者: Boyu Han,Qianqian Xu,Shilong Bao,Zhiyong Yang,Kangli Zi,Qingming Huang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-248] MotionVerse: A Unified Multimodal Framework for Motion Comprehension Generation and Editing

链接: https://arxiv.org/abs/2509.23635
作者: Ruibing Hou,Mingshuang Luo,Hongyu Pan,Hong Chang,Shiguang Shan
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 17 pages, 6 figures

点击查看摘要

[CV-249] Efficient Domain-Adaptive Multi-Task Dense Prediction with Vision Foundation Models

链接: https://arxiv.org/abs/2509.23626
作者: Beomseok Kang,Niluthpol Chowdhury Mithun,Mikhail Sizintsev,Han-Pang Chiu,Supun Samarasekera
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-250] DiffInk: Glyph- and Style-Aware Latent Diffusion Transformer for Text to Online Handwriting Generation

链接: https://arxiv.org/abs/2509.23624
作者: Wei Pan,Huiguo He,Hiuyi Cheng,Yilin Shi,Lianwen Jin
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 24 pages, 16 figures

点击查看摘要

[CV-251] BioVessel-Net and RetinaMix: Unsupervised Retinal Vessel Segmentation from OCTA Images

链接: https://arxiv.org/abs/2509.23617
作者: Cheng Huang,Weizheng Xie,Fan Gao,Yutong Liu,Ruoling Wu,Zeyu Han,Jingxi Qiu,Xiangxiang Wang,Zhenglin Yang,Hao Wang,Yongbin Yu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-252] InteractMove: Text-Controlled Human-Object Interaction Generation in 3D Scenes with Movable Objects

链接: https://arxiv.org/abs/2509.23612
作者: Xinhao Cai,Minghang Zheng,Xin Jin,Yang Liu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-253] Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention

链接: https://arxiv.org/abs/2509.23610
作者: Kai Li,Kejun Gao,Xiaolin Hu
机构: 未知
类目: ound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
备注: Technical Report

点击查看摘要

[CV-254] FlowLUT: Efficient Image Enhancement via Differentiable LUTs and Iterative Flow Matching

链接: https://arxiv.org/abs/2509.23608
作者: Liubing Hu,Chen Wu,Anrui Wang,Dianjie Lu,Guijuan Zhang,Zhuoran Zheng
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-255] ZeroScene: A Zero-Shot Framework for 3D Scene Generation from a Single Image and Controllable Texture Editing

链接: https://arxiv.org/abs/2509.23607
作者: Xiang Tang,Ruotong Li,Xiaopeng Fan
机构: 未知
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
备注: 16 pages, 15 figures, Project page: this https URL

点击查看摘要

[CV-256] VMDiff: Visual Mixing Diffusion for Limitless Cross-Object Synthesis

链接: https://arxiv.org/abs/2509.23605
作者: Zeren Xiong,Yue Yu,Zedong Zhang,Shuo Chen,Jian Yang,Jun Li
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-257] MAN: Latent Diffusion Enhanced Multistage Anti-Noise Network for Efficient and High-Quality Low-Dose CT Image Denoising ICASSP2026

链接: https://arxiv.org/abs/2509.23603
作者: Tangtangfang Fang,Jingxi Hu,Xiangjian He,Jiaqi Yang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: Submitted to ICASSP 2026

点击查看摘要

[CV-258] Deep Taxonomic Networks for Unsupervised Hierarchical Prototype Discovery NEURIPS2025

链接: https://arxiv.org/abs/2509.23602
作者: Zekun Wang,Ethan Haarer,Zhiyi Dai,Tianyi Zhu,Christopher J. MacLellan
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: NeurIPS 2025

点击查看摘要

[CV-259] VAMamba: An Efficient Visual Adaptive Mamba for Image Restoration

链接: https://arxiv.org/abs/2509.23601
作者: Han Hu,Zhuoran Zheng,Liang Li,Chen Lyu
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-260] Multi-Level Heterogeneous Knowledge Transfer Network on Forward Scattering Center Model for Limited Samples SAR ATR

链接: https://arxiv.org/abs/2509.23596
作者: Chenxi Zhao,Daochang Wang,Siqian Zhang,Gangyao Kuang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-261] StolenLoRA: Exploring LoRA Extraction Attacks via Synthetic Data ICCV2025

链接: https://arxiv.org/abs/2509.23594
作者: Yixu Wang,Yan Teng,Yingchun Wang,Xingjun Ma
机构: 未知
类目: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
备注: ICCV 2025

点击查看摘要

[CV-262] BridgeDrive: Diffusion Bridge Policy for Closed-Loop Trajectory Planning in Autonomous Driving

链接: https://arxiv.org/abs/2509.23589
作者: Shu Liu,Wenlin Chen,Weihao Li,Zheng Wang,Lijin Yang,Jianing Huang,Yipin Zhang,Zhongzhan Huang,Ze Cheng,Hao Yang
机构: Bosch (中国)投资有限公司
类目: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注: 16 pages, 7 figures, 6 tables

点击查看摘要

[CV-263] VividFace: High-Quality and Efficient One-Step Diffusion For Video Face Enhancement

链接: https://arxiv.org/abs/2509.23584
作者: Shulian Zhang,Yong Guo,Long Peng,Ziyang Wang,Ye Chen,Wenbo Li,Xiao Zhang,Yulun Zhang,Jian Chen
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-264] RobuQ: Pushing DiTs to W1.58A2 via Robust Activation Quantization

链接: https://arxiv.org/abs/2509.23582
作者: Kaicheng Yang,Xun Zhang,Haotong Qin,Yucheng Lin,Kaisen Yang,Xianglong Yan,Yulun Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: The code and models will be available at this https URL

点击查看摘要

[CV-265] Automated design of compound lenses with discrete-continuous optimization SIGGRAPH

链接: https://arxiv.org/abs/2509.23572
作者: Arjun Teh,Delio Vicini,Bernd Bickel,Ioannis Gkioulekas,Matthew O’Toole
机构: 未知
类目: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Applied Physics (physics.app-ph)
备注: SIGGRAPH Asia 2025, project website: this https URL

点击查看摘要

[CV-266] owards Interpretable Visual Decoding with Attention to Brain Representations

链接: https://arxiv.org/abs/2509.23566
作者: Pinyuan Feng,Hossein Adeli,Wenxuan Guo,Fan Cheng,Ethan Hwang,Nikolaus Kriegeskorte
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 10 pages, 7 figures, under review

点击查看摘要

[CV-267] RAVEN: Resilient Aerial Navigation via Open-Set Semantic Memory and Behavior Adaptation

链接: https://arxiv.org/abs/2509.23563
作者: Seungchan Kim,Omar Alama,Dmytro Kurdydyk,John Keller,Nikhil Keetha,Wenshan Wang,Yonatan Bisk,Sebastian Scherer
机构: Carnegie Mellon University (卡内基梅隆大学); Davidson College (戴维森学院)
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注:

点击查看摘要

[CV-268] Pancreas Part Segmentation under Federated Learning Paradigm

链接: https://arxiv.org/abs/2509.23562
作者: Ziliang Hong,Halil Ertugrul Aktas,Andrea Mia Bejar,Katherine Wu,Hongyi Pan,Gorkem Durak,Zheyuan Zhang,Sait Kayali,Temel Tirkes,Federica Proietto Salanitri,Concetto Spampinato,Michael Goggins,Tamas Gonda,Candice Bolan,Raj Keswani,Frank Miller,Michael Wallace,Ulas Bagci
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-269] From Fields to Splats: A Cross-Domain Survey of Real-Time Neural Scene Representations

链接: https://arxiv.org/abs/2509.23555
作者: Javed Ahmad,Penggang Gao,Donatien Delehelle,Mennuti Canio,Nikhil Deshpande,Jesús Ortiz,Darwin G. Caldwell,Yonas Teodros Tefera
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 18 pages

点击查看摘要

[CV-270] OVSeg3R: Learn Open-vocabulary Instance Segmentation from 2D via 3D Reconstruction

链接: https://arxiv.org/abs/2509.23541
作者: Hongyang Li,Jinyuan Qu,Lei Zhang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-271] Calibrated and Resource-Aware Super-Resolution for Reliable Driver Behavior Analysis

链接: https://arxiv.org/abs/2509.23535
作者: Ibne Farabi Shihab,Weiheng Chai,Jiyang Wang,Sanjeda Akter,Senem Velipasalar Gursoy,Anuj Sharma
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-272] Imaging-Based Mortality Prediction in Patients with Systemic Sclerosis MICCAI

链接: https://arxiv.org/abs/2509.23530
作者: Alec K. Peltekian,Karolina Senkow,Gorkem Durak,Kevin M. Grudzinski,Bradford C. Bemiss,Jane E. Dematte,Carrie Richardson,Nikolay S. Markov,Mary Carns,Kathleen Aren,Alexandra Soriano,Matthew Dapas,Harris Perlman,Aaron Gundersheimer,Kavitha C. Selvan,John Varga,Monique Hinchcliff,Krishnan Warrior,Catherine A. Gao,Richard G. Wunderink,GR Scott Budinger,Alok N. Choudhary,Anthony J. Esposito,Alexander V. Misharin,Ankit Agrawal,Ulas Bagci
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注: 11 pages, 4 figures, 1 table, accepted in MICCAI PRIME 2025

点击查看摘要

[CV-273] Evaluating point-light biological motion in multimodal large language models

链接: https://arxiv.org/abs/2509.23517
作者: Akila Kadambi,Marco Iacoboni,Lisa Aziz-Zadeh,Srini Narayanan
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-274] Enhancing Polyp Segmentation via Encoder Attention and Dynamic Kernel Update

链接: https://arxiv.org/abs/2509.23502
作者: Fatemeh Salahi Chashmi,Roya Sotoudeh
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[CV-275] Orientation-anchored Hyper-Gaussian for 4D Reconstruction from Casual Videos NEURIPS2025

链接: https://arxiv.org/abs/2509.23492
作者: Junyi Wu,Jiachen Tao,Haoxuan Wang,Gaowen Liu,Ramana Rao Kompella,Yan Yan
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: NeurIPS 2025. Code: \href{ this https URL }{OriGS}

点击查看摘要

[CV-276] RestoRect: Degraded Image Restoration via Latent Rectified Flow Feature Distillation

链接: https://arxiv.org/abs/2509.23480
作者: Shourya Verma,Mengbo Wang,Nadia Atallah Lanman,Ananth Grama
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-277] Robust Multi-Modal Face Anti-Spoofing with Domain Adaptation: Tackling Missing Modalities Noisy Pseudo-Labels and Model Degradation

链接: https://arxiv.org/abs/2509.23475
作者: Ming-Tsung Hsu,Fang-Yu Hsu,Yi-Ting Lin,Kai-Heng Chien,Jun-Ren Chen,Cheng-Hsiang Su,Yi-Chen Ou,Chiou-Ting Hsu,Pei-Kai Huang
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-278] No Concept Left Behind: Test-Time Optimization for Compositional Text-to-Image Generation

链接: https://arxiv.org/abs/2509.23457
作者: Mohammad Hossein Sameti,Amir M. Mansourian,Arash Marioriyad,Soheil Fadaee Oshyani,Mohammad Hossein Rohban,Mahdieh Soleymani Baghshah
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注: 8 pages, 8 figures, 1 table

点击查看摘要

[CV-279] 3DPCNet: Pose Canonicalization for Robust Viewpoint-Invariant 3D Kinematic Analysis from Monocular RGB cameras

链接: https://arxiv.org/abs/2509.23455
作者: Tharindu Ekanayake,Constantino Álvarez Casado,Miguel Bordallo López
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
备注: 8 pages, 6 figures, 1 table, 21 references, conference, Code available at: this https URL

点击查看摘要

[CV-280] FM-SIREN FM-FINER: Nyquist-Informed Frequency Multiplier for Implicit Neural Representation with Periodic Activation

链接: https://arxiv.org/abs/2509.23438
作者: Mohammed Alsakabi,Wael Mobeirek,John M. Dolan,Ozan K. Tonguz
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

[CV-281] FracDetNet: Advanced Fracture Detection via Dual-Focus Attention and Multi-scale Calibration in Medical X-ray Imaging

链接: https://arxiv.org/abs/2509.23416
作者: Yuyang Sun,Cuiming Zou
机构: 未知
类目: Computer Vision and Pattern Recognition (cs.CV)
备注:

点击查看摘要

人工智能

[AI-0] XQC: Well-conditioned Optimization Accelerates Deep Reinforcement Learning

链接: https://arxiv.org/abs/2509.25174
作者: Daniel Palenicek,Florian Vogt,Joe Watson,Ingmar Posner,Jan Peters
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-1] GLASS Flows: Transition Sampling for Alignment of Flow and Diffusion Models

链接: https://arxiv.org/abs/2509.25170
作者: Peter Holderrieth,Uriel Singer,Tommi Jaakkola,Ricky T. Q. Chen,Yaron Lipman,Brian Karrer
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
备注:

点击查看摘要

[AI-2] Chance-constrained Flow Matching for High-Fidelity Constraint-aware Generation

链接: https://arxiv.org/abs/2509.25157
作者: Jinhao Liang,Yixuan Sun,Anirban Samaddar,Sandeep Madireddy,Ferdinando Fioretto
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-3] Whos Your Judge? On the Detectability of LLM -Generated Judgments

链接: https://arxiv.org/abs/2509.25154
作者: Dawei Li,Zhen Tan,Chengshuai Zhao,Bohan Jiang,Baixiang Huang,Pingchuan Ma,Abdullah Alnaibari,Kai Shu,Huan Liu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: Under review

点击查看摘要

[AI-4] UniAPL: A Unified Adversarial Preference Learning Framework for Instruct-Following

链接: https://arxiv.org/abs/2509.25148
作者: FaQiang Qian,WeiKun Zhang,Ziliang Wang,Kang An,Xuhui Zheng,Liangjian Wen,Mengya Gao,Yong Dai,Yichao Wu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-5] Visual serial processing deficits explain divergences in human and VLM reasoning

链接: https://arxiv.org/abs/2509.25142
作者: Nicholas Budny,Kia Ghods,Declan Campbell,Raja Marjieh,Amogh Joshi,Sreejan Kumar,Jonathan D. Cohen,Taylor W. Webb,Thomas L. Griffiths
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-6] HeDA: An Intelligent Agent System for Heatwave Risk Discovery through Automated Knowledge Graph Construction and Multi-layer Risk Propagation Analysis

链接: https://arxiv.org/abs/2509.25112
作者: Yiquan Wang,Tin-Yeh Huang,Qingyun Gao,Jialin Zhang
机构: 未知
类目: Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
备注:

点击查看摘要

[AI-7] Optimizing Privacy-Preserving Primitives to Support LLM -Scale Applications

链接: https://arxiv.org/abs/2509.25072
作者: Yaman Jandali,Ruisi Zhang,Nojan Sheybani,Farinaz Koushanfar
机构: 未知
类目: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-8] Cogito Ergo Ludo: An Agent that Learns to Play by Reasoning and Planning

链接: https://arxiv.org/abs/2509.25052
作者: Sai Wang,Yu Wu,Zhongwen Xu
机构: 未知
类目: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-9] Scaling Synthetic Task Generation for Agents via Exploration

链接: https://arxiv.org/abs/2509.25047
作者: Ram Ramrakhya,Andrew Szot,Omar Attia,Yuhao Yang,Anh Nguyen,Bogdan Mazoure,Zhe Gan,Harsh Agrawal,Alexander Toshev
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-10] Large Language Models for Software Testing: A Research Roadmap

链接: https://arxiv.org/abs/2509.25043
作者: Cristian Augusto,Antonia Bertolino,Guglielmo De Angelis,Francesca Lonetti,Jesús Morán
机构: 未知
类目: oftware Engineering (cs.SE); Artificial Intelligence (cs.AI)
备注: 40 pages 10 figures Submitted on 29th September 2025

点击查看摘要

[AI-11] CLPO: Curriculum Learning meets Policy Optimization for LLM Reasoning

链接: https://arxiv.org/abs/2509.25004
作者: Shijie Zhang,Guohao Sun,Kevin Zhang,Xiang Guo,Rujun Guo
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-12] Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

【速读】：该论文旨在解决强化学习中基于可验证奖励（RLVR）方法在数学推理任务中训练不稳定和多样性崩溃的问题。现有方法如PPO和GRPO依赖于广义策略迭代框架，其交替评估与优化策略的过程容易引发训练震荡且需复杂调参。论文的关键洞察在于：数学推理中的RLVR可形式化为具有确定性状态转移、树状动态结构和二元终端奖励的有限时域马尔可夫决策过程（Markov Decision Process, MDP），其结构远比通用控制场景简单，因此可简化甚至省略传统算法中的复杂技巧。基于此，作者证明了一个关键结果——最优动作可通过固定均匀随机策略的Q函数直接恢复，从而绕过策略迭代循环及其相关启发式设计。据此提出的ROVER（Random Policy Valuation for Diverse Reasoning）算法仅通过softmax采样均匀策略下的Q值来指导行动，实现了极简但高效的训练机制，在多个基线模型和标准数学推理基准上显著提升性能（pass@1提升+8.2，pass@256提升+16.8）并保持更高推理路径多样性（+17.6%）。

链接: https://arxiv.org/abs/2509.24981
作者: Haoran He,Yuxiao Ye,Qingpeng Cai,Chen Hu,Binxing Jiao,Daxin Jiang,Ling Pan
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 32 pages

点击查看摘要

Abstract:RL with Verifiable Rewards (RLVR) has emerged as a promising paradigm for improving the reasoning abilities of large language models (LLMs). Current methods rely primarily on policy optimization frameworks like PPO and GRPO, which follow generalized policy iteration that alternates between evaluating the current policy’s value and improving the policy based on evaluation. While effective, they often suffer from training instability and diversity collapse, requiring complex heuristic tricks and careful tuning. We observe that standard RLVR in math reasoning can be formalized as a specialized finite-horizon Markov Decision Process with deterministic state transitions, tree-structured dynamics, and binary terminal rewards. Though large in scale, the underlying structure is simpler than general-purpose control settings for which popular RL algorithms (e.g., PPO) were developed, suggesting that several sophisticated techniques in existing methods may be reduced or even omitted. Based on this insight, we prove a surprising result: the optimal action can be recovered from the Q-function of a fixed uniformly random policy, thereby bypassing the generalized policy iteration loop and its associated heuristics. We introduce Random Policy Valuation for Diverse Reasoning (ROVER) to translate this principle into a practical and scalable algorithm for LLM math reasoning, a minimalist yet highly effective RL method that samples actions from a softmax over these uniform-policy Q-values. ROVER preserves diversity throughout training, allowing sustained exploration of multiple valid pathways. Across multiple base models and standard math reasoning benchmarks, ROVER demonstrates superior performance in both \textbfquality (\textbf+8.2 on pass@1, \textbf+16.8 on pass@256) and \textbfdiversity (\textbf+17.6%), despite its radical simplification compared to strong, complicated existing methods.
zh

[AI-13] Agent ic Exploration of Physics Models

链接: https://arxiv.org/abs/2509.24978
作者: Maximilian Nägele,Florian Marquardt
机构: 未知
类目: Artificial Intelligence (cs.AI); Quantum Gases (cond-mat.quant-gas); Quantum Physics (quant-ph)
备注:

点击查看摘要

[AI-14] SecInfer: Preventing Prompt Injection via Inference-time Scaling

链接: https://arxiv.org/abs/2509.24967
作者: Yupei Liu,Yanting Wang,Yuqi Jia,Jinyuan Jia,Neil Zhenqiang Gong
机构: 未知
类目: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-15] MSG: Multi-Stream Generative Policies for Sample-Efficient Robotic Manipulation

链接: https://arxiv.org/abs/2509.24956
作者: Jan Ole von Hartz,Lukas Schweizer,Joschka Boedecker,Abhinav Valada
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-16] Learning Distinguishable Representations in Deep Q-Networks for Linear Transfer

链接: https://arxiv.org/abs/2509.24947
作者: Sooraj Sathish,Keshav Goyal,Raghuram Bharadwaj Diddigi
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-17] KIRETT - A wearable device to support rescue operations using artificial intelligence to improve first aid

链接: https://arxiv.org/abs/2509.24934
作者: Johannes Zenkert,Christian Weber,Mubaris Nadeem,Lisa Bender,Madjid Fathi,Abu Shad Ahammed,Aniebiet Micheal Ezekiel,Roman Obermaisser,Maximilian Bradford
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: Conference Paper for 2022 IEEE International Smart Cities Conference (ISC2), KIRETT Project, University of Siegen, Germany

点击查看摘要

[AI-18] When Autonomous Vehicle Meets V2X Cooperative Perception: How Far Are We?

链接: https://arxiv.org/abs/2509.24927
作者: An Guo,Shuoxiao Zhang,Enyi Tang,Xinyu Gao,Haomin Pang,Haoxiang Tian,Yanzhou Mu,Wu Wen,Chunrong Fang,Zhenyu Chen
机构: 未知
类目: Artificial Intelligence (cs.AI); Robotics (cs.RO); Software Engineering (cs.SE)
备注: The paper has been accepted by the 40th IEEE/ACM International Conference on Automated Software Engineering, ASE 2025

点击查看摘要

[AI-19] Meta-Learning Theory-Informed Inductive Biases using Deep Kernel Gaussian Processes

链接: https://arxiv.org/abs/2509.24919
作者: Bahti Zakirov,Gašper Tkačik
机构: 未知
类目: Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)
备注: 13 pages, 5 figures, 9 SI figures

点击查看摘要

[AI-20] RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark

链接: https://arxiv.org/abs/2509.24897
作者: Yang Shi,Yuhao Dong,Yue Ding,Yuran Wang,Xuanyu Zhu,Sheng Zhou,Wenting Liu,Haochen Tian,Rundong Wang,Huanqian Wang,Zuyan Liu,Bohan Zeng,Ruizhe Chen,Qixun Wang,Zhuoran Zhang,Xinlong Chen,Chengzhuo Tong,Bozhou Li,Chaoyou Fu,Qiang Liu,Haotian Wang,Wenjing Yang,Yuanxing Zhang,Pengfei Wan,Yi-Fan Zhang,Ziwei Liu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-21] Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

链接: https://arxiv.org/abs/2509.24882
作者: Leonardo Defilippis,Yizhou Xu,Julius Girardin,Emanuele Troiani,Vittorio Erba,Lenka Zdeborová,Bruno Loureiro,Florent Krzakala
机构: 未知
类目: Machine Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
备注:

点击查看摘要

[AI-22] he Emergence of Social Science of Large Language Models

链接: https://arxiv.org/abs/2509.24877
作者: Xiao Jia,Zhanzhan Zhao
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-23] Uncertainty-Guided Expert-AI Collaboration for Efficient Soil Horizon Annotation ECAI2025

链接: https://arxiv.org/abs/2509.24873
作者: Teodor Chiaburu,Vipin Singh,Frank Haußer,Felix Bießmann
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 11 pages, 7 figures, presented at ECAI 2025, CLEAR-AI Workshop, Bologna

点击查看摘要

[AI-24] PhysicsMinions: Winning Gold Medals in the Latest Physics Olympiads with a Coevolutionary Multimodal Multi-Agent System

链接: https://arxiv.org/abs/2509.24855
作者: Fangchen Yu,Junchi Yao,Ziyi Wang,Haiyuan Wan,Youling Huang,Bo Zhang,Shuyue Hu,Dongzhan Zhou,Ning Ding,Ganqu Cui,Lei Bai,Wanli Ouyang,Peng Ye
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-25] Evaluating SAP Joule for Code Generation

链接: https://arxiv.org/abs/2509.24828
作者: Joshua Heisler,Johannes Reisinger,Andreas Fischer
机构: 未知
类目: oftware Engineering (cs.SE); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-26] Putnam-like dataset summary: LLM s as mathematical competition contestants

链接: https://arxiv.org/abs/2509.24827
作者: Bartosz Bieganowski,Daniel Strzelecki,Robert Skiba,Mateusz Topolewski
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 11 pages, 11 figures

点击查看摘要

[AI-27] Query Circuits: Explaining How Language Models Answer User Prompts

链接: https://arxiv.org/abs/2509.24808
作者: Tung-Yu Wu,Fazl Barez
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: Preprint. Under review

点击查看摘要

[AI-28] meOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models

链接: https://arxiv.org/abs/2509.24803
作者: Tong Guan,Zijie Meng,Dianqi Li,Shiyu Wang,Chao-Han Huck Yang,Qingsong Wen,Zuozhu Liu,Sabato Marco Siniscalchi,Ming Jin,Shirui Pan
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-29] DSAT-HD: Dual-Stream Adaptive Transformer with Hybrid Decomposition for Multivariate Time Series Forecasting

链接: https://arxiv.org/abs/2509.24800
作者: Zixu Wang,Hongbin Dong,Xiaoping Zhang
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 10 pages, 5 figures

点击查看摘要

[AI-30] Fidelity-Aware Data Composition for Robust Robot Generalization

链接: https://arxiv.org/abs/2509.24797
作者: Zizhao Tong,Di Chen,Sicheng Hu,Hongwei Fan,Liliang Chen,Guanghui Ren,Hao Tang,Hao Dong,Ling Shao
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 33 pages

点击查看摘要

[AI-31] Sparse Autoencoders Make Audio Foundation Models more Explainable ICASSP2026

链接: https://arxiv.org/abs/2509.24793
作者: Théo Mariotte,Martin Lebourdais,Antonio Almudévar,Marie Tahon,Alfonso Ortega,Nicolas Dugué
机构: 未知
类目: ound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
备注: 5 pages, 5 figures, 1 table, submitted to ICASSP 2026

点击查看摘要

[AI-32] Quantifying Generalisation in Imitation Learning NEURIPS2025

链接: https://arxiv.org/abs/2509.24784
作者: Nathan Gavenski,Odinaldo Rodrigues
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: NeurIPS 2025 Datasets and Benchmarks Track poster

点击查看摘要

[AI-33] From Ambiguity to Verdict: A Semiotic-Grounded Multi-Perspective Agent for LLM Logical Reasoning

链接: https://arxiv.org/abs/2509.24765
作者: Yunyao Zhang,Xinglang Zhang,Junxi Sheng,Wenbing Li,Junqing Yu,Wei Yang,Zikai Song
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-34] Spatial-Functional awareness Transformer-based graph archetype contrastive learning for Decoding Visual Neural Representations from EEG

链接: https://arxiv.org/abs/2509.24761
作者: Yueming Sun,Long Yang
机构: 未知
类目: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-35] Robust Policy Expansion for Offline-to-Online RL under Diverse Data Corruption

链接: https://arxiv.org/abs/2509.24748
作者: Longxiang He,Deheng Ye,Junbo Tan,Xueqian Wang,Li Shen
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 39th Conference on Neural Information Processing Systems

点击查看摘要

[AI-36] Q-Net: Transferable Queue Length Estimation via Kalman-based Neural Networks

链接: https://arxiv.org/abs/2509.24725
作者: Ting Gao,Elvin Isufi,Winnie Daamen,Erik-Sander Smits,Serge Hoogendoorn
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-37] Discrete Variational Autoencoding via Policy Search

链接: https://arxiv.org/abs/2509.24716
作者: Michael Drolet,Firas Al-Hafez,Aditya Bhatt,Jan Peters,Oleg Arenz
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
备注:

点击查看摘要

[AI-38] Circuit-Aware Reward Training: A Mechanistic Framework for Longtail Robustness in RLHF

链接: https://arxiv.org/abs/2509.24713
作者: Jing Liu
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-39] FedPOB: Sample-Efficient Federated Prompt Optimization via Bandits

链接: https://arxiv.org/abs/2509.24701
作者: Pingchen Lu,Zhi Hong,Zhiwei Shang,Zhiyong Wang,Yikun Ban,Yao Shu,Min Zhang,Shuang Qiu,Zhongxiang Dai
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: Preprint

点击查看摘要

[AI-40] -POP: Test-Time Personalization with Online Preference Feedback

链接: https://arxiv.org/abs/2509.24696
作者: Zikun Qu,Min Zhang,Mingze Kong,Xiang Li,Zhiwei Shang,Zhiyong Wang,Yikun Ban,Shuang Qiu,Yao Shu,Zhongxiang Dai
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: Preprint

点击查看摘要

[AI-41] CoTune: Co-evolutionary Configuration Tuning

链接: https://arxiv.org/abs/2509.24694
作者: Gangda Xiong,Tao Chen
机构: 未知
类目: oftware Engineering (cs.SE); Artificial Intelligence (cs.AI)
备注: Accepted by ASE 2025

点击查看摘要

[AI-42] Data-Driven Discrete Geofence Design Using Binary Quadratic Programming

链接: https://arxiv.org/abs/2509.24679
作者: Keisuke Otaki,Akihisa Okada,Tadayoshi Matsumori,Hiroaki Yoshida
机构: 未知
类目: ocial and Information Networks (cs.SI); Artificial Intelligence (cs.AI)
备注: 17 pages, 17 figures, 2 tables

点击查看摘要

[AI-43] Community detection robustness of graph neural networks

链接: https://arxiv.org/abs/2509.24662
作者: Jaidev Goel,Pablo Moriano,Ramakrishnan Kannan,Yulia R. Gel
机构: 未知
类目: ocial and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Physics and Society (physics.soc-ph); Machine Learning (stat.ML)
备注:

点击查看摘要

[AI-44] Successful Misunderstandings: Learning to Coordinate Without Being Understood

链接: https://arxiv.org/abs/2509.24660
作者: Nikolaos Kondylidis,Anil Yaman,Frank van Harmelen,Erman Acar,Annette ten Teije
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-45] Identity Bridge: Enabling Implicit Reasoning via Shared Latent Memory

链接: https://arxiv.org/abs/2509.24653
作者: Pengxiao Lin,Zheng-An Chen,Zhi-Qin John Xu
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-46] "Stop replacing salt with sugar!: Towards Intuitive Human-Agent Teaching

链接: https://arxiv.org/abs/2509.24651
作者: Nikolaos Kondylidis,Andrea Rafanelli,Ilaria Tiddi,Annette ten Teije,Frank van Harmelen
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-47] LTL_f Learning Meets Boolean Set Cover

链接: https://arxiv.org/abs/2509.24616
作者: Gabriel Bathie,Nathanaël Fijalkow,Théo Matricon,Baptiste Mouillon,Pierre Vandenhove
机构: 未知
类目: Artificial Intelligence (cs.AI); Formal Languages and Automata Theory (cs.FL); Logic in Computer Science (cs.LO)
备注: 23 pages, 4 figures

点击查看摘要

[AI-48] Algorithms and data structures for automatic precision estimation of neural networks

链接: https://arxiv.org/abs/2509.24607
作者: Igor V. Netay
机构: 未知
类目: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Numerical Analysis (math.NA)
备注:

点击查看摘要

[AI-49] BPMN Assistant: An LLM -Based Approach to Business Process Modeling

链接: https://arxiv.org/abs/2509.24592
作者: Josip Tomo Licardo,Nikola Tankovic,Darko Etinger
机构: 未知
类目: Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
备注: 12 pages, 4 figures

点击查看摘要

[AI-50] PoseDiff: A Unified Diffusion Model Bridging Robot Pose Estimation and Video-to-Action Control

链接: https://arxiv.org/abs/2509.24591
作者: Haozhuo Zhang,Michele Caprio,Jing Shao,Qiang Zhang,Jian Tang,Shanghang Zhang,Wei Pan
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-51] Deep Reinforcement Learning in Action: Real-Time Control of Vortex-Induced Vibrations

链接: https://arxiv.org/abs/2509.24556
作者: Hussam Sababha,Bernat Font,Mohammed Daqaq
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Fluid Dynamics (physics.flu-dyn)
备注:

点击查看摘要

[AI-52] Short window attention enables long-term memorization

链接: https://arxiv.org/abs/2509.24552
作者: Loïc Cabannes,Maximilian Beck,Gergely Szilvasy,Matthijs Douze,Maria Lomeli,Jade Copet,Pierre-Emmanuel Mazaré,Gabriel Synnaeve,Hervé Jégou
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-53] raining Agents Inside of Scalable World Models

链接: https://arxiv.org/abs/2509.24527
作者: Danijar Hafner,Wilson Yan,Timothy Lillicrap
机构: 未知
类目: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO); Machine Learning (stat.ML)
备注: Website: this https URL

点击查看摘要

[AI-54] PhysiAgent : An Embodied Agent Framework in Physical World

链接: https://arxiv.org/abs/2509.24524
作者: Zhihao Wang,Jianxiong Li,Jinliang Zheng,Wencong Zhang,Dongxiu Liu,Yinan Zheng,Haoyi Niu,Junzhi Yu,Xianyuan Zhan
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Systems and Control (eess.SY)
备注:

点击查看摘要

[AI-55] Agent ic Specification Generator for Move Programs

链接: https://arxiv.org/abs/2509.24515
作者: Yu-Fu Fu,Meng Xu,Taesoo Kim
机构: 未知
类目: oftware Engineering (cs.SE); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Programming Languages (cs.PL)
备注: 18 pages; Extended version of ASE’25 paper with extra appendices

点击查看摘要

[AI-56] Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models

链接: https://arxiv.org/abs/2509.24510
作者: Jonas Hübotter,Patrik Wolf,Alexander Shevchenko,Dennis Jüni,Andreas Krause,Gil Kur
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-57] LLM DNA: Tracing Model Evolution via Functional Representations

链接: https://arxiv.org/abs/2509.24496
作者: Zhaomin Wu,Haodong Zhao,Ziyang Wang,Jizhou Guo,Qian Wang,Bingsheng He
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-58] Neuroplasticity-inspired dynamic ANNs for multi-task demand forecasting

链接: https://arxiv.org/abs/2509.24495
作者: Mateusz Żarski,Sławomir Nowaczyk
机构: 未知
类目: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 14 pages, 3 figures, 2 tables

点击查看摘要

[AI-59] Overcoming Over-Fitting in Constraint Acquisition via Query-Driven Interactive Refinement

链接: https://arxiv.org/abs/2509.24489
作者: Vasileios Balafas,Dimos Tsouros,Nikolaos Ploskas,Kostas Stergiou
机构: 未知
类目: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Logic in Computer Science (cs.LO)
备注: Preprint. Uses the International Journal on Artificial Intelligence Tools (World Scientific) template. Includes figures, tables, and algorithms. Submitted to IJAIT

点击查看摘要

[AI-60] An Agent -Based Framework for Automated Higher-Voice Harmony Generation

链接: https://arxiv.org/abs/2509.24463
作者: Nia D’Souza Ganapathy,Arul Selvamani Shaja
机构: 未知
类目: ound (cs.SD); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-61] ContextPRM: Leverag ing Contextual Coherence for multi-domain Test-Time Scaling

链接: https://arxiv.org/abs/2509.24460
作者: Haotian Zhang,Liu Liu,Baosheng Yu,Jiayan Qiu,Likang Xiao,Yanwei Ren,Quan Chen,Xianglong Liu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-62] A Systematic Review of Digital Twin-Driven Predictive Maintenance in Industrial Engineering: Taxonomy Architectural Elements and Future Research Directions

链接: https://arxiv.org/abs/2509.24443
作者: Leila Ismail,Abdelmoneim Abdelmoti,Arkaprabha Basu,Aymen Dia Eddine Berini,Mohammad Naouss
机构: 未知
类目: Artificial Intelligence (cs.AI); Emerging Technologies (cs.ET); Software Engineering (cs.SE)
备注:

点击查看摘要

[AI-63] EOE: Evolutionary Optimization of Experts for Training Language Models

链接: https://arxiv.org/abs/2509.24436
作者: Yingshi Chen
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
备注: 6 pages, 2 figures

点击查看摘要

[AI-64] Multi-Item-Query Attention for Stable Sequential Recommendation

链接: https://arxiv.org/abs/2509.24424
作者: Mingshi Xu,Haoren Zhu,Wilfred Siu Hung Ng
机构: 未知
类目: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-65] ScatterAD: Temporal-Topological Scattering Mechanism for Time Series Anomaly Detection NEURIPS2025

链接: https://arxiv.org/abs/2509.24414
作者: Tao Yin,Xiaohong Zhang,Shaochen Fu,Zhibin Zhang,Li Huang,Yiyuan Yang,Kaixiang Yang,Meng Yan
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 39th Conference on Neural Information Processing Systems (NeurIPS 2025)

点击查看摘要

[AI-66] he 2025 OpenAI Preparedness Framework does not guarantee any AI risk mitigation practices: a proof-of-concept for affordance analyses of AI safety policies

链接: https://arxiv.org/abs/2509.24394
作者: Sam Coggins,Alex Saeri,Katherine A. Daniell,Lorenn P. Ruster,Jessie Liu,Jenny L. Davis
机构: 未知
类目: Computers and Society (cs.CY); Artificial Intelligence (cs.AI)
备注: 19 pages, 5 tables, 1 figure

点击查看摘要

[AI-67] Plan before Solving: Problem-Aware Strategy Routing for Mathematical Reasoning with LLM s

链接: https://arxiv.org/abs/2509.24377
作者: Shihao Qi,Jie Ma,Ziang Yin,Lingling Zhang,Jian Zhang,Jun Liu,Feng Tian,Tongliang Liu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-68] Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning

链接: https://arxiv.org/abs/2509.24372
作者: Xin Qiu,Yulu Gan,Conor F. Hayes,Qiyao Liang,Elliot Meyerson,Babak Hodjat,Risto Miikkulainen
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
备注: 24 pages, including the appendix

点击查看摘要

[AI-69] Watermarking Diffusion Language Models

链接: https://arxiv.org/abs/2509.24368
作者: Thibaud Gloaguen,Robin Staab,Nikola Jovanović,Martin Vechev
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
备注:

点击查看摘要

[AI-70] From Static to Dynamic: Adaptive Monte Carlo Search for Mathematical Process Supervision

链接: https://arxiv.org/abs/2509.24351
作者: Jie Ma,Shihao Qi,Rui Xing,Ziang Yin,Bifan Wei,Jun Liu,Tongliang Liu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-71] Fin-Ally: Pioneering the Development of an Advanced Commonsense-Embedded Conversational AI for Money Matters

链接: https://arxiv.org/abs/2509.24342
作者: Sarmistha Das,Priya Mathur,Ishani Sharma,Sriparna Saha,Kitsuchart Pasupa,Alka Maurya
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-72] humancompatible.detect: a Python Toolkit for Detecting Bias in AI Models

链接: https://arxiv.org/abs/2509.24340
作者: German M. Matilla,Jiri Nemecek,Illia Kryvoviaz,Jakub Marecek
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-73] owards Generalizable PDE Dynamics Forecasting via Physics-Guided Invariant Learning

链接: https://arxiv.org/abs/2509.24332
作者: Siyang Li,Yize Chen,Yan Guo,Ming Huang,Hui Xiong
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 27 pages, 13 figues. In Submission

点击查看摘要

[AI-74] MedMMV: A Controllable Multimodal Multi-Agent Framework for Reliable and Verifiable Clinical Reasoning

链接: https://arxiv.org/abs/2509.24314
作者: Hongjun Liu,Yinghao Zhu,Yuhui Wang,Yitao Long,Zeyu Lai,Lequan Yu,Chen Zhao
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: 25 pages, 5 figures

点击查看摘要

[AI-75] A study of Universal ODE approaches to predicting soil organic carbon

链接: https://arxiv.org/abs/2509.24306
作者: Satyanarayana Raju G.V.V,Prathamesh Dinesh Joshi,Raj Abhijit Dandekar,Rajat Dandekar,Sreedath Panat
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-76] Experience Paper: Adopting Activity Recognition in On-demand Food Delivery Business

链接: https://arxiv.org/abs/2509.24303
作者: Huatao Xu,Yan Zhang,Wei Gao,Guobin Shen,Mo Li
机构: 未知
类目: Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
备注: 13 pages

点击查看摘要

[AI-77] G-reason er: Foundation Models for Unified Reasoning over Graph-structured Knowledge

链接: https://arxiv.org/abs/2509.24276
作者: Linhao Luo,Zicheng Zhao,Junnan Liu,Zhangchi Qiu,Junnan Dong,Serge Panev,Chen Gong,Thuy-Trang Vu,Gholamreza Haffari,Dinh Phung,Alan Wee-Chung Liew,Shirui Pan
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: 22 pages, 6 figures

点击查看摘要

[AI-78] Adversarial Reinforcement Learning Framework for ESP Cheater Simulation

链接: https://arxiv.org/abs/2509.24274
作者: Inkyu Park,Jeong-Gwan Lee,Taehwan Kwon,Juheon Choi,Seungku Kim,Junsu Kim,Kimin Lee
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-79] Risk-Sensitive RL for Alleviating Exploration Dilemmas in Large Language Models

链接: https://arxiv.org/abs/2509.24261
作者: Yuhua Jiang,Jiawei Huang,Yufeng Yuan,Xin Mao,Yu Yue,Qianchuan Zhao,Lin Yan
机构: 未知
类目: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-80] Rethinking and Benchmarking Large Language Models for Graph Reasoning

链接: https://arxiv.org/abs/2509.24260
作者: Yuwei Hu,Xinyi Huang,Zhewei Wei,Yongchao Liu,Chuntao Hong
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-81] Graph Foundation Models: Bridging Language Model Paradigms and Graph Optimization

链接: https://arxiv.org/abs/2509.24256
作者: Yunhao Liang,Pujun Zhang,Yuan Qu,Shaochong Lin,Zuo-jun Max Shen
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-82] Interactive Program Synthesis for Modeling Collaborative Physical Activities from Narrated Demonstrations

链接: https://arxiv.org/abs/2509.24250
作者: Edward Kim,Daniel He,Jorge Chao,Wiktor Rajca,Mohammed Amin,Nishant Malpani,Ruta Desai,Antti Oulasvirta,Bjoern Hartmann,Sanjit Seshia
机构: 未知
类目: Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-83] Model Merging Scaling Laws in Large Language Models

链接: https://arxiv.org/abs/2509.24244
作者: Yuanyi Wang,Yanggan Gu,Yiming Zhang,Qi Zhou,Zhaoyi Yan,Congkai Xie,Xinyao Wang,Jianbo Yuan,Hongxia Yang
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: 30 pages

点击查看摘要

[AI-84] SafeFlowMatcher: Safe and Fast Planning using Flow Matching with Control Barrier Functions

链接: https://arxiv.org/abs/2509.24243
作者: Jeongyong Yang,Seunghwan Jang,Soojean Han
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI)
备注: 10 pages, 7 figures, 4 tables

点击查看摘要

[AI-85] ChessArena: A Chess Testbed for Evaluating Strategic Reasoning Capabilities of Large Language Models

链接: https://arxiv.org/abs/2509.24239
作者: Jincheng Liu,Sijun He,Jingjing Wu,Xiangsen Wang,Yang Chen,Zhaoqi Kuang,Siqi Bao,Yuan Yao
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-86] ELHPlan: Efficient Long-Horizon Task Planning for Multi-Agent Collaboration

链接: https://arxiv.org/abs/2509.24230
作者: Shaobin Ling,Yun Wang,Chenyou Fan,Tin Lun Lam,Junjie Hu
机构: 未知
类目: Artificial Intelligence (cs.AI); Robotics (cs.RO)
备注:

点击查看摘要

[AI-87] ViReSkill: Vision-Grounded Replanning with Skill Memory for LLM planning with Skill Memory for LLM-Based Planning in Lifelong Robot Learning

链接: https://arxiv.org/abs/2509.24219
作者: Tomoyuki Kagaya,Subramanian Lakshmi,Anbang Ye,Thong Jing Yuan,Jayashree Karlekar,Sugiri Pranata,Natsuki Murakami,Akira Kinose,Yang You
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-88] Conda: Column-Normalized Adam for Training Large Language Models Faster

链接: https://arxiv.org/abs/2509.24218
作者: Junjie Wang,Pan Zhou,Yiming Dong,Huan Li,Jia Li,Xun Zhou,Qicheng Lao,Cong Fang,Zhouchen Lin
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-89] Humanline: Online Alignment as Perceptual Loss

链接: https://arxiv.org/abs/2509.24207
作者: Sijia Liu,Niklas Muennighoff,Kawin Ethayarajh
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-90] Stable Forgetting: Bounded Parameter-Efficient Unlearning in LLM s

链接: https://arxiv.org/abs/2509.24166
作者: Arpit Garg,Hemanth Saratchandran,Ravi Garg,Simon Lucey
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: In Submission

点击查看摘要

[AI-91] Memory Transfer Planning : LLM -driven Context-Aware Code Adaptation for Robot Manipulation

链接: https://arxiv.org/abs/2509.24160
作者: Tomoyuki Kagaya,Subramanian Lakshmi,Yuxuan Lou,Thong Jing Yuan,Jayashree Karlekar,Sugiri Pranata,Natsuki Murakami,Akira Kinose,Yang You
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-92] Robust Preference Optimization: Aligning Language Models with Noisy Preference Feedback

链接: https://arxiv.org/abs/2509.24159
作者: Xiaoyang Cao,Zelai Xu,Mo Guang,Kaiwen Long,Michiel A. Bakker,Yu Wang,Chao Yu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-93] ENET: Leverag ing Tests Beyond Validation for Code Generation

链接: https://arxiv.org/abs/2509.24148
作者: Yiran Hu,Nan Jiang,Shanchao Liang,Yi Wu,Lin Tan
机构: 未知
类目: oftware Engineering (cs.SE); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-94] ransparent Evaluable and Accessible Data Agents : A Proof-of-Concept Framework

链接: https://arxiv.org/abs/2509.24127
作者: Nooshin Bahador
机构: 未知
类目: Artificial Intelligence (cs.AI); Databases (cs.DB)
备注: 20 pages, 11 figures

点击查看摘要

[AI-95] BOSfM: A View Planning Framework for Optimal 3D Reconstruction of Agricultural Scenes

链接: https://arxiv.org/abs/2509.24126
作者: Athanasios Bacharis,Konstantinos D. Polyzos,Georgios B. Giannakis,Nikolaos Papanikolopoulos
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-96] he Impossibility of Inverse Permutation Learning in Transformer Models

链接: https://arxiv.org/abs/2509.24125
作者: Rohan Alur,Chris Hays,Manish Raghavan,Devavrat Shah
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-97] Ancestry Tree Clustering for Particle Filter Diversity Maintenance

链接: https://arxiv.org/abs/2509.24124
作者: Ilari Vallivaara,Bingnan Duan,Yinhuan Dong,Tughrul Arslan
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 15th International Conference on Indoor Positioning and Indoor Navigation, 15-18 September 2025, Tampere, Finland Originally 8 pages. The online version with appendices is 14 pages

点击查看摘要

[AI-98] Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval and Synthesis for SLMs

链接: https://arxiv.org/abs/2509.24107
作者: Shreyas Singh,Kunal Singh,Pradeep Moturi
机构: 未知
类目: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-99] PerfBench: Can Agents Resolve Real-World Performance Bugs?

链接: https://arxiv.org/abs/2509.24091
作者: Spandan Garg,Roshanak Zilouchian Moghaddam
机构: 未知
类目: oftware Engineering (cs.SE); Artificial Intelligence (cs.AI); Performance (cs.PF)
备注:

点击查看摘要

[AI-100] PEARL: Peer-Enhanced Adaptive Radio via On-Device LLM

链接: https://arxiv.org/abs/2509.24085
作者: Ju-Hyung Lee,Yanqing Lu,Klaus Doppler
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
备注:

点击查看摘要

[AI-101] A Small Math Model: Recasting Strategy Choice Theory in an LLM -Inspired Architecture

链接: https://arxiv.org/abs/2509.24068
作者: Roussel Rahman,Jeff Shrager
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-102] In-Context Compositional Q-Learning for Offline Reinforcement Learning

链接: https://arxiv.org/abs/2509.24067
作者: Qiushui Xu,Yuhao Huang,Yushu Jiang,Lei Song,Jinyu Wang,Wenliang Zheng,Jiang Bian
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-103] PartnerMAS: An LLM Hierarchical Multi-Agent Framework for Business Partner Selection on High-Dimensional Features

链接: https://arxiv.org/abs/2509.24046
作者: Lingyao Li,Haolun Wu,Zhenkun Li,Jiabei Hu,Yu Wang,Xiaoshan Huang,Wenyue Hua,Wenqian Wang
机构: 未知
类目: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-104] From Edge to HPC: Investigating Cross-Facility Data Streaming Architectures

链接: https://arxiv.org/abs/2509.24030
作者: Anjus George,Michael Brim,Christopher Zimmer,David Rogers,Sarp Oral,Zach Mayes
机构: 未知
类目: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
备注:

点击查看摘要

[AI-105] Future-Proofing Programmers: Optimal Knowledge Tracing for AI-Assisted Personalized Education

链接: https://arxiv.org/abs/2509.23996
作者: Yuchen Wang,Pei-Duo Yu,Chee Wei Tan
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: The paper is accepted to IEEE Signal Processing Magazine, Special Issue on Artificial Intelligence for Education

点击查看摘要

[AI-106] Guide: Generalized-Prior and Data Encoders for DAG Estimation

链接: https://arxiv.org/abs/2509.23992
作者: Amartya Roy,Devharish N,Shreya Ganguly,Kripabandhu Ghosh
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-107] LLM /Agent -as-Data-Analyst: A Survey

链接: https://arxiv.org/abs/2509.23988
作者: Zirui Tang,Weizheng Wang,Zihang Zhou,Yang Jiao,Bangrui Xu,Boyu Niu,Xuanhe Zhou,Guoliang Li,Yeye He,Wei Zhou,Yitong Song,Cheng Tan,Bin Wang,Conghui He,Xiaoyang Wang,Fan Wu
机构: 未知
类目: Artificial Intelligence (cs.AI); Databases (cs.DB)
备注: 35 page, 11 figures

点击查看摘要

[AI-108] usoAI: Agent ic Optimization for Scientific Methods

链接: https://arxiv.org/abs/2509.23986
作者: Alistair Turcan,Kexin Huang,Lei Li,Martin Jinye Zhang
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-109] Automatic selection of primary studies in systematic reviews with evolutionary rule-based classification

链接: https://arxiv.org/abs/2509.23981
作者: José de la Torre-López,Aurora Ramírez,José Raúl Romero
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: 32 pages, 5 figures, 4 tables

点击查看摘要

[AI-110] MAD-PINN: A Decentralized Physics-Informed Machine Learning Framework for Safe and Optimal Multi-Agent Control

链接: https://arxiv.org/abs/2509.23960
作者: Manan Tayal,Aditya Singh,Shishir Kolathaya,Somil Bansal
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI)
备注: 9 Pages, 4 Figures, 3 Tables. First two authors have contributed equally

点击查看摘要

[AI-111] Diffusion Models are Kelly Gamblers

链接: https://arxiv.org/abs/2509.23937
作者: Akhil Premkumar
机构: 未知
类目: Machine Learning (cs.LG); Statistical Mechanics (cond-mat.stat-mech); Artificial Intelligence (cs.AI); Information Theory (cs.IT)
备注: 26 pages + references, 13 figures

点击查看摘要

[AI-112] HiViS: Hiding Visual Tokens from the Drafter for Speculative Decoding in Vision-Language Models

链接: https://arxiv.org/abs/2509.23928
作者: Zhinan Xie,Peisong Wang,Jian Cheng
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-113] Graph Mixing Additive Networks

链接: https://arxiv.org/abs/2509.23923
作者: Maya Bechler-Speicher,Andrea Zerio,Maor Huri,Marie Vibeke Vestergaard,Ran Gilad-Bachrach,Tine Jess,Samir Bhatt,Aleksejs Sazonovs
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: arXiv admin note: substantial text overlap with arXiv:2505.19193

点击查看摘要

[AI-114] Continual Learning to Generalize Forwarding Strategies for Diverse Mobile Wireless Networks

链接: https://arxiv.org/abs/2509.23913
作者: Cheonjin Park,Victoria Manfredi,Xiaolan Zhang,Chengyi Liu,Alicia P Wolfe,Dongjin Song,Sarah Tasneem,Bing Wang
机构: 未知
类目: Networking and Internet Architecture (cs.NI); Artificial Intelligence (cs.AI)
备注: 11 pages

点击查看摘要

[AI-115] From Neural Networks to Logical Theories: The Correspondence between Fibring Modal Logics and Fibring Neural Networks

链接: https://arxiv.org/abs/2509.23912
作者: Ouns El Harzli,Bernardo Cuenca Grau,Artur d’Avila Garcez,Ian Horrocks,Tarek R. Besold
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-116] Gradient Flow Convergence Guarantee for General Neural Network Architectures

链接: https://arxiv.org/abs/2509.23887
作者: Yash Jakhmola
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 12 pages, 3 figures, 1 table

点击查看摘要

[AI-117] owards Understanding Subliminal Learning: When and How Hidden Biases Transfer

链接: https://arxiv.org/abs/2509.23886
作者: Simon Schrodi,Elias Kempf,Fazl Barez,Thomas Brox
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-118] Quant Fever Reasoning Blackholes Schrodingers Compliance and More: Probing GPT -OSS-20B

链接: https://arxiv.org/abs/2509.23882
作者: Shuyi Lin,Tian Lu,Zikai Wang,Bo Wen,Yibo Zhao,Cheng Tan
机构: 未知
类目: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
备注:

点击查看摘要

[AI-119] Disentangling Score Content and Performance Style for Joint Piano Rendering and Transcription

链接: https://arxiv.org/abs/2509.23878
作者: Wei Zeng,Junchuan Zhao,Ye Wang
机构: 未知
类目: ound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
备注: 30 pages, 13 figures

点击查看摘要

[AI-120] Multi-Value-Product Retrieval-Augmented Generation for Industrial Product Attribute Value Identification

链接: https://arxiv.org/abs/2509.23874
作者: Huike Zou,Haiyang Yang,Yindu Su,Liyu Chen,Chengbao Lian,Qingheng Zhang,Shuguang Han,Jufeng Chen
机构: 未知
类目: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-121] Rethinking Reward Miscalibration of GRPO in Agent ic RL

链接: https://arxiv.org/abs/2509.23870
作者: Jingyu Liu,Xiaopeng Wu,Jingquan Peng,Kehan Chen,Chuan Yu,Lizhong Ding,Yong Liu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-122] Agent Guard: Runtime Verification of AI Agents ICSE

链接: https://arxiv.org/abs/2509.23864
作者: Roham Koohestani
机构: 未知
类目: Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
备注: Accepted for publication in the proceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering, ASE 2025, in the 1st international workshop on Agentic Software Engineering (AgenticSE)

点击查看摘要

[AI-123] Sequence Pathfinder for Multi-Agent Pickup and Delivery in the Warehouse

链接: https://arxiv.org/abs/2509.23778
作者: Zeyuan Zhang,Chaoran Li,Shao Zhang,Ying Wen
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
备注: Preprint Under Review

点击查看摘要

[AI-124] SHAPoint: Task-Agnostic Efficient and Interpretable Point-Based Risk Scoring via Shapley Values

链接: https://arxiv.org/abs/2509.23756
作者: Tomer D. Meirman,Bracha Shapira,Noa Dagan,Lior S. Rokach
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 29 pages inc. references for main article. 6 Figures and 7 Tables. Including Data and Code availability statements

点击查看摘要

[AI-125] LocoFormer: Generalist Locomotion via Long-context Adaptation

链接: https://arxiv.org/abs/2509.23745
作者: Min Liu,Deepak Pathak,Ananye Agarwal
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI)
备注: Accepted to CoRL 2025

点击查看摘要

[AI-126] GUI-Shepherd: Reliable Process Reward and Verification for Long-Sequence GUI Tasks

链接: https://arxiv.org/abs/2509.23738
作者: Cong Chen,Kaixiang Ji,Hao Zhong,Muzhi Zhu,Anzhou Li,Guo Gan,Ziyuan Huang,Cheng Zou,Jiajia Liu,Jingdong Chen,Hao Chen,Chunhua Shen
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-127] Diagnosing Failure Root Causes in Platform-Orchestrated Agent ic Systems: Dataset Taxonomy and Benchmark

链接: https://arxiv.org/abs/2509.23735
作者: Xuyan Ma,Xiaofei Xie,Yawen Wang,Junjie Wang,Boyu Wu,Mingyang Li,Qing Wang
机构: 未知
类目: Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
备注:

点击查看摘要

[AI-128] EAPO: Enhancing Policy Optimization with On-Demand Expert Assistance

链接: https://arxiv.org/abs/2509.23730
作者: Siyao Song,Cong Ma,Zhihao Cheng,Shiye Lei,Minghao Li,Ying Zeng,Huaixiao Tou,Kai Jia
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-129] AudioMoG: Guiding Audio Generation with Mixture-of-Guidance

链接: https://arxiv.org/abs/2509.23727
作者: Junyou Wang,Zehua Chen,Binjie Yuan,Kaiwen Zheng,Chang Li,Yuxuan Jiang,Jun Zhu
机构: 未知
类目: ound (cs.SD); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-130] MedLA: A Logic-Driven Multi-Agent Framework for Complex Medical Reasoning with Large Language Models

链接: https://arxiv.org/abs/2509.23725
作者: Siqi Ma,Jiajie Huang,Bolin Yang,Fan Zhang,Jinlin Wu,Yue Shen,Guohui Fan,Zhu Zhang,Zelin Zang
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-131] AdaPtis: Reducing Pipeline Bubbles with Adaptive Pipeline Parallelism on Heterogeneous Models

链接: https://arxiv.org/abs/2509.23722
作者: Jihu Guo,Tenghui Ma,Wei Gao,Peng Sun,Jiaxing Li,Xun Chen,Yuyang Jin,Dahua Lin
机构: 未知
类目: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)
备注: 13 pages, 15 Figures; Under Review;

点击查看摘要

[AI-132] Measuring Sparse Autoencoder Feature Sensitivity NEURIPS2025

链接: https://arxiv.org/abs/2509.23717
作者: Claire Tian,Katherine Tian,Nathan Hu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: NeurIPS 2025 Workshop on Mechanistic Interpretability Camera Ready

点击查看摘要

[AI-133] Bridging Discrete and Continuous RL: Stable Deterministic Policy Gradient with Martingale Characterization

链接: https://arxiv.org/abs/2509.23711
作者: Ziheng Cheng,Xin Guo,Yufei Zhang
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Optimization and Control (math.OC); Machine Learning (stat.ML)
备注:

点击查看摘要

[AI-134] Estimating Time Series Foundation Model Transferability via In-Context Learning

链接: https://arxiv.org/abs/2509.23695
作者: Qingren Yao,Ming Jin,Chengqi Zhang,Chao-Han Huck Yang,Jun Qi,Shirui Pan
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-135] Graph Neural Networks with Diversity-aware Neighbor Selection and Dynamic Multi-scale Fusion for Multivariate Time Series Forecasting

链接: https://arxiv.org/abs/2509.23671
作者: Jingqi Xu,Guibin Chen,Jingxi Lu,Yuzhang Lin
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-136] Beyond Greedy Exits: Improved Early Exit Decisions for Risk Control and Reliability NEURIPS2025

链接: https://arxiv.org/abs/2509.23666
作者: Divya Jyoti Bajpai,Manjesh Kumar Hanawal
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: Accepted as poster in NeurIPS 2025

点击查看摘要

[AI-137] Calibration Meets Reality: Making Machine Learning Predictions Trustworthy

链接: https://arxiv.org/abs/2509.23665
作者: Kristina P. Sinaga,Arjun S. Nair
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Information Theory (cs.IT); Probability (math.PR)
备注: 30 pages, 7 figures, 5 tables

点击查看摘要

[AI-138] Pure Node Selection for Imbalanced Graph Node Classification

链接: https://arxiv.org/abs/2509.23662
作者: Fanlong Zeng,Wensheng Gan,Jiayang Wu,Philip S. Yu
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: Preprint, 8 tables, 9 figures

点击查看摘要

[AI-139] Game-Oriented ASR Error Correction via RAG -Enhanced LLM

链接: https://arxiv.org/abs/2509.23630
作者: Yan Jiang,Yongle Luo,Qixian Zhou,Elvis S. Liu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-140] How LLM s Learn to Reason : A Complex Network Perspective ICLR2026

链接: https://arxiv.org/abs/2509.23629
作者: Sihan Hu,Xiansheng Cai,Yuan Huang,Zhiyuan Yao,Linfeng Zhang,Pan Zhang,Youjin Deng,Kun Chen
机构: 未知
类目: Artificial Intelligence (cs.AI); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Machine Learning (cs.LG); Physics and Society (physics.soc-ph)
备注: 24 pages, 11 figures, 1 table, under review as a conference paper at ICLR 2026

点击查看摘要

[AI-141] Reasoning Scaffolding: Distilling the Flow of Thought from LLM s

链接: https://arxiv.org/abs/2509.23619
作者: Xiangyu Wen,Junhua Huang,Zeju Li,Min Li,Jianyuan Zhong,Zhijian Xu,Mingxuan Yuan,Yongxiang Huang,Qiang Xu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-142] Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial Alignment

链接: https://arxiv.org/abs/2509.23618
作者: Pu Huang,Shouguang Wang,Siya Yao,Mengchu Zhou
机构: 未知
类目: ound (cs.SD); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-143] GraphIFE: Rethinking Graph Imbalance Node Classification via Invariant Learning

链接: https://arxiv.org/abs/2509.23616
作者: Fanlong Zeng,Wensheng Gan,Philip S. Yu
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: PrePrint, 16 pages, 7 tables, 6 figures

点击查看摘要

[AI-144] PSG-Agent : Personality-Aware Safety Guardrail for LLM -based Agents

链接: https://arxiv.org/abs/2509.23614
作者: Yaozu Wu,Jizhou Guo,Dongyuan Li,Henry Peng Zou,Wei-Chieh Huang,Yankai Chen,Zhen Wang,Weizhi Zhang,Yangning Li,Meng Zhang,Renhe Jiang,Philip S. Yu
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-145] Characteristic Root Analysis and Regularization for Linear Time Series Forecasting

链接: https://arxiv.org/abs/2509.23597
作者: Zheng Wang,Kaixuan Zhang,Wanfang Chen,Xiaonan Lu,Longyuan Li,Tobias Schlagenhauf
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-146] oward a Holistic Approach to Continual Model Merging ICCV2025

链接: https://arxiv.org/abs/2509.23592
作者: Hoang Phan,Sungmin Cha,Tung Lam Tran,Qi Lei
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: Accepted to Workshop on Continual Learning in Computer Vision, ICCV 2025

点击查看摘要

[AI-147] Improving the Efficiency of LLM Agent Systems through Trajectory Reduction

链接: https://arxiv.org/abs/2509.23586
作者: Yuan-An Xiao,Pengfei Gao,Chao Peng,Yingfei Xiong
机构: 未知
类目: oftware Engineering (cs.SE); Artificial Intelligence (cs.AI)
备注: 20 pages, 4 figures

点击查看摘要

[AI-148] ML-Asset Management: Curation Discovery and Utilization VLDB2025

链接: https://arxiv.org/abs/2509.23577
作者: Mengying Wang,Moming Duan,Yicong Huang,Chen Li,Bingsheng He,Yinghui Wu
机构: 未知
类目: Databases (cs.DB); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
备注: Tutorial, VLDB 2025. Project page: this https URL

点击查看摘要

[AI-149] Uncovering Vulnerabilities of LLM -Assisted Cyber Threat Intelligence

链接: https://arxiv.org/abs/2509.23573
作者: Yuqiao Meng,Luoxi Tang,Feiyang Yu,Jinyuan Jia,Guanhua Yan,Ping Yang,Zhaohan Xi
机构: 未知
类目: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-150] Benchmarking LLM -Assisted Blue Teaming via Standardized Threat Hunting

链接: https://arxiv.org/abs/2509.23571
作者: Yuqiao Meng,Luoxi Tang,Feiyang Yu,Xi Li,Guanhua Yan,Ping Yang,Zhaohan Xi
机构: 未知
类目: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-151] Node Classification via Simplicial Interaction with Augmented Maximal Clique Selection

链接: https://arxiv.org/abs/2509.23568
作者: Eunho Koo,Tongseok Lim
机构: 未知
类目: ocial and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: To appear in Neurocomputing

点击查看摘要

[AI-152] A Hierarchical Structure-Enhanced Personalized Recommendation Model for Traditional Chinese Medicine Formulas Based on KG Diffusion Guidance CIKM

链接: https://arxiv.org/abs/2509.23560
作者: ChaoBo Zhang,Long Tan
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: 10 pages, 10 figures, Proceedings of the 34th ACM International Conference on Information and Knowledge Management (CIKM)

点击查看摘要

[AI-153] Formalization Driven LLM Prompt Jailbreaking via Reinforcement Learning

链接: https://arxiv.org/abs/2509.23558
作者: Zhaoqi Wang,Daqing He,Zijian Zhang,Xin Li,Liehuang Zhu,Meng Li,Jiamou Liu
机构: 未知
类目: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
备注:

点击查看摘要

[AI-154] Fusing Sequence Motifs and Pan-Genomic Features: Antimicrobial Resistance Prediction using an Explainable Lightweight 1D CNN-XGBoost Ensemble HPCA

链接: https://arxiv.org/abs/2509.23552
作者: Md. Saiful Bari Siddiqui,Nowshin Tarannum
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Genomics (q-bio.GN); Quantitative Methods (q-bio.QM)
备注: Submitted to SCA/HPCAsia 2026. This preprint version has been prepared for open-access distribution and may differ in formatting from the official proceedings. Also available on bioRxiv for visibility to the life sciences community

点击查看摘要

[AI-155] Disentanglement of Variations with Multimodal Generative Modeling

链接: https://arxiv.org/abs/2509.23548
作者: Yijie Zhang,Yiyang Shen,Weiran Wang
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: 22 pages, 14 figures, 7 tables

点击查看摘要

[AI-156] Beyond the Strongest LLM : Multi-Turn Multi-Agent Orchestration vs. Single LLM s on Benchmarks

链接: https://arxiv.org/abs/2509.23537
作者: Aaron Xuxiang Tian,Ruofan Zhang,Jiayao Tang,Young Min Cho,Xueqian Li,Qiang Yi,Ji Wang,Zhunping Zhang,Danrui Qi,Sharath Chandra Guntuku,Lyle Ungar,Tianyu Shi,Chi Wang
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注: 9 pages, 3 tables, 1 figure

点击查看摘要

[AI-157] DOoM: Difficult Olympiads of Math

链接: https://arxiv.org/abs/2509.23529
作者: Ilya Kuleshov,Ilin Pavel,Nikolay Kompanets,Ksenia Sycheva,Aleksandr Nikolich
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-158] Privy: Envisioning and Mitigating Privacy Risks for Consumer-facing AI Product Concepts

链接: https://arxiv.org/abs/2509.23525
作者: Hao-Ping Lee,Yu-Ju Yang,Matthew Bilik,Isadora Krsek,Thomas Serban von Davier,Kyzyl Monteiro,Jason Lin,Shivani Agarwal,Jodi Forlizzi,Sauvik Das
机构: 未知
类目: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-159] ReliabilityRAG : Effective and Provably Robust Defense for RAG -based Web-Search NEURIPS2025

链接: https://arxiv.org/abs/2509.23519
作者: Zeyu Shen,Basileal Imana,Tong Wu,Chong Xiang,Prateek Mittal,Aleksandra Korolova
机构: 未知
类目: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
备注: Accepted to NeurIPS 2025

点击查看摘要

[AI-160] Model Consistency as a Cheap yet Predictive Proxy for LLM Elo Scores

链接: https://arxiv.org/abs/2509.23510
作者: Ashwin Ramaswamy,Nestor Demeure,Ermal Rrapaj
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-161] Dynamic Trust Calibration Using Contextual Bandits

链接: https://arxiv.org/abs/2509.23497
作者: Bruno M. Henrique,Eugene Santos Jr
机构: 未知
类目: Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
备注:

点击查看摘要

[AI-162] Revisiting Multivariate Time Series Forecasting with Missing Values

链接: https://arxiv.org/abs/2509.23494
作者: Jie Yang,Yifan Hu,Kexin Zhang,Luyang Niu,Yushun Dong,Philip S. Yu,Kaize Ding
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
备注:

点击查看摘要

[AI-163] Accurate Predictions in Education with Discrete Variational Inference

链接: https://arxiv.org/abs/2509.23484
作者: Tom Quilter(1),Anastasia Ilick(2),Anastasia Ilick(3),Richard Turner(4) ((1) University of Manchester, (2) Google DeepMind, (3) MathWorks, (4) University of Cambridge)
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-164] GeoBS: Information-Theoretic Quantification of Geographic Bias in AI Models

链接: https://arxiv.org/abs/2509.23482
作者: Zhangyu Wang,Nemin Wu,Qian Cao,Jiangnan Xia,Zeping Liu,Yiqun Xie,Akshay Nambi,Tanuja Ganu,Ni Lao,Ninghao Liu,Gengchen Mai
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-165] Memory-Efficient Fine-Tuning via Low-Rank Activation Compression

链接: https://arxiv.org/abs/2509.23472
作者: Jiang-Xin Shi,Wen-Da Wei,Jin-Fei Qi,Xuanyu Chen,Tong Wei,Yu-Feng Li
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-166] Multi-Modal Manipulation via Multi-Modal Policy Consensus

链接: https://arxiv.org/abs/2509.23468
作者: Haonan Chen,Jiaming Xu,Hongyu Chen,Kaiwen Hong,Binghao Huang,Chaoqi Liu,Jiayuan Mao,Yunzhu Li,Yilun Du,Katherine Driggs-Campbell
机构: 未知
类目: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: 9 pages, 7 figures

点击查看摘要

[AI-167] ViTSP: A Vision Language Models Guided Framework for Large-Scale Traveling Salesman Problems

链接: https://arxiv.org/abs/2509.23465
作者: Zhuoli Yin,Yi Ding,Reem Khir,Hua Cai
机构: 未知
类目: Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-168] Generative Evolutionary Meta-Solver (GEMS): Scalable Surrogate-Free Multi-Agent Learning

链接: https://arxiv.org/abs/2509.23462
作者: Alakh Sharma,Gaurish Trivedi,Kartikey Bhandari,Yash Sinha,Dhruv Kumar,Pratik Narang,Jagat Sesh Challa
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: Under review

点击查看摘要

[AI-169] Data-Efficient Training by Evolved Sampling

链接: https://arxiv.org/abs/2509.23461
作者: Ziheng Cheng,Zhong Li,Jiang Bian
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
备注:

点击查看摘要

[AI-170] Beyond Embeddings: Interpretable Feature Extraction for Binary Code Similarity USENIX-SECURITY’26

链接: https://arxiv.org/abs/2509.23449
作者: Charles E. Gagnon,Steven H. H. Ding,Philippe Charland,Benjamin C. M. Fung
机构: 未知
类目: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Software Engineering (cs.SE)
备注: 17 pages, 7 figures, submitted to USENIX Security '26

点击查看摘要

[AI-171] Factor Decorrelation Enhanced Data Removal from Deep Predictive Models NEURIPS2025

链接: https://arxiv.org/abs/2509.23443
作者: Wenhao Yang,Lin Li,Xiaohui Tao,Kaize Shi
机构: 未知
类目: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
备注: accepted by NeurIPS 2025

点击查看摘要

[AI-172] AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models

链接: https://arxiv.org/abs/2509.23435
作者: Wenyu Li,Xiaoqi Jiao,Yi Chang,Guangyan Zhang,Yiwen Guo
机构: 未知
类目: ound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
备注:

点击查看摘要

[AI-173] NeuroBridge: Using Generative AI to Bridge Cross-neurotype Communication Differences through Neurotypical Perspective-taking

链接: https://arxiv.org/abs/2509.23434
作者: Rukhshan Haroon,Kyle Wigdor,Katie Yang,Nicole Toumanios,Eileen T. Crehan,Fahad Dogar
机构: 未知
类目: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

[AI-174] Democratizing AI scientists using ToolUniverse

链接: https://arxiv.org/abs/2509.23426
作者: Shanghua Gao,Richard Zhu,Pengwei Sui,Zhenglun Kong,Sufian Aldogom,Yepeng Huang,Ayush Noori,Reza Shamji,Krishna Parvataneni,Theodoros Tsiligkaridis,Marinka Zitnik
机构: 未知
类目: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
备注: this https URL

点击查看摘要

[AI-175] Enhancing Communication Efficiency in FL with Adaptive Gradient Quantization and Communication Frequency Optimization

链接: https://arxiv.org/abs/2509.23419
作者: Asadullah Tariq,Tariq Qayyum,Mohamed Adel Serhani,Farag Sallabi,Ikbal Taleb,Ezedin S. Barka
机构: 未知
类目: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)
备注:

点击查看摘要

机器学习

[LG-0] R2-D2: Tree Search Guided Trajectory-Aware Fine-Tuning for Discrete Diffusion

链接: https://arxiv.org/abs/2509.25171
作者: Sophia Tang,Yuchen Zhu,Molei Tao,Pranam Chatterjee
类目: Machine Learning (cs.LG); Biomolecules (q-bio.BM)
*备注:

点击查看摘要

[LG-1] Physics-Informed Inductive Biases for Voltage Prediction in Distribution Grids

链接: https://arxiv.org/abs/2509.25158
作者: Ehimare Okoyomon,Arbel Yaniv,Christoph Goebel
类目: Machine Learning (cs.LG); Systems and Control (eess.SY)
*备注:

点击查看摘要

[LG-2] Context-Driven Performance Modeling for Causal Inference Operators on Neural Processing Units

链接: https://arxiv.org/abs/2509.25155
作者: Neelesh Gupta,Rakshith Jayanth,Dhruv Parikh,Viktor Prasanna
类目: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
*备注: IEEE HiPC 2025

点击查看摘要

[LG-3] High-Dimensional Analysis of Single-Layer Attention for Sparse-Token Classification

链接: https://arxiv.org/abs/2509.25153
作者: Nicholas Barnfield,Hugo Cui,Yue M. Lu
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-4] BALF: Budgeted Activation-Aware Low-Rank Factorization for Fine-Tuning-Free Model Compression

链接: https://arxiv.org/abs/2509.25136
作者: David González Martínez
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-5] Learning in an Echo Chamber: Online Learning with Replay Adversary

链接: https://arxiv.org/abs/2509.25135
作者: Daniil Dmitriev,Harald Eskelund Franck,Carolin Heinzler,Amartya Sanyal
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-6] owards generalizable deep ptychography neural networks

链接: https://arxiv.org/abs/2509.25104
作者: Albert Vong,Steven Henke,Oliver Hoidn,Hanna Ruth,Junjing Deng,Alexander Hexemer,Apurva Mehta,Arianna Gleason,Levi Hancock,Nicholas Schwarz
类目: Machine Learning (cs.LG)
*备注: Submitted to scientific journal for peer review

点击查看摘要

[LG-7] Curriculum Imitation Learning of Distributed Multi-Robot Policies

链接: https://arxiv.org/abs/2509.25097
作者: Jesús Roche,Eduardo Sebastián,Eduardo Montijano
类目: Robotics (cs.RO); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
*备注: Accepted and presented at the Eight Iberian Robotics Conference, 2025

点击查看摘要

[LG-8] owards a Certificate of Trust: Task-Aware OOD Detection for Scientific AI

链接: https://arxiv.org/abs/2509.25080
作者: Bogdan Raonić,Siddhartha Mishra,Samuel Lanthaler
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-9] Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models

链接: https://arxiv.org/abs/2509.25050
作者: Shuchen Xue,Chongjian Ge,Shilong Zhang,Yichen Li,Zhi-Ming Ma
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-10] Efficient Hyperparameter Tuning via Trajectory Invariance Principle

链接: https://arxiv.org/abs/2509.25049
作者: Bingrui Li,Jiaxin Wen,Zhanpeng Zhou,Jun Zhu,Jianfei Chen
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-11] A multiscale analysis of mean-field transformers in the moderate interaction regime

链接: https://arxiv.org/abs/2509.25040
作者: Giuseppe Bruno,Federico Pasqualotto,Andrea Agazzi
类目: Machine Learning (cs.LG); Probability (math.PR); Machine Learning (stat.ML)
*备注: 30 pages, 4 figures

点击查看摘要

[LG-12] Bayesian Surrogates for Risk-Aware Pre-Assessment of Aging Bridge Portfolios NEURIPS2025

链接: https://arxiv.org/abs/2509.25031
作者: Sophia V. Kuhn,Rafael Bischof,Marius Weber,Antoine Binggeli,Michael A. Kraus,Walter Kaufmann,Fernando Pérez-Cruz
类目: Machine Learning (cs.LG)
*备注: Accepted at the NeurIPS 2025 Workshop on MLxOR: Mathematical Foundations and Operational Integration of Machine Learning for Uncertainty-Aware Decision-Making

点击查看摘要

[LG-13] MARCOS: Deep Thinking by Markov Chain of Continuous Thoughts

链接: https://arxiv.org/abs/2509.25020
作者: Jiayu Liu,Zhenya Huang,Anya Sims,Enhong Chen,Yee Whye Teh,Ning Miao
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-14] Embedded Deep Learning for Bio-hybrid Plant Sensors to Detect Increased Heat and Ozone Levels

链接: https://arxiv.org/abs/2509.24992
作者: Till Aust,Christoph Karl Heck,Eduard Buss,Heiko Hamann
类目: Emerging Technologies (cs.ET); Machine Learning (cs.LG)
*备注: Submitted to IEEE Sensors 2025

点击查看摘要

[LG-15] Sampling Complexity of TD and PPO in RKHS

链接: https://arxiv.org/abs/2509.24991
作者: Lu Zou,Wendi Ren,Weizhong Zhang,Liang Ding,Shuang Li
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-16] Double Descent as a Lens for Sample Efficiency in Autoregressive vs. Discrete Diffusion Models

链接: https://arxiv.org/abs/2509.24974
作者: Ahmad Fraij,Sam Dauncey
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-17] Overlap-Adaptive Regularization for Conditional Averag e Treatment Effect Estimation

链接: https://arxiv.org/abs/2509.24962
作者: Valentyn Melnychuk,Dennis Frauen,Jonas Schweisthal,Stefan Feuerriegel
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-18] Intra-request branch orchestration for efficient LLM reasoning

链接: https://arxiv.org/abs/2509.24957
作者: Weifan Jiang,Rana Shahout,Yilun Du,Michael Mitzenmacher,Minlan Yu
类目: Machine Learning (cs.LG)
*备注: 15 pages, 6 figures

点击查看摘要

[LG-19] OAT-FM: Optimal Acceleration Transport for Improved Flow Matching

链接: https://arxiv.org/abs/2509.24936
作者: Angxiao Yue,Anqi Dong,Hongteng Xu
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-20] Is Sequence Information All You Need for Bayesian Optimization of Antibodies? NEURIPS2025

链接: https://arxiv.org/abs/2509.24933
作者: Sebastian W. Ober,Calvin McCarter,Aniruddh Raghu,Yucen Lily Li,Alan N. Amin,Andrew Gordon Wilson,Hunter Elliott
类目: Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
*备注: Accepted into the AI for Science Workshop, NeurIPS 2025

点击查看摘要

[LG-21] Graph Theory Meets Federated Learning over Satellite Constellations: Spanning Aggregations Network Formation and Performance Optimization

链接: https://arxiv.org/abs/2509.24932
作者: Fardis Nadimi,Payam Abdisarabshali,Jacob Chakareski,Nicholas Mastronarde,Seyyedali Hosseinalipour
类目: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI)
*备注: 8 Figures, 6 Appendix

点击查看摘要

[LG-22] From Code to Action: Hierarchical Learning of Diffusion-VLM Policies

链接: https://arxiv.org/abs/2509.24917
作者: Markus Peschl,Pietro Mazzaglia,Daniel Dijkman
类目: Robotics (cs.RO); Machine Learning (cs.LG)
*备注: 19 pages including references, 6 figures. Accepted to CoRL LEAP 2025

点击查看摘要

[LG-23] Unmute the Patch Tokens: Rethinking Probing in Multi-Label Audio Classification ICLR2026

链接: https://arxiv.org/abs/2509.24901
作者: Lukas Rauch,René Heinrich,Houtan Ghaffari,Lukas Miklautz,Ilyass Moummad,Bernhard Sick,Christoph Scholz
类目: ound (cs.SD); Machine Learning (cs.LG)
*备注: Currently under review @ICLR2026

点击查看摘要

[LG-24] owards Understanding the Shape of Representations in Protein Language Models

链接: https://arxiv.org/abs/2509.24895
作者: Kosio Beshkov,Anders Malthe-Sørenssen
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-25] Adaptive Canonicalization with Application to Invariant Anisotropic Geometric Networks

链接: https://arxiv.org/abs/2509.24886
作者: Ya-Wei Eileen Lin,Ron Levie
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-26] DRIFT-Net: A Spectral–Coupled Neural Operator for PDEs Learning

链接: https://arxiv.org/abs/2509.24868
作者: Jiayi Li,Flora D. Salim
类目: Machine Learning (cs.LG); Computational Physics (physics.comp-ph)
*备注:

点击查看摘要

[LG-27] Beyond the Hook: Predicting Billboard Hot 100 Chart Inclusion with Machine Learning from Streaming Audio Signals and Perceptual Features

链接: https://arxiv.org/abs/2509.24856
作者: Christos Mountzouris
类目: Machine Learning (cs.LG)
*备注: 17 pages, 6 figures, 3 tables

点击查看摘要

Abstract:The advent of digital streaming platforms have recently revolutionized the landscape of music industry, with the ensuing digitalization providing structured data collections that open new research avenues for investigating popularity dynamics and mainstream success. The present work explored which determinants hold the strongest predictive influence for a track’s inclusion in the Billboard Hot 100 charts, including streaming popularity, measurable audio signal attributes, and probabilistic indicators of human listening. The analysis revealed that popularity was by far the most decisive predictor of Billboard Hot 100 inclusion, with considerable contribution from instrumentalness, valence, duration and speechiness. Logistic Regression achieved 90.0% accuracy, with very high recall for charting singles (0.986) but lower recall for non-charting ones (0.813), yielding balanced F1-scores around 0.90. Random Forest slightly improved performance to 90.4% accuracy, maintaining near-perfect precision for non-charting singles (0.990) and high recall for charting ones (0.992), with F1-scores up to 0.91. Gradient Boosting (XGBoost) reached 90.3% accuracy, delivering a more balanced trade-off by improving recall for non-charting singles (0.837) while sustaining high recall for charting ones (0.969), resulting in F1-scores comparable to the other models.

[LG-28] Cell2Text: Multimodal LLM for Generating Single-Cell Descriptions from RNA-Seq Data

链接: https://arxiv.org/abs/2509.24840
作者: Oussama Kharouiche,Aris Markogiannakis,Xiao Fei,Michail Chatzianastasis,Michalis Vazirgiannis
类目: Machine Learning (cs.LG); Computational Engineering, Finance, and Science (cs.CE)
*备注:

点击查看摘要

Abstract:Single-cell RNA sequencing has transformed biology by enabling the measurement of gene expression at cellular resolution, providing information for cell types, states, and disease contexts. Recently, single-cell foundation models have emerged as powerful tools for learning transferable representations directly from expression profiles, improving performance on classification and clustering tasks. However, these models are limited to discrete prediction heads, which collapse cellular complexity into predefined labels that fail to capture the richer, contextual explanations biologists need. We introduce Cell2Text, a multimodal generative framework that translates scRNA-seq profiles into structured natural language descriptions. By integrating gene-level embeddings from single-cell foundation models with pretrained large language models, Cell2Text generates coherent summaries that capture cellular identity, tissue origin, disease associations, and pathway activity, generalizing to unseen cells. Empirically, Cell2Text outperforms baselines on classification accuracy, demonstrates strong ontological consistency using PageRank-based similarity metrics, and achieves high semantic fidelity in text generation. These results demonstrate that coupling expression data with natural language offers both stronger predictive performance and inherently interpretable outputs, pointing to a scalable path for label-efficient characterization of unseen cells.

[LG-29] Efficient Sketching and Nearest Neighbor Search Algorithms for Sparse Vector Sets

链接: https://arxiv.org/abs/2509.24815
作者: Sebastian Bruch,Franco Maria Nardini,Cosimo Rulli,Rossano Venturini
类目: Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR); Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-30] DyMoDreamer: World Modeling with Dynamic Modulation

链接: https://arxiv.org/abs/2509.24804
作者: Boxuan Zhang,Runqing Wang,Wei Xiao,Weipu Zhang,Jian Sun,Gao Huang,Jie Chen,Gang Wang
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

Abstract:A critical bottleneck in deep reinforcement learning (DRL) is sample inefficiency, as training high-performance agents often demands extensive environmental interactions. Model-based reinforcement learning (MBRL) mitigates this by building world models that simulate environmental dynamics and generate synthetic experience, improving sample efficiency. However, conventional world models process observations holistically, failing to decouple dynamic objects and temporal features from static backgrounds. This approach is computationally inefficient, especially for visual tasks where dynamic objects significantly influence rewards and decision-making performance. To address this, we introduce DyMoDreamer, a novel MBRL algorithm that incorporates a dynamic modulation mechanism to improve the extraction of dynamic features and enrich the temporal information. DyMoDreamer employs differential observations derived from a novel inter-frame differencing mask, explicitly encoding object-level motion cues and temporal dynamics. Dynamic modulation is modeled as stochastic categorical distributions and integrated into a recurrent state-space model (RSSM), enhancing the model’s focus on reward-relevant dynamics. Experiments demonstrate that DyMoDreamer sets a new state-of-the-art on the Atari 100 k benchmark with a 156.6 % mean human-normalized score, establishes a new record of 832 on the DeepMind Visual Control Suite, and gains a 9.5 % performance improvement after 1 M steps on the Crafter benchmark. Our code is released at this https URL.

[LG-31] Physics-informed learning under mixing: How physical knowledge speeds up learning

链接: https://arxiv.org/abs/2509.24801
作者: Anna Scampicchio,Leonardo F. Toso,Rahel Rickenbach,James Anderson,Melanie N. Zeilinger
类目: Machine Learning (cs.LG); Systems and Control (eess.SY)
*备注:

点击查看摘要

[LG-32] Fidel-TS: A High-Fidelity Benchmark for Multimodal Time Series Forecasting

链接: https://arxiv.org/abs/2509.24789
作者: Zhijian Xu,Wanxu Cai,Xilin Dai,Zhaorong Deng,Qiang Xu
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-33] Assessing the risk of future Dunkelflaute events for Germany using generative deep learning

链接: https://arxiv.org/abs/2509.24788
作者: Felix Strnad,Jonathan Schmidt,Fabian Mockert,Philipp Hennig,Nicole Ludwig
类目: Machine Learning (cs.LG); Geophysics (physics.geo-ph)
*备注:

点击查看摘要

[LG-34] MarS-FM: Generative Modeling of Molecular Dynamics via Markov State Models

链接: https://arxiv.org/abs/2509.24779
作者: Kacper Kapuśniak,Cristian Gabellini,Michael Bronstein,Prudencio Tossou,Francesco Di Giovanni
类目: Machine Learning (cs.LG); Biomolecules (q-bio.BM)
*备注:

点击查看摘要

[LG-35] Neural Message-Passing on Attention Graphs for Hallucination Detection

链接: https://arxiv.org/abs/2509.24770
作者: Fabrizio Frasca,Guy Bar-Shalom,Yftah Ziser,Haggai Maron
类目: Machine Learning (cs.LG)
*备注: Preprint. 25 pages, 2 figures

点击查看摘要

[LG-36] In-Context Learning of Temporal Point Processes with Foundation Inference Models

链接: https://arxiv.org/abs/2509.24762
作者: David Berghaus,Patrick Seifner,Kostadin Cvejoski,César Ojeda,Ramsés J. Sánchez
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-37] Who invented deep residual learning?

链接: https://arxiv.org/abs/2509.24732
作者: Juergen Schmidhuber
类目: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
*备注: 12 pages, 2 illustrations, circa 100 partially annotated references

点击查看摘要

[LG-38] Beyond Softmax: A Natural Parameterization for Categorical Random Variables

链接: https://arxiv.org/abs/2509.24728
作者: Alessandro Manenti,Cesare Alippi
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-39] Stabilizing Humanoid Robot Trajectory Generation via Physics-Informed Learning and Control-Informed Steering IROS

链接: https://arxiv.org/abs/2509.24697
作者: Evelyn D’Elia,Paolo Maria Viceconte,Lorenzo Rapetti,Diego Ferigo,Giulio Romualdi,Giuseppe L’Erario,Raffaello Camoriano,Daniele Pucci
类目: Robotics (cs.RO); Machine Learning (cs.LG); Systems and Control (eess.SY)
*备注: This paper has been accepted for publication at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, China, 2025

点击查看摘要

[LG-40] HyperHELM: Hyperbolic Hierarchy Encoding for mRNA Language Modeling

链接: https://arxiv.org/abs/2509.24655
作者: Max van Spengler,Artem Moskalev,Tommaso Mansi,Mangal Prakash,Rui Liao
类目: Machine Learning (cs.LG); Genomics (q-bio.GN)
*备注:

点击查看摘要

[LG-41] Learning Hamiltonian Dynamics at Scale: A Differential-Geometric Approach

链接: https://arxiv.org/abs/2509.24627
作者: Katharina Friedl,Noémie Jaquier,Mika Liao,Danica Kragic
类目: Machine Learning (cs.LG)
*备注: 28 pages, 15 figures

点击查看摘要

[LG-42] Evaluating classification performance across operating contexts: A comparison of decision curve analysis and cost curves

链接: https://arxiv.org/abs/2509.24608
作者: Louise AC Millard,Peter A Flach
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-43] CURA: Size Isnt All You Need - A Compact Universal Architecture for On-Device Intelligence

链接: https://arxiv.org/abs/2509.24601
作者: Jae-Bum Seo,Muhammad Salman,Lismer Andres Caceres-Najarro
类目: Machine Learning (cs.LG); Signal Processing (eess.SP)
*备注: 14 pages, 3 figures, 8 tables

点击查看摘要

[LG-44] Prompting Robot Teams with Natural Language

链接: https://arxiv.org/abs/2509.24575
作者: Nicolas Pfitzer,Eduardo Sebastián,Ajay Shankar,Amanda Prorok
类目: Robotics (cs.RO); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
*备注:

点击查看摘要

[LG-45] Learning to Solve Optimization Problems Constrained with Partial Differential Equations

链接: https://arxiv.org/abs/2509.24573
作者: Yusuf Guven,Vincenzo Di Vito,Ferdinando Fioretto
类目: Machine Learning (cs.LG); Optimization and Control (math.OC)
*备注:

点击查看摘要

[LG-46] Emergent World Representations in OpenVLA

链接: https://arxiv.org/abs/2509.24559
作者: Marco Molinari,Leonardo Nevali,Saharsha Navani,Omar G. Younis
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-47] raining-Free Multimodal Guidance for Video to Audio Generation

链接: https://arxiv.org/abs/2509.24550
作者: Eleonora Grassucci,Giuliano Galadini,Giordano Cicchetti,Aurelio Uncini,Fabio Antonacci,Danilo Comminiello
类目: Machine Learning (cs.LG); Sound (cs.SD)
*备注:

点击查看摘要

[LG-48] rading Carbon for Physics: On the Resource Efficiency of Machine Learning for Spatio-Temporal Forecasting ATC

链接: https://arxiv.org/abs/2509.24517
作者: Sophia N. Wilson,Jens Hesselbjerg Christensen,Raghavendra Selvan
类目: Machine Learning (cs.LG)
*备注: Source code available at this https URL

点击查看摘要

[LG-49] Guided Uncertainty Learning Using a Post-Hoc Evidential Meta-Model

链接: https://arxiv.org/abs/2509.24492
作者: Charmaine Barker,Daniel Bethell,Simos Gerasimou
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-50] One-Prompt Strikes Back: Sparse Mixture of Experts for Prompt-based Continual Learning

链接: https://arxiv.org/abs/2509.24483
作者: Minh Le,Bao-Ngoc Dao,Huy Nguyen,Quyen Tran,Anh Nguyen,Nhat Ho
类目: Machine Learning (cs.LG)
*备注: 40 pages, 9 figures

点击查看摘要

[LG-51] FS-KAN: Permutation Equivariant Kolmogorov-Arnold Networks via Function Sharing

链接: https://arxiv.org/abs/2509.24472
作者: Ran Elbaz,Guy Bar-Shalom,Yam Eitan,Fabrizio Frasca,Haggai Maron
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-52] Interpretable Kernel Representation Learning at Scale: A Unified Framework Utilizing Nyström Approximation

链接: https://arxiv.org/abs/2509.24467
作者: Maedeh Zarvandi,Michael Timothy,Theresa Wasserer,Debarghya Ghoshdastidar
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注: 19 Pages, 3 figures

点击查看摘要

[LG-53] Distributionally Robust Federated Learning with Outlier Resilience

链接: https://arxiv.org/abs/2509.24462
作者: Zifan Wang,Xinlei Yi,Xenia Konti,Michael M. Zavlanos,Karl H. Johansson
类目: Machine Learning (cs.LG); Optimization and Control (math.OC)
*备注:

点击查看摘要

[LG-54] Contrastive Learning for Correlating Network Incidents

链接: https://arxiv.org/abs/2509.24446
作者: Jeremias Dötterl
类目: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG)
*备注: Accepted at The 26th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2025). This work was partially funded by the German Federal Ministry of Research, Technology and Space (BMFTR) in the FRONT-RUNNER project (Grant 16KISR005K)

点击查看摘要

[LG-55] Semantic Compression via Multimodal Representation Learning

链接: https://arxiv.org/abs/2509.24431
作者: Eleonora Grassucci,Giordano Cicchetti,Aurelio Uncini,Danilo Comminiello
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-56] BiHDTrans: binary hyperdimensional transformer for efficient multivariate time series classification

链接: https://arxiv.org/abs/2509.24425
作者: Jingtao Zhang,Yi Liu,Qi Shen,Changhong Wang
类目: Machine Learning (cs.LG); Hardware Architecture (cs.AR)
*备注:

点击查看摘要

[LG-57] FuncPoison: Poisoning Function Library to Hijack Multi-agent Autonomous Driving Systems

链接: https://arxiv.org/abs/2509.24408
作者: Yuzhen Long,Songze Li
类目: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-58] Muon: Training and Trade-offs with Latent Attention and MoE

链接: https://arxiv.org/abs/2509.24406
作者: Sushant Mehta,Raj Dandekar,Rajat Dandekar,Sreedath Panat
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-59] From Sound to Setting: AI-Based Equalizer Parameter Prediction for Piano Tone Replication ACL

链接: https://arxiv.org/abs/2509.24404
作者: Song-Ze Yu
类目: ound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
*备注: Undergraduate project technical preprint. 4 pages, 6 figures. Code data: this https URL Primary: cs.SD, Secondary: cs.LG

点击查看摘要

[LG-60] AXIS: Explainable Time Series Anomaly Detection with Large Language Models

链接: https://arxiv.org/abs/2509.24378
作者: Tian Lan,Hao Duong Le,Jinbo Li,Wenjun He,Meng Wang,Chenghao Liu,Chen Zhang
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-61] Prediction-Powered Communication with Distortion Guarantees

链接: https://arxiv.org/abs/2509.24373
作者: Matteo Zecchin,Unnikrishnan Kunnath Ganesan,Giuseppe Durisi,Petar Popovski,Osvaldo Simeone
类目: Information Theory (cs.IT); Machine Learning (cs.LG); Signal Processing (eess.SP)
*备注:

点击查看摘要

[LG-62] Expanding Horizons of Level Diversity via Multi-objective Evolutionary Learning

链接: https://arxiv.org/abs/2509.24341
作者: Qingquan Zhang,Ziqi Wang,Yuchen Li,Keyuan Zhang,Bo Yuan,Jialin Liu
类目: Machine Learning (cs.LG)
*备注: 12 pages,6 figures

点击查看摘要

[LG-63] H: An Efficient Similarity-Aware Aggregation for Byzantine Resilient Federated Learning

链接: https://arxiv.org/abs/2509.24330
作者: Shiyuan Zuo,Rongfei Fan,Cheng Zhan,Jie Xu,Puning Zhao,Han Hu
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-64] AuON: A Linear-time Alternative to Semi-Orthogonal Momentum Updates

链接: https://arxiv.org/abs/2509.24320
作者: Dipan Maity
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-65] Asynchronous Policy Gradient Aggregation for Efficient Distributed Reinforcement Learning

链接: https://arxiv.org/abs/2509.24305
作者: Alexander Tyurin,Andrei Spiridonov,Varvara Rudenko
类目: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)
*备注:

点击查看摘要

[LG-66] ELASTIQ: EEG-Language Alignment with Semantic Task Instruction and Querying

链接: https://arxiv.org/abs/2509.24302
作者: Muyun Jiang,Shuailei Zhang,Zhenjie Yang,Mengjun Wu,Weibang Jiang,Zhiwei Guo,Wei Zhang,Rui Liu,Shangen Zhang,Yong Li,Yi Ding,Cuntai Guan
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-67] VeriLLM : A Lightweight Framework for Publicly Verifiable Decentralized Inference

链接: https://arxiv.org/abs/2509.24257
作者: Ke Wang,Felix Qu,Libin Xia,Zishuo Zhao,Chris Tong,Lynn Ai,Eric Yang
类目: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
*备注: 13 pages, 4 figures, 2 tables

点击查看摘要

[LG-68] Understanding Cognitive States from Head Hand Motion Data

链接: https://arxiv.org/abs/2509.24255
作者: Kaiang Wen,Mark Roman Miller
类目: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-69] Accessible Realistic and Fair Evaluation of Positive-Unlabeled Learning Algorithms

链接: https://arxiv.org/abs/2509.24228
作者: Wei Wang,Dong-Dong Wu,Ming Li,Jingxiong Zhang,Gang Niu,Masashi Sugiyama
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-70] Proposing a Framework for Machine Learning Adoption on Legacy Systems ICDM’25

链接: https://arxiv.org/abs/2509.24224
作者: Ashiqur Rahman,Hamed Alhoori
类目: Machine Learning (cs.LG)
*备注: Accepted at The First International Workshop on Resilient Artificial Intelligence for Manufacturing (ICDM’25)

点击查看摘要

[LG-71] MDD-Thinker: Towards Large Reasoning Models for Major Depressive Disorder Diagnosis

链接: https://arxiv.org/abs/2509.24217
作者: Yuyang Sha,Hongxin Pan,Gang Luo,Caijuan Shi,Jing Wang,Kefeng Li
类目: Machine Learning (cs.LG); Numerical Analysis (math.NA)
*备注:

点击查看摘要

[LG-72] Negative Pre-activations Differentiate Syntax

链接: https://arxiv.org/abs/2509.24198
作者: Linghao Kong,Angelina Ning,Micah Adler,Nir Shavit
类目: Machine Learning (cs.LG)
*备注: 10 pages, 7 figures

点击查看摘要

Abstract:A recently discovered class of entangled neurons, known as Wasserstein neurons, is disproportionately critical in large language models despite constituting only a very small fraction of the network: their targeted removal collapses the model, consistent with their unique role in differentiating similar inputs. Interestingly, in Wasserstein neurons immediately preceding smooth activation functions, such differentiation manifests in the negative pre-activation space, especially in early layers. Pairs of similar inputs are driven to highly distinct negative values, and these pairs involve syntactic tokens such as determiners and prepositions. We show that this negative region is functional rather than simply favorable for optimization. A minimal, sign-specific intervention that zeroes only the negative pre-activations of a small subset of entangled neurons significantly weakens overall model function and disrupts grammatical behavior, while both random and perplexity-matched controls leave grammatical performance largely unchanged. Part of speech analysis localizes the excess surprisal to syntactic scaffolding tokens, and layer-specific interventions reveal that small local degradations accumulate across depth. Over training checkpoints, the same ablation impairs grammatical behavior as Wasserstein neurons emerge and stabilize. Together, these results identify negative differentiation in a sparse subset of entangled neurons as a crucial mechanism that language models rely on for syntax.

[LG-73] FM-FoG: A Real-Time Foundation Model-based Wearable System for Freezing-of-Gait Mitigation

链接: https://arxiv.org/abs/2509.24176
作者: Chuntian Chi,John Clapham,Leslie Cloud,Ingrid Pretzer-Aboff,GinaMari Blackwell,Huajie Shao,Gang Zhou
类目: Machine Learning (cs.LG)
*备注: This is a preprint version, 12 pages, 7 figures, 8 tables

点击查看摘要

[LG-74] Model Correlation Detection via Random Selection Probing

链接: https://arxiv.org/abs/2509.24171
作者: Ruibo Chen,Sheng Zhang,Yihan Wu,Tong Zheng,Peihua Mai,Heng Huang
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-75] Multi-Scale Geometric Autoencoder

链接: https://arxiv.org/abs/2509.24168
作者: Qipeng Zhan,Zhuoping Zhou,Zexuan Wang,Li Shen
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-76] Evaluation of Machine and Deep Learning Techniques for Cyclone Trajectory Regression and Status Classification by Time Series Data

链接: https://arxiv.org/abs/2509.24146
作者: Ethan Zachary Lo,Dan Chie-Tien Lo
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-77] A signal separation view of classification

链接: https://arxiv.org/abs/2509.24140
作者: H. N. Mhaskar,Ryan O’Dowd
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-78] Echo Flow Networks

链接: https://arxiv.org/abs/2509.24122
作者: Hongbo Liu,Jia Xu
类目: Machine Learning (cs.LG)
*备注: Under Review

点击查看摘要

[LG-79] HyMaTE: A Hybrid Mamba and Transformer Model for EHR Representation Learning

链接: https://arxiv.org/abs/2509.24118
作者: Md Mozaharul Mottalib,Thao-Ly T. Phan,Rahmatollah Beheshti
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-80] GeoFunFlow: Geometric Function Flow Matching for Inverse Operator Learning over Complex Geometries

链接: https://arxiv.org/abs/2509.24117
作者: Sifan Wang,Zhikai Wu,David van Dijk,Lu Lu
类目: Machine Learning (cs.LG); Computational Physics (physics.comp-ph); Machine Learning (stat.ML)
*备注: 26 pages, 13 figures, 9 tables

点击查看摘要

[LG-81] ADAPT: Lightweight Long-Range Machine Learning Force Fields Without Graphs

链接: https://arxiv.org/abs/2509.24115
作者: Evan Dramko,Yihuang Xiong,Yizhi Zhu,Geoffroy Hautier,Thomas Reps,Christopher Jermaine,Anastasios Kyrillidis
类目: Machine Learning (cs.LG); Materials Science (cond-mat.mtrl-sci); Optimization and Control (math.OC)
*备注: 14 total pages of main content, 4 of references, 3 in Appendix

点击查看摘要

[LG-82] Demographic-Agnostic Fairness without Harm

链接: https://arxiv.org/abs/2509.24077
作者: Zhongteng Cai,Mohammad Mahdi Khalili,Xueru Zhang
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-83] A Family of Kernelized Matrix Costs for Multiple-Output Mixture Neural Networks

链接: https://arxiv.org/abs/2509.24076
作者: Bo Hu,José C. Príncipe
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-84] On The Variability of Concept Activation Vectors AAAI-26

链接: https://arxiv.org/abs/2509.24058
作者: Julia Wenkmann,Damien Garreau
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注: 26 pages (including appendix), 24 figures (44 panels). Submitted to AAAI-26

点击查看摘要

[LG-85] Collaborative Device-Cloud LLM Inference through Reinforcement Learning

链接: https://arxiv.org/abs/2509.24050
作者: Wenzhi Fang,Dong-Jun Han,Liangqi Yuan,Christopher Brinton
类目: Machine Learning (cs.LG)
*备注: We propose a unified post-training framework that integrates routing optimization, enabling the on-device LLM to improve its problem-solving ability while learning routing strategies

点击查看摘要

[LG-86] Optimism as Risk-Seeking in Multi-Agent Reinforcement Learning

链接: https://arxiv.org/abs/2509.24047
作者: Runyu Zhang,Na Li,Asuman Ozdaglar,Jeff Shamma,Gioele Zardini
类目: Machine Learning (cs.LG); Systems and Control (eess.SY); Optimization and Control (math.OC)
*备注:

点击查看摘要

[LG-87] Pretraining Scaling Laws for Generative Evaluations of Language Models

链接: https://arxiv.org/abs/2509.24012
作者: Rylan Schaeffer,Noam Levi,Brando Miranda,Sanmi Koyejo
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-88] Does Weak-to-strong Generalization Happen under Spurious Correlations?

链接: https://arxiv.org/abs/2509.24005
作者: Chenruo Liu,Yijun Dong,Qi Lei
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-89] Curriculum-Guided Reinforcement Learning for Synthesizing Gas-Efficient Financial Derivatives Contracts

链接: https://arxiv.org/abs/2509.23976
作者: Maruf Ahmed Mridul,Oshani Seneviratne
类目: Machine Learning (cs.LG)
*备注: 8 pages, 3 figures, 2 tables

点击查看摘要

[LG-90] Equation-Free Coarse Control of Distributed Parameter Systems via Local Neural Operators

链接: https://arxiv.org/abs/2509.23975
作者: Gianluca Fabiani,Constantinos Siettos,Ioannis G. Kevrekidis
类目: ystems and Control (eess.SY); Machine Learning (cs.LG); Numerical Analysis (math.NA); Optimization and Control (math.OC)
*备注: 8 pages, 2 figures

点击查看摘要

[LG-91] Evaluating the Robustness of Chinchilla Compute-Optimal Scaling

链接: https://arxiv.org/abs/2509.23963
作者: Rylan Schaeffer,Noam Levi,Andreas Kirsch,Theo Guenais,Brando Miranda,Elyas Obbad,Sanmi Koyejo
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-92] Learning-Based Testing for Deep Learning: Enhancing Model Robustness with Adversarial Input Prioritization

链接: https://arxiv.org/abs/2509.23961
作者: Sheikh Md Mushfiqur Rahman,Nasir Eisty
类目: oftware Engineering (cs.SE); Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-93] DiBS-MTL: Transformation-Invariant Multitask Learning with Direction Oracles

链接: https://arxiv.org/abs/2509.23948
作者: Surya Murthy,Kushagra Gupta,Mustafa O. Karabag,David Fridovich-Keil,Ufuk Topcu
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-94] Efficient Identification of High Similarity Clusters in Polygon Datasets

链接: https://arxiv.org/abs/2509.23942
作者: John N. Daras
类目: Machine Learning (cs.LG); Databases (cs.DB); Quantitative Methods (q-bio.QM)
*备注: 11 pages, 3 figures

点击查看摘要

[LG-95] Brain-language fusion enables interactive neural readout and in-silico experimentation

链接: https://arxiv.org/abs/2509.23941
作者: Victoria Bosch,Daniel Anthes,Adrien Doerig,Sushrut Thorat,Peter König,Tim Christian Kietzmann
类目: Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)
*备注:

点击查看摘要

[LG-96] Integrated Communication and Control for Energy-Efficient UAV Swarms: A Multi-Agent Reinforcement Learning Approach

链接: https://arxiv.org/abs/2509.23905
作者: Tianjiao Sun,Ningyan Guo,Haozhe Gu,Yanyan Peng,Zhiyong Feng
类目: Machine Learning (cs.LG); Signal Processing (eess.SP); Systems and Control (eess.SY)
*备注:

点击查看摘要

[LG-97] Differentiable Sparsity via D-Gating: Simple and Versatile Structured Penalization

链接: https://arxiv.org/abs/2509.23898
作者: Chris Kolb,Laetitia Frost,Bernd Bischl,David Rügamer
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-98] rained Mamba Emulates Online Gradient Descent in In-Context Linear Regression

链接: https://arxiv.org/abs/2509.23779
作者: Jiarui Jiang,Wei Huang,Miao Zhang,Taiji Suzuki,Liqiang Nie
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-99] VioPTT: Violin Technique-Aware Transcription from Synthetic Data Augmentation

链接: https://arxiv.org/abs/2509.23759
作者: Ting-Kang Wang,Yueh-Po Peng,Li Su,Vincent K.M. Cheung
类目: ound (cs.SD); Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-100] An Investigation of Batch Normalization in Off-Policy Actor-Critic Algorithms

链接: https://arxiv.org/abs/2509.23750
作者: Li Wang,Sudun,Xingjian Zhang,Wenjun Wu,Lei Huang
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-101] me-Shifted Token Scheduling for Symbolic Music Generation

链接: https://arxiv.org/abs/2509.23749
作者: Ting-Kang Wang,Chih-Pin Tan,Yi-Hsuan Yang
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-102] A Self-Adaptive Frequency Domain Network for Continuous Intraoperative Hypotension Prediction ECAI2025

链接: https://arxiv.org/abs/2509.23720
作者: Xian Zeng,Tianze Xu,Kai Yang,Jie Sun,Youran Wang,Jun Xu,Mucheng Ren
类目: Machine Learning (cs.LG)
*备注: Accepted at ECAI 2025 main conference

点击查看摘要

[LG-103] FraudTransformer: Time-Aware GPT for Transaction Fraud Detection

链接: https://arxiv.org/abs/2509.23712
作者: Gholamali Aminian,Andrew Elliott,Tiger Li,Timothy Cheuk Hin Wong,Victor Claude Dehon,Lukasz Szpruch,Carsten Maple,Christopher Read,Martin Brown,Gesine Reinert,Mo Mamouei
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注: Pre-print

点击查看摘要

[LG-104] Merge Now Regret Later: The Hidden Cost of Model Merging is Adversarial Transferability

链接: https://arxiv.org/abs/2509.23689
作者: Ankit Gangwal,Aaryan Ajay Sharma
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-105] FedDAPL: Toward Client-Private Generalization in Federated Learning

链接: https://arxiv.org/abs/2509.23688
作者: Soroosh Safari Loaliyan,Jose-Luis Ambite,Paul M. Thompson,Neda Jahanshad,Greg Ver Steeg
类目: Machine Learning (cs.LG)
*备注: 4 Pages

点击查看摘要

[LG-106] Hedonic Neurons: A Mechanistic Mapping of Latent Coalitions in Transformer MLPs

链接: https://arxiv.org/abs/2509.23684
作者: Tanya Chowdhury,Atharva Nijasure,Yair Zick,James Allan
类目: Machine Learning (cs.LG)
*备注: Preprint

点击查看摘要

[LG-107] Decentralized Dynamic Cooperation of Personalized Models for Federated Continual Learning

链接: https://arxiv.org/abs/2509.23683
作者: Danni Yang,Zhikang Chen,Sen Cui,Mengyue Yang,Ding Li,Abudukelimu Wuerkaixi,Haoxuan Li,Jinke Ren,Mingming Gong
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-108] Multi-Scale Spatial-Temporal Hypergraph Network with Lead-Lag Structures for Stock Time Series Forecasting

链接: https://arxiv.org/abs/2509.23668
作者: Xiangfei Qiu,Liu Yang,Hanyin Cheng,Xingjian Wu,Rongjia Wu,Zhigang Zhang,Ding Tu,Chenjuan Guo,Bin Yang,Christian S. Jensen,Jilin Hu
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-109] Why Alignment Must Precede Distillation: A Minimal Working Explanation

链接: https://arxiv.org/abs/2509.23667
作者: Sungmin Cha,Kyunghyun Cho
类目: Machine Learning (cs.LG)
*备注: Preprint

点击查看摘要

[LG-110] Virtual Nodes based Heterogeneous Graph Convolutional Neural Network for Efficient Long-Range Information Aggregation

链接: https://arxiv.org/abs/2509.23660
作者: Ranhui Yan,Jia cai
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-111] PreScope: Unleashing the Power of Prefetching for Resource-Constrained MoE Inference

链接: https://arxiv.org/abs/2509.23638
作者: Enda Yu,Zhaoning Zhang,Dezun Dong,Yongwei Wu,Xiangke Liao
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-112] DRIK: Distribution-Robust Inductive Kriging without Information Leakage

链接: https://arxiv.org/abs/2509.23631
作者: Chen Yang,Changhao Zhao,Chen Wang,Jiansheng Fan
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-113] Communication-aware Wide-Area Damping Control using Risk-Constrained Reinforcement Learning

链接: https://arxiv.org/abs/2509.23620
作者: Kyung-bin Kwon,Lintao Ye,Vijay Gupta,Hao Zhu
类目: ystems and Control (eess.SY); Machine Learning (cs.LG)
*备注: 12 pages, 14 figures, Accepted for publication in IEEE Transactions on Smart Grid, 2025

点击查看摘要

[LG-114] Avoid Catastrophic Forgetting with Rank-1 Fisher from Diffusion Models

链接: https://arxiv.org/abs/2509.23593
作者: Zekun Wang,Anant Gupta,Zihan Dong,Christopher J. MacLellan
类目: Machine Learning (cs.LG)
*备注: 18 pages, 14 figures

点击查看摘要

[LG-115] Sketching Low-Rank Plus Diagonal Matrices

链接: https://arxiv.org/abs/2509.23587
作者: Andres Fernandez,Felix Dangel,Philipp Hennig,Frank Schneider
类目: Machine Learning (cs.LG); Numerical Analysis (math.NA)
*备注:

点击查看摘要

[LG-116] EVO-LRP: Evolutionary Optimization of LRP for Interpretable Model Explanations

链接: https://arxiv.org/abs/2509.23585
作者: Emerald Zhang,Julian Weaver,Edward Castillo
类目: Machine Learning (cs.LG)
*备注: 15 pages

点击查看摘要

[LG-117] Improving constraint-based discovery with robust propagation and reliable LLM priors

链接: https://arxiv.org/abs/2509.23570
作者: Ruiqi Lyu,Alistair Turcan,Martin Jinye Zhang,Bryan Wilder
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-118] Network-Optimised Spiking Neural Network for Event-Driven Networking

链接: https://arxiv.org/abs/2509.23516
作者: Muhammad Bilal
类目: Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI); Optimization and Control (math.OC)
*备注: 52 pages, 16 figures, 9 tables

点击查看摘要

[LG-119] Beyond Outliers: A Study of Optimizers Under Quantization

链接: https://arxiv.org/abs/2509.23500
作者: Georgios Vlassis,Saleh Ashkboos,Alexandra Volkova,Torsten Hoefler,Dan Alistarh
类目: Machine Learning (cs.LG)
*备注: 20 pages

点击查看摘要

[LG-120] Statistical Learning Guarantees for Group-Invariant Barron Functions

链接: https://arxiv.org/abs/2509.23474
作者: Yahong Yang,Wei Zhu
类目: Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-121] Drift-Adapter: A Practical Approach to Near Zero-Downtime Embedding Model Upgrades in Vector Databases EMNLP2025

链接: https://arxiv.org/abs/2509.23471
作者: Harshil Vejendla
类目: Machine Learning (cs.LG); Information Retrieval (cs.IR)
*备注: EMNLP 2025 Main 12 pages, 6 figures

点击查看摘要

[LG-122] Solve Smart Not Often: Policy Learning for Costly MILP Re-solving

链接: https://arxiv.org/abs/2509.23470
作者: Rui Ai,Hugo De Oliveira Barbalho,Sirui Li,Alexei Robsky,David Simchi-Levi,Ishai Menache
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

[LG-123] PHASE: Physics-Integrated Heterogeneity-Aware Surrogates for Scientific Simulations

链接: https://arxiv.org/abs/2509.23453
作者: Dawei Gao,Dali Wang,Zhuowei Gu,Qinglei Cao,Xiao Wang,Peter Thornton,Dan Ricciuto,Yunhe Feng
类目: Machine Learning (cs.LG); Computational Physics (physics.comp-ph)
*备注: 19 pages, 13 figures

点击查看摘要

[LG-124] Better Hessians Matter: Studying the Impact of Curvature Approximations in Influence Functions

链接: https://arxiv.org/abs/2509.23437
作者: Steve Hong,Runa Eschenhagen,Bruno Mlodozeniec,Richard Turner
类目: Machine Learning (cs.LG); Machine Learning (stat.ML)
*备注:

点击查看摘要

[LG-125] LOTFormer: Doubly-Stochastic Linear Attention via Low-Rank Optimal Transport

链接: https://arxiv.org/abs/2509.23436
作者: Ashkan Shahbazi,Chayne Thrash,Yikun Bai,Keaton Hamm,Navid NaderiAlizadeh,Soheil Kolouri
类目: Machine Learning (cs.LG)
*备注:

点击查看摘要

信息检索

[IR-0] UniDex: Rethinking Search Inverted Indexing with Unified Semantic Modeling

链接: https://arxiv.org/abs/2509.24632
作者: Zan Li,Jiahui Chen,Yuan Chai,Xiaoze Jiang,Xiaohua Qi,Zhiheng Qin,Runbin Zhou,Shun Zuo,Guangchao Hao,Kefeng Wang,Jingshan Lv,Yupeng Huang,Xiao Liang,Han Li
类目: Information Retrieval (cs.IR)
*备注: 11 pages, 6 figures and 5 tables

点击查看摘要

[IR-1] Semantic Representation of Processes with Ontology Design Patterns

链接: https://arxiv.org/abs/2509.23776
作者: Ebrahim Norouzi,Sven Hertling,Jörg Waitelonis,Harald Sack
类目: Information Retrieval (cs.IR); Information Theory (cs.IT)
*备注:

点击查看摘要

[IR-2] Constructing Opera Seria in the Iberian Courts: Metastasian Repertoire for Spain and Portugal

链接: https://arxiv.org/abs/2509.23771
作者: Ana Llorens,Alvaro Torrente
类目: Information Retrieval (cs.IR)
*备注:

点击查看摘要

附件下载

点击下载今日全部论文列表

Arxiv今日论文 | 2025-09-30

目录

概览 (2025-09-30)

自然语言处理

计算机视觉

人工智能

机器学习

信息检索

附件下载