| 微调CLIP的最后一个视觉投影器:少量样本的丰富资源 |
Mohammad Fahes |
PDF |
N/A |
Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia |
| 数据顾问:大型语言模型安全对齐的动态数据管理 |
Fei Wang |
PDF |
N/A |
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models |
| 在多模态数据中定位部分定义的事件 |
Kate Sanders |
PDF |
N/A |
Grounding Partially-Defined Events in Multimodal Data |
| 使用密集特征进行脑图绘制:利用视觉变换器将皮质语义选择性锚定在自然图像上 |
Andrew F. Luo |
PDF |
N/A |
Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers |
| PrefixQuant:静态量化通过LLMs中的前缀异常值胜过动态量化 |
Mengzhao Chen |
PDF |
N/A |
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs |
| 偏差下的回归保序预测 |
Matt Y. Cheung |
PDF |
N/A |
Regression Conformal Prediction under Bias |
| TurtleBench:通过真实世界的“是/否”谜题评估顶尖语言模型 |
Qingchen Yu |
PDF |
N/A |
TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles |
| TextHawk2:一款大型视觉语言模型,在双语OCR和定位方面表现卓越,仅需16分之一的Token数量 |
Ya-Qi Yu |
PDF |
N/A |
TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens |
| DART:一种基于扩散的自回归运动模型,用于实时文本驱动的运动控制 |
Kaifeng Zhao |
PDF |
N/A |
DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control |
| GS-VTON:基于高斯溅射的可控3D虚拟试衣 |
Yukang Cao |
PDF |
N/A |
GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting |
| 差动变压器 |
Tianzhu Ye |
PDF |
N/A |
Differential Transformer |
| SePPO:用于扩散对齐的半策略偏好优化 |
Daoan Zhang |
PDF |
N/A |
SePPO: Semi-Policy Preference Optimization for Diffusion Alignment |
| GLEE:一个基于语言的经济环境统一框架和基准测试 |
Eilam Shapira |
PDF |
N/A |
GLEE: A Unified Framework and Benchmark for Language-based Economic Environments |
| 因果微叙事 |
Mourad Heddaya |
PDF |
N/A |
Causal Micro-Narratives |
| LoTLIP:改进长文本理解的语言-图像预训练 |
Wei Wu |
PDF |
N/A |
LoTLIP: Improving Language-Image Pre-training for Long Text Understanding |
| SFTMix:通过Mixup方法提升语言模型指令调优 |
Yuxin Xiao |
PDF |
N/A |
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe |
| 像人类一样在数字世界中导航:GUI代理的通用视觉基础 |
Boyu Gou |
PDF |
N/A |
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents |
| TuneVLSeg:视觉-语言分割模型的提示调优基准 |
Rabin Adhikari |
PDF |
N/A |
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models |
| CasiMedicos-Arg:一个带有解释性论证结构的医学问答数据集 |
katerina Sviridova |
PDF |
N/A |
CasiMedicos-Arg: A Medical Question Answering Dataset Annotated with Explanatory Argumentative Structures |
| DiffuseReg:用于在无监督可变形图像配准中获取变形场的去噪扩散模型 |
Yongtai Zhuo |
PDF |
N/A |
DiffuseReg: Denoising Diffusion Model for Obtaining Deformation Fields in Unsupervised Deformable Image Registration |
| SimO损失:用于细粒度监督对比学习的无锚对比损失 |
Taha Bouhsine |
PDF |
N/A |
SimO Loss: Anchor-Free Contrastive Loss for Fine-Grained Supervised Contrastive Learning |
| 对称镜头(SymmetryLens):一种通过局部性和等变性实现无监督对称学习的新候选范式 |
Onur Efe |
PDF |
N/A |
SymmetryLens: A new candidate paradigm for unsupervised symmetry learning via locality and equivariance |
| GSM-符号化:理解大型语言模型中数学推理的局限性 |
Iman Mirzadeh |
PDF |
N/A |
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models |
| 视频生成的黎明:与SORA类模型初步探索 |
Ailing Zeng |
PDF |
N/A |
The Dawn of Video Generation: Preliminary Explorations with SORA-like Models |
| ETGL-DDPG:一种用于稀疏奖励连续控制的深度确定性策略梯度算法 |
Ehsan Futuhi |
PDF |
N/A |
ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control |
| Cookbook:通过程序化数据生成模板提升大语言模型生成能力的框架 |
Avanika Narayan |
PDF |
N/A |
Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates |
| 仅用少量观测进行精确模型基准测试 |
Riccardo Fogliato |
PDF |
N/A |
Precise Model Benchmarking with Only a Few Observations |
| 使用大型语言模型进行密度估计:对上下文学习轨迹的几何研究 |
Toni J. B. Liu |
PDF |
N/A |
Density estimation with LLMs: a geometric investigation of in-context learning trajectories |
| 使用自然语言组织无结构图像集合 |
Mingxuan Liu |
PDF |
N/A |
Organizing Unstructured Image Collections using Natural Language |
| 保留预训练视觉语言模型(VLMs)的多模态能力以提升视觉语言组合性 |
Youngtaek Oh |
PDF |
N/A |
Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality |
| 研究并减轻手语理解模型中的偏见 |
Katherine Atwell |
PDF |
N/A |
Studying and Mitigating Biases in Sign Language Understanding Models |
| 超越FVD:视频生成质量的增强评估指标 |
Ge Ya |
PDF |
N/A |
Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality |
| RevisEval:通过响应自适应参考提升LLM作为评判者的能力 |
Qiyuan Zhang |
PDF |
N/A |
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References |
| 理解预热-稳定-衰减学习率:从河流谷地损失景观的角度 |
Kaiyue Wen |
PDF |
N/A |
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective |
| LADEV:一种面向机器人操作中视觉-语言-动作模型的语言驱动测试与评估平台 |
Zhijie Wang |
PDF |
N/A |
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation |
| 用于建模多维动态的矩阵加权网络 |
Yu Tian |
PDF |
N/A |
Matrix-weighted networks for modeling multidimensional dynamics |
| 超越相关性:机器翻译指标的可解释性评估 |
Stefano Perrella |
PDF |
N/A |
Beyond Correlation: Interpretable Evaluation of Machine Translation Metrics |
| MARs:用于空间地形基于块特征识别的多视图注意力正则化 |
Timothy Chase Jr |
PDF |
N/A |
MARs: Multi-view Attention Regularizations for Patch-based Feature Recognition of Space Terrain |
| 增强大型语言模型在医疗应用中的公平性 |
Yuelyu Ji |
PDF |
N/A |
Enhancing Equity in Large Language Models for Medical Applications |
| 在多重治疗场景下,因果效应估计是否足以实现最优推荐? |
Sherly Alfonso-Sánchez |
PDF |
N/A |
Are causal effect estimations enough for optimal recommendations under multitreatment scenarios? |
| ReasoningRank:通过基于推理的知识蒸馏来教授学生模型进行排序 |
Yuelyu Ji |
PDF |
N/A |
ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation |
| Presto!提取步骤和层次以加速音乐生成 |
Zachary Novack |
PDF |
N/A |
Presto! Distilling Steps and Layers for Accelerating Music Generation |
| 基于大型语言模型的生成推荐系统的有效推理 |
Xinyu Lin |
PDF |
N/A |
Efficient Inference for Large Language Model-based Generative Recommendation |
| 一种无需模拟的深度学习方法用于随机最优控制 |
Mengjian Hua |
PDF |
N/A |
A Simulation-Free Deep Learning Approach to Stochastic Optimal Control |
| 解读参数记忆与非参数记忆在增强检索的语言模型中的相互作用 |
Mehrdad Farahani |
PDF |
N/A |
Deciphering the Interplay of Parametric and Non-parametric Memory in Retrieval-augmented Language Models |
| VLM2Vec:训练视觉-语言模型以应对大规模多模态嵌入任务 |
Ziyan Jiang |
PDF |
N/A |
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks |
| MIBench:一个全面的模型反演攻击与防御基准测试 |
Yixiang Qiu |
PDF |
N/A |
MIBench: A Comprehensive Benchmark for Model Inversion Attack and Defense |
| PAMLR:一种基于被动-主动多臂老虎机的LoRa信道分配解决方案 |
Jihoon Yun |
PDF |
N/A |
PAMLR: A Passive-Active Multi-Armed Bandit-Based Solution for LoRa Channel Allocation |
| CTC-GMM:CTC引导的模态匹配,实现快速且准确的流式语音翻译 |
Rui Zhao |
PDF |
N/A |
CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translation |
| 利用多模态扩散模型加速成像并结合辅助信息 |
Timofey Efimov |
PDF |
N/A |
Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information |
| 无调优的双层优化:新算法与收敛性分析 |
Yifan Yang |
PDF |
N/A |
Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis |
| LOTOS:用于训练鲁棒集成模型的逐层正交化方法 |
Ali Ebrahimpour-Boroojeny |
PDF |
N/A |
LOTOS: Layer-wise Orthogonalization for Training Robust Ensembles |
| 一个用于液冷超级计算机的数字孪生框架,如在Exascale项目中所展示的 |
Wesley Brewer |
PDF |
N/A |
A Digital Twin Framework for Liquid-cooled Supercomputers as Demonstrated at Exascale |
| 可扩展且准确的基于LLM的多智能体图推理 |
Yuwei Hu |
PDF |
N/A |
Scalable and Accurate Graph Reasoning with LLM-based Multi-Agents |
| 单调平均场博弈中的最后一次迭代收敛 |
Noboru Isobe |
PDF |
N/A |
Last Iterate Convergence in Monotone Mean Field Games |
| 不可知平滑在线学习 |
Moïse Blanchard |
PDF |
N/A |
Agnostic Smoothed Online Learning |
| Assouad、Fano 和 Le Cam 与交互:一个统一的下界框架和带臂学习能力的表征 |
Fan Chen |
PDF |
N/A |
Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability |
| 人类反馈高效强化学习用于在线扩散模型微调 |
Ayano Hiranaka |
PDF |
N/A |
Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning |
| AlphaRouter:结合强化学习和树搜索的量子电路路由 |
Wei Tang |
PDF |
N/A |
AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search |
| 使用生成对抗网络和闭式因子分解合成皮肤镜图像 |
Rohan Reddy Mekala |
PDF |
N/A |
Synthetic Generation of Dermatoscopic Images with GAN and Closed-Form Factorization |
| LiDAR-GS:利用高斯喷洒实现实时激光雷达重仿真 |
Qifeng Chen |
PDF |
N/A |
LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting |
| 超表示:从神经网络群体中学习 |
Konstantin Schürholt |
PDF |
N/A |
Hyper-Representations: Learning from Populations of Neural Networks |
| 非渐近分析下的随机梯度下降与Richardson-Romberg外推法 |
Marina Sheshukova |
PDF |
N/A |
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation |
| AI增强的道德黑客攻击:以Linux为中心的实验 |
Haitham S. Al-Sinani |
PDF |
N/A |
AI-Enhanced Ethical Hacking: A Linux-Focused Experiment |
| MetaDD:通过神经网络架构不变泛化提升数据集蒸馏 |
Yunlong Zhao |
PDF |
N/A |
MetaDD: Boosting Dataset Distillation with Neural Network Architecture-Invariant Generalization |
| SparsePO: 通过稀疏令牌掩码控制LLMs的偏好对齐 |
Fenia Christopoulou |
PDF |
N/A |
SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks |
| CR-CTC:在CTC上的一致性正则化以提升语音识别效果 |
Zengwei Yao |
PDF |
N/A |
CR-CTC: Consistency regularization on CTC for improved speech recognition |
| IGroupSS-Mamba:用于高光谱图像分类的区间组空间-光谱Mamba |
Yan He |
PDF |
N/A |
IGroupSS-Mamba: Interval Group Spatial-Spectral Mamba for Hyperspectral Image Classification |
| 研究大型语言模型在从转录的嘈杂语音中提取语法正确句子方面的能力 |
Alina Wróblewska |
PDF |
N/A |
Investigating large language models for their competence in extracting grammatically sound sentences from transcribed noisy utterances |
| DreamSat:迈向空间物体新视角合成的通用3D模型 |
Nidhi Mathihalli |
PDF |
N/A |
DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects |
| 人机协同推理用于交通标志检测:协作方法 YOLO 与 Video-LLaVA |
Mehdi Azarafza |
PDF |
N/A |
Human-in-the-loop Reasoning For Traffic Sign Detection: Collaborative Approach Yolo With Video-llava |
| 游戏起源结构及其应用 |
Shawn Bowers |
PDF |
N/A |
On the Structure of Game Provenance and its Applications |
| HyperINF:释放舒尔茨方法在数据影响力估计中的超能力 |
Xinyu Zhou |
PDF |
N/A |
HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation |
| 大型语言模型随机性的解释敏感性:新闻文本分类案例 |
Jeremie Bogaert |
PDF |
N/A |
Explanation sensitivity to the randomness of large language models: the case of journalistic text classification |
| ScienceAgentBench:迈向数据驱动科学发现中语言代理的严格评估 |
Ziru Chen |
PDF |
N/A |
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery |
| 通过预训练Transformer进行压缩:一项关于字节级多模态数据的研究 |
David Heurtel-Depeiges |
PDF |
N/A |
Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data |
| ZEBRA:常识问答中的零样本基于示例的检索增强 |
Francesco Maria Molfese |
PDF |
N/A |
ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering |
| TidalDecode:利用位置持久稀疏注意力实现快速且准确的LLM解码 |
Lijie Yang |
PDF |
N/A |
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention |
| xLSTM-FER:通过扩展视觉长短期记忆网络增强学生表情识别 |
Qionghao Huang |
PDF |
N/A |
xLSTM-FER: Enhancing Student Expression Recognition with Extended Vision Long Short-Term Memory Network |
| 具有控制应用的随机浅层ReLU网络的函数梯度逼近 |
Andrew Lamperski |
PDF |
N/A |
Function Gradient Approximation with Random Shallow ReLU Networks with Control Applications |
| 面向控制的视觉潜在表示聚类 |
Han Qi |
PDF |
N/A |
Control-oriented Clustering of Visual Latent Representation |
| 通过局部-全局对比学习改进目标检测 |
Danai Triantafyllidou |
PDF |
N/A |
Improving Object Detection via Local-global Contrastive Learning |
| 选择:大规模图像分类数据整理策略基准 |
Benjamin Feuer |
PDF |
N/A |
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification |
| 随机迭代中$α$-混合的转变及其在排队论中的应用 |
Attila Lovas |
PDF |
N/A |
Transition of $α$-mixing in Random Iterations with Applications in Queuing Theory |
| 通过重参数化初始化大型语言模型以缓解损失尖峰 |
Kosuke Nishida |
PDF |
N/A |
Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes |
| HE-Drive:基于视觉语言模型的人类化端到端驾驶 |
Junming Wang |
PDF |
N/A |
HE-Drive: Human-Like End-to-End Driving with Vision Language Models |
| FreSh:用于加速神经表示学习的频率偏移 |
Adam Kania |
PDF |
N/A |
FreSh: Frequency Shifting for Accelerated Neural Representation Learning |
| 基于LLM的机器翻译的提示注入攻击测试套件 |
Antonio Valerio Miceli-Barone |
PDF |
N/A |
A test suite of prompt injection attacks for LLM-based machine translation |
| 命名临床实体识别基准 |
Wadood M Abdul |
PDF |
N/A |
Named Clinical Entity Recognition Benchmark |
| 大语言模型能否在求解器的额外提示下规划路径? |
Erik Wu |
PDF |
N/A |
Can LLMs plan paths with extra hints from solvers? |
| PhotoReg:光度学注册3D高斯溅射模型 |
Ziwen Yuan |
PDF |
N/A |
PhotoReg: Photometrically Registering 3D Gaussian Splatting Models |
| 基于视觉的户外牲畜监测方法的系统文献综述:从野生动物研究中汲取的教训 |
Stacey D. Scott |
PDF |
N/A |
Systematic Literature Review of Vision-Based Approaches to Outdoor Livestock Monitoring with Lessons from Wildlife Studies |
| 通用策略的主动微调 |
Marco Bagatella |
PDF |
N/A |
Active Fine-Tuning of Generalist Policies |
| 部门:用于预训练语言模型的解耦嵌入 |
Alex Iacob |
PDF |
N/A |
DEPT: Decoupled Embeddings for Pre-training Language Models |
| FRIDA:利用隐私攻击进行搭便车检测 |
Pol G. Recasens |
PDF |
N/A |
FRIDA: Free-Rider Detection using Privacy Attacks |
| RelUNet:用于多通道语音增强的相对通道融合U-Net |
Ibrahim Aldarmaki |
PDF |
N/A |
RelUNet: Relative Channel Fusion U-Net for Multichannel Speech Enhancement |
| 专家发现系统偏差评估 |
Jens-Joris Decorte |
PDF |
N/A |
On the Biased Assessment of Expert Finding Systems |
| T-JEPA:表格数据的无需增强的自监督学习 |
Hugo Thimonier |
PDF |
N/A |
T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data |
| 技能匹配:评估技能相关性的自监督学习 |
Jens-Joris Decorte |
PDF |
N/A |
SkillMatch: Evaluating Self-supervised Learning of Skill Relatedness |
| 假设驱动的后整合推理与负控制结果 |
Jin-Hong Du |
PDF |
N/A |
Assumption-Lean Post-Integrated Inference with Negative Control Outcomes |
| MC-QDSNN:使用生理信号进行压力检测的量化深度进化SNN与多树突隔室神经元 |
Ajay B. S. |
PDF |
N/A |
MC-QDSNN: Quantized Deep evolutionary SNN with Multi-Dendritic Compartment Neurons for Stress Detection using Physiological Signals |
| 分阶段和先验感知的神经语音相位预测 |
Fei Liu |
PDF |
N/A |
Stage-Wise and Prior-Aware Neural Speech Phase Prediction |
| 用于概率姿态回归的条件变分自编码器 |
Fereidoon Zangeneh |
PDF |
N/A |
Conditional Variational Autoencoders for Probabilistic Pose Regression |
| 基于模型的强化学习通过乐观汤普森采样的有效性 |
Jasmine Bayrooti |
PDF |
N/A |
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling |
| RoWeeder:通过作物行检测实现无监督杂草映射 |
Pasquale De Marinis |
PDF |
N/A |
RoWeeder: Unsupervised Weed Mapping through Crop-Row Detection |
| 基于安全学习的模型预测控制优化:应用于电池快速充电 |
Sebastian Hirt |
PDF |
N/A |
Safe Learning-Based Optimization of Model Predictive Control: Application to Battery Fast-Charging |
| 科学写作的严谨性:标准、分析与见解 |
Joseph James |
PDF |
N/A |
On the Rigour of Scientific Writing: Criteria, Analysis, and Insights |
| 无标记二维图像婴儿姿态估计方法的比较 |
Lennart Jahn |
PDF |
N/A |
Comparison of marker-less 2D image-based methods for infant pose estimation |
| 6DGS:增强的方向感知高斯喷洒用于体渲染 |
Zhongpai Gao |
PDF |
N/A |
6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering |
| L-C4:基于语言的视频着色,实现创意与一致的色彩效果 |
Zheng Chang |
PDF |
N/A |
L-C4: Language-Based Video Colorization for Creative and Consistent Color |
| 协作!面向鲁棒神经方法的路线规划问题 |
Jianan Zhou |
PDF |
N/A |
Collaboration! Towards Robust Neural Methods for Routing Problems |
| 揭示文本引导的3D人脸编辑方向 |
Zhuo Chen |
PDF |
N/A |
Revealing Directions for Text-guided 3D Face Editing |
| 激活缩放用于引导和解释语言模型 |
Niklas Stoehr |
PDF |
N/A |
Activation Scaling for Steering and Interpreting Language Models |
| 关于高效变体分割任何模型:一项调查 |
Xiaorui Sun |
PDF |
N/A |
On Efficient Variants of Segment Anything Model: A Survey |
| 无失败风险的无对比自监督学习 |
Emanuele Sansone |
PDF |
N/A |
Failure-Proof Non-Contrastive Self-Supervised Learning |
| 利用知识图谱和大型语言模型进行法律条文推荐:以中国刑法为例的研究 |
Yongming Chen |
PDF |
N/A |
Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law |
| 实时船舶识别与地理定位以提升海上态势感知能力 |
Borja Carrillo Perez |
PDF |
N/A |
Real-time Ship Recognition and Georeferencing for the Improvement of Maritime Situational Awareness |
| 检测和近似神经网络中的冗余计算模块 |
Irene Cannistraci |
PDF |
N/A |
Detecting and Approximating Redundant Computational Blocks in Neural Networks |
| 下一状态预测产生了纠缠的、但仍具有组合性的对象表示 |
Tankred Saanum |
PDF |
N/A |
Next state prediction gives rise to entangled, yet compositional representations of objects |
| PRFusion:通过图像和点云融合实现有效且鲁棒的多模态地点识别 |
Sijie Wang |
PDF |
N/A |
PRFusion: Toward Effective and Robust Multi-Modal Place Recognition with Image and Point Cloud Fusion |
| 在大规模FPS游戏地图中训练交互式代理:基于规则增强的强化学习 |
Chen Zhang |
PDF |
N/A |
Training Interactive Agent in Large FPS Game Map with Rule-enhanced Reinforcement Learning |
| OmniBooth:通过多模态指令学习图像合成的潜在控制 |
Leheng Li |
PDF |
N/A |
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction |
| 政府在加强人工智能部署后互联监控中的作用 |
Merlin Stein |
PDF |
N/A |
The Role of Governments in Increasing Interconnected Post-Deployment Monitoring of AI |
| 目标条件终端价值估计在实时与多任务模型预测控制中的应用 |
Mitsuki Morita |
PDF |
N/A |
Goal-Conditioned Terminal Value Estimation for Real-time and Multi-task Model Predictive Control |
| 通过LLM微调实现银行聊天机器人的意图分类 |
Bibiána Lajčinová |
PDF |
N/A |
Intent Classification for Bank Chatbots through LLM Fine-Tuning |
| 基于云的调度机制,用于可扩展且资源高效的集中式控制器 |
Achilleas Santi Seisa |
PDF |
N/A |
Cloud-Based Scheduling Mechanism for Scalable and Resource-Efficient Centralized Controllers |
| 防御即服务:针对后门图模型的黑盒防护 |
Xiao Yang |
PDF |
N/A |
Defense-as-a-Service: Black-box Shielding against Backdoored Graph Models |
| 分段线性函数的分解多面体 |
Marie-Charlotte Brandenburg |
PDF |
N/A |
Decomposition Polyhedra of Piecewise Linear Functions |
| 艺术与音乐的桥梁:通过跨模态生成连接视觉艺术与音乐 |
Ivan Rinaldi |
PDF |
N/A |
Art2Mus: Bridging Visual Arts and Music through Cross-Modal Generation |
| 低秩连续个性化扩散模型 |
Łukasz Staniszewski |
PDF |
N/A |
Low-Rank Continual Personalization of Diffusion Models |
| D-PoSE: 深度作为中间表示用于3D人体姿态和形状估计 |
Nikolaos Vasilikopoulos |
PDF |
N/A |
D-PoSE: Depth as an Intermediate Representation for 3D Human Pose and Shape Estimation |
| 经过权重衰减训练的宽神经网络确实表现出神经崩溃现象 |
Arthur Jacot |
PDF |
N/A |
Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse |
| 补丁已足够:针对视觉-语言预训练模型的自然主义对抗补丁 |
Dehong Kong |
PDF |
N/A |
Patch is Enough: Naturalistic Adversarial Patch against Vision-Language Pre-training Models |
| 改进KernelSHAP中的采样策略 |
Lars Henry Berge Olsen |
PDF |
N/A |
Improving the Sampling Strategy in KernelSHAP |
| 通过BoxAL主动学习提高废弃鱼类的检测 |
Maria Sokolova |
PDF |
N/A |
Improved detection of discarded fish species through BoxAL active learning |
| 利用语法归纳进行语言理解和生成 |
Jushi Kai |
PDF |
N/A |
Leveraging Grammar Induction for Language Understanding and Generation |
| TeX-NeRF:基于伪TeX视觉的神经辐射场 |
Chonghao Zhong |
PDF |
N/A |
TeX-NeRF: Neural Radiance Fields from Pseudo-TeX Vision |
| 关于带有符号梯度下降的双层Transformer的优化与泛化 |
Bingrui Li |
PDF |
N/A |
On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent |
| 使用Kolmogorov Arnold和卷积神经网络的艺术伪造检测 |
Sandro Boccuzzo |
PDF |
N/A |
Art Forgery Detection using Kolmogorov Arnold and Convolutional Neural Networks |
| 无需搜索掌握中国象棋AI(象棋) |
Yu Chen |
PDF |
N/A |
Mastering Chinese Chess AI (Xiangqi) Without Search |
| 通过自动任务生成实现机器人操作的无监督技能发现 |
Paul Jansonnie |
PDF |
N/A |
Unsupervised Skill Discovery for Robotic Manipulation through Automatic Task Generation |
| TimeCNN:在时间序列预测中,优化时间点上的跨变量交互 |
Ao Hu |
PDF |
N/A |
TimeCNN: Refining Cross-Variable Interaction on Time Point for Time Series Forecasting |
| 因果上下文调整损失用于学习型图像压缩 |
Minghao Han |
PDF |
N/A |
Causal Context Adjustment Loss for Learned Image Compression |
| PostEdit:高效零样本图像编辑的后验采样 |
Feng Tian |
PDF |
N/A |
PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing |
| 通过上下文示例实现的一个简单的图像分割框架 |
Yang Liu |
PDF |
N/A |
A Simple Image Segmentation Framework via In-Context Examples |
| 强模型崩溃 |
Elvis Dohmatob |
PDF |
N/A |
Strong Model Collapse |
| 基于成对自我评估的合理性答案验证 |
Akira Kawabata |
PDF |
N/A |
Rationale-Aware Answer Verification by Pairwise Self-Evaluation |
| 简单如微调:通过双向负反馈损失实现LLM对齐 |
Xin Mao |
PDF |
N/A |
As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss |
| 多模态融合策略用于映射生物物理景观特征 |
Lucia Gordon |
PDF |
N/A |
Multimodal Fusion Strategies for Mapping Biophysical Landscape Features |
| 驯服图神经网络中的梯度过度平滑和扩展问题 |
MoonJeong Park |
PDF |
N/A |
Taming Gradient Oversmoothing and Expansion in Graph Neural Networks |
| CAT:概念瓶颈模型的概念级后门攻击 |
Songning Lai |
PDF |
N/A |
CAT: Concept-level backdoor ATtacks for Concept Bottleneck Models |
| 矿工:挖掘多模态大型语言模型中特定模态神经元的潜在模式 |
Kaichen Huang |
PDF |
N/A |
MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models |
| 基于物理信息的图神经网络用于非线性约束优化:PINCO——一种用于交流最优潮流的求解器 |
Anna Varbella |
PDF |
N/A |
Physics-Informed GNN for non-linear constrained optimization: PINCO a solver for the AC-optimal power flow |
| 资源高效的多视角感知:结合语义掩码与掩码自编码器 |
Kosta Dakic |
PDF |
N/A |
Resource-Efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders |
| 基于人工智能的生物树构建综述:优先级、方法、应用与趋势 |
Zelin Zang |
PDF |
N/A |
A Review of Artificial Intelligence based Biological-Tree Construction: Priorities, Methods, Applications and Trends |
| 学习从时间序列数据中解释层次动态系统模型 |
Manuel Brenner |
PDF |
N/A |
Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data |
| 学习基于微分方程的高效且有效的图像恢复轨迹 |
Zhiyu Zhu |
PDF |
N/A |
Learning Efficient and Effective Trajectories for Differential Equation-based Image Restoration |
| FedBiP:基于个性化潜在扩散模型的异构一次性联邦学习 |
Haokun Chen |
PDF |
N/A |
FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models |
| LPZero:从零开始的零成本代理搜索语言模型 |
Peijie Dong |
PDF |
N/A |
LPZero: Language Model Zero-cost Proxy Search from Zero |
| Timer-XL:用于统一时间序列预测的长上下文变压器 |
Yong Liu |
PDF |
N/A |
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting |
| 冲突地区建筑物损毁评估:利用地理空间亚米级分辨率数据的深度学习方法 |
Matteo Risso |
PDF |
N/A |
Building Damage Assessment in Conflict Zones: A Deep Learning Approach Using Geospatial Sub-Meter Resolution Data |
| 通过推理时注意力工程改进带有伪影抑制的图像聚类 |
Kazumoto Nakamura |
PDF |
N/A |
Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention Engineering |
| 色彩转换:一种新颖的图像着色方法 |
Hamza Shafiq |
PDF |
N/A |
Transforming Color: A Novel Image Colorization Method |
| DAPE V2:将处理注意力分数作为特征图用于长度外推 |
Chuanyang Zheng |
PDF |
N/A |
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation |
| 代表未被充分代表的群体:发展泰国大型语言模型的文化和核心能力基准 |
Dahyun Kim |
PDF |
N/A |
Representing the Under-Represented: Cultural and Core Capability Benchmarks for Developing Thai Large Language Models |
| 大蒜:基于LLM的分层加权图动态进度控制的长文档问答系统 |
Xinyu Wang |
PDF |
N/A |
GARLIC: LLM-Guided Dynamic Progress Control with Hierarchical Weighted Graph for Long Document QA |
| 动画电影中混合成分的弱监督学习分析 |
Mónica Apellaniz Portos |
PDF |
N/A |
Analysis of Hybrid Compositions in Animation Film with Weakly Supervised Learning |
| 正式性受青睐:揭示大型语言模型在具有冲突知识的数据上的学习偏好 |
Jiahuan Li |
PDF |
N/A |
Formality is Favored: Unraveling the Learning Preferences of Large Language Models on Data with Conflicting Knowledge |
| 通过解读注意力因果关系减轻多模态大语言模型中的模态先验诱导幻觉 |
Guanyu Zhou |
PDF |
N/A |
Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality |
| 通过缩放初始化实现正弦神经场的快速训练 |
Taesun Yeom |
PDF |
N/A |
Fast Training of Sinusoidal Neural Fields via Scaling Initialization |
| MM-R$^3$:多模态大型语言模型(MLLMs)的一致性(或不一致性)研究 |
Shih-Han Chou |
PDF |
N/A |
MM-R$^3$: On (In-)Consistency of Multi-modal Large Language Models (MLLMs) |
| OmniBuds:一种用于高级生物传感与设备端机器学习的感官耳戴式平台 |
Alessandro Montanari |
PDF |
N/A |
OmniBuds: A Sensory Earable Platform for Advanced Bio-Sensing and On-Device Machine Learning |
| 粒球双支持向量机 |
A. Quadir |
PDF |
N/A |
Granular Ball Twin Support Vector Machine |
| 从透明度到问责制再回归:人工智能审计中访问与证据的探讨 |
Sarah H. Cen |
PDF |
N/A |
From Transparency to Accountability and Back: A Discussion of Access and Evidence in AI Auditing |
| 用于聚合物性质预测的分子拓扑深度学习 |
Cong Shen |
PDF |
N/A |
Molecular topological deep learning for polymer property prediction |
| 双智能体神经架构搜索用于博弈论深度学习模型 |
Aye Phyu Phyu Aung |
PDF |
N/A |
Double Oracle Neural Architecture Search for Game Theoretic Deep Learning Models |
| WTCL-Dehaze:通过小波变换和对比学习重新思考真实世界图像去雾 |
Divine Joseph Appiah |
PDF |
N/A |
WTCL-Dehaze: Rethinking Real-world Image Dehazing via Wavelet Transform and Contrastive Learning |
| 随机龙格-库塔方法:扩散模型的可证明加速 |
Yuchen Wu |
PDF |
N/A |
Stochastic Runge-Kutta Methods: Provable Acceleration of Diffusion Models |
| 合规驾驶:通过LLM增强的检索推理实现自动驾驶车辆的可解释决策 |
Tianhui Cai |
PDF |
N/A |
Driving with Regulation: Interpretable Decision-Making for Autonomous Vehicles with Retrieval-Augmented Reasoning via LLM |
| 项目聚类感知提示学习用于基于会话的推荐 |
Wooseong Yang |
PDF |
N/A |
Item Cluster-aware Prompt Learning for Session-based Recommendation |
| ImProver:基于代理的自动证明优化 |
Riyaz Ahuja |
PDF |
N/A |
ImProver: Agent-Based Automated Proof Optimization |
| 文档级因果关系抽取与知识引导的二元问答 |
Zimu Wang |
PDF |
N/A |
Document-level Causal Relation Extraction with Knowledge-guided Binary Question Answering |
| 大型语言和视觉模型的引人入胜的特性 |
Young-Jun Lee |
PDF |
N/A |
Intriguing Properties of Large Language and Vision Models |
| LLaVA需要更多知识:通过知识图谱增强检索的自然语言生成,用于解释胸部病理 |
Ameer Hamza |
PDF |
N/A |
LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies |
| 智能能源管理:基于过程结构的混合神经网络用于综合系统中的最优调度和经济预测控制 |
Long Wu |
PDF |
N/A |
Smart energy management: process structure-based hybrid neural networks for optimal scheduling and economic predictive control in integrated systems |
| 评估时空模型在城市场景中的泛化能力 |
Hongjun Wang |
PDF |
N/A |
Evaluating the Generalization Ability of Spatiotemporal Model in Urban Scenario |
| TableRAG:借助语言模型实现百万级标记表格理解 |
Si-An Chen |
PDF |
N/A |
TableRAG: Million-Token Table Understanding with Language Models |
| 3D视觉中的扩散模型:综述 |
Zhen Wang |
PDF |
N/A |
Diffusion Models in 3D Vision: A Survey |
| TLDR:用于大型视觉语言模型的令牌级侦探奖励模型 |
Deqing Fu |
PDF |
N/A |
TLDR: Token-Level Detective Reward Model for Large Vision Language Models |
| PredFormer:Transformer是有效的时空预测学习器 |
Yujin Tang |
PDF |
N/A |
PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners |
| 具有强化位置嵌入的高效变换器用于语言模型 |
Yen-Che Hsiao |
PDF |
N/A |
Efficient transformer with reinforced position embedding for language models |
| 遗忘曲线:评估长上下文模型记忆能力的可靠方法 |
Xinyu Liu |
PDF |
N/A |
Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models |
| ProtoNAM:用于可解释深度表格学习的原型神经加性模型 |
Guangzhi Xiong |
PDF |
N/A |
ProtoNAM: Prototypical Neural Additive Models for Interpretable Deep Tabular Learning |
| 深度神经网络中的标签对齐策略 |
Xuanrui Zeng |
PDF |
N/A |
A Strategy for Label Alignment in Deep Neural Networks |
| ACDC:利用扩散校正实现自回归一致的多模态生成 |
Hyungjin Chung |
PDF |
N/A |
ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction |
| $\textbf{仅当}$:揭示指令多样性对泛化能力的决定性影响 |
Dylan Zhang |
PDF |
N/A |
$\textbf{Only-IF}$:Revealing the Decisive Effect of Instruction Diversity on Generalization |
| H-SIREN:通过双曲周期函数改进隐式神经表示 |
Rui Gao |
PDF |
N/A |
H-SIREN: Improving implicit neural representations with hyperbolic periodic functions |
| 基于规则的数据选择用于大型语言模型 |
Xiaomin Li |
PDF |
N/A |
Rule-based Data Selection for Large Language Models |
| 预测编码网络的紧致稳定性、收敛性和鲁棒性界限 |
Ankur Mali |
PDF |
N/A |
Tight Stability, Convergence, and Robustness Bounds for Predictive Coding Networks |
| 学习如何思考:输入自适应的LM计算分配 |
Mehul Damani |
PDF |
N/A |
Learning How Hard to Think: Input-Adaptive Allocation of LM Computation |