Arxiv 2024-11-20 Papers

标题	作者	PDF链接	代码仓库	Title
AI生成图像检测：被动式还是水印？	Moyang Guo	PDF	N/A	AI-generated Image Detection: Passive or Watermark?
REDUCIO! 使用极度压缩的运动潜在表示在16秒内生成1024$\times$1024视频	Rui Tian	PDF	N/A	REDUCIO! Generating 1024$\times$1024 Video within 16 Seconds using Extremely Compressed Motion Latents
在3D中查找任意零件	Ziqi Ma	PDF	N/A	Find Any Part in 3D
从无姿态的网络照片生成一致的3D视频	Gene Chou	PDF	N/A	Generating 3D-Consistent Videos from Unposed Internet Photos
HF-Diff: 基于一步扩散的高频感知损失与分布匹配图像超分辨率	Shoaib Meraj Sami	PDF	N/A	HF-Diff: High-Frequency Perceptual Loss and Distribution Matching for One-Step Diffusion-Based Image Super-Resolution
SpecTool：一个用于表征工具使用型大语言模型错误的基准	Shirley Kokane	PDF	N/A	SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs
在垄断企业解散过程中促进用户数据自主权	Rushabh Solanki	PDF	N/A	Promoting User Data Autonomy During the Dissolution of a Monopolistic Firm
极限稀疏化：实现极端剪枝的技巧包	Andy Li	PDF	N/A	Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning
DIS-Mine：地下矿井中弱光条件下的灾害感知实例分割	Mizanur Rahman Jewel	PDF	N/A	DIS-Mine: Instance Segmentation for Disaster-Awareness in Poor-Light Condition in Underground Mines
BALROG：在游戏中对代理型大型语言模型和视觉语言模型进行基准测试和推理	Davide Paglieri	PDF	N/A	BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
未知情境与环境下的元认知能力（MUSE）	Rodolfo Valiente	PDF	N/A	Metacognition for Unknown Situations and Environments (MUSE)
保持身份的3D头部风格化与多视角评分蒸馏	Bahri Batuhan Bilecen	PDF	N/A	Identity Preserving 3D Head Stylization with Multiview Score Distillation
宫颈鳞状上皮细胞分类的机器学习与深度学习模型比较分析	Subhasish Das	PDF	N/A	Comparative Analysis of Machine Learning and Deep Learning Models for Classifying Squamous Epithelial Cells of the Cervix
预测LGBTQ+少数群体压力的洞察：对社交媒体话语的传导性探索	S. Chapagain	PDF	N/A	Predictive Insights into LGBTQ+ Minority Stress: A Transductive Exploration of Social Media Discourse
弱监督细胞核检测的熵引导	James Willoughby	PDF	N/A	Entropy Bootstrapping for Weakly Supervised Nuclei Detection
几何代数平面：凸隐式神经体积	Irmak Sivgin	PDF	N/A	Geometric Algebra Planes: Convex Implicit Neural Volumes
高能物理中的视觉变压器量子注意力	Alessandro Tesi	PDF	N/A	Quantum Attention for Vision Transformers in High Energy Physics
使用Sporo AraSum推进阿拉伯语复杂医学交流：超越现有大型语言模型	Chanseo Lee	PDF	N/A	Advancing Complex Medical Communication in Arabic with Sporo AraSum: Surpassing Existing Large Language Models
通过近似最优的子模块化优化进行采购拍卖	Yuan Deng	PDF	N/A	Procurement Auctions via Approximately Optimal Submodular Optimization
在大语言模型中解开记忆与推理能力	Mingyu Jin	PDF	N/A	Disentangling Memory and Reasoning Ability in Large Language Models
VBench++：面向视频生成模型的综合多功能基准测试套件	Ziqi Huang	PDF	N/A	VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
通过分布信息引导的图神经网络（DI-GNN）推进热浪预报：将极值理论与GNN相结合	Farrukh A. Chishtie	PDF	N/A	Advancing Heatwave Forecasting via Distribution Informed-Graph Neural Networks (DI-GNNs): Integrating Extreme Value Theory with GNNs
利用卷积导数运算进行阿尔茨海默病和痴呆症检测的高效脑成像分析	Yasmine Mustafa	PDF	N/A	Efficient Brain Imaging Analysis for Alzheimer's and Dementia Detection Using Convolution-Derivative Operations
利用大型语言模型合成产品吸引力数据集	John D. Hastings	PDF	N/A	Utilizing Large Language Models to Synthesize Product Desirability Datasets
分层数据的一致预测	Guillaume Principato	PDF	N/A	Conformal Prediction for Hierarchical Data
专利编辑：将专利新颖性构建为文本蕴含	Ryan Lee	PDF	N/A	PatentEdits: Framing Patent Novelty as Textual Entailment
当精度遇上位置：BFloat16在长上下文训练中打破RoPE	Haonan Wang	PDF	N/A	When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
通过算法扩散对对数凹函数的采样与积分	Yunbum Kook	PDF	N/A	Sampling and Integration of Logconcave Functions by Algorithmic Diffusion
SoK：复合人工智能威胁与对策的系统视角	Sarbartha Banerjee	PDF	N/A	SoK: A Systems Perspective on Compound AI Threats and Countermeasures
LIMBA：一个开源框架，利用生成模型保护和提升低资源语言的价值	Salvatore Mario Carta	PDF	N/A	LIMBA: An Open-Source Framework for the Preservation and Valorization of Low-Resource Languages using Generative Models
AdaptAgent：通过从人类演示中进行少样本学习，适应多模态网络代理	Gaurav Verma	PDF	N/A	AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations
使用课程学习的鲁棒单目视觉里程计	Assaf Lahiany	PDF	N/A	Robust Monocular Visual Odometry using Curriculum Learning
SynEHRgy：使用仅解码器Transformer合成混合类型结构化电子健康记录	Hojjat Karami	PDF	N/A	SynEHRgy: Synthesizing Mixed-Type Structured Electronic Health Records using Decoder-Only Transformers
水乐园：语言模型水印鲁棒性评估	Jiacheng Liang	PDF	N/A	WaterPark: A Robustness Assessment of Language Model Watermarking
《CAFE：阿尔及利亚方言法语与英语的代码转换数据集》	Houssam Eddine-Othman Lachemat	PDF	N/A	CAFE A Novel Code switching Dataset for Algerian Dialect French and English
启发式自适应扩散模型进化策略	Benedikt Hartl	PDF	N/A	Heuristically Adaptive Diffusion-Model Evolutionary Strategy
复杂环境中强化学习的增强研究：来自人类和LLM反馈的洞察	Alireza Rashidi Laleh	PDF	N/A	A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback
巴尔蒂语与跨境姊妹方言在大型语言模型和人工智能技术本质上的统一	Muhammad Sharif	PDF	N/A	Unification of Balti and trans-border sister dialects in the essence of LLMs and AI Technology
基于Transformer的上下文语言模型与神经网络联合用于越南语自然语言推理	Dat Van-Thanh Nguyen	PDF	N/A	Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese
通往大语言模型个性化之路：学习记忆用户对话	Lucie Charlotte Magister	PDF	N/A	On the Way to LLM Personalization: Learning to Remember User Conversations
带有机器学习的可执行二维码在工业应用中	Stefano Scanzio	PDF	N/A	Executable QR codes with Machine Learning for Industrial Applications
基于能量的单克隆抗体生成模型	Paul Pereira	PDF	N/A	Energy-based generative models for monoclonal antibodies
对抗扩散压缩用于真实世界图像超分辨率	Bin Chen	PDF	N/A	Adversarial Diffusion Compression for Real-World Image Super-Resolution
量子大脑：量子启发的神经网络方法用于视觉-大脑理解	Hoang-Quan Nguyen	PDF	N/A	Quantum-Brain: Quantum-Inspired Neural Network Approach to Vision-Brain Understanding
ODTE——基于多类SVM的斜决策树集成	Ricardo Montañana	PDF	N/A	ODTE -- An ensemble of multi-class SVM-based oblique decision trees
预测冷锻过程中壁厚变化：一种综合有限元法与神经网络的方法	Sasa Ilic	PDF	N/A	Predicting Wall Thickness Changes in Cold Forging Processes: An Integrated FEM and Neural Network approach
可解释有限记忆策略用于部分可观测马尔可夫决策过程	Muqsit Azeem	PDF	N/A	Explainable Finite-Memory Policies for Partially Observable Markov Decision Processes
RTSR：一种针对AV1压缩内容的实时超分辨率模型	Yuxuan Jiang	PDF	N/A	RTSR: A Real-Time Super-Resolution Model for AV1 Compressed Content
垂直验证：在稀疏支持区域上评估隐式生成模型以生成图	Mai Elkady	PDF	N/A	Vertical Validation: Evaluating Implicit Generative Models for Graphs on Thin Support Regions
基于学习的吉兹文字手写识别	Hailemicael Lulseged Yimer	PDF	N/A	Learning based Ge'ez character handwritten recognition
事实级置信度校准与自我修正	Yige Yuan	PDF	N/A	Fact-Level Confidence Calibration and Self-Correction
鲸鱼：一种用于增强自动驾驶中多智能体协作的多智能体调度数据集	Siwei Chen	PDF	N/A	WHALES: A Multi-agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving
验证机器遗忘与可解释人工智能	Àlex Pujol Vidal	PDF	N/A	Verifying Machine Unlearning with Explainable AI
一个用于微阵列数据分类的进化神经网络框架	Maryam Eshraghi Evari	PDF	N/A	An Evolutional Neural Network Framework for Classification of Microarray Data
大型语言模型是否在记忆错误基准？	Daniel Ramos	PDF	N/A	Are Large Language Models Memorizing Bug Benchmarks?
在线广告检索的规模法则	Yunli Wang	PDF	N/A	Scaling Laws for Online Advertisement Retrieval
教会视觉语言模型（VLMs）从上下文示例中定位特定对象	Sivan Doveh	PDF	N/A	Teaching VLMs to Localize Specific Objects from In-context Examples
一种利用相机和原始雷达数据进行鸟瞰图目标检测的资源高效融合网络	Kavin Chandrasekaran	PDF	N/A	A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data
理由能否助力提升行人意图预测？一种跨模态方法	Vaishnavi Khindkar	PDF	N/A	Can Reasons Help Improve Pedestrian Intent Estimation? A Cross-Modal Approach
DATAP-SfM：在野外实现鲁棒的从运动中恢复结构，通过动态感知跟踪任意点	Weicai Ye	PDF	N/A	DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild
基于类型感知的异构图和双重图消息传递的无偏场景图生成	Guanglu Sun	PDF	N/A	Unbiased Scene Graph Generation by Type-Aware Message Passing on Heterogeneous and Dual Graphs
DATTA：基于跨域WiFi的人类活动识别的领域对抗测试时适应	Julian Strohmayer	PDF	N/A	DATTA: Domain-Adversarial Test-Time Adaptation for Cross-Domain WiFi-Based Human Activity Recognition
将自回归和自编码语言模型结合用于文本分类	João Gonçalves	PDF	N/A	Combining Autoregressive and Autoencoder Language Models for Text Classification
VideoAutoArena：一个通过用户模拟评估大型多模态模型在视频分析中的自动化竞技场	Ziyang Luo	PDF	N/A	VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation
解锁基于结构的分子优化中的梯度引导力量	Keyue Qiu	PDF	N/A	Unlocking the Power of Gradient Guidance for Structure-Based Molecule Optimization
前向-后向插拔算法去噪器的分析与综合	Matthieu Kowalski	PDF	N/A	Analysis and Synthesis Denoisers for Forward-Backward Plug-and-Play Algorithms
面向规范驱动的基于大语言模型生成嵌入式汽车软件	Minal Suresh Patil	PDF	N/A	Towards Specification-Driven LLM-Based Generation of Embedded Automotive Software
用于格兰杰因果关系的稀疏注意力变压器	Riya Mahesh	PDF	N/A	Transformers with Sparse Attention for Granger Causality
FASTNav：针对多点机器人导航训练的微调自适应小语言模型	Yuxuan Chen	PDF	N/A	FASTNav: Fine-tuned Adaptive Small-language-models Trained for Multi-point Robot Navigation
更注重局部对比：通过先验知识提升红外小目标检测性能	Peichao Wang	PDF	N/A	Paying more attention to local contrast: improving infrared small target detection performance via prior knowledge
BelHouse3D: 一个用于评估3D点云语义分割中遮挡鲁棒性的基准数据集	Umamaheswaran Raman Kumar	PDF	N/A	BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation
关于无单位距离的平面周期集密度下界	Alexander Tolmachev	PDF	N/A	On lower bounds of the density of planar periodic sets without unit distances
利用先前经验：一个可扩展的文本到SQL辅助知识库	Zhibo Chu	PDF	N/A	Leveraging Prior Experience: An Expandable Auxiliary Knowledge Base for Text-to-SQL
XMask3D: 开放词汇3D语义分割的跨模态掩码推理	Ziyi Wang	PDF	N/A	XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
为新兴AI工作负载重塑混合云	Deming Chen	PDF	N/A	Transforming the Hybrid Cloud for Emerging AI Workloads
BIPro：通过块逆提示约束生成框架实现零样本中文诗歌生成	Xu Zou	PDF	N/A	BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework
AIDBench：一个用于评估大型语言模型作者归属能力的基准	Zichen Wen	PDF	N/A	AIDBench: A benchmark for evaluating the authorship identification capability of large language models
基于量子核的长短期记忆	Yu-Chao Hsu	PDF	N/A	Quantum Kernel-Based Long Short-term Memory
与大型语言模型进行存在主义对话：内容、社区与文化	Murray Shanahan	PDF	N/A	Existential Conversations with Large Language Models: Content, Community, and Culture
第六届自主系统形式方法国际研讨会论文集	Matt Luckcuck	PDF	N/A	Proceedings Sixth International Workshop on Formal Methods for Autonomous Systems
ViSTa数据集：视觉语言模型是否理解顺序任务？	Evžen Wybitul	PDF	N/A	ViSTa Dataset: Do vision-language models understand sequential tasks?
实时说话人像合成的音频特征提取比较分析	Pegah Salehi	PDF	N/A	Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis
大型语言模型的信息安全意识	Ofir Cohen	PDF	N/A	The Information Security Awareness of Large Language Models
机器人物体抓取与操控的综合方法	Owais Ahmed	PDF	N/A	An Integrated Approach to Robotic Object Grasping and Manipulation
用于胸部CT分割中多尺度特征学习的强度-空间双重掩码自编码器	Yuexing Ding	PDF	N/A	Intensity-Spatial Dual Masked Autoencoder for Multi-Scale Feature Learning in Chest CT Segmentation
OpenMS WebApps：构建用户友好的质谱分析解决方案	Tom David Müller	PDF	N/A	OpenMS WebApps: Building User-Friendly Solutions for MS Analysis
基于大型语言模型的参与驱动内容生成	Erica Coppolillo	PDF	N/A	Engagement-Driven Content Generation with Large Language Models
VADet：使用可变聚合的多帧激光雷达3D物体检测	Chengjie Huang	PDF	N/A	VADet: Multi-frame LiDAR 3D Object Detection using Variable Aggregation
点击；单目标跟踪；视频目标分割；实时互动	Kuiran Wang	PDF	N/A	Click; Single Object Tracking; Video Object Segmentation; Real-time Interaction
跨摄像头分心驾驶分类通过特征解耦与对比学习	Simone Bianco	PDF	N/A	Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning
十四行诗：通过利用模拟音频增强时间延迟估计	Erik Tegler	PDF	N/A	SONNET: Enhancing Time Delay Estimation by Leveraging Simulated Audio
写作风格的重要性：信息检索系统中的偏见与公平性考察	Hongliu Cao	PDF	N/A	Writing Style Matters: An Examination of Bias and Fairness in Information Retrieval Systems
有限权重平均的统一分析	Peng Wang	PDF	N/A	A Unified Analysis for Finite Weight Averaging
使用ALIGN解锁历史临床试验数据：一种用于医学编码的组合式大型语言模型系统	Nabeel Seedat	PDF	N/A	Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
硬合成：利用零样本TTS和LLM为ASR合成多样化硬样本	Jiawei Yu	PDF	N/A	Hard-Synth: Synthesizing Diverse Hard Samples for ASR using Zero-Shot TTS and LLM
深入研究高效推理方法：对推测性解码的综述	Hyun Ryu	PDF	N/A	Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding
DMQR-RAG：RAG的多查询重写多样化	Zhicong Li	PDF	N/A	DMQR-RAG: Diverse Multi-Query Rewriting for RAG
独居老人六种异常行为的长期检测系统	Kai Tanaka	PDF	N/A	Long-term Detection System for Six Kinds of Abnormal Behavior of the Elderly Living Alone
AGLP：一种面向半监督领域自适应的图学习视角	Houcheng Su	PDF	N/A	AGLP: A Graph Learning Perspective for Semi-supervised Domain Adaptation
RAW-扩散：RGB引导的扩散模型用于高保真RAW图像生成	Christoph Reinders	PDF	N/A	RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation
YCB-LUMA：用于目标定位的YCB物体数据集，采用亮度键控技术	Thomas Pöllabauer	PDF	N/A	YCB-LUMA: YCB Object Dataset with Luminance Keying for Object Localization
GraphCL：基于图的半监督医学图像分割聚类方法	Mengzhu Wang	PDF	N/A	GraphCL: Graph-based Clustering for Semi-Supervised Medical Image Segmentation
全局相关性感知硬负样本生成	Wenjie Peng	PDF	N/A	Globally Correlation-Aware Hard Negative Generation
CopyrightMeter：重新审视文本到图像模型中的版权保护	Naen Xu	PDF	N/A	CopyrightMeter: Revisiting Copyright Protection in Text-to-image Models
领域自适应展开图神经网络	Zepeng Zhang	PDF	N/A	Domain Adaptive Unfolded Graph Neural Networks
TAPT：视觉-语言模型中鲁棒推理的测试时对抗性提示调优	Xin Wang	PDF	N/A	TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
将视觉基础模型适配用于遥感图像中稳健的云分割	Xuechao Zou	PDF	N/A	Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images
无标记组织在成像质谱中的虚拟染色	Yijie Zhang	PDF	N/A	Virtual Staining of Label-Free Tissue in Imaging Mass Spectrometry
计算稀疏自编码器中的最优推断和可证明的摊销差距	Charles O'Neill	PDF	N/A	Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders
针对连续强化学习的可证明高效动作操纵攻击	Zhi Luo	PDF	N/A	Provably Efficient Action-Manipulation Attack Against Continuous Reinforcement Learning
DriveMLLM：自动驾驶中多模态大语言模型空间理解基准	Xianda Guo	PDF	N/A	DriveMLLM: A Benchmark for Spatial Understanding with Multimodal Large Language Models in Autonomous Driving
展示神经形态、基于事件的动态视觉传感器在金属增材制造和焊接过程中监测的适用性	David Mascareñas	PDF	N/A	Demonstrating the Suitability of Neuromorphic, Event-Based, Dynamic Vision Sensors for In Process Monitoring of Metallic Additive Manufacturing and Welding
超像素成本体积激发用于立体匹配	Shanglong Liu	PDF	N/A	Superpixel Cost Volume Excitation for Stereo Matching
基于深度强化学习的优化：在支持C-V2X的物联网中实现AoI与能耗的平衡	Zheng Zhang	PDF	N/A	DRL-Based Optimization for AoI and Energy Consumption in C-V2X Enabled IoV
歌曲形式感知的整首歌曲文本到歌词生成与多层次粒度音节计数控制	Yunkee Chae	PDF	N/A	Song Form-aware Full-Song Text-to-Lyrics Generation with Multi-Level Granularity Syllable Count Control
使用可扩展图卷积网络进行增量标签分布学习	Ziqi Jia	PDF	N/A	Incremental Label Distribution Learning with Scalable Graph Convolutional Networks
视频-RAG：视觉对齐的检索增强型长视频理解	Yongdong Luo	PDF	N/A	Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
ESARM: 通过自动排序演示的奖励模型实现的三维情感语音到动画转换	Xulong Zhang	PDF	N/A	ESARM: 3D Emotional Speech-to-Animation via Reward Model from Automatically-Ranked Demonstrations
全预测单指标模型与多指标模型	Lunjia Hu	PDF	N/A	Omnipredicting Single-Index Models with Multi-Index Models
耐心是大型语言模型推理的关键	Yijiong Yu	PDF	N/A	Patience Is The Key to Large Language Model Reasoning
实用的紧凑型深度压缩感知	Bin Chen	PDF	N/A	Practical Compact Deep Compressed Sensing
神经内模控制：通过预测误差反馈学习鲁棒控制策略	Feng Gao	PDF	N/A	Neural Internal Model Control: Learning a Robust Control Policy via Predictive Error Feedback
提示词的提示：增强多模态大语言模型在自动驾驶中的视觉表示	Hao Zhou	PDF	N/A	Hints of Prompt: Enhancing Visual Representation for Multimodal LLMs in Autonomous Driving
通过对齐嵌入空间集成来提升预训练编码器的OOD泛化能力	Shuman Peng	PDF	N/A	Improving OOD Generalization of Pre-trained Encoders via Aligned Embedding-Space Ensembles
AMaze：一个直观的基准生成器，用于快速原型化可泛化的代理	Kevin Godin-Dubois	PDF	N/A	AMaze: An intuitive benchmark generator for fast prototyping of generalizable agents
基于相似四面体的单树点云自动无标记配准	Jing Ren	PDF	N/A	Automatic marker-free registration based on similar tetrahedras for single-tree point clouds
向着无偏见和鲁棒的时空场景图生成与预测	Rohith Peddi	PDF	N/A	Towards Unbiased and Robust Spatio-Temporal Scene Graph Generation and Anticipation
分支，集合！淘宝大规模点击率预测的多分支合作网络	Xu Chen	PDF	N/A	Branches, Assemble! Multi-Branch Cooperation Network for Large-Scale Click-Through Rate Prediction at Taobao
高效掩码自动编码器用于视频对象计数及大规模基准测试	Bing Cao	PDF	N/A	Efficient Masked AutoEncoder for Video Object Counting and A Large-Scale Benchmark
硬件扩展趋势与大规模分布式训练中的收益递减	Jared Fernandez	PDF	N/A	Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
MEGL：多模态解释引导学习	Yifei Zhang	PDF	N/A	MEGL: Multimodal Explanation-Guided Learning
基于设备的内容推荐与单次嵌入剪枝：一种合作博弈视角	Hung Vinh Tran	PDF	N/A	On-device Content-based Recommendation with Single-shot Embedding Pruning: A Cooperative Game Perspective
边界框水印：针对目标检测器模型提取攻击的防御	Satoru Koda	PDF	N/A	Bounding-box Watermarking: Defense against Model Extraction Attacks on Object Detectors
可解释的大型语言模型驱动的多维度蒸馏在电子商务相关性学习中的应用	Gang Zhao	PDF	N/A	Explainable LLM-driven Multi-dimensional Distillation for E-Commerce Relevance Learning
细心的上下文注意力用于云去除	Wenli Huang	PDF	N/A	Attentive Contextual Attention for Cloud Removal
RobustFormer：图像和视频的噪声鲁棒预训练	Ashish Bastola	PDF	N/A	RobustFormer: Noise-Robust Pre-training for images and videos
通过交替优化实现多模态图像对的无监督单应性估计	Sanghyeob Song	PDF	N/A	Unsupervised Homography Estimation on Multimodal Image Pair via Alternating Optimization
基于大规模多模态驱动的语义图像-文本编码用于超低比特率学习型图像压缩	Shimon Murai	PDF	N/A	LMM-driven Semantic Image-Text Coding for Ultra Low-bitrate Learned Image Compression
“80%是我，20%是AI”：在大型语言模型协作写作中追求真实性	Angel Hsing-Chi Hwang	PDF	N/A	"It was 80% me, 20% AI": Seeking Authenticity in Co-Writing with Large Language Models
大概准确率和召回率学习	Lee Cohen	PDF	N/A	Probably Approximately Precision and Recall Learning
一种用于图变换器在转导学习中压缩性的理论	Hamed Shirzad	PDF	N/A	A Theory for Compressibility of Graph Transformers for Transductive Learning
X 作为监督：在无监督单目三维姿态估计中应对深度模糊性	Yuchen Yang	PDF	N/A	X as Supervision: Contending with Depth Ambiguity in Unsupervised Monocular 3D Pose Estimation
ORID：器官-区域信息驱动的放射报告生成框架	Tiancheng Gu	PDF	N/A	ORID: Organ-Regional Information Driven Framework for Radiology Report Generation
基于先验的目标推理挖掘面部表情识别的潜在不确定性	Hanwei Liu	PDF	N/A	Prior-based Objective Inference Mining Potential Uncertainty for Facial Expression Recognition
训练无原始数据访问的物理驱动深度学习重建以实现公平快速磁共振成像	Yaşar Utku Alçalar	PDF	N/A	Training Physics-Driven Deep Learning Reconstruction without Raw Data Access for Equitable Fast MRI
香奈儿-订购者：一种用于三通道自然图像的通道排序预测器	Shen Li	PDF	N/A	Chanel-Orderer: A Channel-Ordering Predictor for Tri-Channel Natural Images
开放世界非模态外观补全	Jiayang Ao	PDF	N/A	Open-World Amodal Appearance Completion
打破反复失败的循环：将生成式人工智能应用于传统银行系统的根本原因分析	Siyuan Jin	PDF	N/A	Breaking the Cycle of Recurring Failures: Applying Generative AI to Root Cause Analysis in Legacy Banking Systems
可扩展的属性图上的深度度量学习	Xiang Li	PDF	N/A	Scalable Deep Metric Learning on Attributed Graphs
通过积分推导激活函数	Allen Hao Huang	PDF	N/A	Deriving Activation Functions via Integration
LLMSteer: 通过引导注意力在重复使用的上下文上改进长上下文LLM推理	Zhuohan Gu	PDF	N/A	LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts
评估大型语言模型在理解社会动态方面的能力	Anique Tahir	PDF	N/A	Evaluating LLMs Capabilities Towards Understanding Social Dynamics
利用人工智能和语音界面自动化超声科医生的超声命令	Emad Mohamed	PDF	N/A	Automating Sonologists USG Commands with AI and Voice Interface
DT-LSD：基于可变形Transformer的线段检测	Sebastian Janampa	PDF	N/A	DT-LSD: Deformable Transformer-based Line Segment Detection
MERLOT：一种基于蒸馏LLM的可扩展加密流量分类混合专家框架	Yuxuan Chen	PDF	N/A	MERLOT: A Distilled LLM-based Mixture-of-Experts Framework for Scalable Encrypted Traffic Classification
协作特征-对数对比学习用于开放集半监督目标检测	Xinhao Zhong	PDF	N/A	Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection
NCAirFL：基于非相干检测的无信道状态信息空中联邦学习	Haifeng Wen	PDF	N/A	NCAirFL: CSI-Free Over-the-Air Federated Learning Based on Non-Coherent Detection
消除基于梯度的模拟参数估计中的比率偏差	Zehao Li	PDF	N/A	Eliminating Ratio Bias for Gradient-based Simulated Parameter Estimation
MemoryFormer：通过移除全连接层来最小化Transformer计算	Ning Ding	PDF	N/A	MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers
BetterBench：评估AI基准测试，揭示问题，并建立最佳实践	Anka Reuel	PDF	N/A	BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices
在目标语言中使用数据约束训练双语语言模型	Skyler Seto	PDF	N/A	Training Bilingual LMs with Data Constraints in the Targeted Language
GazeGaussian：使用3D高斯溅射实现高保真视线重定向	Xiaobao Wei	PDF	N/A	GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting
LaVida Drive：基于Token选择、恢复和增强的视觉-文本交互VLM，用于自动驾驶	Siwen Jiao	PDF	N/A	LaVida Drive: Vision-Text Interaction VLM for Autonomous Driving with Token Selection, Recovery and Enhancement
MindForge：赋能具身智能体，通过心智理论实现终身协作学习	Mircea Lică	PDF	N/A	MindForge: Empowering Embodied Agents with Theory of Mind for Lifelong Collaborative Learning
自适应过程引导学习：在预测湖泊溶解氧浓度中的应用	Runlong Yu	PDF	N/A	Adaptive Process-Guided Learning: An Application in Predicting Lake DO Concentrations
统一城市时空流预测的基础模型	Yuan Yuan	PDF	N/A	A Foundation Model for Unified Urban Spatio-Temporal Flow Prediction
POMCP缩减：实时无人机搜救框架	Yunuo Zhang	PDF	N/A	Shrinking POMCP: A Framework for Real-Time UAV Search and Rescue
关于双边最近邻的自适应性和极小极大最优性	Tathagata Sadhukhan	PDF	N/A	On adaptivity and minimax optimality of two-sided nearest neighbors
电动汽车实时能耗最优路径规划	Saman Ahmadi	PDF	N/A	Real-Time Energy-Optimal Path Planning for Electric Vehicles
视频大语言模型在时间理解中的一致性	Minjoon Jung	PDF	N/A	On the Consistency of Video Large Language Models in Temporal Comprehension
KAAE：通过知识感知属性学习实现知识图谱的数值推理	Ming Yin	PDF	N/A	KAAE: Numerical Reasoning for Knowledge Graphs via Knowledge-aware Attributes Learning
从稀疏观测中机器学习海啸动力学重建	Edward McDugald	PDF	N/A	Machine learned reconstruction of tsunami dynamics from sparse observations
一种应用于离题提示检测的灵活大型语言模型防护开发方法论	Gabriel Chua	PDF	N/A	A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
增强热成像多目标跟踪：一种利用热成像身份和运动相似性的新型目标关联方法	Wassim El Ahmar	PDF	N/A	Enhancing Thermal MOT: A Novel Box Association Method Leveraging Thermal Identity and Motion Similarity
关于Koopman算子逼近与神经常微分方程在数据驱动时间演化预测中的关系	Jake Buzhardt	PDF	N/A	On the relationship between Koopman operator approximations and neural ordinary differential equations for data-driven time-evolution predictions
通过混合非线性动力学稀疏识别改进锂离子电池的低保真模型	Samuel Filgueira da Silva	PDF	N/A	Improving Low-Fidelity Models of Li-ion Batteries via Hybrid Sparse Identification of Nonlinear Dynamics