| Molmo 和 PixMo:为最先进的跨模态模型提供开放权重和开放数据 |
Matt Deitke |
PDF |
N/A |
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models |
| DreamWaltz-G:从骨骼引导的2D扩散中生成富有表现力的3D高斯头像 |
Yukun Huang |
PDF |
N/A |
DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion |
| 差分隐私正则化:通过损失函数正则化保护训练数据 |
Francisco Aguilera-Martínez |
PDF |
N/A |
Differential Privacy Regularization: Protecting Training Data Through Loss Function Regularization |
| 图像上注意力提示用于大型视觉-语言模型 |
Runpeng Yu |
PDF |
N/A |
Attention Prompting on Image for Large Vision-Language Models |
| FineZip:推动大型语言模型在实际无损文本压缩中的极限 |
Fazal Mittu |
PDF |
N/A |
FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression |
| 将每个应用程序转变为智能代理:基于API优先的大型语言模型代理实现高效的人机交互 |
Junting Lu |
PDF |
N/A |
Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents |
| 动态学习:基于动态无人机团队的无人机通信网络自主调节 |
Ran Zhang |
PDF |
N/A |
Learning with Dynamics: Autonomous Regulation of UAV Based Communication Networks with Dynamic UAV Crew |
| 有限时间马尔可夫决策过程(MDPs)中具有一般状态和动作的政策优化景观 |
Xin Chen |
PDF |
N/A |
Landscape of Policy Optimization for Finite Horizon MDPs with General State and Action |
| PACE:将参数高效微调中的泛化与一致性正则化相结合 |
Yao Ni |
PDF |
N/A |
PACE: marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization |
| 流式神经图像 |
Marcos V. Conde |
PDF |
N/A |
Streaming Neural Images |
| 评估孟加拉社交媒体评论中对不同群体的毒性水平:一项全面调查 |
Mukaffi Bin Moin |
PDF |
N/A |
Assessing the Level of Toxicity Against Distinct Groups in Bangla Social Media Comments: A Comprehensive Investigation |
| Blox-Net:利用VLM监督、物理模拟和具备重置功能的机器人进行机器人组装的生成式设计 |
Andrew Goldberg |
PDF |
N/A |
Blox-Net: Generative Design-for-Robot-Assembly Using VLM Supervision, Physics Simulation, and a Robot with Reset |
| 航天器碰撞规避的自主决策轨道服务 |
Susmitha Patnala |
PDF |
N/A |
On-orbit Servicing for Spacecraft Collision Avoidance With Autonomous Decision Making |
| 使用深度学习技术对前列腺癌病理图像进行Gleason分级分类:YOLO、视觉变换器和视觉Mamba |
Amin Malekmohammadi |
PDF |
N/A |
Classification of Gleason Grading in Prostate Cancer Histopathology Images Using Deep Learning Techniques: YOLO, Vision Transformers, and Vision Mamba |
| 深度学习与机器学习:推动大数据分析与管理的前沿技术:实用入门指南 |
Benji Peng |
PDF |
N/A |
Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Handy Appetizer |
| 用于现场疾病检测的小数据深度学习方法 |
David Herrera-Poyato |
PDF |
N/A |
Small data deep learning methodology for in-field disease detection |
| 编程每个示例:大规模提升预训练数据质量,如同专家般 |
Fan Zhou |
PDF |
N/A |
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale |
| 描述大型语言模型残差流中的稳定区域 |
Jett Janiak |
PDF |
N/A |
Characterizing stable regions in the residual stream of LLMs |
| MorphoSeg:一种用于复杂细胞形态学生物医学分割的不确定性感知深度学习方法 |
Tianhao Zhang |
PDF |
N/A |
MorphoSeg: An Uncertainty-Aware Deep Learning Method for Biomedical Segmentation of Complex Cellular Morphologies |
| 揭示多模态基础模型中的本体承诺 |
Mert Keser |
PDF |
N/A |
Unveiling Ontological Commitment in Multi-Modal Foundation Models |
| 非渐近收敛性分析的随机梯度哈密顿蒙特卡罗算法与不连续随机梯度,应用于训练ReLU神经网络 |
Luxu Liang |
PDF |
N/A |
Non-asymptotic convergence analysis of the stochastic gradient Hamiltonian Monte Carlo algorithm with discontinuous stochastic gradient with applications to training of ReLU neural networks |
| Text2CAD:从初学者到专家级别的文本提示生成顺序CAD模型 |
Mohammad Sadil Khan |
PDF |
N/A |
Text2CAD: Generating Sequential CAD Models from Beginner-to-Expert Level Text Prompts |
| 基于通用检测的文本行识别 |
Raphael Baena |
PDF |
N/A |
General Detection-based Text Line Recognition |
| BitQ:为资源受限设备上的DNN效率提升量身定制块浮点精度 |
Yongqi Xu |
PDF |
N/A |
BitQ: Tailoring Block Floating Point Precision for Improved DNN Efficiency on Resource-Constrained Devices |
| 累加器感知的后训练量化 |
Ian Colbert |
PDF |
N/A |
Accumulator-Aware Post-Training Quantization |
| Ctrl-GenAug:面向医学序列分类的可控生成增强 |
Xinrui Zhou |
PDF |
N/A |
Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification |
| 通过快速近端梯度下降实现局部正则化的稀疏图 |
Dongfang Sun |
PDF |
N/A |
Locally Regularized Sparse Graph by Fast Proximal Gradient Descent |
| SEN12-WATER:一个新的水文应用数据集及其基准测试 |
Luigi Russo |
PDF |
N/A |
SEN12-WATER: A New Dataset for Hydrological Applications and its Benchmarking |
| 参数高效的贝叶斯神经网络用于不确定性感知的深度估计 |
Richard D. Paul |
PDF |
N/A |
Parameter-efficient Bayesian Neural Networks for Uncertainty-aware Depth Estimation |
| 视觉语言模型能否从模糊空间推理的视觉演示中学习? |
Bowen Zhao |
PDF |
N/A |
Can Vision Language Models Learn from Visual Demonstrations of Ambiguous Spatial Reasoning? |
| 利用Transformer实现高效特征交互:提升游戏用户消费倾向预测 |
Ved Prakash |
PDF |
N/A |
Efficient Feature Interactions with Transformers: Improving User Spending Propensity Predictions in Gaming |
| 通过粗粒度答案分解增强长文档理解中的事后归因 |
Pritika Ramu |
PDF |
N/A |
Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition |
| 感知度量对音乐流派分类中音乐表示学习的影响 |
Tashi Namgyal |
PDF |
N/A |
The Effect of Perceptual Metrics on Music Representation Learning for Genre Classification |
| VPTQ:面向大型语言模型的极低比特向量后训练量化 |
Yifei Liu |
PDF |
N/A |
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models |
| 在计算病理学中基准测试领域泛化算法 |
Neda Zamanitajeddin |
PDF |
N/A |
Benchmarking Domain Generalization Algorithms in Computational Pathology |
| 基于退化引导的单步图像超分辨率与扩散先验 |
Aiping Zhang |
PDF |
N/A |
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors |
| DRIM:从不完整的多模态医疗数据中学习解耦表示 |
Lucas Robinet |
PDF |
N/A |
DRIM: Learning Disentangled Representations from Incomplete Multimodal Healthcare Data |
| 利用大型语言模型(LLM)对印度尼西亚ePuskesmas中医患互动进行实时转录和总结 |
Azmul Asmar Irfan |
PDF |
N/A |
Using LLM for Real-Time Transcription and Summarization of Doctor-Patient Interactions into ePuskesmas in Indonesia |
| ControlCity:一种基于多模态扩散模型的方法,用于精确的地理空间数据生成和城市形态分析 |
Fangshuo Zhou |
PDF |
N/A |
ControlCity: A Multimodal Diffusion Model Based Approach for Accurate Geospatial Data Generation and Urban Morphology Analysis |
| 使用图Koopman自编码器对抗多无人机监控的预测隐蔽通信 |
Sivaram Krishnan |
PDF |
N/A |
Predictive Covert Communication Against Multi-UAV Surveillance Using Graph Koopman Autoencoder |
| 检测问题中的时间模糊性 |
Bhawna Piryani |
PDF |
N/A |
Detecting Temporal Ambiguity in Questions |
| GeoBiked:一个包含几何特征和自动化标注技术的数据集,以支持工程设计中的深度生成模型 |
Phillip Mueller |
PDF |
N/A |
GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design |
| 如何将语音基础模型与大型语言模型连接起来?哪些因素重要,哪些不重要? |
Francesco Verdini |
PDF |
N/A |
How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not |
| EventHDR:从事件到高速高动态范围视频及更进一步 |
Yunhao Zou |
PDF |
N/A |
EventHDR: from Event to High-Speed HDR Videos and Beyond |
| 大型语言模型中的反事实令牌生成 |
Ivi Chatzi |
PDF |
N/A |
Counterfactual Token Generation in Large Language Models |
| 使用高保真桌面幻影进行内镜下垂体手术中的自动化手术技能评估及实时器械追踪 |
Adrito Das |
PDF |
N/A |
Automated Surgical Skill Assessment in Endoscopic Pituitary Surgery using Real-time Instrument Tracking on a High-fidelity Bench-top Phantom |
| 增强型小波散射网络用于图像修复检测 |
Barglazan Adrian-Alin |
PDF |
N/A |
Enhanced Wavelet Scattering Network for image inpainting detection |
| CombU:一种结合单元激活,用于神经网络拟合数学表达式 |
Jiayu Li |
PDF |
N/A |
CombU: A Combined Unit Activation for Fitting Mathematical Expressions with Neural Networks |
| PTQ4RIS:用于指代图像分割的训练后量化 |
Xiaoyan Jiang |
PDF |
N/A |
PTQ4RIS: Post-Training Quantization for Referring Image Segmentation |
| CNN深度混合 |
Rinor Cakaj |
PDF |
N/A |
CNN Mixture-of-Depths |
| AI驱动的风险感知调度用于主动碎片移除任务 |
Antoine Poupon |
PDF |
N/A |
AI-Driven Risk-Aware Scheduling for Active Debris Removal Missions |
| LLM-CARD: 大型语言模型描述与全景图 |
Shengwei Tian |
PDF |
N/A |
LLM-CARD: Towards a Description and Landscape of Large Language Models |
| 模型能够并且应该接纳人类生成数学的交流特性 |
Sasha Boguraev |
PDF |
N/A |
Models Can and Should Embrace the Communicative Nature of Human-Generated Math |
| 恶劣天气光流:累积同质-异质适应 |
Hanyu Zhou |
PDF |
N/A |
Adverse Weather Optical Flow: Cumulative Homogeneous-Heterogeneous Adaptation |
| WasteGAN:通过生成对抗网络实现机器人垃圾分类的数据增强 |
Alberto Bacchin |
PDF |
N/A |
WasteGAN: Data Augmentation for Robotic Waste Sorting through Generative Adversarial Networks |
| PitRSDNet:预测内镜下脑垂体手术中术中剩余手术时间 |
Anjana Wijekoon |
PDF |
N/A |
PitRSDNet: Predicting Intra-operative Remaining Surgery Duration in Endoscopic Pituitary Surgery |
| INT-FlashAttention:为INT8量化启用Flash Attention |
Shimao Chen |
PDF |
N/A |
INT-FlashAttention: Enabling Flash Attention for INT8 Quantization |
| 慢特征分析(Slow Feature Analysis)与后继表示(Successor Representation)之间的关系是什么? |
Eddie Seabrook |
PDF |
N/A |
What is the relationship between Slow Feature Analysis and the Successor Representation? |
| 单张图像,任意面孔:可泛化的3D面部生成 |
Wenqing Wang |
PDF |
N/A |
Single Image, Any Face: Generalisable 3D Face Generation |
| 利用多样性进行大型语言模型预训练中的重要数据选择 |
Chi Zhang |
PDF |
N/A |
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models |
| AXCEL:使用大型语言模型实现自动可解释一致性评估 |
P Aditya Sreekar |
PDF |
N/A |
AXCEL: Automated eXplainable Consistency Evaluation using LLMs |
| 面向用户的训练数据归属研究:以人为中心可解释人工智能 |
Elisa Nguyen |
PDF |
N/A |
Towards User-Focused Research in Training Data Attribution for Human-Centered Explainable AI |
| 解码大型语言模型:社会技术影响、限制及新兴问题的系统概述 |
Zeyneb N. Kaya |
PDF |
N/A |
Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions |
| 自适应自监督学习策略用于动态设备上大型语言模型个性化 |
Rafael Mendoza |
PDF |
N/A |
Adaptive Self-Supervised Learning Strategies for Dynamic On-Device LLM Personalization |
| 将无线人工智能范式与真实环境连接:基于硬件在环的桥梁 |
Jeffrey Redondo |
PDF |
N/A |
Bridge to Real Environment with Hardware-in-the-loop for Wireless Artificial Intelligence Paradigms |
| 使用深度强化学习的多机器人信息路径规划,以实现高效的目标映射 |
Apoorva Vashisth |
PDF |
N/A |
Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning |
| ABCFair:一种可比较公平方法的适应性基准方法 |
MaryBeth Defrance |
PDF |
N/A |
ABCFair: an Adaptable Benchmark approach for Comparing Fairness Methods |
| 求解方程组的元启发式方法 |
Samson Odan |
PDF |
N/A |
Metaheuristic Method for Solving Systems of Equations |
| 知情深度层次分类:一种受非标准分析启发的分析方法 |
Lorenzo Fiaschi |
PDF |
N/A |
Informed deep hierarchical classification: a non-standard analysis inspired approach |
| 多语言语音识别中低资源语言的加权交叉熵 |
Andrés Piñeiro-Martín |
PDF |
N/A |
Weighted Cross-entropy for Low-Resource Languages in Multilingual Speech Recognition |
| 基于事件的任意时长识别的路径自适应时空状态空间模型 |
Jiazhou Zhou |
PDF |
N/A |
Path-adaptive Spatio-Temporal State Space Model for Event-based Recognition with Arbitrary Duration |
| 基于不确定性的自适应规划与扩散的动态障碍物规避 |
Vineet Punyamoorty |
PDF |
N/A |
Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion |
| DALDA:利用扩散模型和LLM进行自适应引导缩放的数据增强 |
Kyuheon Jung |
PDF |
N/A |
DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling |
| NTIRE 2024 立体图像超分辨率挑战赛:方法与结果 |
Longguang Wang |
PDF |
N/A |
NTIRE 2024 Challenge on Stereo Image Super-Resolution: Methods and Results |
| 设定人工智能议程——来自ChatGPT时代瑞典的实证 |
Bastiaan Bruinsma |
PDF |
N/A |
Setting the AI Agenda -- Evidence from Sweden in the ChatGPT Era |
| 具有精细骨干网络的面部伪造检测 |
Zonghui Guo |
PDF |
N/A |
Face Forgery Detection with Elaborate Backbone |
| Go-SLAM:基于高斯散射的物体分割与定位同时定位与地图构建 |
Phu Pham |
PDF |
N/A |
Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM |
| 一般重复-分歧图模型中的分歧不对称性和连通分量 |
Dario Borrelli |
PDF |
N/A |
Divergence asymmetry and connected components in a general duplication-divergence graph model |
| 超越U-Net:评估视觉Transformer在显微镜图像分析中的语义分割效果 |
Illia Tsiporenko |
PDF |
N/A |
Going Beyond U-Net: Assessing Vision Transformers for Semantic Segmentation in Microscopy Image Analysis |
| 在高斯光栅化中使用多视图扩散模型进行生成对象插入 |
Hongliang Zhong |
PDF |
N/A |
Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model |
| 半监督认知状态分类从语音与多视图伪标签 |
Yuanchao Li |
PDF |
N/A |
Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-Labeling |
| 研究OCR敏感神经元以提升历史文档中的实体识别 |
Emanuela Boros |
PDF |
N/A |
Investigating OCR-Sensitive Neurons to Improve Entity Recognition in Historical Documents |
| 量子-经典情感分析 |
Mario Bifulco |
PDF |
N/A |
Quantum-Classical Sentiment Analysis |
| Game4Loc:一个基于游戏数据的无人机地理定位基准 |
Yuxiang Ji |
PDF |
N/A |
Game4Loc: A UAV Geo-Localization Benchmark from Game Data |
| AI辅助的在线考试监考视线检测 |
Yong-Siang Shih |
PDF |
N/A |
AI-assisted Gaze Detection for Proctoring Online Exams |
| 通过不变映射分解等变映射:对称下通用逼近的应用 |
Akiyoshi Sannai |
PDF |
N/A |
Decomposition of Equivariant Maps via Invariant Maps: Application to Universal Approximation under Symmetry |
| Moner:欠采样径向MRI中的运动校正与无监督神经表示 |
Qing Wu |
PDF |
N/A |
Moner: Motion Correction in Undersampled Radial MRI with Unsupervised Neural Representation |
| 跨语言语音情感识别:人类与自监督模型 |
Zhichen Han |
PDF |
N/A |
Cross-lingual Speech Emotion Recognition: Humans vs. Self-Supervised Models |
| 使用标记内聚性进行零样本检测的LLM生成文本 |
Shixuan Ma |
PDF |
N/A |
Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness |
| 告诉我你不知道的:通过表示空间分析和编辑增强角色扮演代理的拒绝能力 |
Wenhao Liu |
PDF |
N/A |
Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing |
| 对多语言大型语言模型进行修剪以用于多语言推理 |
Hwichan Kim |
PDF |
N/A |
Pruning Multilingual Large Language Models for Multilingual Inference |
| 增强时间敏感性及推理能力以应对时间敏感型问答 |
Wanqi Yang |
PDF |
N/A |
Enhancing Temporal Sensitivity and Reasoning for Time-Sensitive Question Answering |
| 一种用于法线积分的自适应屏幕空间网格化方法 |
Moritz Heep |
PDF |
N/A |
An Adaptive Screen-Space Meshing Approach for Normal Integration |
| 判别性锚点学习用于高效的多视角聚类 |
Yalan Qin |
PDF |
N/A |
Discriminative Anchor Learning for Efficient Multi-view Clustering |
| 面向水下伪装目标追踪:SAM与SAM 2的实验评估 |
Chunhui Zhang |
PDF |
N/A |
Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2 |
| LLMs中的具身与社会基础路线图 |
Sara Incao |
PDF |
N/A |
A Roadmap for Embodied and Social Grounding in LLMs |
| 在线对话中的机器人插话辅助:一项跨代研究 |
Sota Kobuki |
PDF |
N/A |
Robotic Backchanneling in Online Conversation Facilitation: A Cross-Generational Study |
| AI驱动的超声心动图图像引导系统 |
Jaeyoung Huh |
PDF |
N/A |
AI-driven View Guidance System in Intra-cardiac Echocardiography Imaging |
| HVT:非欧几里得空间中学习的综合视觉框架 |
Jacob Fein-Ashley |
PDF |
N/A |
HVT: A Comprehensive Vision Framework for Learning in Non-Euclidean Space |
| 从濒危到重生:人工智能时代下的哈拉米文本分类集成机器学习方法 |
Aram Khaksar |
PDF |
N/A |
Shifting from endangerment to rebirth in the Artificial Intelligence Age: An Ensemble Machine Learning Approach for Hawrami Text Classification |
| 重新审视太空任务规划:一种基于强化学习的多碎片会合方法 |
Agni Bandyopadhyay |
PDF |
N/A |
Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous |
| 利用人工智能研究代理自动化交通模型增强 |
Xusen Guo |
PDF |
N/A |
Automating Traffic Model Enhancement with AI Research Agent |
| 基于学习动态局部模型网络的前馈控制器及其在挖掘机辅助功能中的应用 |
Leon Greiser |
PDF |
N/A |
Feedforward Controllers from Learned Dynamic Local Model Networks with Application to Excavator Assistance Functions |
| 道德与可扩展的自动化:企业应用的治理与合规框架 |
Haocheng Lin |
PDF |
N/A |
Ethical and Scalable Automation: A Governance and Compliance Framework for Business Applications |
| 量化GAM形状图的视觉属性:对感知认知负荷和可解释性的影响 |
Sven Kruschel |
PDF |
N/A |
Quantifying Visual Properties of GAM Shape Plots: Impact on Perceived Cognitive Load and Interpretability |
| 使用大型语言模型进行启发式多目标进化 |
Shunyu Yao |
PDF |
N/A |
Multi-objective Evolution of Heuristic Using Large Language Model |
| 具有延迟反馈的风险规避学习 |
Siyi Wang |
PDF |
N/A |
Risk-averse learning with delayed feedback |
| 风格链接:理解深度学习模型中的学习特征 |
Maren H. Wehrheim |
PDF |
N/A |
Linking in Style: Understanding learned features in deep learning models |
| 面向从单视角肖像中统一的三维头发重建 |
Yujian Zheng |
PDF |
N/A |
Towards Unified 3D Hair Reconstruction from Single-View Portraits |
| (普罗克鲁斯特)对齐在评估多人人体姿态和形状估计中的局限性 |
Drazic Martin |
PDF |
N/A |
Limitations of (Procrustes) Alignment in Assessing Multi-Person Human Pose and Shape Estimation |
| 现代医疗中语言模型的作用:全面综述 |
Amna Khalid |
PDF |
N/A |
The Role of Language Models in Modern Healthcare: A Comprehensive Review |
| 一种多功能且可微的手部与物体交互表示 |
Théo Morales |
PDF |
N/A |
A Versatile and Differentiable Hand-Object Interaction Representation |
| 法律调解中基于定量论证的争议解决 |
Xiao Chi |
PDF |
N/A |
Dispute resolution in legal mediation with quantitative argumentation |
| 使用视觉基础模型和交叉注意力机制的鲁棒场景变化检测 |
Chun-Jung Lin |
PDF |
N/A |
Robust Scene Change Detection Using Visual Foundation Models and Cross-Attention Mechanisms |
| 通过认知建模揭示人工智能基准测试中的假设 |
Jonathan H. Rystrøm |
PDF |
N/A |
Exposing Assumptions in AI Benchmarks through Cognitive Modelling |
| IRASNet:改进的特征级杂波抑制用于域泛化SAR-ATR |
Oh-Tae Jang |
PDF |
N/A |
IRASNet: Improved Feature-Level Clutter Reduction for Domain Generalized SAR-ATR |
| 时间序列预测的最佳起点 |
Yiming Zhong |
PDF |
N/A |
Optimal starting point for time series forecasting |
| 显式建模皮层前视觉与神经启发的预处理前端提升CNN鲁棒性 |
Lucas Piper |
PDF |
N/A |
Explicitly Modeling Pre-Cortical Vision with a Neuro-Inspired Front-End Improves CNN Robustness |
| Demo2Vec:利用人口统计信息学习区域嵌入 |
Ya Wen |
PDF |
N/A |
Demo2Vec: Learning Region Embedding with Demographic Information |
| 异步分数多智能体深度强化学习用于最小化移动边缘计算的时延 |
Lyudong Jin |
PDF |
N/A |
Asynchronous Fractional Multi-Agent Deep Reinforcement Learning for Age-Minimal Mobile Edge Computing |
| OffRIPP:基于离线强化学习的情报路径规划 |
Srikar Babu Gadipudi |
PDF |
N/A |
OffRIPP: Offline RL-based Informative Path Planning |
| 人工智能方法在现代力控制造机器人任务中的作用 |
Vincenzo Petrone |
PDF |
N/A |
On the role of Artificial Intelligence methods in modern force-controlled manufacturing robotic tasks |
| 聚焦整体并感知环境以实现任意形状文本检测 |
Xu Han |
PDF |
N/A |
Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection |
| 学习使用时间离散隐式龙格-库塔方法的相空间流 |
Álvaro Fernández Corral |
PDF |
N/A |
Learning phase-space flows using time-discrete implicit Runge-Kutta PINNs |
| 状态空间层中用于深度强化学习在部分可观测性下的不确定性表示 |
Carlos E. Luis |
PDF |
N/A |
Uncertainty Representations in State-Space Layers for Deep Reinforcement Learning under Partial Observability |
| XAI引导的不平衡数据集绝缘子异常检测 |
Maximilian Andreas Hoefler |
PDF |
N/A |
XAI-guided Insulator Anomaly Detection for Imbalanced Datasets |
| 聚光灯文本检测器:像相机一样聚焦候选区域 |
Xu Han |
PDF |
N/A |
Spotlight Text Detector: Spotlight on Candidate Regions Like a Camera |
| CodeInsight:一个精选自Stack Overflow的实用编程解决方案数据集 |
Nathanaël Beau |
PDF |
N/A |
CodeInsight: A Curated Dataset of Practical Coding Solutions from Stack Overflow |
| 面向通用文本引导的图像合成,用于定制化多模态脑部MRI生成 |
Yulin Wang |
PDF |
N/A |
Towards General Text-guided Image Synthesis for Customized Multimodal Brain MRI Generation |
| 基于深度学习的核函数动态模式分解参数化框架 |
Konstantinos Kevopoulos |
PDF |
N/A |
A parametric framework for kernel-based dynamic mode decomposition using deep learning |
| 通过近似内核加速微控制器上的TinyML推理 |
Giorgos Armeniakos |
PDF |
N/A |
Accelerating TinyML Inference on Microcontrollers through Approximate Kernels |
| PeerArg:基于大型语言模型的辩论式同行评审 |
Purin Sukpanichnant |
PDF |
N/A |
PeerArg: Argumentative Peer Review with LLMs |
| 内联光度校准混合视觉SLAM |
Nicolas Abboud |
PDF |
N/A |
Inline Photometrically Calibrated Hybrid Visual SLAM |
| 在边缘计算设备上进行目标检测的深度学习模型基准测试 |
Daghash K. Alqahtani |
PDF |
N/A |
Benchmarking Deep Learning Models for Object Detection on Edge Computing Devices |
| 几个伪君子:用于在线气候变化辩论中检测虚伪指控的少样本学习和子类型定义 |
Paulina Garcia Corral |
PDF |
N/A |
A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates |
| 利用深度特征和拓扑先验的结肠镜检查中的拓扑SLAM |
Javier Morlana |
PDF |
N/A |
Topological SLAM in colonoscopies leveraging deep features and topological priors |
| 大型语言模型预测2024年全印度夏季季风降雨量高于正常水平 |
Ujjawal Sharma |
PDF |
N/A |
Large Language Model Predicts Above Normal All India Summer Monsoon Rainfall in 2024 |
| 可扩展的集成多样化用于OOD泛化和检测 |
Alexander Rubinstein |
PDF |
N/A |
Scalable Ensemble Diversification for OOD Generalization and Detection |
| 太空漫步者:快速交互探索和注释非结构化数据,通过遍历表示空间 |
Lukas Heine |
PDF |
N/A |
Spacewalker: Traversing Representation Spaces for Fast Interactive Exploration and Annotation of Unstructured Data |
| 强化学习的符号状态划分 |
Mohsen Ghaffari |
PDF |
N/A |
Symbolic State Partition for Reinforcement Learning |
| 缓解大型语言模型评估中的偏见 |
Hongli Zhou |
PDF |
N/A |
Mitigating the Bias of Large Language Model Evaluation |
| 通过特征归因增强AI回归任务中的特征选择和可解释性 |
Alexander Hinterleitner |
PDF |
N/A |
Enhancing Feature Selection and Interpretability in AI Regression Tasks Through Feature Attribution |
| 基于世界模型的视觉腿部运动感知 |
Hang Lai |
PDF |
N/A |
World Model-based Perception for Visual Legged Locomotion |
| 通过自上而下的测试用例生成和多轮交互实现大型语言模型的整体自动化红队测试 |
Jinchuan Zhang |
PDF |
N/A |
Holistic Automated Red Teaming for Large Language Models through Top-Down Test Case Generation and Multi-turn Interaction |
| LLaMa-SciQ:一个用于回答科学选择题的教育聊天机器人 |
Marc-Antoine Allard |
PDF |
N/A |
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ |
| MixPolyp:融合掩码、边界框和涂鸦监督以增强息肉分割 |
Yiwen Hu |
PDF |
N/A |
MixPolyp: Integrating Mask, Box and Scribble Supervision for Enhanced Polyp Segmentation |
| 城市污水监测中传感器优化布置问题的演化贪婪算法 |
Sunyu Wang |
PDF |
N/A |
Evolutionary Greedy Algorithm for Optimal Sensor Placement Problem in Urban Sewage Surveillance |
| 超水平集与指数衰减:一种协同稳定的神经网络训练方法 |
Jatin Chaudhary |
PDF |
N/A |
Super Level Sets and Exponential Decay: A Synergistic Approach to Stable Neural Network Training |
| 在变化的信噪比下解释基于深度神经网络的接收器 |
Marko Tuononen |
PDF |
N/A |
Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise Ratios |
| 探索监督训练中神经崩溃相关的信息论度量 |
Kun Song |
PDF |
N/A |
Exploring Information-Theoretic Metrics Associated with Neural Collapse in Supervised Training |
| 让光存在:在外部光照下利用深度学习实现稳健的无镜头成像 |
Eric Bezzam |
PDF |
N/A |
Let There Be Light: Robust Lensless Imaging Under External Illumination With Deep Learning |
| MaViLS是一个用于视频与幻灯片对齐的基准数据集,通过利用语音、OCR和视觉特征的多模态对齐算法评估基线准确性。 |
Katharina Anderer |
PDF |
N/A |
MaViLS, a Benchmark Dataset for Video-to-Slide Alignment, Assessing Baseline Accuracy with a Multimodal Alignment Algorithm Leveraging Speech, OCR, and Visual Features |
| 离线和分布式强化学习在无线电资源管理中的应用 |
Eslam Eldeeb |
PDF |
N/A |
Offline and Distributional Reinforcement Learning for Radio Resource Management |
| 全州范围内的野外视觉地理定位 |
Florian Fervers |
PDF |
N/A |
Statewide Visual Geolocalization in the Wild |
| 一种在加性噪声环境下进化策略的自适应重评估方法 |
Catalin-Viorel Dinu |
PDF |
N/A |
An Adaptive Re-evaluation Method for Evolution Strategy under Additive Noise |
| 探索可解释人工智能的迷宫:评估方法和指标的系统性方法 |
Lukas Klein |
PDF |
N/A |
Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics |
| E-SQL:通过问题丰富实现直接模式链接的文本到SQL转换 |
Hasan Alp Caferoğlu |
PDF |
N/A |
E-SQL: Direct Schema Linking via Question Enrichment in Text-to-SQL |
| 三维微结构的快速原型制作:一种简化的灰度光刻编码方法,使用Blender |
Fabricio Frizera Borghi |
PDF |
N/A |
Rapid Prototyping of 3D Microstructures: A Simplified Grayscale Lithography Encoding Method Using Blender |
| 常见的有趣图片 |
Fitim Abdullahu |
PDF |
N/A |
Commonly Interesting Images |
| GB-RVFL:随机神经网络与粒球计算的融合 |
M. Sajid |
PDF |
N/A |
GB-RVFL: Fusion of Randomized Neural Network and Granular Ball Computing |
| 有损压缩对使用深度学习的3D医学图像分割的影响 |
Anvar Kurmukov |
PDF |
N/A |
The Effect of Lossy Compression on 3D Medical Images Segmentation with Deep Learning |
| 非平稳BERT:探索增强的IMU数据以实现鲁棒的人类活动识别 |
Ning Sun |
PDF |
N/A |
Non-stationary BERT: Exploring Augmented IMU Data For Robust Human Activity Recognition |
| SDCL:面向半监督医学图像分割的学生差异引导校正学习 |
Bentao Song |
PDF |
N/A |
SDCL: Students Discrepancy-Informed Correction Learning for Semi-supervised Medical Image Segmentation |
| 角色分裂:角色幻觉作为角色扮演系统中的越狱攻击 |
Yihong Tang |
PDF |
N/A |
RoleBreak: Character Hallucination as a Jailbreak Attack in Role-Playing Systems |
| 经过验证的神经网络孪生体的相对安全裕度 |
Anahita Baninajjar |
PDF |
N/A |
Verified Relative Safety Margins for Neural Network Twins |
| EAGLE:面向多模态大型语言模型的高效任意视觉提示理解 |
Jiacheng Zhang |
PDF |
N/A |
EAGLE: Towards Efficient Arbitrary Referring Visual Prompts Comprehension for Multimodal Large Language Models |
| PMSS:针对LLM微调的预训练矩阵骨架选择 |
Qibin Wang |
PDF |
N/A |
PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning |
| 基于多数据集分类的深度学习框架,用于电子健康记录和医疗预测分析 |
Syed Mohd Faisal Malik |
PDF |
N/A |
A Multi-Dataset Classification-Based Deep Learning Framework for Electronic Health Records and Predictive Analysis in Healthcare |
| 追逐金色飞贼:多无人机时间最优运动规划与多智能体强化学习 |
Xian Wang |
PDF |
N/A |
Dashing for the Golden Snitch: Multi-Drone Time-Optimal Motion Planning with Multi-Agent Reinforcement Learning |
| 通过简单的参数高效修改进行视觉语言模型的微调 |
Ming Li |
PDF |
N/A |
Vision-Language Model Fine-Tuning via Simple Parameter-Efficient Modification |
| 超越图灵测试:GPT-4能否影响专家决策? |
Takehiro Takayanagi |
PDF |
N/A |
Beyond Turing Test: Can GPT-4 Sway Experts' Decisions? |
| 姿态引导的细粒度手语视频生成 |
Tongkai Shi |
PDF |
N/A |
Pose-Guided Fine-Grained Sign Language Video Generation |
| 探究基于Transformer的RDF-to-文本模型中的遗漏与扭曲 |
Juliette Faille |
PDF |
N/A |
Probing Omissions and Distortions in Transformer-based RDF-to-Text Models |
| Pix2Next:利用视觉基础模型进行RGB到NIR图像翻译 |
Youngwan Jin |
PDF |
N/A |
Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation |
| 3DDX: 通过双面深度估计从单张标准几何射线照片进行骨骼表面重建 |
Yi Gu |
PDF |
N/A |
3DDX: Bone Surface Reconstruction from a Single Standard-Geometry Radiograph via Dual-Face Depth Estimation |
| 有界参数神经网络的数值逼近能力:极限存在吗,如何测量? |
Li Liu |
PDF |
N/A |
Numerical Approximation Capacity of Neural Networks with Bounded Parameters: Do Limits Exist, and How Can They Be Measured? |
| 低比特大型语言模型的调查:基础、系统和算法 |
Ruihao Gong |
PDF |
N/A |
A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms |
| CaBRNet,一个用于开发和评估基于案例推理模型的开源库 |
Romain Xu-Darme |
PDF |
N/A |
CaBRNet, an open-source library for developing and evaluating Case-Based Reasoning Models |
| 布局校正器:缓解离散扩散模型中的布局粘连现象 |
Shoma Iwai |
PDF |
N/A |
Layout-Corrector: Alleviating Layout Sticking Phenomenon in Discrete Diffusion Model |
| MSI-Agent:将多尺度洞察融入具身智能体,以实现卓越的规划和决策能力 |
Dayuan Fu |
PDF |
N/A |
MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making |
| 天眼:利用航拍图像进行地面漫游 |
Zhiyuan Gao |
PDF |
N/A |
Skyeyes: Ground Roaming using Aerial View Images |
| 擦除与修正:一种无需训练的参数编辑方法,实现高效的图数据遗忘 |
Zhe-Rui Yang |
PDF |
N/A |
Erase then Rectify: A Training-Free Parameter Editing Approach for Cost-Effective Graph Unlearning |
| SynTQA:通过文本到SQL与端到端TQA混合模型实现协同表格问答 |
Siyue Zhang |
PDF |
N/A |
SynTQA: Synergistic Table-based Question Answering via Mixture of Text-to-SQL and E2E TQA |
| 基于语言模型的文本转语音中的情感维度控制:涵盖人类情感的广泛光谱 |
Kun Zhou |
PDF |
N/A |
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions |
| TSBP:通过测试时自引导边界框传播提高组织学图像中的目标检测 |
Tingting Yang |
PDF |
N/A |
TSBP: Improving Object Detection in Histology Images via Test-time Self-guided Bounding-box Propagation |
| CryptoTrain:在加密数据集上进行快速安全训练 |
Jiaqi Xue |
PDF |
N/A |
CryptoTrain: Fast Secure Training on Encrypted Datase |
| SWE2:用于仇恨言论检测的子词增强与重要词汇强调框架 |
Guanyi Mou |
PDF |
N/A |
SWE2: SubWord Enriched and Significant Word Emphasized Framework for Hate Speech Detection |
| 在线社交网络中的野生动物产品交易:以象牙相关产品销售推广帖为例的研究 |
Guanyi Mou |
PDF |
N/A |
Wildlife Product Trading in Online Social Networks: A Case Study on Ivory-Related Product Sales Promotion Posts |
| GraphLoRA:结构感知对比低秩适应用于跨图迁移学习 |
Zhe-Rui Yang |
PDF |
N/A |
GraphLoRA: Structure-Aware Contrastive Low-Rank Adaptation for Cross-Graph Transfer Learning |
| 主题感知的因果干预用于反事实检测 |
Thong Nguyen |
PDF |
N/A |
Topic-aware Causal Intervention for Counterfactual Detection |
| 通过想象力进行以角色为中心的创意故事生成 |
Kyeongman Park |
PDF |
N/A |
A Character-Centric Creative Story Generation via Imagination |
| TalkinNeRF:用于全身说话人类的可动画神经场 |
Aggelina Chatziagapi |
PDF |
N/A |
TalkinNeRF: Animatable Neural Fields for Full-Body Talking Humans |
| 使用潜在空间生成世界模型减轻自动驾驶车辆模仿学习中的协变量偏移 |
Alexander Popov |
PDF |
N/A |
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models |
| 预训练语言模型对不忠实幻觉文本返回可区分的概率分布 |
Taehun Cha |
PDF |
N/A |
Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts |
| 使用大型语音-文本基础模型进行语音识别重评分 |
Prashanth Gurunath Shivakumar |
PDF |
N/A |
Speech Recognition Rescoring with Large Speech-Text Foundation Models |
| 可信度转换器 |
Ronald Richman |
PDF |
N/A |
The Credibility Transformer |
| 渐进式表示学习用于实时无人机跟踪 |
Changhong Fu |
PDF |
N/A |
Progressive Representation Learning for Real-Time UAV Tracking |
| 通过自监督辅助学习进行多任务学习中的表示学习 |
Seokwon Shin |
PDF |
N/A |
Learning Representation for Multitask learning through Self Supervised Auxiliary learning |
| 领域无关的时间序列数据描述性文本自动生成 |
Kota Dohi |
PDF |
N/A |
Domain-Independent Automatic Generation of Descriptive Texts for Time-Series Data |
| 跨语言和跨文化在图像描述中的差异 |
Uri Berger |
PDF |
N/A |
Cross-Lingual and Cross-Cultural Variation in Image Descriptions |