跳转至

Arxiv 2024-12-05 Papers

标题 作者 PDF链接 代码仓库 Title
立体无处不在:即使在立体或单目失败的情况下,也能实现鲁棒的零样本深度立体匹配 Luca Bartolomei PDF N/A Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail
PaintScene4D:从文本提示生成一致的4D场景 Vinayak Gupta PDF N/A PaintScene4D: Consistent 4D Scene Generation from Text Prompts
Turbo3D:超快文本转3D生成 Hanzhe Hu PDF N/A Turbo3D: Ultra-fast Text-to-3D Generation
NVILA:高效前沿视觉语言模型 Zhijian Liu PDF N/A NVILA: Efficient Frontier Visual Language Models
QUEEN:流式自由视角视频中动态高斯分布的量化高效编码 Sharath Girish PDF N/A QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos
VisionZip:在视觉语言模型中,更长并不一定更好 Senqiao Yang PDF N/A VisionZip: Longer is Better but Not Necessary in Vision Language Models
UnZipLoRA:从单张图像中分离内容和风格 Chang Liu PDF N/A UnZipLoRA: Separating Content and Style from a Single Image
DualPM:用于三维形状和姿态重建的双姿态-规范点图 Ben Kaye PDF N/A DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction
MegaSaM:从随意动态视频中准确、快速且稳健地提取结构和运动 Zhengqi Li PDF N/A MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos
4Real-Video:学习可泛化的照片级真实感4D视频扩散 Chaoyang Wang PDF N/A 4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
LayerFusion:利用生成先验实现多层次文本到图像生成的和谐统一 Yusuf Dalva PDF N/A LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors
稀疏体素光栅化:实时高保真辐射场渲染 Cheng Sun PDF N/A Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
Cubify Anything:室内3D物体检测的扩展 Justin Lazarow PDF N/A Cubify Anything: Scaling Indoor 3D Object Detection
单目动态高斯喷射法快速但脆弱,而平滑运动有助于改善效果。 Yiqing Liang PDF N/A Monocular Dynamic Gaussian Splatting is Fast and Brittle but Smooth Motion Helps
HeatFormer:一种用于多视角人体网格恢复的神经优化器 Yuto Matsubara PDF N/A HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery
代码即监控:面向约束的可视化编程,用于反应性和前瞻性机器人故障检测 Enshen Zhou PDF N/A Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection
Aguvis: 统一纯视觉代理,用于自主GUI交互 Yiheng Xu PDF N/A Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
四平面分解视频自编码器 Mohammed Suhail PDF N/A Four-Plane Factorized Video Autoencoders
NaVILA:用于导航的足式机器人视觉-语言-动作模型 An-Chieh Cheng PDF N/A NaVILA: Legged Robot Vision-Language-Action Model for Navigation
p-MoD:通过逐步比率衰减构建深度混合的多语言大型语言模型 Jun Zhang PDF N/A p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
备忘录:用于表达性对话视频生成的记忆引导扩散 Longtao Zheng PDF N/A MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation
EgoPlan-Bench2:一个用于多模态大语言模型在现实世界场景中规划的基准 Lu Qiu PDF N/A EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
DiCoDe:用于自回归视频生成与语言模型的扩散压缩深度令牌 Yizhuo Li PDF N/A DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
摩托:潜在运动令牌作为机器人操作的桥梁语言 Yi Chen PDF N/A Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
学习艺术签名:对称性发现与风格迁移 Emma Finn PDF N/A Learning Artistic Signatures: Symmetry Discovery and Style Transfer
GenMAC:通过多智能体协作实现组合式文本到视频生成 Kaiyi Huang PDF N/A GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration
面向实时开放词汇视频实例分割 Bin Yan PDF N/A Towards Real-Time Open-Vocabulary Video Instance Segmentation
PBDyG:基于位置的动态高斯模型用于感知运动的着装人体化身 Shota Sasaki PDF N/A PBDyG: Position Based Dynamic Gaussians for Motion-Aware Clothed Human Avatars
Divot:用于理解和生成的扩散力视频令牌器 Yuying Ge PDF N/A Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
无限:通过位自动回归建模扩展高分辨率图像合成 Jian Han PDF N/A Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
将图像中的描述接地信息用于零样本视觉识别 Shaunak Halbe PDF N/A Grounding Descriptions in Images informs Zero-Shot Visual Recognition
漫威:通过微调的离线策略加速安全的在线强化学习 Keru Chen PDF N/A Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
CA-SSLR:面向广义语音处理的感知条件自监督学习表示 Yen-Ju Lu PDF N/A CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
Florence-VL:通过生成式视觉编码器和深度-广度融合增强视觉-语言模型 Jiuhai Chen PDF N/A Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
FedDUAL: 一种结合自适应损失和动态聚合的双策略方法,用于缓解联邦学习中的数据异质性问题 Pranab Sahoo PDF N/A FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning
针对核心:通过直接LLM操纵攻击基于RAG的代理的简单有效方法 Xuying Li PDF N/A Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation
通过样本优化景观分析实现高效任务分组 Anshul Thakur PDF N/A Efficient Task Grouping Through Samplewise Optimisation Landscape Analysis
使用数据和机器学习稳定并解决逆问题 Erik Burman PDF N/A Stabilizing and Solving Inverse Problems using Data and Machine Learning
为无线联邦学习提供差分隐私:一种跨层框架 Jiayu Mao PDF N/A Providing Differential Privacy for Federated Learning Over Wireless: A Cross-layer Framework
联邦自动化特征工程 Tom Overman PDF N/A Federated Automated Feature Engineering
通过计算高效模型阶梯建立任务缩放法则 Akshita Bhagia PDF N/A Establishing Task Scaling Laws via Compute-Efficient Model Ladders
在实验资源受限条件下,通过流水线评估实现异步批量贝叶斯优化的方法 Yujin Taguchi PDF N/A Asynchronous Batch Bayesian Optimization with Pipelining Evaluations for Experimental Resource$\unicode{x2013}$constrained Conditions
用于高效三维占据预测的概率高斯叠加 Yuanhui Huang PDF N/A Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction
SeeGround:零样本开放词汇3D视觉定位的视觉与基础 Rong Li PDF N/A SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
EmbodiedOcc:基于视觉的在线场景理解的三维占据预测 Yuqi Wu PDF N/A EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
对大型视觉语言模型进行有区别的微调 Yassine Ouali PDF N/A Discriminative Fine-tuning of LVLMs
《理解二分类器性能的搭便车指南》 Anaïs Halin PDF N/A A Hitchhiker's Guide to Understanding Performances of Two-Class Classifiers
可逆分子模拟用于训练经典和机器学习力场 Joe G Greener PDF N/A Reversible molecular simulation for training classical and machine learning force fields
通过自回归特征和优势加权实现更精细的行为基础模型 Edoardo Cetin PDF N/A Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting
自主网络防御的机器心智理论 Luke Swaby PDF N/A Machine Theory of Mind for Autonomous Cyber-Defence
人工智能与创造力的内在过程 Jaan Aru PDF N/A Artificial intelligence and the internal processes of creativity
提高并行性的近似Top-k算法 Oscar Key PDF N/A Approximate Top-$k$ for Increased Parallelism
用于图建模和生成的多尺度节点嵌入 Riccardo Milocco PDF N/A Multi-Scale Node Embeddings for Graph Modeling and Generation
ActFusion:一种用于动作分割和预测的统一扩散模型 Dayoung Gong PDF N/A ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation
BhashaVerse:印度次大陆语言翻译生态系统 Vandan Mujadia PDF N/A BhashaVerse : Translation Ecosystem for Indian Subcontinent Languages
分布稳健的表现预测 Songkai Xue PDF N/A Distributionally Robust Performative Prediction
RMD:通过无训练检索增强运动扩散实现更通用的人类运动生成的一个简单基线 Zhouyingcheng Liao PDF N/A RMD: A Simple Baseline for More General Human Motion Generation via Training-free Retrieval-Augmented Motion Diffuse
使用非结构化知识进行检索增强的机器翻译 Jiaan Wang PDF N/A Retrieval-Augmented Machine Translation with Unstructured Knowledge
基于可能性调度的分数生成模型用于全三维PET图像重建 George Webber PDF N/A Likelihood-Scheduled Score-Based Generative Modeling for Fully 3D PET Image Reconstruction
反思型教师:通过不确定性度量实现鸟瞰图下半监督多模态三维物体检测 Saheli Hazra PDF N/A Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in Bird's-Eye-View via Uncertainty Measure
Liquid: 语言模型是可扩展的多模态生成器 Junfeng Wu PDF N/A Liquid: Language Models are Scalable Multi-modal Generators
约束条件下连续环境中的强化学习动作映射 Mirco Theile PDF N/A Action Mapping for Reinforcement Learning in Continuous Environments with Constraints
多主题图像合成作为单主题PET图像重建的生成先验 George Webber PDF N/A Multi-Subject Image Synthesis as a Generative Prior for Single-Subject PET Image Reconstruction
GRAM:在深度强化学习中通过稳健适应模块实现泛化 James Queeney PDF N/A GRAM: Generalization in Deep RL with a Robust Adaptation Module
基于生成模型的全三维PET图像条件扩散采样重建 George Webber PDF N/A Generative-Model-Based Fully 3D PET Image Reconstruction by Conditional Diffusion Sampling
超拟合现象:为开放式文本生成优化和稳定大型语言模型 Fredrik Carlsson PDF N/A The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation
FlashSloth:通过嵌入式视觉压缩实现的高效多模态大语言模型 Bo Tong PDF N/A FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression
大语言模型(LLMs)的Densing定律 Chaojun Xiao PDF N/A Densing Law of LLMs
LocalSR:局部区域图像超分辨率 Bo Ji PDF N/A LocalSR: Image Super-Resolution in Local Region
标题:二维排名分数图用于二分类 Sébastien Piérard PDF N/A The Tile: A 2D Map of Ranking Scores for Two-Class Classification
ALMA:最小注释对齐 Michihiro Yasunaga PDF N/A ALMA: Alignment with Minimal Annotation
面向零样本的三维异常定位 Yizhou Wang PDF N/A Towards Zero-shot 3D Anomaly Localization
SwiftEdit:通过一步扩散实现闪电般快速的文本引导图像编辑 Trong-Tung Nguyen PDF N/A SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion
T2I-FactualBench:利用知识密集型概念评估文本到图像模型的真实性基准测试 Ziwei Huang PDF N/A T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts
结构感知风格化图像合成在鲁棒医学图像分割中的应用 Jie Bao PDF N/A Structure-Aware Stylized Image Synthesis for Robust Medical Image Segmentation
SIDA:利用大型多模态模型进行社交媒体图像深度伪造检测、定位与解释 Zhenglin Huang PDF N/A SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model
数学推理的进化预提示优化 Mathurin Videau PDF N/A Evolutionary Pre-Prompt Optimization for Mathematical Reasoning
针对点参考空间数据的深度因果推断与连续处理 Ziyang Jiang PDF N/A Deep Causal Inference for Point-referenced Spatial Data with Continuous Treatments
可学习无穷泰勒高斯函数用于动态视图渲染 Bingbing Hu PDF N/A Learnable Infinite Taylor Gaussian for Dynamic View Rendering
HumanEdit:一个基于指令的图像编辑高质量人类奖励数据集 Jinbin Bai PDF N/A HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
基于估计姿态和遮挡误差的定向硬样本合成以提升物体姿态估计 Alan Li PDF N/A Targeted Hard Sample Synthesis Based on Estimated Pose and Occlusion Error for Improved Object Pose Estimation
阿拉伯稳定语言模型:将稳定语言模型2 1.6B适配到阿拉伯语 Zaid Alyafeai PDF N/A Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic
向量值预测的复杂性:从线性模型到随机凸优化 Matan Schliserman PDF N/A Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization
从野生动物视频中进行强化学习 Elliot Chane-Sane PDF N/A Reinforcement Learning from Wild Animal Videos
PoTable:像人类分析师一样在基于表格的推理中编程标准化 Qingyang Mao PDF N/A PoTable: Programming Standardly on Table-based Reasoning Like a Human Analyst
端到端语音翻译的表示净化 Chengwei Zhang PDF N/A Representation Purification for End-to-End Speech Translation
SynFinTabs:一个用于信息和表格提取的合成金融表格数据集 Ethan Bradley PDF N/A SynFinTabs: A Dataset of Synthetic Financial Tables for Information and Table Extraction
阿雅领域:结合研究突破,开创多语言新前沿 John Dang PDF N/A Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
通过监督对比领域自适应提升全切片图像分类 Ilán Carretero PDF N/A Enhancing Whole Slide Image Classification through Supervised Contrastive Domain Adaptation
SCADE:可扩展的命令行异常检测引擎 Vaishali Vinay PDF N/A SCADE: Scalable Command-line Anomaly Detection Engine
在密集环境中终身导航的瞬态多智能体路径寻找 Jonathan Morag PDF N/A Transient Multi-Agent Path Finding for Lifelong Navigation in Dense Environments
CLINICSUM:利用语言模型从医患对话中生成临床摘要 Subash Neupane PDF N/A CLINICSUM: Utilizing Language Models for Generating Clinical Summaries from Patient-Doctor Conversations
通过几何聚合的2D视觉特征进行3D部件分割 Marco Garosi PDF N/A 3D Part Segmentation via Geometric Aggregation of 2D Visual Features
鲁棒分类的有趣特性 Bernd Prach PDF N/A Intriguing Properties of Robust Classification
GigaHands:一个大规模标注的双手动活动数据集 Rao Fu PDF N/A GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities
量化分割一切模型的极限:分析分割树状和低对比度结构的挑战 Yixin Zhang PDF N/A Quantifying the Limits of Segment Anything Model: Analyzing Challenges in Segmenting Tree-Like and Low-Contrast Structures
LMDM:用于三维分子生成的潜在分子扩散模型 Xiang Chen PDF N/A LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation
VASCAR:通过视觉感知自校正实现内容感知布局生成 Jiahao Zhang PDF N/A VASCAR: Content-Aware Layout Generation via Visual-Aware Self-Correction
通过主题建模探索哥伦比亚哲学史 Juan R. Loaiza PDF N/A A History of Philosophy in Colombia through Topic Modelling
在意大利医疗大型语言模型聊天机器人中使用RAG和NMISS处理幻觉 Maria Paola Priola PDF N/A Addressing Hallucinations with RAG and NMISS in Italian Healthcare LLM Chatbots
DEIM:具有改进匹配的DETR,用于快速收敛 Shihua Huang PDF N/A DEIM: DETR with Improved Matching for Fast Convergence
HyperMARL:用于多智能体强化学习的自适应超网络 Kale-ab Abebe Tessera PDF N/A HyperMARL: Adaptive Hypernetworks for Multi-Agent RL
基于绩效排名的理论基础 Sébastien Piérard PDF N/A Foundations of the Theory of Performance-Based Ranking
自定义混合LoRA专家的多模态语义分割的Segment Anything模型 Chenyang Zhu PDF N/A Customize Segment Anything Model for Multi-Modal Semantic Segmentation with Mixture of LoRA Experts
对齐音乐符号与歌词转录 Eliseo Fuentes-Martínez PDF N/A Aligned Music Notation and Lyrics Transcription
利用未标记的sEMG信号进行肌肉力预测的物理信息深度学习 Shuhao Ma PDF N/A Physics-informed Deep Learning for Muscle Force Prediction with Unlabeled sEMG Signals
一个用于翻译中介对话的上下文感知框架 José Pombal PDF N/A A Context-aware Framework for Translation-mediated Conversations
PANGAEA:一个全球性和包容性的地理空间基础模型基准 Valerio Marsocci PDF N/A PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models
歌词音乐中关键词与强拍之间的关系 Callie C. Liao PDF N/A Relationships between Keywords and Strong Beats in Lyrical Music
Hipandas:通过与全色图像融合实现高光谱图像联合去噪与超分辨率 Shuang Xu PDF N/A Hipandas: Hyperspectral Image Joint Denoising and Super-Resolution by Image Fusion with the Panchromatic Image
AL-QASIDA:系统分析阿拉伯方言中大型语言模型质量与准确性的系统 Nathaniel R. Robinson PDF N/A AL-QASIDA: Analyzing LLM Quality and Accuracy Systematically in Dialectal Arabic
直接结构适应以克服统计冲突并实现持续学习 Zeki Doruk Erden PDF N/A Directed Structural Adaptation to Overcome Statistical Conflicts and Enable Continual Learning
教学视频生成 Yayuan Li PDF N/A Instructional Video Generation
利用大型语言模型生成特定课程的语义注释学习对象 Dominic Lohr PDF N/A Leveraging Large Language Models to Generate Course-specific Semantically Annotated Learning Objects
使用GAN和频谱损失建模眼球注视速度轨迹以提高逼真度 Shailendra Bhandari PDF N/A Modeling Eye Gaze Velocity Trajectories using GANs with Spectral Loss for Enhanced Fidelity
线性判别分析在信用评分中的应用:一种透明的混合模型方法 Md Shihab Reza PDF N/A Linear Discriminant Analysis in Credit Scoring: A Transparent Hybrid Model Approach
SKIM:任意位量化 推动后训练量化的极限 Runsheng Bai PDF N/A SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
基于渐进信息披露的多层隐私保护记录链接与文员审查 Florens Rohde PDF N/A Multi-Layer Privacy-Preserving Record Linkage with Clerical Review based on gradual information disclosure
固定均值高斯过程用于后验贝叶斯深度学习 Luis A. Ortega PDF N/A Fixed-Mean Gaussian Processes for Post-hoc Bayesian Deep Learning
Bench-CoE:一个用于基准专家协作的框架 Yuanshuai Wang PDF N/A Bench-CoE: a Framework for Collaboration of Experts from Benchmark
多类分类算法中风险评估的深入研究 Disha Ghandwani PDF N/A An In-Depth Examination of Risk Assessment in Multi-Class Classification Algorithms
二值化函数相似性系统鲁棒性的缺失 Gianluca Capozzi PDF N/A On the Lack of Robustness of Binary Function Similarity Systems
LossVal:神经网络的高效数据估值 Tim Wibiral PDF N/A LossVal: Efficient Data Valuation for Neural Networks
非渐近闭环辨识不稳定非线性随机系统的界限 Seth Siriya PDF N/A Non-Asymptotic Bounds for Closed-Loop Identification of Unstable Nonlinear Stochastic Systems
使用事件和帧的频率自适应低延迟目标检测 Haitian Zhang PDF N/A Frequency-Adaptive Low-Latency Object Detection Using Events and Frames
MultiTASC++:一种面向基于边缘的多设备级联推理的持续自适应调度器 Sokratis Nikolaidis PDF N/A MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based Multi-Device Cascade Inference
AnyDressing:通过潜在扩散模型实现可定制的多服装虚拟试穿 Xinghui Li PDF N/A AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models
如果你无法使用它们,那就回收它们:大规模优化合并以缓解性能权衡 Muhammad Khalifa PDF N/A If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs
利用深度学习和微流控技术在线估计聚合物熔体流变参数的方法论 Juan Sandubete-López PDF N/A Methodology for Online Estimation of Rheological Parameters in Polymer Melts Using Deep Learning and Microfluidics
通过可靠性对齐减少工具幻觉 Hongshen Xu PDF N/A Reducing Tool Hallucination via Reliability Alignment
通过概率景观中的锐度理解生成模型中的记忆化 Dongjae Jeon PDF N/A Understanding Memorization in Generative Models via Sharpness in Probability Landscapes
莫奈:用于Transformer的单语义专家混合模型 Jungwoo Park PDF N/A Monet: Mixture of Monosemantic Experts for Transformers
使用图像比较进行多语言文档中的文本变化检测 Doyoung Park PDF N/A Text Change Detection in Multilingual Documents Using Image Comparison
组合生成多物理场与多组分模拟 Tao Zhang PDF N/A Compositional Generative Multiphysics and Multi-component Simulation
用于卫星图像恢复的深度先验方法,具有精确的不确定性估计 Biquard Maud PDF N/A Deep priors for satellite image restoration with accurate uncertainties
DeepFEA:用于预测瞬态有限元分析解决方案的深度学习 Georgios Triantafyllou PDF N/A DeepFEA: Deep Learning for Prediction of Transient Finite Element Analysis Solutions
CrossSDF:通过横截面进行薄结构的3D重建 Thomas Walker PDF N/A CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections
GRAF:基于事实增强的法律问答图检索 Cristian-George Crăciun PDF N/A GRAF: Graph Retrieval Augmented by Facts for Legal Question Answering
MVUDA:多视角行人检测的无监督域自适应 Erik Brorsson PDF N/A MVUDA: Unsupervised Domain Adaptation for Multi-view Pedestrian Detection
热成像与RGB图像在风力涡轮机损伤检测中相辅相成 Serhii Svystun PDF N/A Thermal and RGB Images Work Better Together in Wind Turbine Damage Detection
使用分层微调数据的迁移学习对撒哈拉以南非洲成人胶质瘤进行分割 Abhijeet Parida PDF N/A Adult Glioma Segmentation in Sub-Saharan Africa using Transfer Learning on Stratified Finetuning Data
通过背景操作符增强大型语言模型中的数学推理能力 Jiajun Chen PDF N/A Enhancing Mathematical Reasoning in LLMs with Background Operators
预训练、对齐与解耦:利用大型语言模型赋能序列推荐 Yuhao Wang PDF N/A Pre-train, Align, and Disentangle: Empowering Sequential Recommendation with Large Language Models
缺失的旋律:人工智能音乐生成及其对全球南方的“几乎”完全忽视 Atharva Mehta PDF N/A Missing Melodies: AI Music Generation and its "Nearly" Complete Omission of the Global South
D-LORD 用于运动风格化 Meenakshi Gupta PDF N/A D-LORD for Motion Stylization
HyperFLINT:基于超网络的流场估计与时间插值用于科学集合可视化 Hamid Gadirov PDF N/A HyperFLINT: Hypernetwork-based Flow Estimation and Temporal Interpolation for Scientific Ensemble Visualization
基于磁共振成像特征的亚型分类与模型集成以提升脑肿瘤分割效果 Zhifan Jiang PDF N/A Magnetic Resonance Imaging Feature-Based Subtyping and Model Ensemble for Enhanced Brain Tumor Segmentation
代理型大型语言模型系统的实际考虑 Chris Sypherd PDF N/A Practical Considerations for Agentic LLM Systems
GEITje 7B Ultra:荷兰语对话模型 Bram Vanroy PDF N/A GEITje 7B Ultra: A Conversational Model for Dutch
LossAgent:利用LLM代理实现图像处理中任意优化目标 Bingchen Li PDF N/A LossAgent: Towards Any Optimization Objectives for Image Processing with LLM Agents
BodyMetric:评估文本到图像生成中人体逼真度 Nefeli Andreou PDF N/A BodyMetric: Evaluating the Realism of HumanBodies in Text-to-Image Generation
开放世界组合零样本学习的统一框架 Hirunima Jayasekara PDF N/A Unified Framework for Open-World Compositional Zero-shot Learning
可学习的相似性与差异性引导的对称非负矩阵分解 Wenlong Lyu PDF N/A Learnable Similarity and Dissimilarity Guided Symmetric Non-Negative Matrix Factorization
移动网络中的联邦学习:一项关于流量预测的综合案例研究 Nikolaos Pavlidis PDF N/A Federated Learning in Mobile Networks: A Comprehensive Case Study on Traffic Forecasting
通过领域随机化和元强化学习实现可泛化的自主渗透测试 Shicheng Zhou PDF N/A Towards Generalizable Autonomous Penetration Testing via Domain Randomization and Meta-Reinforcement Learning
SoRA:用于领域泛化表示学习的奇异值分解低秩适应 Seokju Yun PDF N/A SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning
距离自适应的四元数知识图谱嵌入与双向旋转 Weihua Wang PDF N/A Distance-Adaptive Quaternion Knowledge Graph Embedding with Bidirectional Rotation
你的模型能理解基因吗?针对生物和文本模型的一个基因特性基准测试 Yoav Kan-Tor PDF N/A Does your model understand genes? A benchmark of gene properties for biological and text models
低空经济中的综合感知与通信:一种深度强化学习方法 Xiaowen Ye PDF N/A Integrated Sensing and Communications for Low-Altitude Economy: A Deep Reinforcement Learning Approach
TransAdapter:以特征为中心的无监督域适应的视觉变换器 A. Enes Doruk PDF N/A TransAdapter: Vision Transformer for Feature-Centric Unsupervised Domain Adaptation
边界引导学习在空间转录组学中基因表达预测的应用 Mingcheng Qu PDF N/A Boundary-Guided Learning for Gene Expression Prediction in Spatial Transcriptomics
ProtDAT:一个从任何蛋白质文本描述进行蛋白质序列设计的统一框架 Xiao-Yu Guo PDF N/A ProtDAT: A Unified Framework for Protein Sequence Design from Any Protein Text Description
自动生成心电图数据医疗报告:利用深度学习连接医学文本与信号处理 Amnon Bleich PDF N/A Automated Medical Report Generation for ECG Data: Bridging Medical Text and Signal Processing with Deep Learning
空间到政策:利用地理空间数据进行可扩展的砖窑检测与自动合规监测 Zeel B Patel PDF N/A Space to Policy: Scalable Brick Kiln Detection and Automatic Compliance Monitoring with Geospatial Data
图神经网络需要聚类-归一化-激活模块 Arseny Skryagin PDF N/A Graph Neural Networks Need Cluster-Normalize-Activate Modules
ZipAR:通过空间局部性加速自回归图像生成 Yefei He PDF N/A ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality
扩展基于深度学习的感知系统与多源知识迁移 Gaole Dai PDF N/A Expanding Deep Learning-based Sensing Systems with Multi-Source Knowledge Transfer
从代码到游戏:使用大型语言模型进行游戏程序搜索的基准测试 Manuel Eberhardinger PDF N/A From Code to Play: Benchmarking Program Search for Games Using Large Language Models
使用大型语言模型进行基于概念代理的模型提取的提示工程指南 Siamak Khatami PDF N/A Prompt Engineering Guidance for Conceptual Agent-based Model Extraction using Large Language Models
桥型估计量的路径优化及其应用 Alessandro De Gregorio PDF N/A Pathwise optimization for bridge-type estimators and its applications
英国政治中的敌意检测:针对议员的网络攻击数据集 Mugdha Pandya PDF N/A Hostility Detection in UK Politics: A Dataset on Online Abuse Targeting MPs
AI4EF:建筑领域节能的人工智能 Alexandros Menelaos Tzortzis PDF N/A AI4EF: Artificial Intelligence for Energy Efficiency in the Building Sector
基准测试和增强机器人辅助食管切除术手术阶段识别模型 Yiping Li PDF N/A Benchmarking and Enhancing Surgical Phase Recognition Models for Robotic-Assisted Esophagectomy
INFP:双人对话中的音频驱动互动头部生成 Yongming Zhu PDF N/A INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations
SocialMind:基于大型语言模型的主动式增强现实社交辅助系统,具备类人感知能力,支持现场实时互动 Bufang Yang PDF N/A SocialMind: LLM-based Proactive AR Social Assistive System with Human-like Perception for In-situ Live Interactions
动态图表示与对比学习在金融市场预测中的应用:整合时间演化和静态关系 Yunhua Pei PDF N/A Dynamic Graph Representation with Contrastive Learning for Financial Market Prediction: Integrating Temporal Evolution and Static Relations
真相面具:模型对医学图像中意外区域的敏感性 Théo Sourget PDF N/A Mask of truth: model sensitivity to unexpected regions of medical images
影响人工智能攻防动态的考量因素 Giulio Corsi PDF N/A Considerations Influencing Offense-Defense Dynamics From Artificial Intelligence
M$^{3}$D:一个用于基于文档的信息抽取的多模态、多语言和多任务数据集 Jiang Liu PDF N/A M$^{3}$D: A Multimodal, Multilingual and Multitask Dataset for Grounded Document-level Information Extraction
探索标签聚合对少数群体声音的影响:对数据集偏差和模型训练的启示 Mugdha Pandya PDF N/A Exploring the Influence of Label Aggregation on Minority Voices: Implications for Dataset Bias and Model Training
PriorMotion:基于栅格-矢量运动场先验的生成式类不可知运动预测 Kangan Qian PDF N/A PriorMotion: Generative Class-Agnostic Motion Prediction with Raster-Vector Motion Field Priors
光谱映射的注记 Tuğçe Gökdemir PDF N/A A Note on Spectral Map
在神经形态硬件上的多维谐波检索算法的深度展开 Vlad C. Andrei PDF N/A Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware
Marco-LLM:通过大规模多语言训练实现跨语言增强,连接不同语言 Lingfeng Ming PDF N/A Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement
IF-MDM:用于高保真实时说话头生成的隐式面部运动扩散模型 Sejong Yang PDF N/A IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation
基于合作回归网络的盲水下图像复原 Ozer Can Devecioglu PDF N/A Blind Underwater Image Restoration using Co-Operational Regressor Networks
LaserGuider:一种基于激光的深度神经网络物理后门攻击 Yongjie Xu PDF N/A LaserGuider: A Laser Based Physical Backdoor Attack against Deep Neural Networks
有限维扩散映射的行为有多好? Wenyu Bo PDF N/A How well behaved is finite dimensional Diffusion Maps?
MTMT:整合多种思维模式以形成思维树,从而强化大型语言模型 Changcheng Li PDF N/A MTMT: Consolidating Multiple Thinking Modes to Form a Thought Tree for Strengthening LLM
揭秘:自动驾驶车辆实时未知类别物体检测 Lars Schmarje PDF N/A UNCOVER: Unknown Class Object Detection for Autonomous Vehicles in Real-time
具有线性预算约束和部分反馈的安全高效在线凸优化 Shanqi Liu PDF N/A Safe and Efficient Online Convex Optimization with Linear Budget Constraints and Partial Feedback
探索应用于高级驾驶辅助系统的全卷积网络在高光谱成像分割中的应用 Jon Gutiérrez-Zaballa PDF N/A Exploring Fully Convolutional Networks for the Segmentation of Hyperspectral Imaging Applied to Advanced Driver Assistance Systems
基于时代的多目标遗传算法在投资组合优化中的问题感知算子应用 Feijoo Colomine Durán PDF N/A Epoch-based Application of Problem-Aware Operators in a Multiobjective Memetic Algorithm for Portfolio Optimization
一个用于在复杂系统中发现分数阶微分方程的数据驱动框架 Xiangnan Yu PDF N/A A Data-Driven Framework for Discovering Fractional Differential Equations in Complex Systems
HyperDefect-YOLO:通过超图计算增强YOLO以实现工业缺陷检测 Zuo Zuo PDF N/A HyperDefect-YOLO: Enhance YOLO with HyperGraph Computation for Industrial Defect Detection
精准翻译:探索用于弱监督卫星图像时间序列语义分割的空间-时间感知线索 Hao Zhu PDF N/A Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation
通过强化学习进行上下文学习的演示选择 Xubin Wang PDF N/A Demonstration Selection for In-Context Learning via Reinforcement Learning
增强思维还是自动化技能:人力资本在生成式人工智能对创意任务影响中的不同作用 Meiling Huang PDF N/A Augmenting Minds or Automating Skills: The Differential Role of Human Capital in Generative AI's Impact on Creative Tasks
利用Stein恒等式进行局部曲率平滑以实现高效评分匹配 Genki Osada PDF N/A Local Curvature Smoothing with Stein's Identity for Efficient Score Matching
基于电子健康记录的数据驱动型糖尿病知识揭示与风险预测 Huadong Pang PDF N/A Electronic Health Records-Based Data-Driven Diabetes Knowledge Unveiling and Risk Prognosis