Arxiv 2024-12-05 Papers

标题	作者	PDF链接	代码仓库	Title
立体无处不在：即使在立体或单目失败的情况下，也能实现鲁棒的零样本深度立体匹配	Luca Bartolomei	PDF	N/A	Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail
PaintScene4D：从文本提示生成一致的4D场景	Vinayak Gupta	PDF	N/A	PaintScene4D: Consistent 4D Scene Generation from Text Prompts
Turbo3D：超快文本转3D生成	Hanzhe Hu	PDF	N/A	Turbo3D: Ultra-fast Text-to-3D Generation
NVILA：高效前沿视觉语言模型	Zhijian Liu	PDF	N/A	NVILA: Efficient Frontier Visual Language Models
QUEEN：流式自由视角视频中动态高斯分布的量化高效编码	Sharath Girish	PDF	N/A	QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos
VisionZip：在视觉语言模型中，更长并不一定更好	Senqiao Yang	PDF	N/A	VisionZip: Longer is Better but Not Necessary in Vision Language Models
UnZipLoRA：从单张图像中分离内容和风格	Chang Liu	PDF	N/A	UnZipLoRA: Separating Content and Style from a Single Image
DualPM：用于三维形状和姿态重建的双姿态-规范点图	Ben Kaye	PDF	N/A	DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction
MegaSaM：从随意动态视频中准确、快速且稳健地提取结构和运动	Zhengqi Li	PDF	N/A	MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos
4Real-Video：学习可泛化的照片级真实感4D视频扩散	Chaoyang Wang	PDF	N/A	4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
LayerFusion：利用生成先验实现多层次文本到图像生成的和谐统一	Yusuf Dalva	PDF	N/A	LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors
稀疏体素光栅化：实时高保真辐射场渲染	Cheng Sun	PDF	N/A	Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
Cubify Anything：室内3D物体检测的扩展	Justin Lazarow	PDF	N/A	Cubify Anything: Scaling Indoor 3D Object Detection
单目动态高斯喷射法快速但脆弱，而平滑运动有助于改善效果。	Yiqing Liang	PDF	N/A	Monocular Dynamic Gaussian Splatting is Fast and Brittle but Smooth Motion Helps
HeatFormer：一种用于多视角人体网格恢复的神经优化器	Yuto Matsubara	PDF	N/A	HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery
代码即监控：面向约束的可视化编程，用于反应性和前瞻性机器人故障检测	Enshen Zhou	PDF	N/A	Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection
Aguvis: 统一纯视觉代理，用于自主GUI交互	Yiheng Xu	PDF	N/A	Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
四平面分解视频自编码器	Mohammed Suhail	PDF	N/A	Four-Plane Factorized Video Autoencoders
NaVILA：用于导航的足式机器人视觉-语言-动作模型	An-Chieh Cheng	PDF	N/A	NaVILA: Legged Robot Vision-Language-Action Model for Navigation
p-MoD：通过逐步比率衰减构建深度混合的多语言大型语言模型	Jun Zhang	PDF	N/A	p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
备忘录：用于表达性对话视频生成的记忆引导扩散	Longtao Zheng	PDF	N/A	MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation
EgoPlan-Bench2：一个用于多模态大语言模型在现实世界场景中规划的基准	Lu Qiu	PDF	N/A	EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
DiCoDe：用于自回归视频生成与语言模型的扩散压缩深度令牌	Yizhuo Li	PDF	N/A	DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
摩托：潜在运动令牌作为机器人操作的桥梁语言	Yi Chen	PDF	N/A	Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
学习艺术签名：对称性发现与风格迁移	Emma Finn	PDF	N/A	Learning Artistic Signatures: Symmetry Discovery and Style Transfer
GenMAC：通过多智能体协作实现组合式文本到视频生成	Kaiyi Huang	PDF	N/A	GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration
面向实时开放词汇视频实例分割	Bin Yan	PDF	N/A	Towards Real-Time Open-Vocabulary Video Instance Segmentation
PBDyG：基于位置的动态高斯模型用于感知运动的着装人体化身	Shota Sasaki	PDF	N/A	PBDyG: Position Based Dynamic Gaussians for Motion-Aware Clothed Human Avatars
Divot：用于理解和生成的扩散力视频令牌器	Yuying Ge	PDF	N/A	Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
无限：通过位自动回归建模扩展高分辨率图像合成	Jian Han	PDF	N/A	Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
将图像中的描述接地信息用于零样本视觉识别	Shaunak Halbe	PDF	N/A	Grounding Descriptions in Images informs Zero-Shot Visual Recognition
漫威：通过微调的离线策略加速安全的在线强化学习	Keru Chen	PDF	N/A	Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
CA-SSLR：面向广义语音处理的感知条件自监督学习表示	Yen-Ju Lu	PDF	N/A	CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
Florence-VL：通过生成式视觉编码器和深度-广度融合增强视觉-语言模型	Jiuhai Chen	PDF	N/A	Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
FedDUAL: 一种结合自适应损失和动态聚合的双策略方法，用于缓解联邦学习中的数据异质性问题	Pranab Sahoo	PDF	N/A	FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning
针对核心：通过直接LLM操纵攻击基于RAG的代理的简单有效方法	Xuying Li	PDF	N/A	Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation
通过样本优化景观分析实现高效任务分组	Anshul Thakur	PDF	N/A	Efficient Task Grouping Through Samplewise Optimisation Landscape Analysis
使用数据和机器学习稳定并解决逆问题	Erik Burman	PDF	N/A	Stabilizing and Solving Inverse Problems using Data and Machine Learning
为无线联邦学习提供差分隐私：一种跨层框架	Jiayu Mao	PDF	N/A	Providing Differential Privacy for Federated Learning Over Wireless: A Cross-layer Framework
联邦自动化特征工程	Tom Overman	PDF	N/A	Federated Automated Feature Engineering
通过计算高效模型阶梯建立任务缩放法则	Akshita Bhagia	PDF	N/A	Establishing Task Scaling Laws via Compute-Efficient Model Ladders
在实验资源受限条件下，通过流水线评估实现异步批量贝叶斯优化的方法	Yujin Taguchi	PDF	N/A	Asynchronous Batch Bayesian Optimization with Pipelining Evaluations for Experimental Resource$\unicode{x2013}$constrained Conditions
用于高效三维占据预测的概率高斯叠加	Yuanhui Huang	PDF	N/A	Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction
SeeGround：零样本开放词汇3D视觉定位的视觉与基础	Rong Li	PDF	N/A	SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
EmbodiedOcc：基于视觉的在线场景理解的三维占据预测	Yuqi Wu	PDF	N/A	EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
对大型视觉语言模型进行有区别的微调	Yassine Ouali	PDF	N/A	Discriminative Fine-tuning of LVLMs
《理解二分类器性能的搭便车指南》	Anaïs Halin	PDF	N/A	A Hitchhiker's Guide to Understanding Performances of Two-Class Classifiers
可逆分子模拟用于训练经典和机器学习力场	Joe G Greener	PDF	N/A	Reversible molecular simulation for training classical and machine learning force fields
通过自回归特征和优势加权实现更精细的行为基础模型	Edoardo Cetin	PDF	N/A	Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting
自主网络防御的机器心智理论	Luke Swaby	PDF	N/A	Machine Theory of Mind for Autonomous Cyber-Defence
人工智能与创造力的内在过程	Jaan Aru	PDF	N/A	Artificial intelligence and the internal processes of creativity
提高并行性的近似Top-k算法	Oscar Key	PDF	N/A	Approximate Top-$k$ for Increased Parallelism
用于图建模和生成的多尺度节点嵌入	Riccardo Milocco	PDF	N/A	Multi-Scale Node Embeddings for Graph Modeling and Generation
ActFusion：一种用于动作分割和预测的统一扩散模型	Dayoung Gong	PDF	N/A	ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation
BhashaVerse：印度次大陆语言翻译生态系统	Vandan Mujadia	PDF	N/A	BhashaVerse : Translation Ecosystem for Indian Subcontinent Languages
分布稳健的表现预测	Songkai Xue	PDF	N/A	Distributionally Robust Performative Prediction
RMD：通过无训练检索增强运动扩散实现更通用的人类运动生成的一个简单基线	Zhouyingcheng Liao	PDF	N/A	RMD: A Simple Baseline for More General Human Motion Generation via Training-free Retrieval-Augmented Motion Diffuse
使用非结构化知识进行检索增强的机器翻译	Jiaan Wang	PDF	N/A	Retrieval-Augmented Machine Translation with Unstructured Knowledge
基于可能性调度的分数生成模型用于全三维PET图像重建	George Webber	PDF	N/A	Likelihood-Scheduled Score-Based Generative Modeling for Fully 3D PET Image Reconstruction
反思型教师：通过不确定性度量实现鸟瞰图下半监督多模态三维物体检测	Saheli Hazra	PDF	N/A	Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in Bird's-Eye-View via Uncertainty Measure
Liquid: 语言模型是可扩展的多模态生成器	Junfeng Wu	PDF	N/A	Liquid: Language Models are Scalable Multi-modal Generators
约束条件下连续环境中的强化学习动作映射	Mirco Theile	PDF	N/A	Action Mapping for Reinforcement Learning in Continuous Environments with Constraints
多主题图像合成作为单主题PET图像重建的生成先验	George Webber	PDF	N/A	Multi-Subject Image Synthesis as a Generative Prior for Single-Subject PET Image Reconstruction
GRAM：在深度强化学习中通过稳健适应模块实现泛化	James Queeney	PDF	N/A	GRAM: Generalization in Deep RL with a Robust Adaptation Module
基于生成模型的全三维PET图像条件扩散采样重建	George Webber	PDF	N/A	Generative-Model-Based Fully 3D PET Image Reconstruction by Conditional Diffusion Sampling
超拟合现象：为开放式文本生成优化和稳定大型语言模型	Fredrik Carlsson	PDF	N/A	The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation
FlashSloth：通过嵌入式视觉压缩实现的高效多模态大语言模型	Bo Tong	PDF	N/A	FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression
大语言模型（LLMs）的Densing定律	Chaojun Xiao	PDF	N/A	Densing Law of LLMs
LocalSR：局部区域图像超分辨率	Bo Ji	PDF	N/A	LocalSR: Image Super-Resolution in Local Region
标题：二维排名分数图用于二分类	Sébastien Piérard	PDF	N/A	The Tile: A 2D Map of Ranking Scores for Two-Class Classification
ALMA：最小注释对齐	Michihiro Yasunaga	PDF	N/A	ALMA: Alignment with Minimal Annotation
面向零样本的三维异常定位	Yizhou Wang	PDF	N/A	Towards Zero-shot 3D Anomaly Localization
SwiftEdit：通过一步扩散实现闪电般快速的文本引导图像编辑	Trong-Tung Nguyen	PDF	N/A	SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion
T2I-FactualBench：利用知识密集型概念评估文本到图像模型的真实性基准测试	Ziwei Huang	PDF	N/A	T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts
结构感知风格化图像合成在鲁棒医学图像分割中的应用	Jie Bao	PDF	N/A	Structure-Aware Stylized Image Synthesis for Robust Medical Image Segmentation
SIDA：利用大型多模态模型进行社交媒体图像深度伪造检测、定位与解释	Zhenglin Huang	PDF	N/A	SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model
数学推理的进化预提示优化	Mathurin Videau	PDF	N/A	Evolutionary Pre-Prompt Optimization for Mathematical Reasoning
针对点参考空间数据的深度因果推断与连续处理	Ziyang Jiang	PDF	N/A	Deep Causal Inference for Point-referenced Spatial Data with Continuous Treatments
可学习无穷泰勒高斯函数用于动态视图渲染	Bingbing Hu	PDF	N/A	Learnable Infinite Taylor Gaussian for Dynamic View Rendering
HumanEdit：一个基于指令的图像编辑高质量人类奖励数据集	Jinbin Bai	PDF	N/A	HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
基于估计姿态和遮挡误差的定向硬样本合成以提升物体姿态估计	Alan Li	PDF	N/A	Targeted Hard Sample Synthesis Based on Estimated Pose and Occlusion Error for Improved Object Pose Estimation
阿拉伯稳定语言模型：将稳定语言模型2 1.6B适配到阿拉伯语	Zaid Alyafeai	PDF	N/A	Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic
向量值预测的复杂性：从线性模型到随机凸优化	Matan Schliserman	PDF	N/A	Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization
从野生动物视频中进行强化学习	Elliot Chane-Sane	PDF	N/A	Reinforcement Learning from Wild Animal Videos
PoTable：像人类分析师一样在基于表格的推理中编程标准化	Qingyang Mao	PDF	N/A	PoTable: Programming Standardly on Table-based Reasoning Like a Human Analyst
端到端语音翻译的表示净化	Chengwei Zhang	PDF	N/A	Representation Purification for End-to-End Speech Translation
SynFinTabs：一个用于信息和表格提取的合成金融表格数据集	Ethan Bradley	PDF	N/A	SynFinTabs: A Dataset of Synthetic Financial Tables for Information and Table Extraction
阿雅领域：结合研究突破，开创多语言新前沿	John Dang	PDF	N/A	Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
通过监督对比领域自适应提升全切片图像分类	Ilán Carretero	PDF	N/A	Enhancing Whole Slide Image Classification through Supervised Contrastive Domain Adaptation
SCADE：可扩展的命令行异常检测引擎	Vaishali Vinay	PDF	N/A	SCADE: Scalable Command-line Anomaly Detection Engine
在密集环境中终身导航的瞬态多智能体路径寻找	Jonathan Morag	PDF	N/A	Transient Multi-Agent Path Finding for Lifelong Navigation in Dense Environments
CLINICSUM：利用语言模型从医患对话中生成临床摘要	Subash Neupane	PDF	N/A	CLINICSUM: Utilizing Language Models for Generating Clinical Summaries from Patient-Doctor Conversations
通过几何聚合的2D视觉特征进行3D部件分割	Marco Garosi	PDF	N/A	3D Part Segmentation via Geometric Aggregation of 2D Visual Features
鲁棒分类的有趣特性	Bernd Prach	PDF	N/A	Intriguing Properties of Robust Classification
GigaHands：一个大规模标注的双手动活动数据集	Rao Fu	PDF	N/A	GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities
量化分割一切模型的极限：分析分割树状和低对比度结构的挑战	Yixin Zhang	PDF	N/A	Quantifying the Limits of Segment Anything Model: Analyzing Challenges in Segmenting Tree-Like and Low-Contrast Structures
LMDM：用于三维分子生成的潜在分子扩散模型	Xiang Chen	PDF	N/A	LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation
VASCAR：通过视觉感知自校正实现内容感知布局生成	Jiahao Zhang	PDF	N/A	VASCAR: Content-Aware Layout Generation via Visual-Aware Self-Correction
通过主题建模探索哥伦比亚哲学史	Juan R. Loaiza	PDF	N/A	A History of Philosophy in Colombia through Topic Modelling
在意大利医疗大型语言模型聊天机器人中使用RAG和NMISS处理幻觉	Maria Paola Priola	PDF	N/A	Addressing Hallucinations with RAG and NMISS in Italian Healthcare LLM Chatbots
DEIM：具有改进匹配的DETR，用于快速收敛	Shihua Huang	PDF	N/A	DEIM: DETR with Improved Matching for Fast Convergence
HyperMARL：用于多智能体强化学习的自适应超网络	Kale-ab Abebe Tessera	PDF	N/A	HyperMARL: Adaptive Hypernetworks for Multi-Agent RL
基于绩效排名的理论基础	Sébastien Piérard	PDF	N/A	Foundations of the Theory of Performance-Based Ranking
自定义混合LoRA专家的多模态语义分割的Segment Anything模型	Chenyang Zhu	PDF	N/A	Customize Segment Anything Model for Multi-Modal Semantic Segmentation with Mixture of LoRA Experts
对齐音乐符号与歌词转录	Eliseo Fuentes-Martínez	PDF	N/A	Aligned Music Notation and Lyrics Transcription
利用未标记的sEMG信号进行肌肉力预测的物理信息深度学习	Shuhao Ma	PDF	N/A	Physics-informed Deep Learning for Muscle Force Prediction with Unlabeled sEMG Signals
一个用于翻译中介对话的上下文感知框架	José Pombal	PDF	N/A	A Context-aware Framework for Translation-mediated Conversations
PANGAEA：一个全球性和包容性的地理空间基础模型基准	Valerio Marsocci	PDF	N/A	PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models
歌词音乐中关键词与强拍之间的关系	Callie C. Liao	PDF	N/A	Relationships between Keywords and Strong Beats in Lyrical Music
Hipandas：通过与全色图像融合实现高光谱图像联合去噪与超分辨率	Shuang Xu	PDF	N/A	Hipandas: Hyperspectral Image Joint Denoising and Super-Resolution by Image Fusion with the Panchromatic Image
AL-QASIDA：系统分析阿拉伯方言中大型语言模型质量与准确性的系统	Nathaniel R. Robinson	PDF	N/A	AL-QASIDA: Analyzing LLM Quality and Accuracy Systematically in Dialectal Arabic
直接结构适应以克服统计冲突并实现持续学习	Zeki Doruk Erden	PDF	N/A	Directed Structural Adaptation to Overcome Statistical Conflicts and Enable Continual Learning
教学视频生成	Yayuan Li	PDF	N/A	Instructional Video Generation
利用大型语言模型生成特定课程的语义注释学习对象	Dominic Lohr	PDF	N/A	Leveraging Large Language Models to Generate Course-specific Semantically Annotated Learning Objects
使用GAN和频谱损失建模眼球注视速度轨迹以提高逼真度	Shailendra Bhandari	PDF	N/A	Modeling Eye Gaze Velocity Trajectories using GANs with Spectral Loss for Enhanced Fidelity
线性判别分析在信用评分中的应用：一种透明的混合模型方法	Md Shihab Reza	PDF	N/A	Linear Discriminant Analysis in Credit Scoring: A Transparent Hybrid Model Approach
SKIM：任意位量化推动后训练量化的极限	Runsheng Bai	PDF	N/A	SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
基于渐进信息披露的多层隐私保护记录链接与文员审查	Florens Rohde	PDF	N/A	Multi-Layer Privacy-Preserving Record Linkage with Clerical Review based on gradual information disclosure
固定均值高斯过程用于后验贝叶斯深度学习	Luis A. Ortega	PDF	N/A	Fixed-Mean Gaussian Processes for Post-hoc Bayesian Deep Learning
Bench-CoE：一个用于基准专家协作的框架	Yuanshuai Wang	PDF	N/A	Bench-CoE: a Framework for Collaboration of Experts from Benchmark
多类分类算法中风险评估的深入研究	Disha Ghandwani	PDF	N/A	An In-Depth Examination of Risk Assessment in Multi-Class Classification Algorithms
二值化函数相似性系统鲁棒性的缺失	Gianluca Capozzi	PDF	N/A	On the Lack of Robustness of Binary Function Similarity Systems
LossVal：神经网络的高效数据估值	Tim Wibiral	PDF	N/A	LossVal: Efficient Data Valuation for Neural Networks
非渐近闭环辨识不稳定非线性随机系统的界限	Seth Siriya	PDF	N/A	Non-Asymptotic Bounds for Closed-Loop Identification of Unstable Nonlinear Stochastic Systems
使用事件和帧的频率自适应低延迟目标检测	Haitian Zhang	PDF	N/A	Frequency-Adaptive Low-Latency Object Detection Using Events and Frames
MultiTASC++：一种面向基于边缘的多设备级联推理的持续自适应调度器	Sokratis Nikolaidis	PDF	N/A	MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based Multi-Device Cascade Inference
AnyDressing：通过潜在扩散模型实现可定制的多服装虚拟试穿	Xinghui Li	PDF	N/A	AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models
如果你无法使用它们，那就回收它们：大规模优化合并以缓解性能权衡	Muhammad Khalifa	PDF	N/A	If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs
利用深度学习和微流控技术在线估计聚合物熔体流变参数的方法论	Juan Sandubete-López	PDF	N/A	Methodology for Online Estimation of Rheological Parameters in Polymer Melts Using Deep Learning and Microfluidics
通过可靠性对齐减少工具幻觉	Hongshen Xu	PDF	N/A	Reducing Tool Hallucination via Reliability Alignment
通过概率景观中的锐度理解生成模型中的记忆化	Dongjae Jeon	PDF	N/A	Understanding Memorization in Generative Models via Sharpness in Probability Landscapes
莫奈：用于Transformer的单语义专家混合模型	Jungwoo Park	PDF	N/A	Monet: Mixture of Monosemantic Experts for Transformers
使用图像比较进行多语言文档中的文本变化检测	Doyoung Park	PDF	N/A	Text Change Detection in Multilingual Documents Using Image Comparison
组合生成多物理场与多组分模拟	Tao Zhang	PDF	N/A	Compositional Generative Multiphysics and Multi-component Simulation
用于卫星图像恢复的深度先验方法，具有精确的不确定性估计	Biquard Maud	PDF	N/A	Deep priors for satellite image restoration with accurate uncertainties
DeepFEA：用于预测瞬态有限元分析解决方案的深度学习	Georgios Triantafyllou	PDF	N/A	DeepFEA: Deep Learning for Prediction of Transient Finite Element Analysis Solutions
CrossSDF：通过横截面进行薄结构的3D重建	Thomas Walker	PDF	N/A	CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections
GRAF：基于事实增强的法律问答图检索	Cristian-George Crăciun	PDF	N/A	GRAF: Graph Retrieval Augmented by Facts for Legal Question Answering
MVUDA：多视角行人检测的无监督域自适应	Erik Brorsson	PDF	N/A	MVUDA: Unsupervised Domain Adaptation for Multi-view Pedestrian Detection
热成像与RGB图像在风力涡轮机损伤检测中相辅相成	Serhii Svystun	PDF	N/A	Thermal and RGB Images Work Better Together in Wind Turbine Damage Detection
使用分层微调数据的迁移学习对撒哈拉以南非洲成人胶质瘤进行分割	Abhijeet Parida	PDF	N/A	Adult Glioma Segmentation in Sub-Saharan Africa using Transfer Learning on Stratified Finetuning Data
通过背景操作符增强大型语言模型中的数学推理能力	Jiajun Chen	PDF	N/A	Enhancing Mathematical Reasoning in LLMs with Background Operators
预训练、对齐与解耦：利用大型语言模型赋能序列推荐	Yuhao Wang	PDF	N/A	Pre-train, Align, and Disentangle: Empowering Sequential Recommendation with Large Language Models
缺失的旋律：人工智能音乐生成及其对全球南方的“几乎”完全忽视	Atharva Mehta	PDF	N/A	Missing Melodies: AI Music Generation and its "Nearly" Complete Omission of the Global South
D-LORD 用于运动风格化	Meenakshi Gupta	PDF	N/A	D-LORD for Motion Stylization
HyperFLINT：基于超网络的流场估计与时间插值用于科学集合可视化	Hamid Gadirov	PDF	N/A	HyperFLINT: Hypernetwork-based Flow Estimation and Temporal Interpolation for Scientific Ensemble Visualization
基于磁共振成像特征的亚型分类与模型集成以提升脑肿瘤分割效果	Zhifan Jiang	PDF	N/A	Magnetic Resonance Imaging Feature-Based Subtyping and Model Ensemble for Enhanced Brain Tumor Segmentation
代理型大型语言模型系统的实际考虑	Chris Sypherd	PDF	N/A	Practical Considerations for Agentic LLM Systems
GEITje 7B Ultra：荷兰语对话模型	Bram Vanroy	PDF	N/A	GEITje 7B Ultra: A Conversational Model for Dutch
LossAgent：利用LLM代理实现图像处理中任意优化目标	Bingchen Li	PDF	N/A	LossAgent: Towards Any Optimization Objectives for Image Processing with LLM Agents
BodyMetric：评估文本到图像生成中人体逼真度	Nefeli Andreou	PDF	N/A	BodyMetric: Evaluating the Realism of HumanBodies in Text-to-Image Generation
开放世界组合零样本学习的统一框架	Hirunima Jayasekara	PDF	N/A	Unified Framework for Open-World Compositional Zero-shot Learning
可学习的相似性与差异性引导的对称非负矩阵分解	Wenlong Lyu	PDF	N/A	Learnable Similarity and Dissimilarity Guided Symmetric Non-Negative Matrix Factorization
移动网络中的联邦学习：一项关于流量预测的综合案例研究	Nikolaos Pavlidis	PDF	N/A	Federated Learning in Mobile Networks: A Comprehensive Case Study on Traffic Forecasting
通过领域随机化和元强化学习实现可泛化的自主渗透测试	Shicheng Zhou	PDF	N/A	Towards Generalizable Autonomous Penetration Testing via Domain Randomization and Meta-Reinforcement Learning
SoRA：用于领域泛化表示学习的奇异值分解低秩适应	Seokju Yun	PDF	N/A	SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning
距离自适应的四元数知识图谱嵌入与双向旋转	Weihua Wang	PDF	N/A	Distance-Adaptive Quaternion Knowledge Graph Embedding with Bidirectional Rotation
你的模型能理解基因吗？针对生物和文本模型的一个基因特性基准测试	Yoav Kan-Tor	PDF	N/A	Does your model understand genes? A benchmark of gene properties for biological and text models
低空经济中的综合感知与通信：一种深度强化学习方法	Xiaowen Ye	PDF	N/A	Integrated Sensing and Communications for Low-Altitude Economy: A Deep Reinforcement Learning Approach
TransAdapter：以特征为中心的无监督域适应的视觉变换器	A. Enes Doruk	PDF	N/A	TransAdapter: Vision Transformer for Feature-Centric Unsupervised Domain Adaptation
边界引导学习在空间转录组学中基因表达预测的应用	Mingcheng Qu	PDF	N/A	Boundary-Guided Learning for Gene Expression Prediction in Spatial Transcriptomics
ProtDAT：一个从任何蛋白质文本描述进行蛋白质序列设计的统一框架	Xiao-Yu Guo	PDF	N/A	ProtDAT: A Unified Framework for Protein Sequence Design from Any Protein Text Description
自动生成心电图数据医疗报告：利用深度学习连接医学文本与信号处理	Amnon Bleich	PDF	N/A	Automated Medical Report Generation for ECG Data: Bridging Medical Text and Signal Processing with Deep Learning
空间到政策：利用地理空间数据进行可扩展的砖窑检测与自动合规监测	Zeel B Patel	PDF	N/A	Space to Policy: Scalable Brick Kiln Detection and Automatic Compliance Monitoring with Geospatial Data
图神经网络需要聚类-归一化-激活模块	Arseny Skryagin	PDF	N/A	Graph Neural Networks Need Cluster-Normalize-Activate Modules
ZipAR：通过空间局部性加速自回归图像生成	Yefei He	PDF	N/A	ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality
扩展基于深度学习的感知系统与多源知识迁移	Gaole Dai	PDF	N/A	Expanding Deep Learning-based Sensing Systems with Multi-Source Knowledge Transfer
从代码到游戏：使用大型语言模型进行游戏程序搜索的基准测试	Manuel Eberhardinger	PDF	N/A	From Code to Play: Benchmarking Program Search for Games Using Large Language Models
使用大型语言模型进行基于概念代理的模型提取的提示工程指南	Siamak Khatami	PDF	N/A	Prompt Engineering Guidance for Conceptual Agent-based Model Extraction using Large Language Models
桥型估计量的路径优化及其应用	Alessandro De Gregorio	PDF	N/A	Pathwise optimization for bridge-type estimators and its applications
英国政治中的敌意检测：针对议员的网络攻击数据集	Mugdha Pandya	PDF	N/A	Hostility Detection in UK Politics: A Dataset on Online Abuse Targeting MPs
AI4EF：建筑领域节能的人工智能	Alexandros Menelaos Tzortzis	PDF	N/A	AI4EF: Artificial Intelligence for Energy Efficiency in the Building Sector
基准测试和增强机器人辅助食管切除术手术阶段识别模型	Yiping Li	PDF	N/A	Benchmarking and Enhancing Surgical Phase Recognition Models for Robotic-Assisted Esophagectomy
INFP：双人对话中的音频驱动互动头部生成	Yongming Zhu	PDF	N/A	INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations
SocialMind：基于大型语言模型的主动式增强现实社交辅助系统，具备类人感知能力，支持现场实时互动	Bufang Yang	PDF	N/A	SocialMind: LLM-based Proactive AR Social Assistive System with Human-like Perception for In-situ Live Interactions
动态图表示与对比学习在金融市场预测中的应用：整合时间演化和静态关系	Yunhua Pei	PDF	N/A	Dynamic Graph Representation with Contrastive Learning for Financial Market Prediction: Integrating Temporal Evolution and Static Relations
真相面具：模型对医学图像中意外区域的敏感性	Théo Sourget	PDF	N/A	Mask of truth: model sensitivity to unexpected regions of medical images
影响人工智能攻防动态的考量因素	Giulio Corsi	PDF	N/A	Considerations Influencing Offense-Defense Dynamics From Artificial Intelligence
M$^{3}$D：一个用于基于文档的信息抽取的多模态、多语言和多任务数据集	Jiang Liu	PDF	N/A	M$^{3}$D: A Multimodal, Multilingual and Multitask Dataset for Grounded Document-level Information Extraction
探索标签聚合对少数群体声音的影响：对数据集偏差和模型训练的启示	Mugdha Pandya	PDF	N/A	Exploring the Influence of Label Aggregation on Minority Voices: Implications for Dataset Bias and Model Training
PriorMotion：基于栅格-矢量运动场先验的生成式类不可知运动预测	Kangan Qian	PDF	N/A	PriorMotion: Generative Class-Agnostic Motion Prediction with Raster-Vector Motion Field Priors
光谱映射的注记	Tuğçe Gökdemir	PDF	N/A	A Note on Spectral Map
在神经形态硬件上的多维谐波检索算法的深度展开	Vlad C. Andrei	PDF	N/A	Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware
Marco-LLM：通过大规模多语言训练实现跨语言增强，连接不同语言	Lingfeng Ming	PDF	N/A	Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement
IF-MDM：用于高保真实时说话头生成的隐式面部运动扩散模型	Sejong Yang	PDF	N/A	IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation
基于合作回归网络的盲水下图像复原	Ozer Can Devecioglu	PDF	N/A	Blind Underwater Image Restoration using Co-Operational Regressor Networks
LaserGuider：一种基于激光的深度神经网络物理后门攻击	Yongjie Xu	PDF	N/A	LaserGuider: A Laser Based Physical Backdoor Attack against Deep Neural Networks
有限维扩散映射的行为有多好？	Wenyu Bo	PDF	N/A	How well behaved is finite dimensional Diffusion Maps?
MTMT：整合多种思维模式以形成思维树，从而强化大型语言模型	Changcheng Li	PDF	N/A	MTMT: Consolidating Multiple Thinking Modes to Form a Thought Tree for Strengthening LLM
揭秘：自动驾驶车辆实时未知类别物体检测	Lars Schmarje	PDF	N/A	UNCOVER: Unknown Class Object Detection for Autonomous Vehicles in Real-time
具有线性预算约束和部分反馈的安全高效在线凸优化	Shanqi Liu	PDF	N/A	Safe and Efficient Online Convex Optimization with Linear Budget Constraints and Partial Feedback
探索应用于高级驾驶辅助系统的全卷积网络在高光谱成像分割中的应用	Jon Gutiérrez-Zaballa	PDF	N/A	Exploring Fully Convolutional Networks for the Segmentation of Hyperspectral Imaging Applied to Advanced Driver Assistance Systems
基于时代的多目标遗传算法在投资组合优化中的问题感知算子应用	Feijoo Colomine Durán	PDF	N/A	Epoch-based Application of Problem-Aware Operators in a Multiobjective Memetic Algorithm for Portfolio Optimization
一个用于在复杂系统中发现分数阶微分方程的数据驱动框架	Xiangnan Yu	PDF	N/A	A Data-Driven Framework for Discovering Fractional Differential Equations in Complex Systems
HyperDefect-YOLO：通过超图计算增强YOLO以实现工业缺陷检测	Zuo Zuo	PDF	N/A	HyperDefect-YOLO: Enhance YOLO with HyperGraph Computation for Industrial Defect Detection
精准翻译：探索用于弱监督卫星图像时间序列语义分割的空间-时间感知线索	Hao Zhu	PDF	N/A	Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation
通过强化学习进行上下文学习的演示选择	Xubin Wang	PDF	N/A	Demonstration Selection for In-Context Learning via Reinforcement Learning
增强思维还是自动化技能：人力资本在生成式人工智能对创意任务影响中的不同作用	Meiling Huang	PDF	N/A	Augmenting Minds or Automating Skills: The Differential Role of Human Capital in Generative AI's Impact on Creative Tasks
利用Stein恒等式进行局部曲率平滑以实现高效评分匹配	Genki Osada	PDF	N/A	Local Curvature Smoothing with Stein's Identity for Efficient Score Matching
基于电子健康记录的数据驱动型糖尿病知识揭示与风险预测	Huadong Pang	PDF	N/A	Electronic Health Records-Based Data-Driven Diabetes Knowledge Unveiling and Risk Prognosis