Arxiv 2024-10-17 Papers

标题	作者	PDF链接	代码仓库	Title
流体：通过连续标记扩展自回归文本到图像生成模型	Lijie Fan	PDF	N/A	Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
UniDrive：跨越相机配置的通用驾驶感知	Ye Li	PDF	N/A	UniDrive: Towards Universal Driving Perception Across Camera Configurations
DepthSplat：连接高斯Splatting与深度	Haofei Xu	PDF	N/A	DepthSplat: Connecting Gaussian Splatting and Depth
PUMA：赋能统一的多层次视觉生成的大型多模态语言模型	Rongyao Fang	PDF	N/A	PUMA: Empowering Unified MLLM with Multi-granular Visual Generation
VLM-Grounder：一种用于零样本3D视觉定位的VLM代理	Runsen Xu	PDF	N/A	VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
$γ-$MoD：探索多模态大语言模型的深度适应混合方法	Yaxin Luo	PDF	N/A	$γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
数值精度如何影响大型语言模型的数学推理能力	Guhao Feng	PDF	N/A	How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
扩散状态与匹配分数：一种新的模仿学习框架	Runzhe Wu	PDF	N/A	Diffusing States and Matching Scores: A New Framework for Imitation Learning
多模态大语言模型能理解中文图片背后的深层含义吗？	Chenhao Zhang	PDF	N/A	Can MLLMs Understand the Deep Implication Behind Chinese Images?
AutoAL：基于可微查询策略搜索的自动化主动学习	Yifeng Wang	PDF	N/A	AutoAL: Automated Active Learning with Differentiable Query Strategy Search
从互动中进行回顾性学习	Zizhao Chen	PDF	N/A	Retrospective Learning from Interactions
可扩展扩散模型中数据归因的影响函数	Bruno Mlodozeniec	PDF	N/A	Influence Functions for Scalable Data Attribution in Diffusion Models
可微分的机器人渲染	Ruoshi Liu	PDF	N/A	Differentiable Robot Rendering
从梯度裁剪到重尾随机梯度下降的归一化	Florian Hübler	PDF	N/A	From Gradient Clipping to Normalization for Heavy Tailed SGD
Janus：解耦视觉编码以实现统一的多模态理解和生成	Chengyue Wu	PDF	N/A	Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
SimLayerKV：一个简单的层级KV缓存缩减框架	Xuan Zhang	PDF	N/A	SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction
D-FINE：将DETRs中的回归任务重新定义为细粒度分布细化	Yansong Peng	PDF	N/A	D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement
后训练大规模模型中Delta参数编辑的统一视角	Qiaoyu Tang	PDF	N/A	A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models
通过多标记预测和推测解码加速基于编解码器的语音合成	Tan Dat Nguyen	PDF	N/A	Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding
ORSO：通过在线奖励选择和策略优化加速奖励设计	Chen Bo Calvin Zhang	PDF	N/A	ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
活跃-休眠注意力头：从机制上揭示大语言模型中的极端标记现象	Tianyu Guo	PDF	N/A	Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
VidPanos：从随意的平移视频生成全景视频	Jingwei Ma	PDF	N/A	VidPanos: Generative Panoramic Videos from Casual Panning Videos
深度集成模型的不同优势	Kajetan Schweighofer	PDF	N/A	The Disparate Benefits of Deep Ensembles
DreamVideo-2：通过精确运动控制实现零样本主题驱动视频定制	Yujie Wei	PDF	N/A	DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
基于边界的语言模型对齐的一个常见陷阱：梯度纠缠	Hui Yuan	PDF	N/A	A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
挖掘技能层级洞察：理解基础模型权衡	Mazda Moayeri	PDF	N/A	Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models
AgentOccam：基于大型语言模型的网络代理的简单而强大的基线	Ke Yang	PDF	N/A	AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
利用网页用户界面进行丰富的文本视觉理解	Junpeng Liu	PDF	N/A	Harnessing Webpage UIs for Text-Rich Visual Understanding
深度生成模型通过视觉-语言条件化揭示医学图像中的模式	Xiaodan Xing	PDF	N/A	Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning
通过对抗攻击实现眼底图像病变语义分割的多风格转换	Clément Playout	PDF	N/A	Multi-style conversion for semantic segmentation of lesions in fundus images by adversarial attacks
人工Kuramoto振荡神经元	Takeru Miyato	PDF	N/A	Artificial Kuramoto Oscillatory Neurons
指导性强化学习在稳健的多接触移动操作中的应用	Jean-Pierre Sleiman	PDF	N/A	Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation
引导你的通才：通过价值指导提升机器人基础模型	Mitsuhiko Nakamoto	PDF	N/A	Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
私人反事实检索	Mohamed Nomeir	PDF	N/A	Private Counterfactual Retrieval
去水印：大型语言模型中的水印去除	Ruibo Chen	PDF	N/A	De-mark: Watermark Removal in Large Language Models
ConsisSR：深入探讨基于扩散的图像超分辨率中的连贯性	Junhao Gu	PDF	N/A	ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution
一种用于无序语言模型的水印	Ruibo Chen	PDF	N/A	A Watermark for Order-Agnostic Language Models
BenTo：基于上下文可迁移性的基准任务缩减	Hongyu Zhao	PDF	N/A	BenTo: Benchmark Task Reduction with In-Context Transferability
一种模式将它们对齐：整合不同模态以定义多模态实体	Gianluca Apriceno	PDF	N/A	A Pattern to Align Them All: Integrating Different Modalities to Define Multi-Modal Entities
对抗性测试作为可解释性工具：变压器中基本函数的长度依赖过拟合	Patrik Zavoral	PDF	N/A	Adversarial Testing as a Tool for Interpretability: Length-based Overfitting of Elementary Functions in Transformers
机器学习分析LHC上对暗物质的辐射衰变	Ernesto Arganda	PDF	N/A	Machine-Learning Analysis of Radiative Decays to Dark Matter at the LHC
离散分布可以从亚稳态样本中学习得到	Abhijith Jayakumar	PDF	N/A	Discrete distributions are learnable from metastable samples
学习用于Transformer的图量化标记器	Limei Wang	PDF	N/A	Learning Graph Quantized Tokenizers for Transformers
任意条件下的多功能扩散用于多物理场仿真	Da Long	PDF	N/A	Arbitrarily-Conditioned Multi-Functional Diffusion for Multi-Physics Emulation
通过流形学习分析用于时间序列预测的深度变换模型	Ilya Kaufman	PDF	N/A	Analyzing Deep Transformer Models for Time Series Forecasting via Manifold Learning
MotionBank：一个大规模视频运动基准，具有解耦的基于规则的注释	Liang Xu	PDF	N/A	MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
建模未来对话轮次以教导大型语言模型提出澄清性问题	Michael J. Q. Zhang	PDF	N/A	Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
内省的力量：语言模型通过自我反思可以了解自身	Felix J Binder	PDF	N/A	Looking Inward: Language Models Can Learn About Themselves by Introspection
强调语音驱动手势生成中显著姿态的语义一致性	Fengqi Liu	PDF	N/A	Emphasizing Semantic Consistency of Salient Posture for Speech-Driven Gesture Generation
PopAlign：多样化对比模式，实现更全面的比对	Zekun Moore Wang	PDF	N/A	PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment
单语源数据的量与质在自动文本翻译中的对比：如果质量太好，数量是否可以太少？	Idris Abdulmumin	PDF	N/A	Quantity vs. Quality of Monolingual Source Data in Automatic Text Translation: Can It Be Too Little If It Is Too Good?
DPLM-2：一种多模态扩散蛋白质语言模型	Xinyou Wang	PDF	N/A	DPLM-2: A Multimodal Diffusion Protein Language Model
矩阵乘法的最佳量化	Or Ordentlich	PDF	N/A	Optimal Quantization for Matrix Multiplication
语言模型中病态路径任务的奥秘	Arvid Frydenlund	PDF	N/A	The Mystery of the Pathological Path-star Task for Language Models
多元数据流中的变化检测：基于Kernel-QuantTree的在线分析	Michelangelo Olmo Nogara Notarianni	PDF	N/A	Change Detection in Multivariate data streams: Online Analysis with Kernel-QuantTree
使用树专家以语言表示模型权重	Eliahu Horwitz	PDF	N/A	Representing Model Weights with Language using Tree Experts
主观任务中的聚合伪影导致大型语言模型后验概率崩溃	Georgios Chochlakis	PDF	N/A	Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors
通过优化机器学习模型提升零售销售预测	Priyam Ganguly	PDF	N/A	Enhancing Retail Sales Forecasting with Optimized Machine Learning Models
无先验知识、黑箱、非平稳强化学习是否可行？	Argyrios Gerogiannis	PDF	N/A	Is Prior-Free Black-Box Non-Stationary Reinforcement Learning Feasible?
通过扩散模型探索数据的潜在层次结构	Antonio Sclocchi	PDF	N/A	Probing the Latent Hierarchical Structure of Data via Diffusion Models
变压器引导的协同进化：多智能体对抗游戏中团队形成的改进	Pranav Rajbhandari	PDF	N/A	Transformer Guided Coevolution: Improved Team Formation in Multiagent Adversarial Games
基于图神经网络和大型语言模型驱动的多智能体系统的快速自动化合金设计	Alireza Ghafarollahi	PDF	N/A	Rapid and Automated Alloy Design with Graph Neural Network-Powered LLM-Driven Multi-Agent Systems
利用大型语言模型进行知识感知的查询扩展，以实现文本和关系检索	Yu Xia	PDF	N/A	Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval
虚拟传感技术在核系统实时退化监测中的应用：利用DeepONet提升数字孪生技术传感覆盖范围	Raisa Bentay Hossain	PDF	N/A	Virtual Sensing for Real-Time Degradation Monitoring of Nuclear Systems: Leveraging DeepONet for Enhanced Sensing Coverage for Digital Twin-Enabling Technology
GDeR：通过原型图剪枝保障效率、平衡性和鲁棒性	Guibin Zhang	PDF	N/A	GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning
面部建模中的眼睑折叠一致性	Lohit Petikam	PDF	N/A	Eyelid Fold Consistency in Facial Modeling
MobA：一种用于高效移动任务自动化的双层代理系统	Zichen Zhu	PDF	N/A	MobA: A Two-Level Agent System for Efficient Mobile Task Automation
攀登：基于语言引导的持续学习，通过迭代模型构建实现任务规划	Walker Byrnes	PDF	N/A	CLIMB: Language-Guided Continual Learning for Task Planning with Iterative Model Building
MixEval-X：从现实世界数据混合中进行任意到任意的评估	Jinjie Ni	PDF	N/A	MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
隐私保护的去中心化人工智能与机密计算	Dayeol Lee	PDF	N/A	Privacy-Preserving Decentralized AI with Confidential Computing
监督核细化	Albert Gong	PDF	N/A	Supervised Kernel Thinning
分数不匹配扩散模型与零样本条件采样器的理论	Yuchen Liang	PDF	N/A	Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers
通过非线性局部平均场近似推断准反应系统的动力学	Matteo Framba	PDF	N/A	Inferring the dynamics of quasi-reaction systems via nonlinear local mean-field approximations
单时间尺度多序列随机逼近无固定点光滑性：理论与应用	Yue Huang	PDF	N/A	Single-Timescale Multi-Sequence Stochastic Approximation Without Fixed Point Smoothness: Theories and Applications
扩散概率模型的收敛速度提升	Gen Li	PDF	N/A	Improved Convergence Rate for Diffusion Probabilistic Models
优化向量化非一致性得分的概率性保形预测	Minxing Zheng	PDF	N/A	Optimizing Probabilistic Conformal Prediction with Vectorized Non-Conformity Scores
通过提升视觉能力来改进多模态大语言模型	Yanpeng Sun	PDF	N/A	Improving Multi-modal Large Language Model through Boosting Vision Capabilities
将Transformer架构简化为最小化	Bernhard Bermeitinger	PDF	N/A	Reducing the Transformer Architecture to a Minimum
用于对话中文化背景定位的LLM-人类流程	Rajkumar Pujari	PDF	N/A	LLM-Human Pipeline for Cultural Context Grounding of Conversations
DAWN：动态帧虚拟形象与非自回归扩散框架用于说话头视频生成	Hanbo Cheng	PDF	N/A	DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation
持续预训练对大型语言模型的毒害	Yiming Zhang	PDF	N/A	Persistent Pre-Training Poisoning of LLMs
电影基因：媒体基础模型的演员阵容	Adam Polyak	PDF	N/A	Movie Gen: A Cast of Media Foundation Models
MIRAGE-Bench：自动多语言基准竞技场，用于增强检索生成系统	Nandan Thakur	PDF	N/A	MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems
通过学习理论的视角来看待生成	Vinod Raman	PDF	N/A	Generation through the lens of learning theory
CrystalX：利用深度学习实现超精密晶体结构分辨与错误校正	Kaipeng Zheng	PDF	N/A	CrystalX: Ultra-Precision Crystal Structure Resolution and Error Correction Using Deep Learning
智能手机上的设备内联邦学习用于从Reddit帖子检测抑郁症	Mustofa Ahmed	PDF	N/A	On-device Federated Learning in Smartphones for Detecting Depression from Reddit Posts
大型语言模型安全性中注意力头的作用	Zhenhong Zhou	PDF	N/A	On the Role of Attention Heads in Large Language Model Safety
Wikidata中的不一致性违规	Ege Atacan Doğan	PDF	N/A	Disjointness Violations in Wikidata
无约束模型合并以增强大型语言模型推理	Yiming Zhang	PDF	N/A	Unconstrained Model Merging for Enhanced LLM Reasoning
虚拟网络中高效的功能放置：一种在线学习方法	Wei Huang	PDF	N/A	Efficient Function Placement in Virtual Networks: An Online Learning Approach
探索视频多模态大语言模型中的视觉上下文表示设计空间	Yifan Du	PDF	N/A	Exploring the Design Space of Visual Context Representation in Video MLLMs
越狱LLM控制的机器人	Alexander Robey	PDF	N/A	Jailbreaking LLM-Controlled Robots
使用深度学习无标签预测牛卫星细胞的荧光标记	Sania Sinha	PDF	N/A	Label-free prediction of fluorescence markers in bovine satellite cells using deep learning
从零开始的无参数变量选择：用于大规模符号回归的高维$p$变量选择	Shengbin Ye	PDF	N/A	Ab initio nonparametric variable selection for scalable Symbolic Regression with large $p$
基于姿态的手语外观迁移	Amit Moryossef	PDF	N/A	Pose-Based Sign Language Appearance Transfer
扩散课程：通过图像引导的扩散实现从合成到真实的生成课程学习	Yijun Liang	PDF	N/A	Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion
健康-PARIKSHA：评估RAG模型在现实世界多语言健康聊天机器人中的应用	Varun Gumma	PDF	N/A	HEALTH-PARIKSHA: Assessing RAG Models for Health Chatbots in Real-World Multilingual Settings
手语书写评估：通过手语书写实现有效手语评估	Amit Moryossef	PDF	N/A	signwriting-evaluation: Effective Sign Language Evaluation via SignWriting
兰花：一个用于目标无关立场检测和论证对话摘要的中文辩论语料库	Xiutian Zhao	PDF	N/A	ORCHID: A Chinese Debate Corpus for Target-Independent Stance Detection and Argumentative Dialogue Summarization
VL-GLUE：一套基础但具有挑战性的视觉语言推理任务集	Shailaja Keyur Sampat	PDF	N/A	VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
DiRecNetV2：一种增强型Transformer网络，用于空中灾害识别	Demetris Shianios	PDF	N/A	DiRecNetV2: A Transformer-Enhanced Network for Aerial Disaster Recognition
ActionCOMET：一种零样本方法，用于学习关于动作的图像特定常识概念	Shailaja Keyur Sampat	PDF	N/A	ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
使用领域感知进化算法选择光子晶体光谱仪的滤波器	Kirill Antonov	PDF	N/A	Selection of Filters for Photonic Crystal Spectrometer Using Domain-Aware Evolutionary Algorithms
红蓝语言：特朗普与哈里斯2024年总统辩论中的用词选择	Philipp Wicke	PDF	N/A	Red and blue language: Word choices in the Trump & Harris 2024 presidential debate
帮助我识别：一个LLM+VQA系统是否足以识别视觉概念？	Shailaja Keyur Sampat	PDF	N/A	Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
一种用于微调句子变换器以进行意图分类和超出范围检测任务的新方法	Tianyi Zhang	PDF	N/A	A new approach for fine-tuning sentence transformers for intent classification and out-of-scope detection tasks
SimpleToM：揭示LLMs中显式ToM推理与隐式ToM应用之间的差距	Yuling Gu	PDF	N/A	SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs
张力稳态的自动模型发现：生长和重塑中的构成性机器学习	Hagen Holthusen	PDF	N/A	Automated Model Discovery for Tensional Homeostasis: Constitutive Machine Learning in Growth and Remodeling
通过奖励优化微调离散扩散模型及其在DNA和蛋白质设计中的应用	Chenyu Wang	PDF	N/A	Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
一个由大型语言模型实现包容性生成的主动学习框架	Sabit Hassan	PDF	N/A	An Active Learning Framework for Inclusive Generation by Large Language Models
潜在空间嵌入链实现无需输出的LLM自我评估	Yiming Wang	PDF	N/A	Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
关于OpenAI的o1模型推理模式比较研究	Siwei Wu	PDF	N/A	A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
扩展可穿戴基础模型	Girish Narayanswamy	PDF	N/A	Scaling Wearable Foundation Models
规范化自监督学习以实现可靠的变化点检测	Alexandra Bazarova	PDF	N/A	Normalizing self-supervised learning for provably reliable Change Point Detection
集体细胞迁移中的表型结构：数学模型与方法教程	Tommaso Lorenzi	PDF	N/A	Phenotype structuring in collective cell migration:a tutorial of mathematical models and methods
基于分割一切模型增强提示的弱监督癌症分割	Joonhyeon Song	PDF	N/A	Enhanced Prompt-leveraged Weakly Supervised Cancer Segmentation based on Segment Anything
LoLDU：通过下三角-对角-上三角分解实现低秩适应，用于参数高效的微调	Yiming Shi	PDF	N/A	LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning
时空目标检测在交通监控中提升空中飞行器检测的效果	Kristina Telegraph	PDF	N/A	Spatiotemporal Object Detection for Improved Aerial Vehicle Detection in Traffic Monitoring
材料指纹识别：识别和预测材料外观的感知属性	Jiri Filip	PDF	N/A	Material Fingerprinting: Identifying and Predicting Perceptual Attributes of Material Appearance
MEGA：动态场景中用于高效内存的4D高斯喷射技术	Xinjie Zhang	PDF	N/A	MEGA: Memory-Efficient 4D Gaussian Splatting for Dynamic Scenes
H2OVL-密西西比视觉语言模型技术报告	Shaikat Galib	PDF	N/A	H2OVL-Mississippi Vision Language Models Technical Report
MeNTi：通过嵌套工具调用连接医疗计算器与大型语言模型代理	Yakun Zhu	PDF	N/A	MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling
所有模型都有缺陷，但有些是有用的：在标签有限的情况下进行模型选择	Patrik Okanovic	PDF	N/A	All models are wrong, some are useful: Model Selection with Limited Labels
DN-4DGS：用于动态场景渲染的去噪可变形网络与时空聚合	Jiahao Lu	PDF	N/A	DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering
基于Transformer的传感器人体活动识别方法：机遇与挑战	Clayton Souza Leite	PDF	N/A	Transformer-Based Approaches for Sensor-Based Human Activity Recognition: Opportunities and Challenges
大型语言模型作为叙事驱动推荐系统	Lukas Eberhard	PDF	N/A	Large Language Models as Narrative-Driven Recommenders
面向卫星非独立同分布图像：一种光谱聚类辅助的联邦学习方法	Luyao Zou	PDF	N/A	Towards Satellite Non-IID Imagery: A Spectral Clustering-Assisted Federated Learning Approach
让我说完我的句子：基于整体文本理解的视频时间定位	Jongbhin Woo	PDF	N/A	Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding
基于扩散语言模型的多属性分子优化	Yida Xiong	PDF	N/A	Text-Guided Multi-Property Molecular Optimization with a Diffusion Language Model
深度学习识别和追踪低对比度显微视频中的单个纳米管	Vladimir Pimonov	PDF	N/A	Deep-learning recognition and tracking of individual nanotubes in low-contrast microscopy videos
OAH-Net：一种用于离轴数字全息显微镜全息重建的深度神经网络	Wei Liu	PDF	N/A	OAH-Net: A Deep Neural Network for Hologram Reconstruction of Off-axis Digital Holographic Microscope
伪数据集生成用于域外多摄像头视角推荐	Kuan-Ying Lee	PDF	N/A	Pseudo Dataset Generation for Out-of-Domain Multi-Camera View Recommendation
无像素级监督的协同分割及其在大规模草图分类中的应用	Nikolaos-Antonios Ypsilantis	PDF	N/A	Co-Segmentation without any Pixel-level Supervision with Application to Large-Scale Sketch Classification
EFX 存在于三种类型的代理人中	Vishwa Prakash H. V.	PDF	N/A	EFX Exists for Three Types of Agents
在不完备LDL中实现更优性能：解决数据不平衡问题	Zhiqiang Kou	PDF	N/A	Towards Better Performance in Incomplete LDL: Addressing Data Imbalance
样本压缩超网络：从泛化界限到元学习	Benjamin Leblanc	PDF	N/A	Sample Compression Hypernetworks: From Generalization Bounds to Meta-Learning
DriveDreamer4D：世界模型是用于4D驾驶场景表示的高效数据机器	Guosheng Zhao	PDF	N/A	DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
RGB到高光谱：增强手术成像的光谱重建	Tobias Czempiel	PDF	N/A	RGB to Hyperspectral: Spectral Reconstruction for Enhanced Surgical Imaging
CCUP：一种用于预训练换衣人物重识别模型的可控合成数据生成管道	Yujian Zhao	PDF	N/A	CCUP: A Controllable Synthetic Data Generation Pipeline for Pretraining Cloth-Changing Person Re-Identification Models
360U-Former：全景适应视觉变换器的高动态范围光照估计	Jack Hilliard	PDF	N/A	360U-Former: HDR Illumination Estimation with Panoramic Adapted Vision Transformers
用于空间感知对象插入的生成位置建模	Jooyeol Yun	PDF	N/A	Generative Location Modeling for Spatially Aware Object Insertion
Ornstein-Uhlenbeck适应作为一种大脑和机器中的学习机制	Jesus Garcia Fernandez	PDF	N/A	Ornstein-Uhlenbeck Adaptation as a Mechanism for Learning in Brains and Machines
通过真实性提升PLMs中的事实检索	Paul Youssef	PDF	N/A	Enhancing Fact Retrieval in PLMs through Truthfulness
在大语言模型中整合时间表示，以实现动态记忆的检索与管理	Yuki Hou	PDF	N/A	Integrating Temporal Representations for Dynamic Memory Retrieval and Management in Large Language Models
自适应和盲目的统计对手是等价的。	Guy Blanc	PDF	N/A	Adaptive and oblivious statistical adversaries are equivalent
RemoteDet-Mamba：一种用于遥感图像多模态目标检测的混合Mamba-CNN网络	Kejun Ren	PDF	N/A	RemoteDet-Mamba: A Hybrid Mamba-CNN Network for Multi-modal Object Detection in Remote Sensing Images
L3DG：潜在三维高斯扩散	Barbara Roessle	PDF	N/A	L3DG: Latent 3D Gaussian Diffusion
生成对抗网络合成雷达点云场景	Muhammad Saad Nawaz	PDF	N/A	Generative Adversarial Synthesis of Radar Point Cloud Scenes
医学视觉-语言预训练能否仅凭纯合成数据取得成功？	Che Liu	PDF	N/A	Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data?
镜中的偏见：大型语言模型（LLMs）的意见是否能抵御自身的对抗性攻击？	Virgile Rennard	PDF	N/A	Bias in the Mirror : Are LLMs opinions robust to their own adversarial attacks ?
PORTAL：通过内容特定标记化实现的可扩展表格基础模型	Marco Spinaci	PDF	N/A	PORTAL: Scalable Tabular Foundation Models via Content-Specific Tokenization
CERES：通过时间场景图完成的关键事件重建	Efimia Panagiotaki	PDF	N/A	CERES: Critical-Event Reconstruction via Temporal Scene Graph Completion
GeoCoder：通过视觉-语言模型生成模块化代码解决几何问题	Aditya Sharma	PDF	N/A	GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models
RAG-DDR：利用可微数据奖励优化检索增强生成	Xinze Li	PDF	N/A	RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
MathGAP：在具有任意复杂证明的问题上的分布外评估	Andreas Opedal	PDF	N/A	MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
将大型语言模型与强化学习相结合，用于非线性推理	Yoav Alon	PDF	N/A	Integrating Large Language Models and Reinforcement Learning for Non-Linear Reasoning
SAda-Net：一种用于遥感图像数据的自监督自适应立体估计卷积神经网络	Dominik Hirner	PDF	N/A	SAda-Net: A Self-Supervised Adaptive Stereo Estimation CNN For Remote Sensing Image Data
通过课程学习、半监督训练和高级优化技术增强联合NLG/NLU学习中的文本生成	Rahimanuddin Shaik	PDF	N/A	Enhancing Text Generation in Joint NLG/NLU Learning Through Curriculum Learning, Semi-Supervised Training, and Advanced Optimization Techniques
重复神经元：语言模型如何生成重复内容？	Tatsuya Hiraoka	PDF	N/A	Repetition Neurons: How Do Language Models Produce Repetitions?
深度强化学习用于在线最优执行策略	Alessandro Micheli	PDF	N/A	Deep Reinforcement Learning for Online Optimal Execution Strategies
基于新颖性的连续机器人控制样本重用	Ke Duan	PDF	N/A	Novelty-based Sample Reuse for Continuous Robotics Control
透过VisualBERT的视觉：在模因景观上的因果冒险	Dibyanayan Bandyopadhyay	PDF	N/A	Seeing Through VisualBERT: A Causal Adventure on Memetic Landscapes
SemSim: 从语义相似性角度重新审视弱到强一致性用于半监督医学图像分割	Shiao Xie	PDF	N/A	SemSim: Revisiting Weak-to-Strong Consistency from a Semantic Similarity Perspective for Semi-supervised Medical Image Segmentation
昼夜适应：一种创新的无需源数据的医学图像分割适应框架	Ziyang Chen	PDF	N/A	Day-Night Adaptation: An Innovative Source-free Adaptation Framework for Medical Image Segmentation
SiamSeg: 结合对比学习的自训练方法用于遥感中的无监督域适应	Bin Wang	PDF	N/A	SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation in Remote Sensing
利用Koopman理论解释时序图神经网络	Michele Guerra	PDF	N/A	Interpreting Temporal Graph Neural Networks with Koopman Theory
透明物体的隐式表示用于目标姿态估计	Varun Burde	PDF	N/A	Object Pose Estimation Using Implicit Representation For Transparent Objects
IterSelectTune：一种用于高效指令调优数据选择的迭代训练框架	Jielin Song	PDF	N/A	IterSelectTune: An Iterative Training Framework for Efficient Instruction-Tuning Data Selection
在蒙特卡罗策略评估中截断轨迹：一种自适应方法	Riccardo Poiani	PDF	N/A	Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach
渐进混合精度解码以提高大型语言模型推理效率	Hao Mark Chen	PDF	N/A	Progressive Mixed-Precision Decoding for Efficient LLM Inference
打破人工标注瓶颈：通过半自动化标注创建全面的法律案件关键性数据集	Ronja Stern	PDF	N/A	Breaking the Manual Annotation Bottleneck: Creating a Comprehensive Legal Case Criticality Dataset through Semi-Automated Labeling
MedINST：生物医学指令元数据集	Wenhan Han	PDF	N/A	MedINST: Meta Dataset of Biomedical Instructions
解锁法律知识：瑞士司法摘要的多语言数据集	Luca Rolshoven	PDF	N/A	Unlocking Legal Knowledge: A Multilingual Dataset for Judicial Summarization in Switzerland
通过自触发混合检测方法实现的多智能体拜占庭弹性输出优化	Chenhang Yan	PDF	N/A	Byzantine-Resilient Output Optimization of Multiagent via Self-Triggered Hybrid Detection Approach
使用大型语言模型进行图像分类的增强策略生成	Ant Duru	PDF	N/A	Augmentation Policy Generation for Image Classification Using Large Language Models
使用树快速估计部分依赖函数	Jinyang Liu	PDF	N/A	Fast Estimation of Partial Dependence Functions using Trees
低资源自动语音识别中多语言多模态模型的参数高效适应	Abhishek Gupta	PDF	N/A	Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR
NLIP_Lab-IITH 多语言MT系统，用于WAT24 MT共享任务	Maharaj Brahma	PDF	N/A	NLIP_Lab-IITH Multilingual MT System for WAT24 MT Shared Task
指令驱动的游戏引擎：扑克案例研究	Hongqiu Wu	PDF	N/A	Instruction-Driven Game Engine: A Poker Case Study
带有监督对比学习的多标签分类的相似性-不相似性损失	Guangming Huang	PDF	N/A	Similarity-Dissimilarity Loss with Supervised Contrastive Learning for Multi-label Classification
时间增强多模态Transformer用于指代多目标跟踪与分割	Changcheng Xiao	PDF	N/A	Temporal-Enhanced Multimodal Transformer for Referring Multi-Object Tracking and Segmentation
通过最优输运解决扩散模型中的先验分布不匹配问题	Zhanpeng Wang	PDF	N/A	Solving Prior Distribution Mismatch in Diffusion Models via Optimal Transport
通过对比MR-to-CT模态转换实现的无监督颅骨分割	Kamil Kwarciak	PDF	N/A	Unsupervised Skull Segmentation via Contrastive MR-to-CT Modality Translation
嵌入特征空间上高斯混合模型分类器的性能	Jeremy Chopin	PDF	N/A	Performance of Gaussian Mixture Model Classifiers on Embedded Feature Spaces
部分训练的图卷积网络抵抗过平滑	Dimitrios Kelesis	PDF	N/A	Partially Trained Graph Convolutional Networks Resist Oversmoothing
Shavette：通过算法级错误检测和欠压实现低功耗神经网络加速	Mikael Rinkinen	PDF	N/A	Shavette: Low Power Neural Network Acceleration via Algorithm-level Error Detection and Undervolting
三思而后行：大型语言模型中的渐进思维优化	Chengyu Du	PDF	N/A	Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
RAMPA：用于机器编程和自动化的机器人增强现实技术	Fatih Dogangun	PDF	N/A	RAMPA: Robotic Augmented Reality for Machine Programming and Automation
Attr-Int：一种简单且有效的异构知识图谱实体对齐框架	Linyan Yang	PDF	N/A	Attr-Int: A Simple and Effective Entity Alignment Framework for Heterogeneous Knowledge Graphs
MoR：低秩适应调优的秩混合方法	Chuanyu Tang	PDF	N/A	MoR: Mixture of Ranks for Low-Rank Adaptation Tuning
预测乳腺癌生存率：利用对数优势比和临床变量的生存分析方法	Opeyemi Sheu Alamu	PDF	N/A	Predicting Breast Cancer Survival: A Survival Analysis Approach Using Log Odds and Clinical Variables
新闻中的混合智能：ChatGPT与人类合作分析希腊政治修辞的发现与经验教训	Thanasis Troboukis	PDF	N/A	Towards Hybrid Intelligence in Journalism: Findings and Lessons Learnt from a Collaborative Analysis of Greek Political Rhetoric by ChatGPT and Humans
使用Shapley头值的语言模型语言学基础分析	Marcell Fekete	PDF	N/A	Linguistically Grounded Analysis of Language Models using Shapley Head Values
跨语言自动评估用于评估多语言大型语言模型	Sumanth Doddapaneni	PDF	N/A	Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs
元认知监控：超越生成式人工智能的人类能力	Markus Huff	PDF	N/A	Metacognitive Monitoring: A Human Ability Beyond Generative Artificial Intelligence
用于高维数据分类的自构建多专家模糊系统	Yingtao Ren	PDF	N/A	A Self-Constructing Multi-Expert Fuzzy System for High-dimensional Data Classification
利用音频改进对话策略	Daniel Roncel	PDF	N/A	On the Use of Audio to Improve Dialogue Policies
RescueADI：利用自主代理在遥感图像中进行自适应灾害解释	Zhuoran Liu	PDF	N/A	RescueADI: Adaptive Disaster Interpretation in Remote Sensing Images with Autonomous Agents
基于智能半自动化数据标注的铁路激光雷达语义分割	Florian Wulff	PDF	N/A	Railway LiDAR semantic segmentation based on intelligent semi-automated data annotation
通过核最近邻学习反事实分布	Kyuseong Choi	PDF	N/A	Learning Counterfactual Distributions via Kernel Nearest Neighbors