跳转至

Arxiv Daily

Arxiv 2025-01-09 Papers

Arxiv 2025-01-09 Papers

标题	作者	PDF链接	代码仓库	Title
ReFocus：将视觉编辑作为结构化图像理解的思维链	Xingyu Fu	PDF	N/A	ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding
以下是将“An Empirical Study of Autoregressive Pre-training from Videos”翻译成中文的结果：

基于视频的自回归预训练实证研究

这个标题可以理解为一项针对从视频数据中进行自回归预训练方法的实证研究。自回归预训练是一种常见的机器学习技术，通常用于生成模型（如语言模型或视频生成模型），而“实证研究”则强调通过实验和数据来验证方法的有效性。 | Jathushan Rajasegaran | PDF | N/A | An Empirical Study of Autoregressive Pre-training from Videos | | 去中心化扩散模型 | David McAllister | PDF | N/A | Decentralized Diffusion Models | | 可解释性AI增强的深度学习用于南瓜叶病害检测：CNN架构的比较分析

在这段文字中，"Explainable AI-Enhanced Deep Learning" 指的是结合了可解释性人工智能（AI）技术的深度学习方法，这种方法不仅能够进行高效的图像识别和分析，还能提供对决策过程的解释，使得结果更加透明和可信。"Pumpkin Leaf Disease Detection" 指的是针对南瓜叶病害的检测任务，这是农业领域中一个重要的应用，旨在通过技术手段及时发现并处理植物病害，以保障作物健康。"A Comparative Analysis of CNN Architectures" 则表明这项研究将对比不同的卷积神经网络（CNN）架构，以评估它们在特定任务上的表现和效果。整体而言，这段文字描述了一项研究，该研究利用可解释性AI增强的深度学习技术，特别是不同的CNN架构，来检测南瓜叶的病害，并对这些架构进行了比较分析。 | Md. Arafat Alam Khandaker | PDF | N/A | Explainable AI-Enhanced Deep Learning for Pumpkin Leaf Disease Detection: A Comparative Analysis of CNN Architectures | | 通过单目深度先验的仿射校正进行相对姿态估计 | Yifan Yu | PDF | N/A | Relative Pose Estimation through Affine Corrections of Monocular Depth Priors | | 文本到3D生成的连贯流蒸馏 | Runjie Yan | PDF | N/A | Consistent Flow Distillation for Text-to-3D Generation | | 多模态大语言模型（MLLMs）能否进行多模态推理？EMMA：一个增强的多模态推理基准 | Yunzhuo Hao | PDF | N/A | Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark | | 以下是这段文字的中文翻译：

使用尖端语言模型和大型语言模型进行文本网络滥用检测的调查

这个翻译保留了原文的核心意思，同时将其转换为流畅的中文表达。 | Jose A. Diaz-Garcia | PDF | N/A | A survey of textual cyber abuse detection using cutting-edge language models and large language models | | 视频分词器的渐进式增长以实现高度压缩的潜在空间 | Aniruddha Mahapatra | PDF | N/A | Progressive Growing of Video Tokenizers for Highly Compressed Latent Spaces | | GAN已死，GAN永存！一个现代GAN基准 | Yiwen Huang | PDF | N/A | The GAN is dead; long live the GAN! A Modern GAN Baseline | | 从简单到复杂的技能：以手中物体重新定向为例 | Haozhi Qi | PDF | N/A | From Simple to Complex Skills: The Case of In-Hand Object Reorientation | | $DPF^$：改进的深度势函数，用于尺度不变的脑沟深度估计 | Maxime Dieudonné | PDF | N/A | $DPF^$: improved Depth Potential Function for scale-invariant sulcal depth estimation | | 2024年神经符号人工智能：系统性综述 | Brandon C. Colelough | PDF | N/A | Neuro-Symbolic AI in 2024: A Systematic Review | | "Flatland Vision" 可以翻译为 “平面国视野” 或 “二维世界的视角”。具体翻译取决于上下文。如果这是一个书名、项目名或概念名称，通常可以保留原文或根据具体含义进行意译。如果需要更详细的翻译，请提供更多背景信息！ | Sameer Agarwal | PDF | N/A | Flatland Vision | | 零到一再到G：驯服预训练的2D扩散模型以实现直接3D生成 | Xuyi Meng | PDF | N/A | Zero-1-to-G: Taming Pretrained 2D Diffusion Model for Direct 3D Generation | | 从图像到洞见：利用可解释性AI革新脑癌诊断 | Md. Arafat Alam Khandaker | PDF | N/A | From Images to Insights: Transforming Brain Cancer Diagnosis with Explainable AI | | 高维纠缠均值估计 | Ilias Diakonikolas | PDF | N/A | Entangled Mean Estimation in High-Dimensions | | 使用大型语言模型（LLMs）推断中国微博用户的非二元化新冠疫情情绪 | Jerry Chongyi Hu | PDF | N/A | Using LLMs to Infer Non-Binary COVID-19 Sentiments of Chinese Micro-bloggers | | 不确定性感知的知识追踪 | Weihua Cheng | PDF | N/A | Uncertainty-aware Knowledge Tracing | | LongProc: 在长流程生成任务上对长上下文语言模型进行基准测试 | Xi Ye | PDF | N/A | LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation | | 《看见声音：从视觉元素中组合声音以实现音频到图像的生成》 | Darius Petermann | PDF | N/A | Seeing Sound: Assembling Sounds from Visuals for Audio-to-Image Generation | | 梅奥诊所、夏里特医院和Aignostics联合开发的新型病理学基础模型 | Maximilian Alber | PDF | N/A | A Novel Pathology Foundation Model by Mayo Clinic, Charité, and Aignostics | | TimeRL：利用多面体依赖图实现高效深度强化学习 | Pedro F. Silvestre | PDF | N/A | TimeRL: Efficient Deep Reinforcement Learning with Polyhedral Dependence Graphs | | 使用蒙特卡洛搜索进行在线策略改进 | Gerald Tesauro | PDF | N/A | On-line Policy Improvement using Monte-Carlo Search | | TimeDP：学习使用领域提示生成多领域时间序列 | Yu-Hao Huang | PDF | N/A | TimeDP: Learning to Generate Multi-Domain Time Series with Domain Prompts | | BRATI: 用于时间序列插值的双向循环注意力机制 | Armando Collado-Villaverde | PDF | N/A | BRATI: Bidirectional Recurrent Attention for Time-Series Imputation | | YOLOv7在厨房刀具使用安全中的表现 | Athulya Sundaresan Geetha | PDF | N/A | Performance of YOLOv7 in Kitchen Safety While Handling Knife | | 使用SemanticLens对大型AI模型进行机制理解与验证 | Maximilian Dreyer | PDF | N/A | Mechanistic understanding and validation of large AI models with SemanticLens | | FairCode：评估大语言模型在代码生成中的社会偏见 | Yongkang Du | PDF | N/A | FairCode: Evaluating Social Bias of LLMs in Code Generation | | 关于自动驾驶风险管理的全球共识 | Sebastian Krügel | PDF | N/A | The global consensus on the risk management of autonomous driving | | 集成可解释的人工智能以有效检测加密网络流量中的恶意软件 | Sileshi Nibret Zeleke | PDF | N/A | Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic | | 大型物理模型：迈向与大型语言模型和基础模型的协作方法 | Kristian G. Barman | PDF | N/A | Large Physics Models: Towards a collaborative approach with Large Language Models and Foundation Models | | Arc2Avatar：通过ID引导从单张图像生成富有表现力的3D虚拟形象 | Dimitrios Gerogiannis | PDF | N/A | Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance | | 一种便携式解决方案，用于同时进行人体运动和移动脑电图采集：篮球罚球投篮的准备电位 | Contreras-Altamirano | PDF | N/A | A Portable Solution for Simultaneous Human Movement and Mobile EEG Acquisition: Readiness Potentials for Basketball Free-throw Shooting | | 通过推测性采样加速扩散模型 | Valentin De Bortoli | PDF | N/A | Accelerated Diffusion Models via Speculative Sampling | | 使用范畴论构建向量符号架构的基础 | Nolan P Shaw | PDF | N/A | Developing a Foundation of Vector Symbolic Architectures Using Category Theory | | 1-2-1: 单网络范式的复兴——虚拟试衣 | Shuliang Ning | PDF | N/A | 1-2-1: Renaissance of Single-Network Paradigm for Virtual Try-On | | 搜索-o1：增强型代理搜索大规模推理模型 | Xiaoxi Li | PDF | N/A | Search-o1: Agentic Search-Enhanced Large Reasoning Models | | 在调整差距的误设下的无遗憾线性赌博机 | Chong Liu | PDF | N/A | No-Regret Linear Bandits under Gap-Adjusted Misspecification | | 关于多智能体游戏中的可修正性与对齐性 | Edmund Dable-Heath | PDF | N/A | On Corrigibility and Alignment in Multi Agent Games | | CROPS：一种与模型无关、无需训练的安全图像合成框架，适用于潜在扩散模型 | Junha Park | PDF | N/A | CROPS: Model-Agnostic Training-Free Framework for Safe Image Synthesis with Latent Diffusion Models | | JAQ：联合高效架构设计与低比特量化及硬件-软件协同探索 | Mingzi Wang | PDF | N/A | JAQ: Joint Efficient Architecture Design and Low-Bit Quantization with Hardware-Software Co-Exploration | | 流对齐器：通过分布归纳实现高效的句子级对齐 | Hantao Lou | PDF | N/A | Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction | | 面包师与磨坊主的游戏：受限制的位置 | Simon Krogmann | PDF | N/A | The Bakers and Millers Game with Restricted Locations | | 稳定性和列表可复制性对于不可知论学习者的影响 | Ari Blonda | PDF | N/A | Stability and List-Replicability for Agnostic Learners | | AnCoGen：使用掩码自编码器进行语音的分析、控制与生成 | Samir Sadok | PDF | N/A | AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder | | 基于模型的强化学习代理中的知识转移以实现高效的多任务学习 | Dmytro Kuzmenko | PDF | N/A | Knowledge Transfer in Model-Based Reinforcement Learning Agents for Efficient Multi-Task Learning | | 解释性对话：一项专家焦点研究，旨在理解《通用数据保护条例》（GDPR）中对解释的要求 | Laura State | PDF | N/A | The explanation dialogues: an expert focus study to understand requirements towards explanations within the GDPR | | 分布式学习与推理系统：网络视角 | Hesham G. Moussa | PDF | N/A | Distributed Learning and Inference Systems: A Networking Perspective | | 优化无服务器计算中专家混合模型推理的分布式部署 | Mengfan Liu | PDF | N/A | Optimizing Distributed Deployment of Mixture-of-Experts Model Inference in Serverless Computing | | 具有异质敏感性的私人选择 | Daniela Antonova | PDF | N/A | Private Selection with Heterogeneous Sensitivities | | 对比研究：利用深度学习在合成孔径雷达图像中划定冰川崩解前沿 | Nora Gourmelon | PDF | N/A | Comparison Study: Glacier Calving Front Delineation in Synthetic Aperture Radar Images With Deep Learning | | 在紧致阿贝尔群上学习卷积算子 | Emilia Magnani | PDF | N/A | Learning convolution operators on compact Abelian groups | | 动态拍卖环境中的离策略评估与反事实方法 | Ritam Guha | PDF | N/A | Off-Policy Evaluation and Counterfactual Methods in Dynamic Auction Environments | | 解决广义类别发现中的灾难性遗忘问题 | Xinzi Cao | PDF | N/A | Solving the Catastrophic Forgetting Problem in Generalized Category Discovery | | CellViT++：基于基础模型的高效自适应细胞分割与分类 | Fabian Hörst | PDF | N/A | CellViT++: Energy-Efficient and Adaptive Cell Segmentation and Classification Using Foundation Models | | 基于重构模型的Patch-GAN迁移学习在云去除中的应用 | Wanli Ma | PDF | N/A | Patch-GAN Transfer Learning with Reconstructive Models for Cloud Removal | | 迈向平衡的持续多模态学习在人体姿态估计中的应用 | Jiaxuan Peng | PDF | N/A | Towards Balanced Continual Multi-Modal Learning in Human Pose Estimation | | 提升马拉地语剽窃检测：结合TF-IDF和BERT嵌入的加权集成方法在低资源语言处理中的应用 | Atharva Mutsaddi | PDF | N/A | Enhancing Plagiarism Detection in Marathi with a Weighted Ensemble of TF-IDF and BERT Embeddings for Low-Resource Language Processing | | 通过分析GitHub问题自动检测代码漏洞 | Daniele Cipollone | PDF | N/A | Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues | | CallNavi: 大语言模型中函数调用路由与调用的研究与挑战 | Yewei Song | PDF | N/A | CallNavi: A Study and Challenge on Function Calling Routing and Invocation in Large Language Models | | 从科学文本到可验证代码：利用Transformer模型实现过程自动化 | Changjie Wang | PDF | N/A | From Scientific Texts to Verifiable Code: Automating the Process with Transformers | | RAG-WM：一种针对大型语言模型检索增强生成的高效黑盒水印方法 | Peizhuo Lv | PDF | N/A | RAG-WM: An Efficient Black-Box Watermarking Approach for Retrieval-Augmented Generation of Large Language Models | | 从大型语言模型中通过资源高效剪枝导出编码专用子模型 | Laura Puccioni | PDF | N/A | Deriving Coding-Specific Sub-Models from LLMs using Resource-Efficient Pruning | | 程序合成中的在线提示与求解器选择 | Yixuan Li | PDF | N/A | Online Prompt and Solver Selection for Program Synthesis | | 恶劣驾驶条件下的自动驾驶领域增量语义分割 | Shishir Muralidhara | PDF | N/A | Domain-Incremental Semantic Segmentation for Autonomous Driving under Adverse Driving Conditions | | 使用改进的快速傅里叶变换进行非视线成像的优化采样 | Talha Sultan | PDF | N/A | Optimized Sampling for Non-Line-of-Sight Imaging Using Modified Fast Fourier Transforms | | Scaffold-SLAM：用于同时定位与逼真地图构建的结构化3D高斯模型 | Wen Tianci | PDF | N/A | Scaffold-SLAM: Structured 3D Gaussians for Simultaneous Localization and Photorealistic Mapping | | 使用运动和纹理融合在Cine MRI中进行无对比剂心肌瘢痕分割 | Guang Yang | PDF | N/A | Contrast-Free Myocardial Scar Segmentation in Cine MRI using Motion and Texture Fusion | | 你的自动驾驶汽车安全吗？了解电磁信号注入攻击对交通场景感知的威胁 | Wenhao Liao | PDF | N/A | Is Your Autonomous Vehicle Safe? Understanding the Threat of Electromagnetic Signal Injection Attacks on Traffic Scene Perception | | 焦点：迈向通用前景分割 | Zuyao You | PDF | N/A | FOCUS: Towards Universal Foreground Segmentation | | 使用局部纹理特征在锥束CT中自动分割外部宫颈吸收 | Sadhana Ravikumar | PDF | N/A | Automated external cervical resorption segmentation in cone-beam CT using local texture features | | 利用半监督学习和大型语言模型优化爱沙尼亚语电视字幕 | Artem Fedorchenko | PDF | N/A | Optimizing Estonian TV Subtitles with Semi-supervised Learning and LLMs | | 利用大型语言模型和视觉-语言模型进行鲁棒的分布外检测 | Pei-Kang Lee | PDF | N/A | Harnessing Large Language and Vision-Language Models for Robust Out-of-Distribution Detection | | 基于光传输感知的扩散后验采样用于单视图三维体积重建 | Ludwic Leonard | PDF | N/A | Light Transport-aware Diffusion Posterior Sampling for Single-View Reconstruction of 3D Volumes | | 利用大型语言模型在生物医学及其他领域进行零样本层次化摘要生成 | Tomas Goldsack | PDF | N/A | Leveraging Large Language Models for Zero-shot Lay Summarisation in Biomedicine and Beyond | | EVA-S2PLoR：一种在异构数据库上实现的安全元素乘法与逻辑回归相结合的方法 | Tianle Tao | PDF | N/A | EVA-S2PLoR: A Secure Element-wise Multiplication Meets Logistic Regression on Heterogeneous Database | | ParaRev：构建一个用于科学段落修订的数据集，并附有修订指令的注释 | Léane Jourdan | PDF | N/A | ParaRev: Building a dataset for Scientific Paragraph Revision annotated with revision instruction | | 一种新颖的可扩展且自动化的主题控制问题生成方法在教育中的应用 | Ziqing Li | PDF | N/A | A Novel Approach to Scalable and Automatic Topic-Controlled Question Generation in Education | | GLaM-Sign：希腊语多模态唇读与集成手语无障碍功能 | Dimitris Kouremenos | PDF | N/A | GLaM-Sign: Greek Language Multimodal Lip Reading with Integrated Sign Language Accessibility | | MHAFF：基于CNN和Transformer的多头注意力特征融合用于牛只识别 | Rabin Dulal | PDF | N/A | MHAFF: Multi-Head Attention Feature Fusion of CNN and Transformer for Cattle Identification | | 代码：通信延迟容忍的多智能体协作——通过意图与时效性的双重对齐 | Shoucheng Song | PDF | N/A | CoDe: Communication Delay-Tolerant Multi-Agent Collaboration via Dual Alignment of Intent and Timeliness | | 探索婴儿学习中超越语言输入的隐藏视觉概念 | Xueyi Ke | PDF | N/A | Discovering Hidden Visual Concepts Beyond Linguistic Input in Infant Learning | | 双足机器人角色的设计与控制 | Ruben Grandia | PDF | N/A | Design and Control of a Bipedal Robotic Character | | 一种用于因果健康公平的算法方法：重症监护病房（ICU）结果中的种族差异研究 | Drago Plecko | PDF | N/A | An Algorithmic Approach for Causal Health Equity: A Look at Race Differentials in Intensive Care Unit (ICU) Outcomes | | HipyrNet：用于混合曝光校正的超网络引导特征金字塔网络 | Shaurya Singh Rathore | PDF | N/A | HipyrNet: Hypernet-Guided Feature Pyramid network for mixed-exposure correction | | RadioTransformer: 精确的无线电地图构建与覆盖预测 | Yuxuan Li | PDF | N/A | RadioTransformer: Accurate Radio Map Construction and Coverage Prediction | | 压缩与全局引导：迈向无需训练的高分辨率多语言学习模型加速 | Xuyang Liu | PDF | N/A | Compression with Global Guidance: Towards Training-free High-Resolution MLLMs Acceleration | | FaceMe：具备个人识别功能的鲁棒性盲人脸修复技术

（注：这里的“盲”指的是在缺乏先验信息或特定条件下进行修复，而非字面意义上的“看不见”。） | Siyu Liu | PDF | N/A | FaceMe: Robust Blind Face Restoration with Personal Identification | | 去中心化（传统）用户：推荐系统的多利益相关方评估 | Robin Burke | PDF | N/A | De-centering the (Traditional) User: Multistakeholder Evaluation of Recommender Systems | | 在混乱中建立秩序：论人工智能在安全软件工程中的作用 | Matteo Esposito | PDF | N/A | Bringing Order Amidst Chaos: On the Role of Artificial Intelligence in Secure Software Engineering | | 基于可解释人工智能的送风温度预测系统 | Marika Eik | PDF | N/A | Explainable AI based System for Supply Air Temperature Forecast | | 生物医学关系抽取通过自适应文档-关系交叉映射和概念唯一标识符实现 | Yufei Shang | PDF | N/A | Biomedical Relation Extraction via Adaptive Document-Relation Cross-Mapping and Concept Unique Identifier | | 关于深度学习在计算机视觉中深度估计的系统文献综述 | Ali Rohan | PDF | N/A | A Systematic Literature Review on Deep Learning-based Depth Estimation in Computer Vision | | CorrDiff：具有时间线索输入的自适应延迟感知检测器，用于实时目标检测 | Xiang Zhang | PDF | N/A | CorrDiff: Adaptive Delay-aware Detector with Temporal Cue Inputs for Real-time Object Detection | | 3DIS-FLUX：使用DiT渲染实现简单高效的多实例生成 | Dewei Zhou | PDF | N/A | 3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering | | 学习用于异常检测的分布内表示 | William T. Lunardi | PDF | N/A | Learning In-Distribution Representations for Anomaly Detection | | Centurio：论大型视觉-语言模型多语言能力的驱动因素 | Gregor Geigle | PDF | N/A | Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model | | 改进U-Net配置以自动化MRI头颈部癌症的轮廓描绘 | Andrei Iantsen | PDF | N/A | Improving the U-Net Configuration for Automated Delineation of Head and Neck Cancer on MRI | | 使用多智能体强化学习进行带电粒子追踪的约束优化 | Tobias Kortus | PDF | N/A | Constrained Optimization of Charged Particle Tracking with Multi-Agent Reinforcement Learning | | EquiBoost: 一种用于分子构象生成的等变增强方法 | Yixuan Yang | PDF | N/A | EquiBoost: An Equivariant Boosting Approach to Molecular Conformation Generation | | 优化多任务工业流程与预测性行动指导 | Naval Kishore Mehta | PDF | N/A | Optimizing Multitask Industrial Processes with Predictive Action Guidance | | 稳健评分匹配 | Richard Schwank | PDF | N/A | Robust Score Matching | | Motion-X++：一个大规模多模态3D全身人体运动数据集 | Yuhong Zhang | PDF | N/A | Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion Dataset | | 一个用于图像分类和基于分块压缩的1Mb混合精度量化编码器 | Van Thien Nguyen | PDF | N/A | A 1Mb mixed-precision quantized encoder for image classification and patch-based compression | | 推进ALS应用的大规模预训练：数据集开发与下游评估 | Haoyi Xiu | PDF | N/A | Advancing ALS Applications with Large-Scale Pre-training: Dataset Development and Downstream Assessment | | 以下是这段文字的中文翻译：

层次分解双域深度学习用于稀疏视角CT重建

这个翻译保留了原文的技术术语和结构，同时使其更符合中文的表达习惯。 | Yoseob Han | PDF | N/A | Hierarchical Decomposed Dual-domain Deep Learning for Sparse-View CT Reconstruction | | ResPanDiff：具有解耦调制功能的扩散模型用于图像融合 | Shiqi Cao | PDF | N/A | ResPanDiff: Diffusion Model with Disentangled Modulations for Image Fusion | | 监督学习与任务演变及性能保证 | Verónica Álvarez | PDF | N/A | Supervised Learning with Evolving Tasks and Performance Guarantees | | 基于脉冲神经网络的增强型分位数回归用于长期系统健康预测 | David J Poland | PDF | N/A | Enhanced Quantile Regression with Spiking Neural Networks for Long-Term System Health Prognostics | | 端到端深度学习在低剂量X射线CT内部成像中的应用 | Yoseob Han | PDF | N/A | End-to-End Deep Learning for Interior Tomography with Low-Dose X-ray CT | | 比较用于从PDF学术文档中提取元数据的特征学习方法 | Zeyd Boukhers | PDF | N/A | Comparison of Feature Learning Methods for Metadata Extraction from PDF Scholarly Documents | | DriVLM：自动驾驶中视觉-语言模型的领域适应 | Xuran Zheng | PDF | N/A | DriVLM: Domain Adaptation of Vision-Language Models in Autonomous Driving | | 在多模态到文本的提示工程中，利用特征嵌入进行GNSS干扰特征描述的大型语言模型 | Harshith Manjunath | PDF | N/A | Multimodal-to-Text Prompt Engineering in Large Language Models Using Feature Embeddings for GNSS Interference Characterization | | 通过模型归因的视角分析大型语言模型中的记忆化现象 | Tarun Ram Menta | PDF | N/A | Analyzing Memorization in Large Language Models through the Lens of Model Attribution | | TipSegNet：非接触式指纹成像中的指尖分割 | Laurenz Ruzicka | PDF | N/A | TipSegNet: Fingertip Segmentation in Contactless Fingerprint Imaging | | 基于大语言模型的通用工业过程任务的文本知识嵌入软测量建模方法 | Shuo Tong | PDF | N/A | A Text-Based Knowledge-Embedded Soft Sensing Modeling Approach for General Industrial Process Tasks Based on Large Language Model | | 一个灵活且可扩展的视频片段搜索框架 | Chongzhi Zhang | PDF | N/A | A Flexible and Scalable Framework for Video Moment Search | | 通过基于视频的蕴含树推理进行常识性视频问答 | Huabin Liu | PDF | N/A | Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning | | D3RM：一种用于钢琴转录的离散去噪扩散精炼模型 | Hounsu Kim | PDF | N/A | D3RM: A Discrete Denoising Diffusion Refinement Model for Piano Transcription | | LLaVA-Octopus：解锁指令驱动的自适应投影融合技术，用于视频理解 | Jiaxing Zhao | PDF | N/A | LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding | | 利用交互对象信息改进基于骨架的动作识别 | Hao Wen | PDF | N/A | Improving Skeleton-based Action Recognition with Interactive Object Information | | 以下是这段文字的中文翻译：

基于物理一致性的深度学习区域海洋模拟器的同步模拟与降尺度

翻译说明： - "Simultaneous emulation" 翻译为 "同步模拟"，表示同时进行的模拟过程。 - "downscaling" 翻译为 "降尺度"，在地球科学中通常指将大尺度数据或模型结果细化到更小尺度的过程。 - "physically-consistent" 翻译为 "物理一致性"，强调模型或方法在物理上的合理性。 - "deep learning-based" 翻译为 "基于深度学习的"，说明方法的核心技术。 - "regional ocean emulators" 翻译为 "区域海洋模拟器"，指用于模拟特定区域海洋系统的工具或模型。

希望这个翻译对你有帮助！如果有其他问题，欢迎随时提问。 | Leonard Lupin-Jimenez | PDF | N/A | Simultaneous emulation and downscaling with physically-consistent deep learning-based regional ocean emulators | | LearningFlow: 基于大型语言模型的城市驾驶自动化策略学习工作流 | Zengqi Peng | PDF | N/A | LearningFlow: Automated Policy Learning Workflow for Urban Driving with Large Language Models | | TAPFed：用于隐私保护联邦学习的阈值安全聚合 | Runhua Xu | PDF | N/A | TAPFed: Threshold Secure Aggregation for Privacy-Preserving Federated Learning | | SWE-Fixer：训练开源大型语言模型以实现高效解决GitHub问题 | Chengxing Xie | PDF | N/A | SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution | | LongViTU：用于长视频理解的指令调优 | Rujie Wu | PDF | N/A | LongViTU: Instruction Tuning for Long-Form Video Understanding | | 迈向指纹拼接伪影检测：一种自监督深度学习方法 | Laurenz Ruzicka | PDF | N/A | Towards Fingerprint Mosaicking Artifact Detection: A Self-Supervised Deep Learning Approach | | 提升大型语言模型中的人类化响应能力 | Ethem Yağız Çalık | PDF | N/A | Enhancing Human-Like Responses in Large Language Models | | ECBench：多模态基础模型能否理解自我中心的世界？一个全面的具身认知基准测试 | Ronghao Dang | PDF | N/A | ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark | | 多模态案例推理应用的通用检索增强生成框架 | Ofir Marom | PDF | N/A | A General Retrieval-Augmented Generation Framework for Multimodal Case-Based Reasoning Applications | | 感知即控制：利用3D感知运动表示实现细粒度可控的图像动画 | Yingjie Chen | PDF | N/A | Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation | | 在嵌入的干草堆中寻找针：通过装袋法和支持向量回归集成进行法律文档检索 | Kevin Bönisch | PDF | N/A | Finding Needles in Emb(a)dding Haystacks: Legal Document Retrieval via Bagging and SVR Ensembles | | 持续知识保留分解用于少样本持续学习 | Xiaojie Li | PDF | N/A | Continuous Knowledge-Preserving Decomposition for Few-Shot Continual Learning | | 关于图对抗攻击的不可察觉性度量：观察、新度量及应用 | Hyeonsoo Jo | PDF | N/A | On Measuring Unnoticeability of Graph Adversarial Attacks: Observations, New Measure, and Applications | | UAV-VLA：面向大规模空中任务生成的视觉-语言-动作系统 | Oleg Sautenkov | PDF | N/A | UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission Generation | | 一个可扩展的海洋数据可视化分析系统 | Toshit Jain | PDF | N/A | A Scalable System for Visual Analysis of Ocean Data | | 量子增强的因果发现适用于少量样本 | Yota Maeda | PDF | N/A | Quantum-enhanced causal discovery for a small number of samples | | 一种高精度的功率半导体器件瞬态TSEPs校准方法 | Qinghao Zhang | PDF | N/A | A High-accuracy Calibration Method of Transient TSEPs for Power Semiconductor Devices | | 家庭和能源社区的负荷预测：深度学习模型值得投入吗？ | Lukas Moosbrugger | PDF | N/A | Load Forecasting for Households and Energy Communities: Are Deep Learning Models Worth the Effort? | | GiNet：集成序列与上下文感知学习的电池容量预测 | Sara Sameer | PDF | N/A | GiNet: Integrating Sequential and Context-Aware Learning for Battery Capacity Prediction | | 基于预训练MobileNetV2模型和迁移学习的肺部肿瘤CT图像分类网络框架及其在医疗领域的应用与市场分析

这个标题描述了一个用于肺部肿瘤CT图像分类的网络框架，该框架基于预训练的MobileNetV2模型，并采用了迁移学习技术。此外，还探讨了该技术在医疗领域的应用和市场分析。 | Ziyang Gao | PDF | N/A | A CT Image Classification Network Framework for Lung Tumors Based on Pre-trained MobileNetV2 Model and Transfer learning, And Its Application and Market Analysis in the Medical field | | IPDN：用于3D指代表达分割的图像增强提示解码网络 | Qi Chen | PDF | N/A | IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation | | TreeKV：采用树结构实现平滑的键值缓存压缩 | Ziwei He | PDF | N/A | TreeKV: Smooth Key-Value Cache Compression with Tree Structures | | CuRLA: 基于课程学习的深度强化学习在自动驾驶中的应用 | Bhargava Uppuluri | PDF | N/A | CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving | | V2C-CBM：使用视觉到概念标记器构建概念瓶颈 | Hangzhou He | PDF | N/A | V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer | | SensorQA：一个用于日常生活监测的问答基准 | Benjamin Reichman | PDF | N/A | SensorQA: A Question Answering Benchmark for Daily-Life Monitoring | | 自适应伊辛机用于约束优化 | Corentin Delacour | PDF | N/A | Self-Adaptive Ising Machines for Constrained Optimization | | 通过测试时适应应对时间序列预测中的非平稳性问题 | HyunGi Kim | PDF | N/A | Battling the Non-stationarity in Time Series Forecasting via Test-time Adaptation | | AD-L-JEPA：基于联合嵌入预测架构的自监督空间世界模型，用于LiDAR数据自动驾驶 | Haoran Zhu | PDF | N/A | AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data | | 目标对抗性去噪自编码器（TADA）用于神经时间序列滤波 | Benjamin J. Choi | PDF | N/A | Targeted Adversarial Denoising Autoencoders (TADA) for Neural Time Series Filtration | | 绘画能力的出现通过识别驱动的进化 | Yi Lin | PDF | N/A | Emergence of Painting Ability via Recognition-Driven Evolution | | VoxEval：评估端到端口语模型的知识理解能力基准 | Wenqian Cui | PDF | N/A | VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models | | 揭秘金融大语言模型的领域自适应后训练 | Zixuan Ke | PDF | N/A | Demystifying Domain-adaptive Post-training for Financial LLMs | | 通过不平衡感知的域适应解决胚胎发育评估中的领域偏移问题 | Lei Li | PDF | N/A | Addressing Domain Shift via Imbalance-Aware Domain Adaptation in Embryo Development Assessment | | 机器学习中的“遗忘”问题在AI安全领域中的开放性挑战 | Fazl Barez | PDF | N/A | Open Problems in Machine Unlearning for AI Safety | | MORDA：一个合成数据集，旨在促进对象检测器适应未见过的真实目标领域，同时保持其在真实源领域上的性能 | Hojun Lim | PDF | N/A | MORDA: A Synthetic Dataset to Facilitate Adaptation of Object Detectors to Unseen Real-target Domain While Preserving Performance on Real-source Domain | | 《部分确定性视角下的建筑环境机器人场景识别：基于保形预测的方法》 | Yifan Xu | PDF | N/A | Seeing with Partial Certainty: Conformal Prediction for Robotic Scene Recognition in Built Environments | | 对稀疏模型中惩罚最小截断平方性能的非渐近分析 | Yijun Zuo | PDF | N/A | Non-asymptotic analysis of the performance of the penalized least trimmed squares in sparse models | | 逐步精通：提升大型语言模型的软约束遵循能力 | Qingyu Ren | PDF | N/A | Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models | | MambaHSI：用于高光谱图像分类的空间-光谱曼巴模型 | Yapeng Li | PDF | N/A | MambaHSI: Spatial-Spectral Mamba for Hyperspectral Image Classification | | 基于粒球计算的全新视角：联邦学习中的隐私保护 | Guannan Lai | PDF | N/A | A New Perspective on Privacy Protection in Federated Learning with Granular-Ball Computing | | 多上下文时间一致性建模用于参考视频对象分割 | Sun-Hyuk Choi | PDF | N/A | Multi-Context Temporal Consistent Modeling for Referring Video Object Segmentation | | 即插即用DISep：在高分辨率遥感图像中分离密集实例以实现场景到像素的弱监督变化检测 | Zhenghui Zhao | PDF | N/A | Plug-and-Play DISep: Separating Dense Instances for Scene-to-Pixel Weakly-Supervised Change Detection in High-Resolution Remote Sensing Images | | 通过洗牌不一致性破解多模态大型语言模型 | Shiji Zhao | PDF | N/A | Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency | | Image2CADSeq：从产品图像中推断计算机辅助设计序列与知识 | Xingang Li | PDF | N/A | Image2CADSeq: Computer-Aided Design Sequence and Knowledge Inference from Product Images | | 研究大型语言模型在数值翻译中的应用 | Wei Tang | PDF | N/A | Investigating Numerical Translation with Large Language Models | | FLowHigh：通过单步流匹配实现高效且高质量的音频超分辨率 | Jun-Hak Yun | PDF | N/A | FLowHigh: Towards Efficient and High-Quality Audio Super-Resolution with Single-Step Flow Matching | | SpecTf: 变压器技术助力数据驱动的成像光谱云检测 | Jake H. Lee | PDF | N/A | SpecTf: Transformers Enable Data-Driven Imaging Spectroscopy Cloud Detection | | 从网格补全到AI设计的牙冠 | Golriz Hosseinimanesh | PDF | N/A | From Mesh Completion to AI Designed Crown | | 一种用于朝觐视频帧中人群密度分类的机器学习模型 | Afnan A. Shah | PDF | N/A | A Machine Learning Model for Crowd Density Classification in Hajj Video Frames | | JELLY：基于大型语言模型的联合情感识别与上下文推理用于对话语音合成 | Jun-Hyeok Cha | PDF | N/A | JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis | | 为了理解决策树中的偏差 | Nathan Phelps | PDF | N/A | Towards understanding the bias in decision trees | | SUGAR：利用上下文信心实现更智能的检索 | Hanna Zubkova | PDF | N/A | SUGAR: Leveraging Contextual Confidence for Smarter Retrieval | | 深度神经网络特征在工具变量回归中的最优性与适应性 | Juno Kim | PDF | N/A | Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression | | 在线持续学习：方法、挑战与基准的系统文献综述 | Seyed Amir Bidaki | PDF | N/A | Online Continual Learning: A Systematic Literature Review of Approaches, Challenges, and Benchmarks | | 使用机器学习和无线电信号量化瘙痒及其对睡眠的影响 | Michail Ouroutzoglou | PDF | N/A | Quantifying Itch and its Impact on Sleep Using Machine Learning and Radio Signals | | 探索机器学习如何重塑工程模型：分析瘫痪的兴起、最优但不可行的解决方案，以及不可避免的罗生门悖论 | MZ Naser | PDF | N/A | A Look into How Machine Learning is Reshaping Engineering Models: the Rise of Analysis Paralysis, Optimal yet Infeasible Solutions, and the Inevitable Rashomon Paradox |