Skip to content

Latest commit

 

History

History
187 lines (186 loc) · 503 KB

File metadata and controls

187 lines (186 loc) · 503 KB

WSDM2023 Paper List

论文 作者 组织 摘要 翻译 代码 引用数
Heterogeneous Graph Contrastive Learning for Recommendation Mengru Chen, Chao Huang, Lianghao Xia, Wei Wei, Yong Xu, Ronghua Luo University of Hong Kong, Hong Kong, China; South China University of Technology, Guangzhou, China Graph Neural Networks (GNNs) have become powerful tools in modeling graph-structured data in recommender systems. However, real-life recommendation scenarios usually involve heterogeneous relationships (e.g., social-aware user influence, knowledge-aware item dependency) which contains fruitful information to enhance the user preference learning. In this paper, we study the problem of heterogeneous graph-enhanced relational learning for recommendation. Recently, contrastive self-supervised learning has become successful in recommendation. In light of this, we propose a Heterogeneous Graph Contrastive Learning (HGCL), which is able to incorporate heterogeneous relational semantics into the user-item interaction modeling with contrastive learning-enhanced knowledge transfer across different views. However, the influence of heterogeneous side information on interactions may vary by users and items. To move this idea forward, we enhance our heterogeneous graph contrastive learning with meta networks to allow the personalized knowledge transformer with adaptive contrastive augmentation. The experimental results on three real-world datasets demonstrate the superiority of HGCL over state-of-the-art recommendation methods. Through ablation study, key components in HGCL method are validated to benefit the recommendation performance improvement. The source code of the model implementation is available at the link https://github.com/HKUDS/HGCL. 图形神经网络(GNN)已经成为推荐系统中建立图形结构数据模型的有力工具。然而,现实生活中的推荐场景通常涉及异构关系(例如,社会意识的用户影响,知识意识的项目依赖) ,其中包含丰富的信息,以增强用户偏好学习。本文研究了异构图增强关系学习的推荐问题。近年来,对比自监督学习在推荐领域取得了成功。鉴于此,我们提出了一种异构图形对比学习(HGCL) ,它能够将异构关系语义整合到用户项目交互模型中,并通过对比学习增强跨不同视图的知识转移。然而,异构侧信息对交互的影响可能因用户和项目的不同而不同。为了进一步提高这一思想,我们使用元网络来增强异构图的对比学习,以允许个性化的知识转换器通过自适应的对比增强来实现。在三个实际数据集上的实验结果表明,HGCL 比最先进的推荐方法具有更大的优越性。通过消融研究,验证了 HGCL 方法的关键组成部分,有利于推荐性能的提高。模型实现的源代码可在连结 https://github.com/hkuds/hgcl 找到。 code 2
Modeling Fine-grained Information via Knowledge-aware Hierarchical Graph for Zero-shot Entity Retrieval Taiqiang Wu, Xingyu Bai, Weigang Guo, Weijie Liu, Siheng Li, Yujiu Yang Tsinghua University, Shenzhen, China; Tencent, Shenzhen, China Zero-shot entity retrieval, aiming to link mentions to candidate entities under the zero-shot setting, is vital for many tasks in Natural Language Processing. Most existing methods represent mentions/entities via the sentence embeddings of corresponding context from the Pre-trained Language Model. However, we argue that such coarse-grained sentence embeddings can not fully model the mentions/entities, especially when the attention scores towards mentions/entities are relatively low. In this work, we propose GER, a \textbf{G}raph enhanced \textbf{E}ntity \textbf{R}etrieval framework, to capture more fine-grained information as complementary to sentence embeddings. We extract the knowledge units from the corresponding context and then construct a mention/entity centralized graph. Hence, we can learn the fine-grained information about mention/entity by aggregating information from these knowledge units. To avoid the graph information bottleneck for the central mention/entity node, we construct a hierarchical graph and design a novel Hierarchical Graph Attention Network~(HGAN). Experimental results on popular benchmarks demonstrate that our proposed GER framework performs better than previous state-of-the-art models. The code has been available at https://github.com/wutaiqiang/GER-WSDM2023. 零镜头实体检索是自然语言处理中的一项重要任务,其目的是在零镜头设置下将提到的内容与候选实体联系起来。大多数现有的方法通过预训练语言模型中相应上下文的句子嵌入来表示提及/实体。然而,我们认为这种粗粒度句子嵌入不能完全模拟提及/实体,特别是当注意力分数相对较低的提及/实体。在这项工作中,我们提出了 GER,一个 textbf { G } raph 增强 textbf { E }实体 textbf { R }检索框架,以捕获更多的细粒度信息作为句子嵌入的补充。我们从相应的上下文中提取知识单元,然后构造一个提及/实体集中图。因此,我们可以通过聚合来自这些知识单元的信息来学习关于提及/实体的细粒度信息。为了避免中心提及/实体节点的图形信息瓶颈,构造了一个层次图,并设计了一个新的层次图注意网络 ~ (HGAN)。对流行基准测试的实验结果表明,我们提出的 GER 框架比以前最先进的模型表现得更好。密码已经在 https://github.com/wutaiqiang/ger-wsdm2023上公布了。 code 2
Inductive Graph Transformer for Delivery Time Estimation Xin Zhou, Jinglong Wang, Yong Liu, Xingyu Wu, Zhiqi Shen, Cyril Leung Nanyang Technological University, Singapore, Singapore; Alibaba Group, Hangzhou, China Providing accurate estimated time of package delivery on users' purchasing pages for e-commerce platforms is of great importance to their purchasing decisions and post-purchase experiences. Although this problem shares some common issues with the conventional estimated time of arrival (ETA), it is more challenging with the following aspects: 1) Inductive inference. Models are required to predict ETA for orders with unseen retailers and addresses; 2) High-order interaction of order semantic information. Apart from the spatio-temporal features, the estimated time also varies greatly with other factors, such as the packaging efficiency of retailers, as well as the high-order interaction of these factors. In this paper, we propose an inductive graph transformer (IGT) that leverages raw feature information and structural graph data to estimate package delivery time. Different from previous graph transformer architectures, IGT adopts a decoupled pipeline and trains transformer as a regression function that can capture the multiplex information from both raw feature and dense embeddings encoded by a graph neural network (GNN). In addition, we further simplify the GNN structure by removing its non-linear activation and the learnable linear transformation matrix. The reduced parameter search space and linear information propagation in the simplified GNN enable the IGT to be applied in large-scale industrial scenarios. Experiments on real-world logistics datasets show that our proposed model can significantly outperform the state-of-the-art methods on estimation of delivery time. The source code is available at: https://github.com/enoche/IGT-WSDM23. 为电子商务平台的用户提供准确的预计包裹递送时间,对他们的购买决策和购买后体验至关重要。虽然这个问题与传统的估计到达时间(ETA)方法有一些共同之处,但在以下几个方面更具挑战性: 1)归纳推理。模型需要预测具有隐形零售商和地址的订单的预计到达时间; 2)订单语义信息的高阶交互作用。除了时空特征外,包装时间的估计值也随着其他因素的变化而变化,如零售商的包装效率,以及这些因素之间的高阶相互作用。在本文中,我们提出了一个感应图形变换器(IGT) ,它利用原始特征信息和结构图数据来估计包裹递送时间。与以往的图形变换器体系结构不同,IGT 采用解耦流水线和列车变换器作为回归函数,可以从图形神经网络(GNN)编码的原始特征和密集嵌入中获取多路信息。此外,我们进一步简化了 GNN 的结构,去除了它的非线性激活和可学习的线性映射矩阵。简化 GNN 中的参数搜索空间和线性信息传播使 IGT 能够应用于大规模工业场景。在实际物流数据集上的实验结果表明,该模型在估计交货期方面的性能明显优于目前最先进的方法。源代码可在以下 https://github.com/enoche/igt-wsdm23找到:。 code 2
A Multimodal Framework for the Identification of Vaccine Critical Memes on Twitter Usman Naseem, Jinman Kim, Matloob Khushi, Adam G. Dunn University of Sydney, Sydney, NSW, Australia; Brunel University, London, United Kingdom Memes can be a useful way to spread information because they are funny, easy to share, and can spread quickly and reach further than other forms. With increased interest in COVID-19 vaccines, vaccination-related memes have grown in number and reach. Memes analysis can be difficult because they use sarcasm and often require contextual understanding. Previous research has shown promising results but could be improved by capturing global and local representations within memes to model contextual information. Further, the limited public availability of annotated vaccine critical memes datasets limit our ability to design computational methods to help design targeted interventions and boost vaccine uptake. To address these gaps, we present VaxMeme, which consists of 10,244 manually labelled memes. With VaxMeme, we propose a new multimodal framework designed to improve the memes' representation by learning the global and local representations of memes. The improved memes' representations are then fed to an attentive representation learning module to capture contextual information for classification using an optimised loss function. Experimental results show that our framework outperformed state-of-the-art methods with an F1-Score of 84.2%. We further analyse the transferability and generalisability of our framework and show that understanding both modalities is important to identify vaccine critical memes on Twitter. Finally, we discuss how understanding memes can be useful in designing shareable vaccination promotion, myth debunking memes and monitoring their uptake on social media platforms. 模因可以成为传播信息的一种有用的方式,因为它们很有趣,容易分享,而且可以迅速传播,比其他形式传播得更远。随着人们对2019冠状病毒疾病疫苗兴趣的增加,与疫苗接种相关的文化基因数量和影响范围都在增加。模因分析可能很困难,因为它们使用讽刺,并且经常需要上下文理解。先前的研究已经显示了有希望的结果,但是可以通过捕获模因中的全局和局部表征来模拟上下文信息来改进。此外,注释疫苗关键模因数据集的有限公共可用性限制了我们设计计算方法的能力,以帮助设计有针对性的干预措施和提高疫苗的摄取。为了解决这些差距,我们提出了 VaxMeme,它由10,244个手动标记的模因组成。利用 VaxMeme,我们提出了一种新的多模态框架,该框架通过学习模因的全局和局部表示来改善模因的表示。然后将改进的模因表示提供给注意表示学习模块,使用优化损失函数捕获上下文信息进行分类。实验结果表明,该框架的性能优于最先进的方法,F1-得分为84.2% 。我们进一步分析了我们框架的可转移性和普遍性,并表明了解这两种模式对于在 Twitter 上识别疫苗关键模因是重要的。最后,我们讨论了如何理解模因可以有助于设计共享疫苗促进,神话揭穿模因和监测他们在社交媒体平台上的吸收。 code 2
DisenPOI: Disentangling Sequential and Geographical Influence for Point-of-Interest Recommendation Yifang Qin, Yifan Wang, Fang Sun, Wei Ju, Xuyang Hou, Zhe Wang, Jia Cheng, Jun Lei, Ming Zhang Meituan, Beijing, China; Peking University, Beijing, China Point-of-Interest (POI) recommendation plays a vital role in various location-aware services. It has been observed that POI recommendation is driven by both sequential and geographical influences. However, since there is no annotated label of the dominant influence during recommendation, existing methods tend to entangle these two influences, which may lead to sub-optimal recommendation performance and poor interpretability. In this paper, we address the above challenge by proposing DisenPOI, a novel Disentangled dual-graph framework for POI recommendation, which jointly utilizes sequential and geographical relationships on two separate graphs and disentangles the two influences with self-supervision. The key novelty of our model compared with existing approaches is to extract disentangled representations of both sequential and geographical influences with contrastive learning. To be specific, we construct a geographical graph and a sequential graph based on the check-in sequence of a user. We tailor their propagation schemes to become sequence-/geo-aware to better capture the corresponding influences. Preference proxies are extracted from check-in sequence as pseudo labels for the two influences, which supervise the disentanglement via a contrastive loss. Extensive experiments on three datasets demonstrate the superiority of the proposed model. 感兴趣点(POI)推荐在各种位置感知服务中起着至关重要的作用。人们注意到,POI 建议受到顺序和地域影响的驱动。然而,由于在推荐过程中没有标注主导影响的标签,现有的方法倾向于将这两种影响纠缠在一起,这可能导致推荐性能次优和可解释性差。本文针对上述挑战,提出了一种新的用于 POI 推荐的分离双图框架 DisenPOI,该框架在两个分离的图上联合利用序列和地理关系,并通过自我监督来分离这两种影响。与现有方法相比,我们的模型的关键新颖之处在于通过对比学习提取序列和地理影响的分离表示。具体来说,我们根据用户的签入顺序构造了一个地理图和一个序列图。我们调整他们的传播方案成为序列/地理感知,以更好地捕捉相应的影响。针对这两种影响,从签入序列中提取偏好代理作为伪标签,通过对比损失来监控解纠缠过程。在三个数据集上的大量实验表明了该模型的优越性。 code 1
SGCCL: Siamese Graph Contrastive Consensus Learning for Personalized Recommendation Boyu Li, Ting Guo, Xingquan Zhu, Qian Li, Yang Wang, Fang Chen Curtin University, Perth, WA, Australia; University of Technology Sydney, Sydney, NSW, Australia; Florida Atlantic University, Boca Raton, FL, USA Contrastive-learning-based neural networks have recently been introduced to recommender systems, due to their unique advantage of injecting collaborative signals to model deep representations, and the self-supervision nature in the learning process. Existing contrastive learning methods for recommendations are mainly proposed through introducing augmentations to the user-item (U-I) bipartite graphs. Such a contrastive learning process, however, is susceptible to bias towards popular items and users, because higher-degree users/items are subject to more augmentations and their correlations are more captured. In this paper, we advocate a Siamese Graph Contrastive Consensus Learning (SGCCL) framework, to explore intrinsic correlations and alleviate the bias effects for personalized recommendation. Instead of augmenting original U-I networks, we introduce siamese graphs, which are homogeneous relations of user-user (U-U) similarity and item-item (I-I) correlations. A contrastive consensus optimization process is also adopted to learn effective features for user-item ratings, user-user similarity, and item-item correlation. Finally, we employ the self-supervised learning coupled with the siamese item-item/user-user graph relationships, which ensures unpopular users/items are well preserved in the embedding space. Different from existing studies, SGCCL performs well on both overall and debiasing recommendation tasks resulting in a balanced recommender. Experiments on four benchmark datasets demonstrate that SGCCL outperforms state-of-the-art methods with higher accuracy and greater long-tail item/user exposure. 基于对比学习的神经网络最近被引入到推荐系统中,因为它具有注入协作信号来建立深度表征模型的独特优势,以及学习过程中的自我监督性质。现有的推荐对比学习方法主要是通过对用户项(U-I)二部图的增广来实现的。然而,这种对比学习过程容易对流行项目和用户产生偏见,因为高学历用户/项目受到更多的增强,他们之间的相关性更容易被捕捉。本文提出了一个对比性的“ u > C 共识 < u > L 收益(SGCCL)”框架,以探索个性化推荐的内在相关性,减轻个性化推荐的偏差效应。本文引入了用户-用户(U-U)相似度齐次关系和项目-项目(I-I)相关度齐次关系的连体图,取代了原始的 U-I 网络。采用对比一致性优化方法学习用户项目评分、用户相似度和项目相关性的有效特征。最后,采用自监督学习结合连体项目-项目/用户-用户图关系,保证了不受欢迎的用户/项目在嵌入空间中得到很好的保留。与现有的研究不同,SGCCL 在总体推荐任务和降低推荐偏差任务上都表现良好,从而产生了一个平衡的推荐。在四个基准数据集上的实验表明,SGCCL 在更高的精度和更大的长尾项目/用户暴露方面优于最先进的方法。 code 1
DGRec: Graph Neural Network for Recommendation with Diversified Embedding Generation Liangwei Yang, Shengjie Wang, Yunzhe Tao, Jiankai Sun, Xiaolong Liu, Philip S. Yu, Taiqing Wang University of Illinois at Chicago, Chicago, IL, USA; ByteDance Inc., Seattle, WA, USA Graph Neural Network (GNN) based recommender systems have been attracting more and more attention in recent years due to their excellent performance in accuracy. Representing user-item interactions as a bipartite graph, a GNN model generates user and item representations by aggregating embeddings of their neighbors. However, such an aggregation procedure often accumulates information purely based on the graph structure, overlooking the redundancy of the aggregated neighbors and resulting in poor diversity of the recommended list. In this paper, we propose diversifying GNN-based recommender systems by directly improving the embedding generation procedure. Particularly, we utilize the following three modules: submodular neighbor selection to find a subset of diverse neighbors to aggregate for each GNN node, layer attention to assign attention weights for each layer, and loss reweighting to focus on the learning of items belonging to long-tail categories. Blending the three modules into GNN, we present DGRec(Diversified GNN-based Recommender System) for diversified recommendation. Experiments on real-world datasets demonstrate that the proposed method can achieve the best diversity while keeping the accuracy comparable to state-of-the-art GNN-based recommender systems. 基于图形神经网络(GNN)的推荐系统由于其良好的精度性能,近年来越来越受到人们的关注。GNN 模型将用户-项目交互表示为二分图,通过聚合相邻用户的嵌入来生成用户和项目表示。然而,这样的聚合过程往往只是基于图结构积累信息,忽略了聚合邻居的冗余性,导致推荐列表的多样性较差。本文提出通过直接改进嵌入式生成过程来实现基于 GNN 的推荐系统的多样化。特别地,我们利用以下三个模块: 子模块邻居选择来寻找不同邻居的子集来为每个 GNN 节点聚合,分层注意来为每个层分配注意权重,以及损失重新加权来关注属于长尾类别的项目的学习。将这三个模块融合到 GNN 中,我们提出了 DGrec (基于多样化 GNN 的推荐系统)以供多样化推荐。在实际数据集上的实验表明,该方法在保证精度的同时,能够获得最佳的分集效果,与现有的基于 GNN 的推荐系统相当。 code 1
Efficiently Leveraging Multi-level User Intent for Session-based Recommendation via Atten-Mixer Network Peiyan Zhang, Jiayan Guo, Chaozhuo Li, Yueqi Xie, Jaeboum Kim, Yan Zhang, Xing Xie, Haohan Wang, Sunghun Kim Microsoft Research Asia, Beijing, China; School of Intelligence Science and Technology, Peking University, Beijing, China; The Hong Kong University of Science and Technology, Hong Kong, Hong Kong; The Hong Kong University of Science and Technology & Upstage, Yongin-si, Republic of Korea; University of Illinois at Urbana-Champaign, Champaign, IL, USA Session-based recommendation (SBR) aims to predict the user's next action based on short and dynamic sessions. Recently, there has been an increasing interest in utilizing various elaborately designed graph neural networks (GNNs) to capture the pair-wise relationships among items, seemingly suggesting the design of more complicated models is the panacea for improving the empirical performance. However, these models achieve relatively marginal improvements with exponential growth in model complexity. In this paper, we dissect the classical GNN-based SBR models and empirically find that some sophisticated GNN propagations are redundant, given the readout module plays a significant role in GNN-based models. Based on this observation, we intuitively propose to remove the GNN propagation part, while the readout module will take on more responsibility in the model reasoning process. To this end, we propose the Multi-Level Attention Mixture Network (Atten-Mixer), which leverages both concept-view and instance-view readouts to achieve multi-level reasoning over item transitions. As simply enumerating all possible high-level concepts is infeasible for large real-world recommender systems, we further incorporate SBR-related inductive biases, i.e., local invariance and inherent priority to prune the search space. Experiments on three benchmarks demonstrate the effectiveness and efficiency of our proposal. We also have already launched the proposed techniques to a large-scale e-commercial online service since April 2021, with significant improvements of top-tier business metrics demonstrated in the online experiments on live traffic. 基于会话的推荐(SBR)旨在基于短期和动态会话预测用户的下一步操作。近年来,人们越来越热衷于利用各种精心设计的图形神经网络(GNN)来捕捉项目之间的配对关系,这似乎表明设计更复杂的模型是提高经验性能的灵丹妙药。然而,这些模型在模型复杂性方面的改进指数增长相对较小。本文剖析了经典的基于 GNN 的 SBR 模型,发现一些复杂的 GNN 传播是冗余的,因为读出模块在基于 GNN 的模型中起着重要的作用。在此基础上,我们直观地提出了去除 GNN 传播部分,而读出模块将在模型推理过程中承担更多的责任。为此,我们提出了多层次注意力混合网络(Atten-Mixer) ,它利用概念视图和实例视图读数来实现项目过渡的多层次推理。由于简单地列举所有可能的高级概念对于大型实际推荐系统是不可行的,因此我们进一步结合 SBR 相关的归纳偏差,即局部不变性和固有优先级来修剪搜索空间。在三个基准测试上的实验证明了该方案的有效性和高效性。自2021年4月以来,我们已经推出了大规模电子商务在线服务的建议技术,在实时流量的在线实验中,顶级业务指标得到了显著改进。 code 1
Counterfactual Collaborative Reasoning Jianchao Ji, Zelong Li, Shuyuan Xu, Max Xiong, Juntao Tan, Yingqiang Ge, Hao Wang, Yongfeng Zhang Rutgers Preparatory School, Somerset, NJ, USA; Rutgers University, New Brunswick, NJ, USA Causal reasoning and logical reasoning are two important types of reasoning abilities for human intelligence. However, their relationship has not been extensively explored under machine intelligence context. In this paper, we explore how the two reasoning abilities can be jointly modeled to enhance both accuracy and explainability of machine learning models. More specifically, by integrating two important types of reasoning ability -- counterfactual reasoning and (neural) logical reasoning -- we propose Counterfactual Collaborative Reasoning (CCR), which conducts counterfactual logic reasoning to improve the performance. In particular, we use recommender system as an example to show how CCR alleviate data scarcity, improve accuracy and enhance transparency. Technically, we leverage counterfactual reasoning to generate "difficult" counterfactual training examples for data augmentation, which -- together with the original training examples -- can enhance the model performance. Since the augmented data is model irrelevant, they can be used to enhance any model, enabling the wide applicability of the technique. Besides, most of the existing data augmentation methods focus on "implicit data augmentation" over users' implicit feedback, while our framework conducts "explicit data augmentation" over users explicit feedback based on counterfactual logic reasoning. Experiments on three real-world datasets show that CCR achieves better performance than non-augmented models and implicitly augmented models, and also improves model transparency by generating counterfactual explanations. 因果推理和逻辑推理是人类智力中两种重要的推理能力。然而,在机器智能语境下,它们之间的关系还没有得到广泛的研究。本文探讨如何将两种推理能力联合建模,以提高机器学习模型的准确性和可解释性。更具体地说,通过整合两种重要类型的推理能力——反事实推理和(神经)逻辑推理——我们提出了反事实协作推理,它通过进行反事实逻辑推理来提高性能。特别是,我们以推荐系统为例,展示 CCR 如何缓解数据稀缺、提高准确性和增强透明度。从技术上讲,我们利用反事实推理来生成“困难的”反事实训练示例用于数据增强,这些示例与原始的训练示例一起可以提高模型的性能。由于增强数据与模型无关,因此可以用它们来增强任何模型,从而使该技术具有广泛的适用性。此外,现有的数据增强方法大多侧重于对用户的隐性反馈进行“隐性数据增强”,而我们的框架基于反事实逻辑推理对用户的显性反馈进行“显性数据增强”。在三个实际数据集上的实验表明,CCR 模型比非增广模型和隐式增广模型具有更好的性能,并且通过生成反事实解释来提高模型的透明度。 code 1
VRKG4Rec: Virtual Relational Knowledge Graph for Recommendation Lingyun Lu, Bang Wang, Zizhuo Zhang, Shenghao Liu, Han Xu Huazhong University of Science and Technology, Wuhan, China Incorporating knowledge graph as side information has become a new trend in recommendation systems. Recent studies regard items as entities of a knowledge graph and leverage graph neural networks to assist item encoding, yet by considering each relation type individually. However, relation types are often too many and sometimes one relation type involves too few entities. We argue that it is not efficient nor effective to use every relation type for item encoding. In this paper, we propose a VRKG4Rec model (Virtual Relational Knowledge Graphs for Recommendation), which explicitly distinguish the influence of different relations for item representation learning. We first construct virtual relational graphs (VRKGs) by an unsupervised learning scheme. We also design a local weighted smoothing (LWS) mechanism for encoding nodes, which iteratively updates a node embedding only depending on the embedding of its own and its neighbors, but involve no additional training parameters. We also employ the LWS mechanism on a user-item bipartite graph for user representation learning, which utilizes encodings of items with relational knowledge to help training representations of users. Experiment results on two public datasets validate that our VRKG4Rec model outperforms the state-of-the-art methods. The implementations are available at https://github.com/lulu0913/VRKG4Rec. 在推荐系统中引入知识图作为侧面信息已经成为一种新的趋势。最近的研究把项目看作是一个知识图的实体,并利用图神经网络来辅助项目编码,但是要分别考虑每个关系类型。然而,关系类型往往太多,有时一个关系类型涉及的实体太少。我们认为,使用每种关系类型进行项目编码既不高效也不有效。本文提出了一种 VRKG4Rec 模型(虚拟关系知识推荐图) ,该模型能够明确区分不同关系对项目表示学习的影响。我们首先通过一个非监督式学习方案构造虚拟关系图(vrkGs)。我们还设计了一种局部加权平滑(LWS)编码机制,该机制仅根据节点自身及其邻居的嵌入来迭代更新嵌入的节点,而不涉及额外的训练参数。在用户表征学习的用户项二分图上采用了 LWS 机制,该机制利用关系知识对用户表征进行编码,有助于用户表征的训练。在两个公共数据集上的实验结果验证了我们的 VRKG4Rec 模型优于最先进的方法。有关实施方案可于 https://github.com/lulu0913/vrkg4rec 下载。 code 1
A Bird's-eye View of Reranking: From List Level to Page Level Yunjia Xi, Jianghao Lin, Weiwen Liu, Xinyi Dai, Weinan Zhang, Rui Zhang, Ruiming Tang, Yong Yu ruizhang.info, Shenzhen, China; Huawei Noah's Ark Lab, Shenzhen, China; Shanghai Jiao Tong University, Shanghai, China Reranking, as the final stage of multi-stage recommender systems, refines the initial lists to maximize the total utility. With the development of multimedia and user interface design, the recommendation page has evolved to a multi-list style. Separately employing traditional list-level reranking methods for different lists overlooks the inter-list interactions and the effect of different page formats, thus yielding suboptimal reranking performance. Moreover, simply applying a shared network for all the lists fails to capture the commonalities and distinctions in user behaviors on different lists. To this end, we propose to draw a bird's-eye view of \textbf{page-level reranking} and design a novel Page-level Attentional Reranking (PAR) model. We introduce a hierarchical dual-side attention module to extract personalized intra- and inter-list interactions. A spatial-scaled attention network is devised to integrate the spatial relationship into pairwise item influences, which explicitly models the page format. The multi-gated mixture-of-experts module is further applied to capture the commonalities and differences of user behaviors between different lists. Extensive experiments on a public dataset and a proprietary dataset show that PAR significantly outperforms existing baseline models. 重新排序作为多阶段推荐系统的最后一个阶段,对初始列表进行细化,使总体效用最大化。随着多媒体技术和用户界面设计的发展,推荐页面已经发展成为多列表的风格。对不同的列表单独使用传统的列表级别重新排序方法忽略了列表间的交互作用和不同页面格式的影响,因此产生了次优的重新排序性能。此外,对所有列表简单地应用共享网络无法捕获不同列表上用户行为的共性和区别。为此,我们提出了一种基于 textbf { Page-level rerank }的鸟瞰图,并设计了一种新的页面级注意力重排(PAR)模型。我们引入了一个分层的双侧注意模块来提取个性化的列表内和列表间的交互。设计了一个空间尺度的注意网络,将空间关系整合到成对的项目影响中,并对页面格式进行了明确的建模。进一步应用多门限混合专家模块来捕获不同列表之间用户行为的共性和差异。对公共数据集和专有数据集的大量实验表明,PAR 的性能明显优于现有的基线模型。 code 1
Minimum Entropy Principle Guided Graph Neural Networks Zhenyu Yang, Ge Zhang, Jia Wu, Jian Yang, Quan Z. Sheng, Hao Peng, Angsheng Li, Shan Xue, Jianlin Su University of Wollongong, Wollongong, Australia; Macquarie University, Sydney, NSW, Australia; Beihang University, Beijing, China; Shenzhen Zhuiyi Technology Co., Ltd., Shenzhen, China Graph neural networks (GNNs) are now the mainstream method for mining graph-structured data and learning low-dimensional node- and graph-level embeddings to serve downstream tasks. However, limited by the bottleneck of interpretability that deep neural networks present, existing GNNs have ignored the issue of estimating the appropriate number of dimensions for the embeddings. Hence, we propose a novel framework called Minimum Graph Entropy principle-guided Dimension Estimation, i.e. MGEDE, that learns the appropriate embedding dimensions for both node and graph representations. In terms of node-level estimation, a minimum entropy function that counts both structure and attribute entropy, appraises the appropriate number of dimensions. In terms of graph-level estimation, each graph is assigned a customized embedding dimension from a candidate set based on the number of dimensions estimated for the node-level embeddings. Comprehensive experiments with node and graph classification tasks and nine benchmark datasets verify the effectiveness and generalizability of MGEDE. 图神经网络(GNN)是目前挖掘图结构数据和学习低维节点和图级嵌入以服务下游任务的主流方法。然而,受深度神经网络可解释性瓶颈的限制,现有的神经网络忽略了估计适当的嵌入维数的问题。因此,我们提出了一个新的框架,称为最小图熵原理指导的维度估计,即 MGEDE,学习适当的嵌入维度的节点和图表示。在节点级估计方面,通过计算结构熵和属性熵的最小熵函数来估计合适的维数。在图级估计方面,根据节点级嵌入的维数估计,从候选集中为每个图分配一个定制的嵌入维数。通过对节点和图形分类任务以及9个基准数据集的综合实验,验证了 MGEDE 的有效性和通用性。 code 1
Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning Ziyun Xu, Chengyu Wang, Minghui Qiu, Fuli Luo, Runxin Xu, Songfang Huang, Jun Huang Carnegie Mellon University, Pittsburgh, PA, USA; Alibaba Group, Hangzhou, China; Peking University, Beijing, China Pre-trained Language Models (PLMs) have achieved remarkable performance for various language understanding tasks in IR systems, which require the fine-tuning process based on labeled training data. For low-resource scenarios, prompt-based learning for PLMs exploits prompts as task guidance and turns downstream tasks into masked language problems for effective few-shot fine-tuning. In most existing approaches, the high performance of prompt-based learning heavily relies on handcrafted prompts and verbalizers, which may limit the application of such approaches in real-world scenarios. To solve this issue, we present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning PLMs without any manual engineering of task-specific prompts and verbalizers. It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters. We further propose the pair-wise cost-sensitive contrastive learning procedure to optimize the model in order to achieve verbalizer-free class mapping and enhance the task-invariance of prompts. It explicitly learns to distinguish different classes and makes the decision boundary smoother by assigning different costs to easy and hard cases. Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods. 预训练语言模型(PLM)在信息检索系统的各种语言理解任务中取得了显著的性能,需要基于标记训练数据进行微调。对于低资源的场景,PLM 的基于提示的学习利用提示作为任务指导,并将下游任务转化为隐藏的语言问题,以便进行有效的少量微调。在大多数现有方法中,基于提示的学习的高性能在很大程度上依赖于手工制作的提示和语言表达器,这可能会限制这种方法在现实世界情景中的应用。为了解决这个问题,我们提出了 CP-Tuning,这是第一个端到端的对比提示调优框架,用于对 PLM 进行微调,而不需要任何特定于任务的提示和语言化工程。它集成了任务不变的连续提示编码技术与完全可训练的提示参数。我们进一步提出了成对代价敏感的对比学习过程来优化模型,以实现无语言代码的类映射和增强提示的任务不变性。它明确地学会区分不同的类别,并通过将不同的成本分配给简单和困难的案例,使决策边界更加平滑。在 IR 系统和不同 PLM 中使用的各种语言理解任务的实验表明,CP 调优优于最先进的方法。 code 1
Effective Seed-Guided Topic Discovery by Integrating Multiple Types of Contexts Yu Zhang, Yunyi Zhang, Martin Michalski, Yucheng Jiang, Yu Meng, Jiawei Han University of Illinois Urbana-Champaign, Urbana, IL, USA Instead of mining coherent topics from a given text corpus in a completely unsupervised manner, seed-guided topic discovery methods leverage user-provided seed words to extract distinctive and coherent topics so that the mined topics can better cater to the user's interest. To model the semantic correlation between words and seeds for discovering topic-indicative terms, existing seed-guided approaches utilize different types of context signals, such as document-level word co-occurrences, sliding window-based local contexts, and generic linguistic knowledge brought by pre-trained language models. In this work, we analyze and show empirically that each type of context information has its value and limitation in modeling word semantics under seed guidance, but combining three types of contexts (i.e., word embeddings learned from local contexts, pre-trained language model representations obtained from general-domain training, and topic-indicative sentences retrieved based on seed information) allows them to complement each other for discovering quality topics. We propose an iterative framework, SeedTopicMine, which jointly learns from the three types of contexts and gradually fuses their context signals via an ensemble ranking process. Under various sets of seeds and on multiple datasets, SeedTopicMine consistently yields more coherent and accurate topics than existing seed-guided topic discovery approaches. 种子引导的主题发现方法不是以完全无监督的方式从给定的文本语料库中挖掘连贯的主题,而是利用用户提供的种子词来提取独特和连贯的主题,以便挖掘出的主题能够更好地迎合用户的兴趣。为了建立词与种子之间的语义关系模型,以发现主题指示性词语,现有的种子引导方法利用了不同类型的上下文信号,如文档级词语共现、基于滑动窗口的局部上下文以及预训练语言模型带来的通用语言知识。在本文中,我们通过实证分析和表明,在种子引导下,每种类型的上下文信息在词语语义建模中都有其价值和局限性,但是结合三种类型的上下文(即从局部上下文中学习的词语嵌入、从一般领域训练中获得的预训练语言模型表示和基于种子信息检索的主题指示句) ,它们可以相互补充,发现高质量的主题。我们提出了一个迭代框架 SeedTopicMine,该框架共同学习三种类型的上下文,并通过一个集成排序过程逐步融合它们的上下文信号。在不同的种子集和多个数据集上,SeedTopicMine 始终比现有的种子引导的主题发现方法产生更加连贯和准确的主题。 code 1
CoQEx: Entity Counts Explained Shrestha Ghosh, Simon Razniewski, Gerhard Weikum Max Planck Institute for Informatics, Saarbrücken, Germany; Max Planck Institute for Informatics & Saarland University, Saarbrücken, Germany For open-domain question answering, queries on entity counts, such ashow many languages are spoken in Indonesia, are challenging. Such queries can be answered through succinct contexts with counts:estimated 700 languages, and instances:Javanese and Sundanese. Answer candidates naturally give rise to a distribution, where count contexts denoting the queried entity counts and their semantic subgroups often coexist, while the instances ground the counts in their constituting entities. In this demo we showcase the CoQEx methodology (Count Queries Explained) [5,6], which aggregates and structures explanatory evidence across search snippets, for answering user queries related to entity counts [4]. Given a entity count query, our system CoQEx retrieves search-snippets and provides the user with a distribution-aware prediction prediction, categorizes the count contexts into semantic groups and ranks instances grounding the counts, all in real-time. Our demo can be accessed athttps://nlcounqer.mpi-inf.mpg.de/. 对于开放领域的问题回答,查询实体计数,如有多少语言在印度尼西亚,是具有挑战性的。这样的查询可以通过简洁的上下文来回答,包括计数: 估计有700种语言,以及实例: 爪哇语和巽丹语。应答候选者自然产生一种分布,其中表示被查询实体计数的计数上下文和它们的语义子组常常共存,而实例将计数置于它们的组成实体中。在这个演示中,我们展示了 CoQEx 方法(< u > Co unt < u > Q ueries < u > Ex 明确)[5,6] ,该方法通过搜索片段聚合和构建解释性证据,用于回答与实体计数相关的用户查询[4]。给定一个实体计数查询,我们的系统 CoQEx 检索搜索片段并为用户提供一个分布感知的预测预测,将计数上下文分类为语义组并对计数实例进行排序,所有这些都是实时的。可以访问我们的演示程序 https://nlounqer.mpi-inf.mpg。德/。 code 1
Beyond Hard Negatives in Product Search: Semantic Matching Using One-Class Classification (SMOCC) Arindam Bhattacharya, Ankit Gandhi, Vijay Huddar, Ankith M. S, Aayush Moroney, Atul Saroop, Rahul Bhagat Amazon, Seattle, WA, USA; Amazon, Bengaluru, India Semantic matching is an important component of a product search pipeline. Its goal is to capture the semantic intent of the search query as opposed to the syntactic matching performed by a lexical matching system. A semantic matching model captures relationships like synonyms, and also captures common behavioral patterns to retrieve relevant results by generalizing from purchase data. They however suffer from lack of availability of informative negative examples for model training. Various methods have been proposed in the past to address this issue based upon hard-negative mining and contrastive learning. In this work, we propose a novel method for semantic matching based on one-class classification called SMOCC. Given a query and a relevant product, SMOCC generates the representation of an informative negative which is then used to train the model. Our method is based on the idea of generating negatives by using adversarial search in the neighborhood of the positive examples. We also propose a novel approach for selecting the radius to generate adversarial negative products around queries based on the model's understanding of the query. Depending on how we select the radius, we propose two variants of our method: SMOCC-QS, that quantizes the queries using their specificity, and SMOCC-EM, that uses expectation-maximization paradigm to iteratively learn the best radius. We show that our method outperforms the state-of-the-art hard negative mining approaches by increasing the purchase recall by 3 percentage points, and improving the percentage of exacts retrieved by up to 5 percentage points while reducing irrelevant results by 1.8 percentage points. 语义匹配是产品搜索管道的重要组成部分。它的目标是捕获搜索查询的语义意图,而不是词法匹配系统执行的语法匹配。语义匹配模型捕获同义词之类的关系,还捕获常见的行为模式,通过从购买数据归纳来检索相关结果。然而,他们缺乏可用于模型培训的信息丰富的负面例子。基于硬负面挖掘和对比学习,过去人们提出了各种方法来解决这一问题。本文提出了一种新的基于单类分类的语义匹配方法 SMOCC。给定一个查询和一个相关的产品,SMOCC 生成一个信息否定的表示,然后用于训练模型。我们的方法是基于在正例的邻域内使用对抗搜索生成否定的思想。基于模型对查询的理解,我们提出了一种新的半径选择方法来生成查询周围的对抗性否定产品。根据我们如何选择半径,我们提出了我们的方法的两个变体: SMOCC-QS,使用它们的特异性量化查询,和 SMOCC-EM,使用期望最大化范式迭代学习最佳半径。我们表明,我们的方法优于最先进的硬负面挖掘方法,通过增加3个百分点的购买召回,提高了5个百分点的精确检索百分比,同时减少了1.8个百分点的不相关结果。 code 0
Separating Examination and Trust Bias from Click Predictions for Unbiased Relevance Ranking Haiyuan Zhao, Jun Xu, Xiao Zhang, Guohao Cai, Zhenhua Dong, JiRong Wen Renmin University of China, Beijing, China; Noah's Ark Lab, Huawei, Shenzhen, China Alleviating the examination and trust bias in ranking systems is an important research line in unbiased learning-to-rank (ULTR). Current methods typically use the propensity to correct the biased user clicks and then learn ranking models based on the corrected clicks. Though successes have been achieved, directly modifying the clicks suffers from the inherent high variance because the propensities are usually involved in the denominators of corrected clicks. The problem gets even worse in the situation of mixed examination and trust bias. To address the issue, this paper proposes a novel ULTR method called Decomposed Ranking Debiasing (DRD). DRD is tailored for learning unbiased relevance models with low variance in the existence of examination and trust bias. Unlike existing methods that directly modify the original user clicks, DRD proposes to decompose each click prediction as the combination of a relevance term outputted by the ranking model and other bias terms. The unbiased relevance model, therefore, can be learned by fitting the overall click predictions to the biased user clicks. A joint learning algorithm is developed to learn the relevance and bias models' parameters alternatively. Theoretical analysis showed that, compared with existing methods, DRD has lower variance while retains unbiasedness. Empirical studies indicated that DRD can effectively reduce the variance and outperform the state-of-the-art ULTR baselines. 消除排名系统中的考试和信任偏差是无偏学习排名(ULTR)的一个重要研究方向。当前的方法通常使用倾向于纠正有偏见的用户点击,然后学习排名模型的基础上纠正点击。虽然已经取得了成功,但是直接修改点击受到固有的高变异性的影响,因为倾向性通常涉及修正点击的分母。在混合考试和信任偏差的情况下,问题更加严重。为了解决这个问题,本文提出了一种新的 ULTR 方法,称为分解排序去偏(DRD)。DRD 是为学习无偏相关模型而量身定制的,在考试和信任偏差存在的情况下具有较低的方差。与直接修改原始用户点击的现有方法不同,DRD 建议将每个点击预测分解为由排名模型输出的相关项和其他偏倚项的组合。无偏相关模型,因此,可以通过拟合整体点击预测偏向用户点击。提出了一种联合学习算法,交替学习相关模型和偏差模型的参数。理论分析表明,与现有方法相比,DRD 方差较小,保持了无偏性。实证研究表明,DRD 可以有效地降低方差,优于最先进的 ULTR 基线。 code 0
Travel Bird: A Personalized Destination Recommender with TourBERT and Airbnb Experiences Veronika Arefieva, Roman Egger, Michael Schrefl, Markus Schedl Johannes Kepler University Linz, Linz, Austria; Johannes Kepler University Linz & Linz Institute of Technology, Linz, Austria; Salzburg University of Applied Sciences, Salzburg, Austria We present Travel Bird, a novel personalized destination recommendation and exploration interface which allows its users to find their next tourist destination by describing their specific preferences in a narrative form. Unlike other solutions, Travel Bird is based on TourBERT, a novel NLP model we developed, specifically tailored to the tourism domain. Travel Bird creates a two-dimensional personalized destination exploration space from TourBERT embeddings of social media content and the users' textual description of the experience they are looking for. In this demo, we will showcase several use cases for Travel Bird, which are beneficial for consumers and destination management organizations. 我们介绍了旅游鸟,一个新颖的个性化目的地推荐和探索界面,允许其用户找到他们的下一个旅游目的地通过描述他们的具体喜好在叙述形式。与其他解决方案不同的是,Travel Bird 基于 TourBERT,这是我们开发的一种新型 NLP 模型,专门针对旅游领域。Travel Bird 通过 TourBERT 嵌入的社交媒体内容和用户对所寻找体验的文字描述,创建了一个二维的个性化目的地探索空间。在这个演示中,我们将展示 Travel Bird 的几个用例,这些用例对消费者和目的地管理组织都有好处。 code 0
One for All, All for One: Learning and Transferring User Embeddings for Cross-Domain Recommendation Chenglin Li, Yuanzhen Xie, Chenyun Yu, Bo Hu, Zang Li, Guoqiang Shu, Xiaohu Qie, Di Niu Sun Yat-sen University, Shenzhen, China; Tencent, Shenzhen, China; University of Alberta, Edmonton, AB, Canada Cross-domain recommendation is an important method to improve recommender system performance, especially when observations in target domains are sparse. However, most existing techniques focus on single-target or dual-target cross-domain recommendation (CDR) and are hard to be generalized to CDR with multiple target domains. In addition, the negative transfer problem is prevalent in CDR, where the recommendation performance in a target domain may not always be enhanced by knowledge learned from a source domain, especially when the source domain has sparse data. In this study, we propose CAT-ART, a multi-target CDR method that learns to improve recommendations in all participating domains through representation learning and embedding transfer. Our method consists of two parts: a self-supervised Contrastive AuToencoder (CAT) framework to generate global user embeddings based on information from all participating domains, and an Attention-based Representation Transfer (ART) framework which transfers domain-specific user embeddings from other domains to assist with target domain recommendation. CAT-ART boosts the recommendation performance in any target domain through the combined use of the learned global user representation and knowledge transferred from other domains, in addition to the original user embedding in the target domain. We conducted extensive experiments on a collected real-world CDR dataset spanning 5 domains and involving a million users. Experimental results demonstrate the superiority of the proposed method over a range of prior arts. We further conducted ablation studies to verify the effectiveness of the proposed components. Our collected dataset will be open-sourced to facilitate future research in the field of multi-domain recommender systems and user modeling. 跨域推荐是提高推荐系统性能的一种重要方法,特别是在目标域观测稀疏的情况下。然而,现有的技术大多集中于单目标或双目标跨域推荐(CDR) ,很难推广到多目标域的 CDR。此外,负迁移问题在 CDR 中普遍存在,目标领域的推荐性能并不总是通过从源领域学到的知识得到提高,特别是当源领域数据稀少时。在这项研究中,我们提出了 CAT-ART,一种多目标 CDR 方法,它通过表示学习和嵌入转移学习来改进所有参与领域的推荐。该方法由两部分组成: 一部分是基于所有参与域的信息生成全局用户嵌入的自监督对比编码(CAT)框架,另一部分是基于注意力的表示转移(ART)框架,它将来自其他域的特定域用户嵌入转移到目标域推荐中。CAT-ART 通过综合利用学习到的全局用户表示和从其他领域转移的知识,以及原始用户嵌入到目标领域,提高了目标领域的推荐性能。我们进行了广泛的实验收集现实世界的 CDR 数据集跨越5个领域,涉及一百万用户。实验结果表明,该方法优于现有的一系列技术。我们进一步进行了消融研究,以验证所提议的组件的有效性。我们收集的数据集将是开源的,以促进未来在多领域推荐系统和用户建模领域的研究。 code 0
A Semantic Search Framework for Similar Audit Issue Recommendation in Financial Industry Chuchu Zhang, Can Song, Samarth Agarwal, Huayu Wu, Xuejie Zhang, John Jianan Lu DBS Bank, Singapore, Singapore Audit issues summarize the findings during audit reviews and provide valuable insights of risks and control gaps in a financial institute. Despite the wide use of data analytics and NLP in financial services, due to the diverse coverage and lack of annotations, there are very few use cases that analyze audit issue writing and derive insights from it. In this paper, we propose a deep learning based semantic search framework to search, rank and recommend similar past issues based on new findings. We adopt a two-step approach. First, a TF-IDF based search algorithm and a Bi-Encoder are used to shortlist a set of issue candidates based on the input query. Then a Cross-Encoder will re-rank the candidates and provide the final recommendation. We will also demonstrate how the models are deployed and integrated with the existing workbench to benefit auditors in their daily work. 审计问题总结审计审查期间的结果,并提供有价值的见解的风险和控制差距的金融机构。尽管数据分析和自然语言处理(NLP)在金融服务中得到了广泛的应用,但是由于覆盖面的多样性和注释的缺乏,很少有用例能够分析审计问题的写作并从中获得见解。在本文中,我们提出了一个基于深度学习的语义搜索框架来搜索,排序和推荐过去的类似问题的新发现。我们采取两步走的方法。首先,使用基于 TF-IDF 的搜索算法和双编码器,根据输入查询列出一组候选问题。然后交叉编码器将重新排名的候选人,并提供最终的建议。我们还将演示如何部署这些模型并将其与现有的工作台集成,以使审计人员在日常工作中受益。 code 0
Disentangled Negative Sampling for Collaborative Filtering Riwei Lai, Li Chen, Yuhan Zhao, Rui Chen, Qilong Han Hong Kong Baptist University, Hong Kong, China; Harbin Engineering University, Harbin, China; Harbin Engineering University & Hong Kong Baptist University, Harbin; Hong Kong, China Negative sampling is essential for implicit collaborative filtering to generate negative samples from massive unlabeled data. Unlike existing strategies that consider items as a whole when selecting negative items, we argue that normally user interactions are mainly driven by some relevant, but not all, factors of items, leading to a new direction of negative sampling. In this paper, we introduce a novel disentangled negative sampling (DENS) method. We first disentangle the relevant and irrelevant factors of positive and negative items using a hierarchical gating module. Next, we design a factor-aware sampling strategy to identify the best negative samples by contrasting the relevant factors while keeping irrelevant factors similar. To ensure the credibility of the disentanglement, we propose to adopt contrastive learning and introduce four pairwise contrastive tasks, which enable to learn better disentangled representations of the relevant and irrelevant factors and remove the dependency on ground truth. Extensive experiments on five real-world datasets demonstrate the superiority of DENS against several state-of-the-art competitors, achieving over 7% improvement over the strongest baseline in terms of Recall@20 and NDCG@20. Our code is publically available at https://github.com/Riwei-HEU/DENS . 对于隐性协同过滤来说,从大量未标记的数据中产生负样本是至关重要的。不像现有的策略在选择消极项目时将项目作为一个整体来考虑,我们认为通常用户的交互主要是由一些相关的,但不是全部的项目因素驱动的,导致了一个新的消极抽样方向。本文介绍了一种新的解纠缠负采样(DENS)方法。我们首先利用一个层次化的门控模块来分离正项和负项的相关因素和不相关因素。接下来,我们设计了一个因子感知抽样策略,通过对比相关因子,同时保持不相关因子的相似性来识别最佳负样本。为了保证解纠缠的可信度,我们建议采用对比学习并引入四个成对的对比任务,以便更好地学习相关和不相关因素的解纠缠表示,并消除对基本事实的依赖。在五个真实世界数据集上的广泛实验证明了 DENS 相对于几个最先进的竞争对手的优越性,在 Recall@20和 NDCG@20方面比最强的基线提高了7% 以上。我们的代码可以在 https://github.com/riwei-heu/dens 上公开获得。 code 0
Meta Policy Learning for Cold-Start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu JD.COM Silicon Valley Research Center, Mountain View, CA, USA; University of Virginia, Charlottesville, VA, USA; JD.COM, Beijing, UNK, China Conversational recommender systems (CRS) explicitly solicit users' preferences for improved recommendations on the fly. Most existing CRS solutions count on a single policy trained by reinforcement learning for a population of users. However, for users new to the system, such a global policy becomes ineffective to satisfy them, i.e., the cold-start challenge. In this paper, we study CRS policy learning for cold-start users via meta-reinforcement learning. We propose to learn a meta policy and adapt it to new users with only a few trials of conversational recommendations. To facilitate fast policy adaptation, we design three synergetic components. Firstly, we design a meta-exploration policy dedicated to identifying user preferences via a few exploratory conversations, which accelerates personalized policy adaptation from the meta policy. Secondly, we adapt the item recommendation module for each user to maximize the recommendation quality based on the collected conversation states during conversations. Thirdly, we propose a Transformer-based state encoder as the backbone to connect the previous two components. It provides comprehensive state representations by modeling complicated relations between positive and negative feedback during the conversation. Extensive experiments on three datasets demonstrate the advantage of our solution in serving new users, compared with a rich set of state-of-the-art CRS solutions. 会话推荐系统(CRS)明确地在运行中征求用户对改进推荐的偏好。大多数现有的 CRS 解决方案都依赖于强化学习为大量用户培训的单一政策。然而,对于系统的新用户来说,这样的全局策略对于满足他们来说是无效的,也就是说,冷启动的挑战。本文通过元强化学习研究了冷启动用户的 CRS 策略学习。我们建议学习一个元策略,并使其适应新用户,只有几个试验的会话建议。为了促进政策的快速适应,我们设计了三个协同组件。首先,我们设计了一个元探索策略,通过一些探索性的对话来识别用户的偏好,从而加速了元策略的个性化策略调整。其次,根据会话过程中收集到的会话状态,针对每个用户调整项目推荐模块,使推荐质量最大化。第三,我们提出了一个基于变压器的状态编码器作为骨干,连接前两个组件。它通过在会话过程中建立正反馈和负反馈之间的复杂关系来提供全面的状态表示。对三个数据集进行的大量实验表明,与一组丰富的最先进的 CRS 解决方案相比,我们的解决方案在服务新用户方面具有优势。 code 0
Variational Reasoning over Incomplete Knowledge Graphs for Conversational Recommendation Xiaoyu Zhang, Xin Xin, Dongdong Li, Wenxuan Liu, Pengjie Ren, Zhumin Chen, Jun Ma, Zhaochun Ren Shandong University, Qingdao, China Conversational recommender systems (CRSs) often utilize external knowledge graphs (KGs) to introduce rich semantic information and recommend relevant items through natural language dialogues. However, original KGs employed in existing CRSs are often incomplete and sparse, which limits the reasoning capability in recommendation. Moreover, only few of existing studies exploit the dialogue context to dynamically refine knowledge from KGs for better recommendation. To address the above issues, we propose the Variational Reasoning over Incomplete KGs Conversational Recommender (VRICR). Our key idea is to incorporate the large dialogue corpus naturally accompanied with CRSs to enhance the incomplete KGs; and perform dynamic knowledge reasoning conditioned on the dialogue context. Specifically, we denote the dialogue-specific subgraphs of KGs as latent variables with categorical priors for adaptive knowledge graphs refactor. We propose a variational Bayesian method to approximate posterior distributions over dialogue-specific subgraphs, which not only leverages the dialogue corpus for restructuring missing entity relations but also dynamically selects knowledge based on the dialogue context. Finally, we infuse the dialogue-specific subgraphs to decode the recommendation and responses. We conduct experiments on two benchmark CRSs datasets. Experimental results confirm the effectiveness of our proposed method. 会话推荐系统通常利用外部知识图表(KGs)来介绍丰富的语义信息,并通过自然语言对话来推荐相关项目。然而,现有推荐系统使用的原始幼稚园往往不完整、稀疏,限制了推荐系统的推理能力。此外,现有的研究很少利用对话语境来动态提炼幼儿园的知识,从而获得更好的推荐。针对上述问题,我们提出了不完全幼儿园会话推荐系统(VRICR)的变分推理方法。我们的主要想法是将大型对话语料库自然地与学习策略结合在一起,以加强不完整的幼儿园; 并根据对话背景进行动态的知识推理。具体来说,我们表示对话特定的子图的幼稚园作为潜在的变量与分类先验的自适应知识图重构。提出了一种变分贝叶斯方法来逼近对话特定子图的后验分布,该方法不仅利用对话语料重构缺失的实体关系,而且基于对话上下文动态选择知识。最后,我们注入特定于对话的子图来解码推荐和响应。我们在两个基准 CRS 数据集上进行了实验。实验结果证实了该方法的有效性。 code 0
Multi-Intention Oriented Contrastive Learning for Sequential Recommendation Xuewei Li, Aitong Sun, Mankun Zhao, Jian Yu, Kun Zhu, Di Jin, Mei Yu, Ruiguo Yu Tianjin University, Tianjin, China Sequential recommendation aims to capture users' dynamic preferences, in which data sparsity is a key problem. Most contrastive learning models leverage data augmentation to address this problem, but they amplify noises in original sequences. Contrastive learning has the assumption that two views (positive pairs) obtained from the same user behavior sequence must be similar. However, noises typically disturb the user's main intention, which results in the dissimilarity of two views. To address this problem, in this work, we formalize the denoising problem by selecting the user's main intention, and apply contrastive learning for the first time under this topic, i.e., we propose a novel framework, namely Multi-Intention Oriented Contrastive Learning Recommender (IOCRec). In order to create high-quality views with intent-level, we fuse local and global intentions to unify sequential patterns and intent-level self-supervision signals. Specifically, we design the sequence encoder in IOCRec which includes three modules: local module, global module and disentangled module. The global module can capture users' global preferences, which is independent of the local module. The disentangled module can obtain multi-intention behind global and local representations. From a fine-grained perspective, IOCRec separates different intentions to guide the denoising process. Extensive experiments on four widely-used real datasets demonstrate the effectiveness of our new method for sequential recommendation. 序贯推荐的目的是获取用户的动态偏好,其中数据稀疏性是一个关键问题。大多数对比学习模型利用数据增强来解决这个问题,但是它们放大了原始序列中的噪声。对比学习假设从相同的用户行为序列中获得的两个视图(正对)必须相似。然而,噪声通常会干扰用户的主要意图,从而导致两种视图的不同。为了解决这一问题,本文通过选择用户的主要意图将去噪问题形式化,并首次在此课题下应用对比学习,即提出了一种新的框架,即多意图导向对比学习推荐系统(IOCRec)。为了创建具有意图级别的高质量视图,我们融合了局部意图和全局意图,统一了序列模式和意图级别的自我监督信号。具体来说,我们在 IOCRec 中设计了序列编码器,它包括三个模块: 本地模块、全局模块和解纠模块。全局模块可以捕获用户的全局首选项,这与本地模块无关。分离模块可以获得全局和局部表示背后的多意图。从细粒度的角度来看,IOCRec 分离了不同的意图来指导去噪过程。在四个广泛使用的实际数据集上的大量实验证明了我们的新方法对于顺序推荐的有效性。 code 0
IDNP: Interest Dynamics Modeling Using Generative Neural Processes for Sequential Recommendation Jing Du, Zesheng Ye, Bin Guo, Zhiwen Yu, Lina Yao ; The University of New South Wales, Sydeny, Australia; The University of New South Wales, Sydney, Australia Recent sequential recommendation models rely increasingly on consecutive short-term user-item interaction sequences to model user interests. These approaches have, however, raised concerns about both short- and long-term interests. (1) {\it short-term}: interaction sequences may not result from a monolithic interest, but rather from several intertwined interests, even within a short period of time, resulting in their failures to model skip behaviors; (2) {\it long-term}: interaction sequences are primarily observed sparsely at discrete intervals, other than consecutively over the long run. This renders difficulty in inferring long-term interests, since only discrete interest representations can be derived, without taking into account interest dynamics across sequences. In this study, we address these concerns by learning (1) multi-scale representations of short-term interests; and (2) dynamics-aware representations of long-term interests. To this end, we present an \textbf{I}nterest \textbf{D}ynamics modeling framework using generative \textbf{N}eural \textbf{P}rocesses, coined IDNP, to model user interests from a functional perspective. IDNP learns a global interest function family to define each user's long-term interest as a function instantiation, manifesting interest dynamics through function continuity. Specifically, IDNP first encodes each user's short-term interactions into multi-scale representations, which are then summarized as user context. By combining latent global interest with user context, IDNP then reconstructs long-term user interest functions and predicts interactions at upcoming query timestep. Moreover, IDNP can model such interest functions even when interaction sequences are limited and non-consecutive. Extensive experiments on four real-world datasets demonstrate that our model outperforms state-of-the-arts on various evaluation metrics. 最近的顺序推荐模型越来越依赖于连续的短期用户交互序列来模拟用户兴趣。然而,这些做法引发了人们对短期和长期利益的担忧。(1){ it short-term } : 交互序列可能不是由单个兴趣产生的,而是由几个相互交织的兴趣产生的,即使在很短的时间内,也会导致它们无法对跳过行为进行建模; (2){ it long-term } : 交互序列主要是在离散的间隔稀疏地观察到的,而不是在长期连续观察到的。这使得推断长期利益变得困难,因为只有离散的利益表示可以推导出来,而不考虑跨序列的利益动态。在这项研究中,我们通过学习(1)短期兴趣的多尺度表征; (2)长期兴趣的动态感知表征来解决这些问题。为此,我们提出了一个 textbf { I } interest textbf { D }动态建模框架,该框架使用生成 textbf { N }神经 textbf { P }流程,称为 IDNP,从功能的角度对用户兴趣进行建模。IDNP 学习一个全局兴趣函数族,将每个用户的长期兴趣定义为一个函数实例,通过函数连续性体现兴趣动态。具体来说,IDNP 首先将每个用户的短期交互编码为多尺度表示,然后将其总结为用户上下文。通过将潜在的全局兴趣与用户上下文相结合,IDNP 重构长期用户兴趣函数,并在即将到来的查询时间步进行交互预测。而且,IDNP 可以在交互序列有限且非连续的情况下建立这种兴趣函数模型。在四个真实世界数据集上的大量实验表明,我们的模型在各种评估指标上优于最先进的模型。 code 0
UserSimCRS: A User Simulation Toolkit for Evaluating Conversational Recommender Systems Jafar Afzali, Aleksander Mark Drzewiecki, Krisztian Balog, Shuo Zhang Bloomberg, London, United Kingdom; University of Stavanger, Stavanger, Norway We present an extensible user simulation toolkit to facilitate automatic evaluation of conversational recommender systems. It builds on an established agenda-based approach and extends it with several novel elements, including user satisfaction prediction, persona and context modeling, and conditional natural language generation. We showcase the toolkit with a pre-existing movie recommender system and demonstrate its ability to simulate dialogues that mimic real conversations, while requiring only a handful of manually annotated dialogues as training data. 我们提出了一个可扩展的用户模拟工具包,以促进会话推荐系统的自动评估。它建立在已建立的基于议程的方法之上,并扩展了几个新的元素,包括用户满意度预测、人物和上下文建模以及条件自然语言生成。我们用一个预先存在的电影推荐系统展示了这个工具包,并演示了它模拟模拟真实对话的对话的能力,同时只需要少量手动注释的对话作为训练数据。 code 0
A Synthetic Search Session Generator for Task-Aware Information Seeking and Retrieval Shawon Sarkar, Chirag Shah University of Washington, Seattle, WA, USA For users working on a complex search task, it is common to address different goals at various stages of the task through query iterations. While addressing these goals, users go through different task states as well. Understanding these task states latent under users' interactions is crucial in identifying users' changing intents and search behaviors to simulate and achieve real-time adaptive search recommendations and retrievals. However, the availability of sizeable real-world web search logs is scarce due to various ethical and privacy concerns, thus often challenging to develop generalizable task-aware computation models. Furthermore, session logs with task state labels are rarer. For many researchers who lack the resources to directly and at scale collect data from users and conduct a time-consuming data annotation process, this becomes a considerable bottleneck to furthering their research. Synthetic search sessions have the potential to address this gap. This paper shares a parsimonious model to simulate synthetic web search sessions with task state information, which interactive information retrieval (IIR) and search personalization studies could utilize to develop and evaluate task-based search and retrieval systems. 对于处理复杂搜索任务的用户来说,通常通过查询迭代在任务的不同阶段处理不同的目标。在实现这些目标的同时,用户还要经历不同的任务状态。理解用户交互中潜在的这些任务状态对于识别用户不断变化的意图和搜索行为来模拟和实现实时自适应搜索推荐和检索是至关重要的。然而,由于各种道德和隐私问题,大量真实世界的网络搜索日志的可用性是稀缺的,因此往往具有挑战性的开发普遍的任务感知计算模型。此外,带有任务状态标签的会话日志很少。对于许多缺乏资源直接和大规模地从用户那里收集数据并进行耗时的数据注释过程的研究人员来说,这成为他们进一步研究的一个相当大的瓶颈。合成搜索会话有可能解决这一差距。本文共享一个简约的模型来模拟任务状态信息的合成网络搜索会话,交互式信息检索(IIR)和搜索个性化研究可以利用这些信息来开发和评估基于任务的搜索和检索系统。 code 0
Understanding the Effect of Outlier Items in E-commerce Ranking Fatemeh Sarvi University of Amsterdam, Amsterdam, Netherlands Implicit feedback is an attractive source of training data in Learning to Rank (LTR). However, naively use of this data can produce unfair ranking policies originating from both exogenous and endogenous factors. Exogenous factors comes from biases in the training data, which can lead to rich-get-richer dynamics. Endogenous factors can result in ranking policies that do not allocate exposure among items in a fair way. Item exposure is a common components influencing both endogenous and exogenous factors which depends on not only position but also Inter-item dependencies. In this project, we focus on a specific case of these Inter-item dependencies which is the existence of an outlier in the list. We first define and formalize outlierness in ranking, then study the effects of this phenomenon on endogenous and exogenous factors. We further investigate the visual aspects of presentational features and their impact on item outlierness. 内隐反馈是学习排序(LTR)中一个很有吸引力的训练数据来源。然而,天真地使用这些数据可能会产生源于外生因素和内生因素的不公平排序策略。外生因素来自训练数据的偏差,这可能导致富者越富的动态。内在因素可能导致排名政策不公平地分配项目的风险。项目暴露是影响内生因素和外生因素的共同因素,它不仅取决于位置,而且取决于项目间的依赖性。在这个项目中,我们关注的是这些项目间依赖关系的一个特定情况,即列表中存在一个异常值。我们首先对排名中的异常进行定义和形式化,然后研究这种现象对内生因素和外生因素的影响。我们进一步研究了表象特征的视觉方面及其对项目异常的影响。 code 0
Simplifying Graph-based Collaborative Filtering for Recommendation Li He, Xianzhi Wang, Dingxian Wang, Haoyuan Zou, Hongzhi Yin, Guandong Xu The University of Queensland, Brisbane, QLD, Australia; eBay Research America, Seattle, WA, USA; University of Technology Sydney, Sydney, NSW, Australia; Meta Inc., San Diego, CA, USA Graph Convolutional Networks (GCNs) are a popular type of machine learning models that use multiple layers of convolutional aggregation operations and non-linear activations to represent data. Recent studies apply GCNs to Collaborative Filtering (CF)-based recommender systems (RSs) by modeling user-item interactions as a bipartite graph and achieve superior performance. However, these models face difficulty in training with non-linear activations on large graphs. Besides, most GCN-based models could not model deeper layers due to the over-smoothing effect with the graph convolution operation. In this paper, we improve the GCN-based CF models from two aspects. First, we remove non-linearities to enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we obtain the initialization of the embedding for each node in the graph by computing the network embedding on the condensed graph, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse interaction data. The proposed model is a linear model that is easy to train, scalable to large datasets, and shown to yield better efficiency and effectiveness on four real datasets. 图卷积网络(GCNs)是一种流行的机器学习模型,它使用多层卷积聚合操作和非线性激活来表示数据。最近的研究将通用网络控制器应用于基于协同过滤(CF)的推荐系统(RSs) ,通过将用户与项目之间的交互建模为一个二分图,从而获得更好的性能。然而,这些模型面临的困难训练与非线性激活的大型图。此外,大多数基于 GCN 的模型由于图卷积运算的过度平滑效应而无法建立更深层次的模型。本文从两个方面对基于 GCN 的 CF 模型进行了改进。首先,消除非线性,提高推荐性能,这与简单图卷积网络的理论是一致的。其次,通过计算压缩图上的网络嵌入,初始化图中每个节点的嵌入,解决了稀疏交互数据的图卷积聚合操作中的过平滑问题。该模型是一个线性模型,易于训练,可扩展到大型数据集,并表明产生更好的效率和效果的四个实际数据集。 code 0
A Personalized Neighborhood-based Model for Within-basket Recommendation in Grocery Shopping Mozhdeh Ariannezhad, Ming Li, Sebastian Schelter, Maarten de Rijke University of Amsterdam, Amsterdam, Netherlands Users of online shopping platforms typically purchase multiple items at a time in the form of a shopping basket. Personalized within-basket recommendation is the task of recommending items to complete an incomplete basket during a shopping session. In contrast to the related task of session-based recommendation, where the goal is to complete an ongoing anonymous session, we have access to the shopping history of the user in within-basket recommendation. Previous studies have shown the superiority of neighborhood-based models for session-based recommendation and the importance of personal history in the grocery shopping domain. But their applicability in within-basket recommendation remains unexplored. We propose PerNIR, a neighborhood-based model that explicitly models the personal history of users for within-basket recommendation in grocery shopping. The main novelty of PerNIR is in modeling the short-term interests of users, which are represented by the current basket, as well as their long-term interest, which is reflected in their purchasing history. In addition to the personal history, user neighbors are used to capture the collaborative purchase behavior. We evaluate PerNIR on two public and proprietary datasets. The experimental results show that it outperforms 10 state-of-the-art competitors with a significant margin, i.e., with gains of more than 12% in terms of hit rate over the second best performing approach. Additionally, we showcase an optimized implementation of our method, which computes recommendations fast enough for real-world production scenarios. 网上购物平台的用户通常一次购买多种商品,形式是一个购物篮。个性化购物篮内推荐的任务是在购物过程中推荐完成一个不完整的购物篮的项目。与基于会话的推荐的相关任务(其目标是完成正在进行的匿名会话)不同,我们可以访问篮子内推荐中用户的购物历史记录。以往的研究已经表明基于邻域的模型在基于会话的推荐中的优越性,以及个人历史在杂货店购物领域中的重要性。但它们在篮子内推荐中的适用性仍未得到探索。我们提出了 PerNIR 模型,这是一个基于邻域的模型,明确地模拟了购物篮内推荐用户的个人历史。PerNIR 的主要新颖之处在于为用户的短期利益(以当前篮子为代表)以及他们的长期利益(反映在他们的购买历史中)建模。除了个人历史记录,用户邻居还被用来捕获协同购买行为。我们在两个公共数据集和专有数据集上评估 PerNIR。实验结果表明,它比10个最先进的竞争对手有明显的优势,也就是说,在命中率方面比第二个最好的方法提高了12% 以上。此外,我们还展示了我们的方法的优化实现,该方法计算建议的速度足以满足实际生产场景的需要。 code 0
Self-Supervised Group Graph Collaborative Filtering for Group Recommendation Kang Li, ChangDong Wang, JianHuang Lai, Huaqiang Yuan Sun Yat-Sen University, Guangzhou, China; Dongguan University of Technology, Dongguan, China Nowadays, it is more and more convenient for people to participate in group activities. Therefore, providing some recommendations to groups of individuals is indispensable. Group recommendation is the task of suggesting items or events for a group of users in social networks or online communities. In this work, we study group recommendation in a particular scenario, namely occasional group recommendation, which has few or no historical directly interacted items. Existing group recommendation methods mostly adopt attention-based preference aggregation strategies to capture group preferences. However, these models either ignore the complex high-order interactions between groups, users and items or greatly reduce the efficiency by introducing complex data structures. Moreover, occasional group recommendation suffers from the problem of data sparsity due to the lack of historical group-item interactions. In this work, we focus on addressing the aforementioned challenges and propose a novel group recommendation model called Self-Supervised Group Graph Collaborative Filtering (SGGCF). The goal of the model is capturing the high-order interactions between users, items and groups and alleviating the data sparsity issue in an efficient way. First, we explicitly model the complex relationships as a unified user-centered heterogeneous graph and devise a base group recommendation model. Second, we explore self-supervised learning on the graph with two kinds of contrastive learning module to capture the implicit relations between groups and items. At last, we treat the proposed contrastive learning loss as supplementary and apply a multi-task strategy to jointly train the BPR loss and the proposed contrastive learning loss. We conduct extensive experiments on three real-world datasets, and the experimental results demonstrate the superiority of our proposed model in comparison to the state-of-the-art baselines. 现在,人们参加集体活动越来越方便了。因此,向个人群体提供一些建议是必不可少的。群体推荐是为社交网络或在线社区中的一组用户推荐项目或事件的任务。在这项工作中,我们研究了一个特定场景中的群组推荐,即偶尔群组推荐,它很少或没有历史直接相互作用的项目。现有的群体推荐方法大多采用基于注意的偏好聚合策略来捕获群体偏好。然而,这些模型要么忽略了组、用户和项目之间复杂的高阶交互,要么通过引入复杂的数据结构大大降低了效率。此外,由于缺乏历史上的组项目交互,偶尔的组推荐会遇到数据稀疏的问题。在这项工作中,我们致力于解决上述挑战,并提出了一个新颖的群体推荐模型,称为自我监督群体图协同过滤(sggCF)。该模型的目标是捕获用户、项目和组之间的高阶交互,有效地缓解数据稀疏问题。首先,我们明确地将复杂关系建模为一个统一的以用户为中心的异构图,并设计了一个基本的组推荐模型。其次,利用两种对比学习模块探索图上的自我监督学习,以捕捉群体与项目之间的内隐关系。最后,将提出的对比学习损失作为补充,采用多任务策略对 BPR 损失和提出的对比学习损失进行联合训练。我们在三个真实世界的数据集上进行了广泛的实验,实验结果证明了我们提出的模型相对于最先进的基线的优越性。 code 0
Visual Matching is Enough for Scene Text Retrieval Lilong Wen, Yingrong Wang, Dongxiang Zhang, Gang Chen Zhejiang University, Hangzhou, China Given a text query, the task of scene text retrieval aims at searching and localizing all the text instances that are contained in an image gallery. The state-of-the-art method learns a cross-modal similarity between the query text and the detected text regions in natural images to facilitate retrieval. However, this cross-modal approach still cannot well bridge the heterogeneity gap between the text and image modalities. In this paper, we propose a new paradigm that converts the task into a single-modality retrieval problem. Unlike previous works that rely on character recognition or embedding, we directly leverage pictorial information by rendering query text into images to learn the glyph feature of each character, which can be utilized to capture the similarity between query and scene text images. With the extracted visual features, we devise a synthetic label image guided feature alignment mechanism that is robust to different scene text styles and layouts. The modules of glyph feature learning, text instance detection, and visual matching are jointly trained in an end-to-end framework. Experimental results show that our proposed paradigm achieves the best performance in multiple benchmark datasets. As a side product, our method can also be easily generalized to support text queries with unseen characters or languages in a zero-shot manner. 给定一个文本查询,场景文本检索任务的目的是搜索和定位包含在图像库中的所有文本实例。最先进的方法学习查询文本和自然图像中检测到的文本区域之间的跨模式相似性,以便于检索。然而,这种跨模式的方法仍然不能很好地弥合文本和图像模式之间的异质性差距。在本文中,我们提出了一个新的范式,将任务转化为一个单模态检索问题。与以往依赖于字符识别或嵌入的作品不同,我们直接利用图像信息,通过将查询文本渲染成图像来学习每个字符的字形特征,这可以用来捕捉查询文本图像和场景文本图像之间的相似性。利用提取的视觉特征,设计了一种综合标签图像引导的特征对齐机制,该机制对不同的场景文本样式和布局具有鲁棒性。字形特征学习、文本实例检测和视觉匹配等模块在端到端的框架下进行联合训练。实验结果表明,我们提出的范式在多个基准数据集中取得了最佳的性能。作为一个副产品,我们的方法也可以很容易地推广到支持文本查询与看不见的字符或语言在零拍方式。 code 0
Slate-Aware Ranking for Recommendation Yi Ren, Xiao Han, Xu Zhao, Shenzheng Zhang, Yan Zhang Tencent, Beijing, China We see widespread adoption of slate recommender systems, where an ordered item list is fed to the user based on the user interests and items' content. For each recommendation, the user can select one or several items from the list for further interaction. In this setting, the significant impact on user behaviors from the mutual influence among the items is well understood. The existing methods add another step of slate re-ranking after the ranking stage of recommender systems, which considers the mutual influence among recommended items to re-rank and generate the recommendation results so as to maximize the expected overall utility. However, to model the complex interaction of multiple recommended items, the re-ranking stage usually can just handle dozens of candidates because of the constraint of limited hardware resource and system latency. Therefore, the ranking stage is still essential for most applications to provide high-quality candidate set for the re-ranking stage. In this paper, we propose a solution named Slate-Aware ranking (SAR) for the ranking stage. By implicitly considering the relations among the slate items, it significantly enhances the quality of the re-ranking stage's candidate set and boosts the relevance and diversity of the overall recommender systems. Both experiments with the public datasets and internal online A/B testing are conducted to verify its effectiveness. 我们看到了平板推荐系统的广泛应用,它根据用户的兴趣和项目的内容向用户提供一个有序的项目列表。对于每个建议,用户可以从列表中选择一个或多个项以进行进一步的交互。在这种设置中,项目之间的相互影响对用户行为的重大影响是可以理解的。现有的方法在推荐系统的排序阶段之后又增加了一个重新排序的步骤,即考虑推荐项目之间的相互影响,重新排序并生成推荐结果,从而使预期的总体效用最大化。然而,为了对多个推荐项目的复杂交互进行建模,由于硬件资源和系统延迟的限制,重新排序阶段通常只能处理几十个候选项。因此,排名阶段仍然是必不可少的大多数应用程序提供高质量的候选人集重新排名阶段。在本文中,我们提出了一个解决方案,石板感知排序(SAR)的排序阶段。通过隐含地考虑石板条目之间的关系,显著地提高了重新排序阶段候选集的质量,增强了整个推荐系统的相关性和多样性。通过公共数据集和内部在线 A/B 测试的实验,验证了该方法的有效性。 code 0
MUSENET: Multi-Scenario Learning for Repeat-Aware Personalized Recommendation Senrong Xu, Liangyue Li, Yuan Yao, Zulong Chen, Han Wu, Quan Lu, Hanghang Tong Nanjing University, Nanjing, China; University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA; Alibaba Group, Hangzhou, China Personalized recommendation has been instrumental in many real applications. Despite the great progress, the underlying multi-scenario characteristics (e.g., users may behave differently under different scenarios) are largely ignored by existing recommender systems. Intuitively, modeling different scenarios properly could significantly improve the recommendation accuracy, and some existing work has explored this direction. However, these work assumes the scenarios are explicitly given, and thus becomes less effective when such information is unavailable. To complicate things further, proper scenario modeling from data is challenging and the recommendation models may easily overfit to some scenarios. In this paper, we propose a multi-scenario learning framework, MUSENET, for personalized recommendation. The key idea of MUSENET is to learn multiple implicit scenarios from the user behaviors, with a careful design inspired by the causal interpretation of recommender systems to avoid the overfitting issue. Additionally, since users' repeat consumptions account for a large part of the user behavior data on many e-commerce platforms, a repeat-aware mechanism is integrated to handle users' repurchase intentions within each scenario. Comprehensive experimental results on both industrial and public datasets demonstrate the effectiveness of the proposed approach compared with the state-of-the-art methods. 个性化推荐在许多实际应用中发挥了重要作用。尽管取得了巨大的进步,但是现有的推荐系统基本上忽略了潜在的多场景特征(例如,用户在不同场景下的行为可能不同)。直观地说,适当地建立不同场景的模型可以显著地提高推荐的准确性,现有的一些工作已经探索了这一方向。然而,这些工作假设场景是显式给出的,因此当这些信息不可用时,工作效率就会降低。更复杂的是,根据数据建立适当的场景模型是具有挑战性的,推荐模型可能很容易过度适应某些场景。在本文中,我们提出了一个多场景学习框架 MUSENET,用于个性化推荐。MUSENET 的关键思想是从用户行为中学习多个隐式场景,并通过对推荐系统的因果解释进行精心设计,以避免过度拟合问题。此外,由于在许多电子商务平台上,用户的重复消费占用了用户行为数据的很大一部分,因此集成了一个重复感知机制来处理每个场景中用户的回购意图。在工业数据集和公共数据集上的综合实验结果表明,与最新的方法相比,该方法是有效的。 code 0
An F-shape Click Model for Information Retrieval on Multi-block Mobile Pages Lingyue Fu, Jianghao Lin, Weiwen Liu, Ruiming Tang, Weinan Zhang, Rui Zhang, Yong Yu ruizhang.info, Shenzhen, China; Huawei Noah's Ark Lab, Shenzhen, China; Shanghai Jiao Tong University, Shanghai, China To provide click simulation or relevance estimation based on users' implicit interaction feedback, click models have been much studied during recent years. Most click models focus on user behaviors towards a single list. However, with the development of user interface (UI) design, the layout of displayed items on a result page tends to be multi-block (i.e., multi-list) style instead of a single list, which requires different assumptions to model user behaviors more accurately. There exist click models for multi-block pages in desktop contexts, but they cannot be directly applied to mobile scenarios due to different interaction manners, result types and especially multi-block presentation styles. In particular, multi-block mobile pages can normally be decomposed into interleavings of basic vertical blocks and horizontal blocks, thus resulting in typically F-shape forms. To mitigate gaps between desktop and mobile contexts for multi-block pages, we conduct a user eye-tracking study, and identify users' sequential browsing, block skip and comparison patterns on F-shape pages. These findings lead to the design of a novel F-shape Click Model (FSCM), which serves as a general solution to multi-block mobile pages. Firstly, we construct a directed acyclic graph (DAG) for each page, where each item is regarded as a vertex and each edge indicates the user's possible examination flow. Secondly, we propose DAG-structured GRUs and a comparison module to model users' sequential (sequential browsing, block skip) and non-sequential (comparison) behaviors respectively. Finally, we combine GRU states and comparison patterns to perform user click predictions. Experiments on a large-scale real-world dataset validate the effectiveness of FSCM on user behavior predictions compared with baseline models. 为了提供基于用户隐性交互反馈的点击模拟或相关性估计,点击模型近年来得到了广泛的研究。大多数单击模型关注的是用户对单个列表的行为。然而,随着用户界面(UI)设计的发展,结果页面上显示项的布局趋向于多块(即多列表)样式,而不是单列表,这就需要不同的假设来更准确地模拟用户行为。桌面环境中存在多块页面的点击模型,但由于不同的交互方式、结果类型,尤其是多块表示方式,这些模型不能直接应用于移动场景。特别是,多块移动页面通常可以分解为基本垂直块和水平块的交错,从而产生典型的 F 形形式。为了缩小多块网页的桌面上下文和移动上下文之间的差距,我们进行了用户眼动跟踪研究,识别了用户在 F 形网页上的顺序浏览、块跳过和比较模式。这些发现导致了一种新颖的 F 形点击模型(FSCM)的设计,它作为一个多块移动页面的通用解决方案。首先,我们为每个页面建立一个有向无环图(DAG) ,其中每个项目被视为一个顶点,每个边表示用户可能的考试流程。其次,提出了 DAG 结构的 GRU 和比较模块,分别对用户的顺序(顺序浏览,块跳过)和非顺序(比较)行为进行建模。最后,我们结合 GRU 状态和比较模式来执行用户单击预测。在一个大规模真实世界数据集上的实验验证了与基线模型相比,FSCM 在用户行为预测方面的有效性。 code 0
AgAsk: A Conversational Search Agent for Answering Agricultural Questions Hang Li, Bevan Koopman, Ahmed Mourad, Guido Zuccon CSIRO, Brisbane, Australia; The University of Queensland, Brisbane, Australia While large amounts of potentially useful agricultural resources (journal articles, manuals, reports) are available, their value cannot be realised if they cannot be easily searched and presented to the agriculture users in a digestible form.AgAsk is a conversational search system for the agricultural domain, providing tailored answers to growers questions. AgAsk is underpinned by an efficient and effective neural passage ranking model fine-tuned on real world growers' questions. An adaptable, messaging-style user interface is deployed via the Telegram messaging platform, allowing users to ask natural language questions via text or voice, and receive short natural language answers as replies. AgAsk is empirically evaluated on an agricultural passage retrieval test collection. The system provides a single entry point to access the information needed for better growing decisions. Much of the system is domain agnostic and would benefit other domains. AgAsk can be interacted via Telegram; further information about AgAsk, including codebases, instructions and demonstration videos can be accessed at https://ielab.io/publications/agask-agent. 虽然有大量潜在有用的农业资源(期刊文章、手册、报告)可用,但如果它们不能以易于理解的形式被搜索并呈现给农业用户,它们的价值就无法实现。 AgAsk 是一个农业领域的对话搜索系统,为种植者提供量身定制的问题答案。AgAsk 的基础是一个有效的神经通道排序模型,该模型根据真实世界种植者的问题进行了微调。通过 Telegram 消息平台部署了一个可适应的消息类型的用户界面,允许用户通过文本或语音提出自然语言问题,并收到简短的自然语言答复作为答复。AgAsk 在一个农业通道检索测试集上进行了实证评估。该系统提供了一个单一的切入点,以访问更好的增长决策所需的信息。该系统的大部分是领域不可知的,并将有益于其他领域。AgAsk 可以通过 Telegram 进行交互,关于 AgAsk 的更多信息,包括代码库、说明和演示视频可以在 https://ielab.io/publications/AgAsk-agent 上获得。 code 0
Learning to Distinguish Multi-User Coupling Behaviors for TV Recommendation Jiarui Qin, Jiachen Zhu, Yankai Liu, Junchao Gao, Jianjie Ying, Chaoxiong Liu, Ding Wang, Junlan Feng, Chao Deng, Xiaozheng Wang, Jian Jiang, Cong Liu, Yong Yu, Haitao Zeng, Weinan Zhang China Mobile Research Institute, Beijing, China; Shanghai Jiao Tong University, Shanghai, China; Digital Brain Lab, Shanghai, China; China Mobile Zhejiang, Hangzhou, China; China Mobile (Zhejiang) Research & Innovation Institute, Hangzhou, China This paper is concerned with TV recommendation, where one major challenge is the coupling behavior issue that the behaviors of multiple users are coupled together and not directly distinguishable because the users share the same account. Unable to identify the current watching user and use the coupling behaviors directly could lead to sub-optimal recommendation results due to the noise introduced by the behaviors of other users. Most existing methods deal with this issue either by unsupervised clustering algorithms or depending on latent user representation learning with strong assumptions. However, they neglect to sophisticatedly model the current session behaviors, which carry the information of user identification. Another critical limitation of the existing models is the lack of supervision signal on distinguishing behaviors because they solely depend on the final click label, which is insufficient to provide effective supervision. To address the above problems, we propose the Coupling Sequence Model (COSMO) for TV recommendation. In COSMO, we design a session-aware co-attention mechanism that uses both the candidate item and session behaviors as the query to attend to the historical behaviors in a fine-grained manner. Furthermore, we propose to use the data of accounts with multiple devices (e.g., families with various TV sets), which means the behaviors of one account are generated on different devices. We regard the device information as weak supervision and propose a novel pair-wise attention loss for learning to distinguish the coupling behaviors. Extensive offline experiments and online A/B tests over a commercial TV service provider demonstrate the efficacy of COSMO compared to the existing models. 本文研究的是电视推荐,其中一个主要的挑战是耦合行为问题,即多个用户的行为耦合在一起,而不能直接区分,因为用户共享相同的帐户。由于其他用户的行为所带来的噪声,如果不能直接识别当前监视用户并使用耦合行为,可能会导致推荐结果不理想。大多数现有的方法都是通过无监督聚类算法或依赖于强假设条件下的潜在用户表征学习来解决这个问题。然而,他们忽视了对当前的会话行为进行复杂的建模,因为当前的会话行为携带用户识别信息。现有模型的另一个严重缺陷是缺乏区分行为的监督信号,因为它们仅仅依赖于最终的点击标签,这不足以提供有效的监督。针对上述问题,本文提出了耦合序列模型(COSMO)用于电视推荐。在 COSMO 中,我们设计了一个会话感知的共注意机制,该机制同时使用候选项和会话行为作为查询,以细粒度的方式关注历史行为。此外,我们建议使用多个设备的帐户数据(例如,拥有不同电视机的家庭) ,这意味着一个帐户的行为是在不同的设备上产生的。我们将设备信息视为弱监督,提出了一种新的对注意损失学习方法来区分耦合行为。通过对一家商业电视服务提供商的大量离线实验和在线 A/B 测试,证明了 COSMO 与现有模型相比的有效性。 code 0
A Causal View for Item-level Effect of Recommendation on User Preference Wei Cai, Fuli Feng, Qifan Wang, Tian Yang, Zhenguang Liu, Congfu Xu Chinese University of Hong Kong, Hong Kong, China; University of Science and Technology of China, Hefei, China; Zhejiang University, Hangzhou, China; Meta AI, Menlo Park, USA Recommender systems not only serve users but also affect user preferences through personalized recommendations. Recent researches investigate the effects of the entire recommender system on user preferences, i.e., system-level effects, and find that recommendations may lead to problems such as echo chambers and filter bubbles. To properly alleviate the problems, it is necessary to estimate the effects of recommending a specific item on user preferences, i.e., item-level effects. For example, by understanding whether recommending an item aggravates echo chambers, we can better decide whether to recommend it or not. This work designs a method to estimate the item-level effects from the causal perspective. We resort to causal graphs to characterize the average treatment effect of recommending an item on the preference of another item. The key to estimating the effects lies in mitigating the confounding bias of time and user features without the costly randomized control trials. Towards the goal, we estimate the causal effects from historical observations through a method with stratification and matching to address the two confounders, respectively. Nevertheless, directly implementing stratification and matching is intractable, which requires high computational cost due to the large sample size. We thus propose efficient approximations of stratification and matching to reduce the computation complexity. Extensive experimental results on two real-world datasets validate the effectiveness and efficiency of our method. We also show a simple example of using the item-level effects to provide insights for mitigating echo chambers. 推荐系统不仅服务于用户,而且通过个性化推荐影响用户的偏好。最近的研究调查了整个推荐系统对用户偏好的影响,即系统层面的影响,发现建议可能导致回声室和过滤气泡等问题。为了恰当地缓解这些问题,有必要估计推荐一个特定项目对用户偏好的影响,即项目级别的影响。例如,通过了解推荐一个项目是否会加剧回声室,我们可以更好地决定是否推荐它。本文设计了一种从因果关系角度评价项目水平效应的方法。我们使用因果图来描述推荐一个项目对另一个项目偏好的平均治疗效果。估计效果的关键在于减轻时间和用户特征的混杂偏差,而不需要昂贵的随机对照试验。为了实现这一目标,我们通过分层和匹配的方法来分别解决这两个混杂因素,从历史观察中估计因果效应。然而,直接实现分层和匹配是比较困难的,由于样本量较大,需要较高的计算成本。因此,我们提出了分层和匹配的有效近似,以降低计算复杂度。在两个实际数据集上的大量实验结果验证了该方法的有效性和有效性。我们还展示了一个使用条目级效应来提供减轻回声室的见解的简单示例。 code 0
Exploiting Explicit and Implicit Item relationships for Session-based Recommendation Zihao Li, Xianzhi Wang, Chao Yang, Lina Yao, Julian J. McAuley, Guandong Xu University of California, San Diego, La Jolla, CA, USA; University of New South Wales, Sydney, Australia; University of Technology Sydney, Sydney, Australia The session-based recommendation aims to predict users' immediate next actions based on their short-term behaviors reflected by past and ongoing sessions. Graph neural networks (GNNs) recently dominated the related studies, yet their performance heavily relies on graph structures, which are often predefined, task-specific, and designed heuristically. Furthermore, existing graph-based methods either neglect implicit correlations among items or consider explicit and implicit relationships altogether in the same graphs. We propose to decouple explicit and implicit relationships among items. As such, we can capture the prior knowledge encapsulated in explicit dependencies and learned implicit correlations among items simultaneously in a flexible and more interpretable manner for effective recommendations. We design a dual graph neural network that leverages the feature representations extracted by two GNNs: a graph neural network with a single gate (SG-GNN) and an adaptive graph neural network (A-GNN). The former models explicit dependencies among items. The latter employs a self-learning strategy to capture implicit correlations among items. Our experiments on four real-world datasets show our model outperforms state-of-the-art methods by a large margin, achieving 18.46% and 70.72% improvement in HR@20, and 49.10% and 115.29% improvement in MRR@20 on Diginetica and LastFM datasets. 基于会话的建议旨在根据用户过去和正在进行的会话所反映的短期行为来预测用户的即时下一步行动。图形神经网络(GNN)是近年来研究的热点,但其性能主要依赖于图形结构,这种结构往往是预定义的、任务特定的、启发式设计的。此外,现有的基于图的方法要么忽略项目之间的隐式关系,要么在同一个图中同时考虑显式和隐式关系。我们建议解耦项目之间的显式和隐式关系。因此,我们可以以一种灵活和更易于解释的方式同时捕获包含在显式依赖和学习的项目之间的隐式相关性中的先验知识,以获得有效的建议。我们设计了一个双图神经网络,利用两个 GNN 提取的特征表示: 一个单门图神经网络(SG-GNN)和一个自适应图神经网络(A-GNN)。前者对项之间的显式依赖关系进行建模。后者采用自我学习策略来捕捉项目之间的内隐相关性。我们在四个真实世界数据集上的实验表明,我们的模型大大优于最先进的方法,HR@20分别提高了18.46% 和70.72% ,在 Diginetica 和 LastFM 数据集上 MRR@20分别提高了49.10% 和115.29% 。 code 0
Unbiased Knowledge Distillation for Recommendation Gang Chen, Jiawei Chen, Fuli Feng, Sheng Zhou, Xiangnan He University of Science and Technology of China, Hefei, China; Zhejiang University, Hangzhou, China As a promising solution for model compression, knowledge distillation (KD) has been applied in recommender systems (RS) to reduce inference latency. Traditional solutions first train a full teacher model from the training data, and then transfer its knowledge (\ie \textit{soft labels}) to supervise the learning of a compact student model. However, we find such a standard distillation paradigm would incur serious bias issue -- popular items are more heavily recommended after the distillation. This effect prevents the student model from making accurate and fair recommendations, decreasing the effectiveness of RS. In this work, we identify the origin of the bias in KD -- it roots in the biased soft labels from the teacher, and is further propagated and intensified during the distillation. To rectify this, we propose a new KD method with a stratified distillation strategy. It first partitions items into multiple groups according to their popularity, and then extracts the ranking knowledge within each group to supervise the learning of the student. Our method is simple and teacher-agnostic -- it works on distillation stage without affecting the training of the teacher model. We conduct extensive theoretical and empirical studies to validate the effectiveness of our proposal. We release our code at: https://github.com/chengang95/UnKD. 作为一种有前途的模型压缩方法,知识精馏(KD)已经应用于推荐系统(RS)中,以减少推理延迟。传统的解决方案首先从培训数据中训练一个完整的教师模型,然后转移其知识(即文本(软标签))来监督一个紧凑的学生模型的学习。然而,我们发现这样一个标准的蒸馏范式会引起严重的偏见问题-流行的项目是更多地推荐后蒸馏。这种效应阻碍了学生模型做出准确、公正的推荐,降低了 RS 的有效性。在这项工作中,我们找出偏见的根源 KD-它根源于偏见软标签的教师,并进一步传播和加强在蒸馏过程中。为了解决这一问题,我们提出了一种新的分层蒸馏 KD 方法。它首先根据项目的知名度将项目划分为多个组,然后提取每个组内的排名知识,以监督学生的学习。我们的方法是简单的和教师不可知的——它工作在蒸馏阶段,不影响教师模型的训练。我们进行了广泛的理论和实证研究,以验证我们的建议的有效性。我们在 https://github.com/chengang95/unkd 发布代码。 code 0
Multimodal Pre-Training with Self-Distillation for Product Understanding in E-Commerce Shilei Liu, Lin Li, Jun Song, Yonghua Yang, Xiaoyi Zeng Alibaba Group, Hangzhou, China Product understanding refers to a series of product-centric tasks, such as classification, alignment and attribute values prediction, which requires fine-grained fusion of various modalities of products. Excellent product modeling ability will enhance the user experience and benefit search and recommendation systems. In this paper, we propose MBSD, a pre-trained vision-and-language model which can integrate the heterogeneous information of product in a single stream BERT-style architecture. Compared with current approaches, MBSD uses a lightweight convolutional neural network instead of a heavy feature extractor for image encoding, which has lower latency. Besides, we cleverly utilize user behavior data to design a two-stage pre-training task to understand products from different perspectives. In addition, there is an underlying imbalanced problem in multimodal pre-training, which will impairs downstream tasks. To this end, we propose a novel self-distillation strategy to transfer the knowledge in dominated modality to weaker modality, so that each modality can be fully tapped during pre-training. Experimental results on several product understanding tasks demonstrate that the performance of MBSD outperforms the competitive baselines. 产品理解是指一系列以产品为中心的任务,如分类、对齐和属性值预测,它需要对产品的各种模式进行细粒度的融合。优秀的产品建模能力将增强用户体验,有利于搜索和推荐系统。在本文中,我们提出了 MBSD,一个预先训练的视觉和语言模型,它可以集成产品的异构信息在一个单流 BERT 风格的体系结构。与目前的方法相比,MBSD 使用了一个轻量级的卷积神经网络,而不是一个沉重的特征提取器来进行图像编码,后者具有更低的延迟。此外,我们巧妙地利用用户行为数据设计了一个两阶段的培训前任务,从不同的角度来理解产品。此外,在多模式预训练中存在一个潜在的不平衡问题,这将损害下游任务。为此,我们提出了一种新的自我提取策略,将主导模式中的知识转移到较弱模式中,以便在预训练过程中充分利用每种模式。在几个产品理解任务上的实验结果表明,MBSD 的性能优于竞争基线。 code 0
Towards Universal Cross-Domain Recommendation Jiangxia Cao, Shaoshuai Li, Bowen Yu, Xiaobo Guo, Tingwen Liu, Bin Wang Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; Xiaomi AI Lab, Xiaomi Inc., NEED, China; MYbank, Ant Group, Beijing, China; DAMO Academy, Alibaba Group, Beijing, China In industry, web platforms such as Alibaba and Amazon often provide diverse services for users. Unsurprisingly, some developed services are data-rich, while some newly started services are data-scarce accompanied by severe data sparsity and cold-start problems. To alleviate the above problems and incubate new services easily, cross-domain recommendation (CDR) has attracted much attention from industrial and academic researchers. Generally, CDR aims to transfer rich user-item interaction information from related source domains (e.g., developed services) to boost recommendation quality of target domains (e.g., newly started services). For different scenarios, previous CDR methods can be roughly divided into two branches: (1) Data sparsity CDR fulfills user preference aided by other domain data to make intra-domain recommendations for users with few interactions, (2) Cold-start CDR projects user preference from other domain to make inter-domain recommendations for users with none interactions. In the past years, many outstanding CDR methods are emerged, however, to the best of our knowledge, none of them attempts to solve the two branches simultaneously. In this paper, we provide a unified framework, namely UniCDR, which can universally model different CDR scenarios by transferring the domain-shared information. Extensive experiments under the above 2 branches on 4 CDR scenarios and 6 public and large-scale industrial datasets demonstrate the effectiveness and universal ability of our UniCDR. 在工业界,阿里巴巴和亚马逊等网络平台往往为用户提供多样化的服务。毫不奇怪,一些已开发的服务数据丰富,而一些新开始的服务数据稀缺,伴随着严重的数据稀缺和冷启动问题。为了解决上述问题,更好地孵化新的服务,跨域推荐已经引起了业界和学术界的广泛关注。一般来说,CDR 旨在从相关的源域(例如已开发的服务)传输丰富的用户项交互信息,以提高目标域(例如新启动的服务)的推荐质量。对于不同的场景,以往的 CDR 方法大致可以分为两个分支: (1)数据稀疏性 CDR 在其他领域数据的辅助下实现用户偏好,为交互较少的用户提供域内推荐; (2)冷启动 CDR 从其他领域投射用户偏好,为没有交互的用户提供域间推荐。在过去的几年中,出现了许多优秀的 CDR 方法,然而,就我们所知,它们都没有尝试同时解决这两个分支。本文提供了一个统一的框架,即 UniCDR,它通过传递领域共享信息,可以对不同的 CDR 场景进行统一建模。在上述两个分支下,对4个 CDR 场景和6个公共和大规模工业数据集进行了广泛的实验,证明了 UniCDR 的有效性和通用性。 code 0
Relation Preference Oriented High-order Sampling for Recommendation Mukun Chen, Xiuwen Gong, YH Jin, Wenbin Hu School of Computer Science, Wuhan University, Wuhan, China; Center for Evidence-Based and Translational Medicine, Wuhan University, Wuhan, China; The University of Sydney, Sydney, NSW, Austria The introduction of knowledge graphs (KG) into recommendation systems (RS) has been proven to be effective because KG introduces a variety of relations between items. In fact, users have different relation preferences depending on the relationship in KG. Existing GNN-based models largely adopt random neighbor sampling strategies to process propagation; however, these models cannot aggregate biased relation preference local information for a specific user, and thus cannot effectively reveal the internal relationship between users' preferences. This will reduce the accuracy of recommendations, while also limiting the interpretability of the results. Therefore, we propose a Relation Preference oriented High-order Sampling (RPHS) model to dynamically sample subgraphs based on relation preference and hard negative samples for user-item pairs. We design a path sampling strategy based on relation preference, which can encode the critical paths between specific user-item pairs to sample the paths in the high-order message passing subgraphs. Next, we design a mixed sampling strategy and define a new propagation operation to further enhance RPHS's ability to distinguish negative signals. Through the above sampling strategies, our model can better aggregate local relation preference information and reveal the internal relationship between users' preferences. Experiments show that our model outperforms the state-of-the-art models on three datasets by 14.98%, 5.31%, and 8.65%, and also performs well in terms of interpretability. The codes are available at https://github.com/RPHS/RPHS.git 在推荐系统中引入知识图(KG)已被证明是有效的,因为 KG 引入了项目之间的各种关系。实际上,根据 KG 中的关系,用户有不同的关系偏好。现有的基于 GNN 的模型大多采用随机邻居抽样策略来处理传播过程,但是这些模型不能为特定用户聚合有偏差的关系偏好局部信息,因此不能有效地揭示用户偏好之间的内在关系。这将降低建议的准确性,同时也限制了结果的可解释性。因此,我们提出了一个面向关系偏好的高阶抽样(RPHS)模型来动态抽样基于关系偏好和硬负样本的用户项目对子图。设计了一种基于关系偏好的路径抽样策略,对特定用户-项目对之间的关键路径进行编码,从而对高阶消息传递子图中的路径进行抽样。接下来,我们设计了一个混合采样策略并定义了一个新的传播操作来进一步提高 RPHS 分辨负信号的能力。通过以上的抽样策略,我们的模型可以更好地聚合局部关系偏好信息,揭示用户偏好之间的内在关系。实验表明,我们的模型在三个数据集上的性能分别比最先进的模型高出14.98% 、5.31% 和8.65% ,而且在可解释性方面也表现出良好的性能。密码可以在 https://github.com/rphs/rphs.git 找到 code 0
Knowledge Enhancement for Contrastive Multi-Behavior Recommendation Hongrui Xuan, Yi Liu, Bohan Li, Hongzhi Yin Nanjing University of Aeronautics and Astronautics, Nanjing , China; Nanjing University of Aeronautics and Astronautics, Nanjing, China; The University of Queensland, Brisbane, Australia A well-designed recommender system can accurately capture the attributes of users and items, reflecting the unique preferences of individuals. Traditional recommendation techniques usually focus on modeling the singular type of behaviors between users and items. However, in many practical recommendation scenarios (e.g., social media, e-commerce), there exist multi-typed interactive behaviors in user-item relationships, such as click, tag-as-favorite, and purchase in online shopping platforms. Thus, how to make full use of multi-behavior information for recommendation is of great importance to the existing system, which presents challenges in two aspects that need to be explored: (1) Utilizing users' personalized preferences to capture multi-behavioral dependencies; (2) Dealing with the insufficient recommendation caused by sparse supervision signal for target behavior. In this work, we propose a Knowledge Enhancement Multi-Behavior Contrastive Learning Recommendation (KMCLR) framework, including two Contrastive Learning tasks and three functional modules to tackle the above challenges, respectively. In particular, we design the multi-behavior learning module to extract users' personalized behavior information for user-embedding enhancement, and utilize knowledge graph in the knowledge enhancement module to derive more robust knowledge-aware representations for items. In addition, in the optimization stage, we model the coarse-grained commonalities and the fine-grained differences between multi-behavior of users to further improve the recommendation effect. Extensive experiments and ablation tests on the three real-world datasets indicate our KMCLR outperforms various state-of-the-art recommendation methods and verify the effectiveness of our method. code 0
Improving News Recommendation with Channel-Wise Dynamic Representations and Contrastive User Modeling Jingkun Wang, Yongtao Jiang, Haochen Li, Wen Zhao Peking University, Beijing, China News modeling and user modeling are the two core tasks of news recommendation. Accurate user representation and news representation can enable the recommendation system to provide users with precise recommendation services. Most existing methods use deep learning models such as CNN and Self-Attention to extract text features from news titles and abstracts to generate specific news vectors. However, the CNN-based methods have fixed parameters and cannot extract specific features for different input words, while the Self-Attention-based methods have high computational costs and are difficult to capture local features effectively. In our proposed method, we build a category-based dynamic component to generate suitable parameters for different inputs and extract local features from multiple perspectives. Meanwhile, users will mistakenly click on some news terms they are not interested in, so there will be some interaction noises in the datasets. In order to explore the critical user behaviors in user data and reduce the impact of noise data on user modeling, we adopt a frequency-aware contrastive learning method in user modeling. Experiments on real-world datasets verify the effectiveness of our proposed method. code 0
Search Behavior Prediction: A Hypergraph Perspective Yan Han, Edward W. Huang, Wenqing Zheng, Nikhil Rao, Zhangyang Wang, Karthik Subbian University of Texas at Austin, Austin, TX, USA; Amazon, Palo Alto, CA, USA Although the bipartite shopping graphs are straightforward to model search behavior, they suffer from two challenges: 1) The majority of items are sporadically searched and hence have noisy/sparse query associations, leading to a \textit{long-tail} distribution. 2) Infrequent queries are more likely to link to popular items, leading to another hurdle known as \textit{disassortative mixing}. To address these two challenges, we go beyond the bipartite graph to take a hypergraph perspective, introducing a new paradigm that leverages \underline{auxiliary} information from anonymized customer engagement sessions to assist the \underline{main task} of query-item link prediction. This auxiliary information is available at web scale in the form of search logs. We treat all items appearing in the same customer session as a single hyperedge. The hypothesis is that items in a customer session are unified by a common shopping interest. With these hyperedges, we augment the original bipartite graph into a new \textit{hypergraph}. We develop a \textit{\textbf{D}ual-\textbf{C}hannel \textbf{A}ttention-Based \textbf{H}ypergraph Neural Network} (\textbf{DCAH}), which synergizes information from two potentially noisy sources (original query-item edges and item-item hyperedges). In this way, items on the tail are better connected due to the extra hyperedges, thereby enhancing their link prediction performance. We further integrate DCAH with self-supervised graph pre-training and/or DropEdge training, both of which effectively alleviate disassortative mixing. Extensive experiments on three proprietary E-Commerce datasets show that DCAH yields significant improvements of up to \textbf{24.6% in mean reciprocal rank (MRR)} and \textbf{48.3% in recall} compared to GNN-based baselines. Our source code is available at \url{https://github.com/amazon-science/dual-channel-hypergraph-neural-network}. code 0
Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation Zhen Tian, Ting Bai, Zibin Zhang, Zhiyuan Xu, Kangyi Lin, JiRong Wen, Wayne Xin Zhao Renmin University of China, Beijing, China; Beijing University of Posts and Telecommunications & Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing, China; Renmin University of China & Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China; Tencent, Guangzhou, China With the growth of high-dimensional sparse data in web-scale recommender systems, the computational cost to learn high-order feature interaction in CTR prediction task largely increases, which limits the use of high-order interaction models in real industrial applications. Some recent knowledge distillation based methods transfer knowledge from complex teacher models to shallow student models for accelerating the online model inference. However, they suffer from the degradation of model accuracy in knowledge distillation process. It is challenging to balance the efficiency and effectiveness of the shallow student models. To address this problem, we propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation. The proposed lightweight student model DAGFM can learn arbitrary explicit feature interactions from teacher networks, which achieves approximately lossless performance and is proved by a dynamic programming algorithm. Besides, an improved general model KD-DAGFM+ is shown to be effective in distilling both explicit and implicit feature interactions from any complex teacher model. Extensive experiments are conducted on four real-world datasets, including a large-scale industrial dataset from WeChat platform with billions of feature dimensions. KD-DAGFM achieves the best performance with less than 21.5% FLOPs of the state-of-the-art method on both online and offline experiments, showing the superiority of DAGFM to deal with the industrial scale data in CTR prediction task. Our implementation code is available at: https://github.com/RUCAIBox/DAGFM. code 0
Model-based Unbiased Learning to Rank Dan Luo, Lixin Zou, Qingyao Ai, Zhiyu Chen, Dawei Yin, Brian D. Davison Lehigh University, Bethlehem, PA, USA; Amazon.com, Inc., Seattle, WA, USA; Baidu Inc., Beijing, PA, China; Tsinghua University, Beijing, PA, China Unbiased Learning to Rank (ULTR) that learns to rank documents with biased user feedback data is a well-known challenge in information retrieval. Existing methods in unbiased learning to rank typically rely on click modeling or inverse propensity weighting (IPW). Unfortunately, the search engines are faced with severe long-tail query distribution, where neither click modeling nor IPW can handle well. Click modeling suffers from data sparsity problem since the same query-document pair appears limited times on tail queries; IPW suffers from high variance problem since it is highly sensitive to small propensity score values. Therefore, a general debiasing framework that works well under tail queries is in desperate need. To address this problem, we propose a model-based unbiased learning-to-rank framework. Specifically, we develop a general context-aware user simulator to generate pseudo clicks for unobserved ranked lists to train rankers, which addresses the data sparsity problem. In addition, considering the discrepancy between pseudo clicks and actual clicks, we take the observation of a ranked list as the treatment variable and further incorporate inverse propensity weighting with pseudo labels in a doubly robust way. The derived bias and variance indicate that the proposed model-based method is more robust than existing methods. Finally, extensive experiments on benchmark datasets, including simulated datasets and real click logs, demonstrate that the proposed model-based method consistently performs outperforms state-of-the-art methods in various scenarios. The code is available at https://github.com/rowedenny/MULTR. code 0
Pairwise Fairness in Ranking as a Dissatisfaction Measure Alessandro Fabris, Gianmaria Silvello, Gian Antonio Susto, Asia J. Biega Univesity of Padova, Padova, Italy; Max Planck Institute for Security and Privacy, Bochum, Germany Fairness and equity have become central to ranking problems in information access systems, such as search engines, recommender systems, or marketplaces. To date, several types of fair ranking measures have been proposed, including diversity, exposure, and pairwise fairness measures. Out of those, pairwise fairness is a family of metrics whose normative grounding has not been clearly explicated, leading to uncertainty with respect to the construct that is being measured and how it relates to stakeholders' desiderata. In this paper, we develop a normative and behavioral grounding for pairwise fairness in ranking. Leveraging measurement theory and user browsing models, we derive an interpretation of pairwise fairness centered on the construct of producer dissatisfaction, tying pairwise fairness to perceptions of ranking quality. Highlighting the key limitations of prior pairwise measures, we introduce a set of reformulations that allow us to capture behavioral and practical aspects of ranking systems. These reformulations form the basis for a novel pairwise metric of producer dissatisfaction. Our analytical and empirical study demonstrates the relationship between dissatisfaction, pairwise, and exposure-based fairness metrics, enabling informed adoption of the measures. code 0
Reducing Negative Effects of the Biases of Language Models in Zero-Shot Setting Xiaosu Wang, Yun Xiong, Beichen Kang, Yao Zhang, Philip S. Yu, Yangyong Zhu University of Illinois at Chicago, Chicago, USA; Fudan University, Shanghai, China Pre-trained language models (PLMs) such as GPTs have been revealed to be biased towards certain target classes because of the prompt and the model's intrinsic biases. In contrast to the fully supervised scenario where there are a large number of costly labeled samples that can be used to fine-tune model parameters to correct for biases, there are no labeled samples available for the zero-shot setting. We argue that a key to calibrating the biases of a PLM on a target task in zero-shot setting lies in detecting and estimating the biases, which remains a challenge. In this paper, we first construct probing samples with the randomly generated token sequences, which are simple but effective in detecting inputs for stimulating GPTs to show the biases; and we pursue an in-depth research on the plausibility of utilizing class scores for the probing samples to reflect and estimate the biases of GPTs on a downstream target task. Furtherly, in order to effectively utilize the probing samples and thus reduce negative effects of the biases of GPTs, we propose a lightweight model Calibration Adapter (CA) along with a self-guided training strategy that carries out distribution-level optimization, which enables us to take advantage of the probing samples to fine-tune and select only the proposed CA, respectively, while keeping the PLM encoder frozen. To demonstrate the effectiveness of our study, we have conducted extensive experiments, where the results indicate that the calibration ability acquired by CA on the probing samples can be successfully transferred to reduce negative effects of the biases of GPTs on a downstream target task, and our approach can yield better performance than state-of-the-art (SOTA) models in zero-shot settings. code 0
Multi-queue Momentum Contrast for Microvideo-Product Retrieval Yali Du, Yinwei Wei, Wei Ji, Fan Liu, Xin Luo, Liqiang Nie Harbin Institute of Technology (Shenzhen), Shenzhen, China; Nanjing University, Nanjing, China; National University of Singapore, Kent Ridge, Singapore; Shandong University, Jinan, China The booming development and huge market of micro-videos bring new e-commerce channels for merchants. Currently, more micro-video publishers prefer to embed relevant ads into their micro-videos, which not only provides them with business income but helps the audiences to discover their interesting products. However, due to the micro-video recording by unprofessional equipment, involving various topics and including multiple modalities, it is challenging to locate the products related to micro-videos efficiently, appropriately, and accurately. We formulate the microvideo-product retrieval task, which is the first attempt to explore the retrieval between the multi-modal and multi-modal instances. A novel approach named Multi-Queue Momentum Contrast (MQMC) network is proposed for bidirectional retrieval, consisting of the uni-modal feature and multi-modal instance representation learning. Moreover, a discriminative selection strategy with a multi-queue is used to distinguish the importance of different negatives based on their categories. We collect two large-scale microvideo-product datasets (MVS and MVS-large) for evaluation and manually construct the hierarchical category ontology, which covers sundry products in daily life. Extensive experiments show that MQMC outperforms the state-of-the-art baselines. Our replication package (including code, dataset, etc.) is publicly available at https://github.com/duyali2000/MQMC. code 0
Improving Cross-lingual Information Retrieval on Low-Resource Languages via Optimal Transport Distillation Zhiqi Huang, Puxuan Yu, James Allan University of Massachusetts Amherst, Amherst, MA, USA Benefiting from transformer-based pre-trained language models, neural ranking models have made significant progress. More recently, the advent of multilingual pre-trained language models provides great support for designing neural cross-lingual retrieval models. However, due to unbalanced pre-training data in different languages, multilingual language models have already shown a performance gap between high and low-resource languages in many downstream tasks. And cross-lingual retrieval models built on such pre-trained models can inherit language bias, leading to suboptimal result for low-resource languages. Moreover, unlike the English-to-English retrieval task, where large-scale training collections for document ranking such as MS MARCO are available, the lack of cross-lingual retrieval data for low-resource language makes it more challenging for training cross-lingual retrieval models. In this work, we propose OPTICAL: Optimal Transport distillation for low-resource Cross-lingual information retrieval. To transfer a model from high to low resource languages, OPTICAL forms the cross-lingual token alignment task as an optimal transport problem to learn from a well-trained monolingual retrieval model. By separating the cross-lingual knowledge from knowledge of query document matching, OPTICAL only needs bitext data for distillation training, which is more feasible for low-resource languages. Experimental results show that, with minimal training data, OPTICAL significantly outperforms strong baselines on low-resource languages, including neural machine translation. code 0
MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction Shuai Wang, Hang Li, Guido Zuccon The University of Queensland, Brisbane, QLD, Australia Boolean query construction is often critical for medical systematic review literature search. To create an effective Boolean query, systematic review researchers typically spend weeks coming up with effective query terms and combinations. One challenge to creating an effective systematic review Boolean query is the selection of effective MeSH Terms to include in the query. In our previous work, we created neural MeSH term suggestion methods and compared them to state-of-the-art MeSH term suggestion methods. We found neural MeSH term suggestion methods to be highly effective. In this demonstration, we build upon our previous work by creating (1) a Web-based MeSH term suggestion prototype system that allows users to obtain suggestions from a number of underlying methods and (2) a Python library that implements ours and others' MeSH term suggestion methods and that is aimed at researchers who want to further investigate, create or deploy such type of methods. We describe the architecture of the web-based system and how to use it for the MeSH term suggestion task. For the Python library, we describe how the library can be used for advancing further research and experimentation, and we validate the results of the methods contained in the library on standard datasets. Our web-based prototype system is available at http://ielab-mesh-suggest.uqcloud.net, while our Python library is at https://github.com/ielab/meshsuggestlib. 布尔查询结构通常是医学系统综述文献检索的关键。为了创建一个有效的布尔查询,系统综述研究人员通常要花费数周时间来构建有效的查询术语和组合。创建一个有效的系统综述布尔查询的一个挑战是在查询中选择有效的 MeSH 术语。在我们以前的工作中,我们创建了神经 MeSH 术语建议方法,并将它们与最先进的 MeSH 术语建议方法进行了比较。我们发现神经网络术语推荐方法是非常有效的。在这个演示中,我们通过创建(1)一个基于 Web 的 MeSH 术语建议原型系统,允许用户从一些底层方法中获得建议; (2)一个实现我们和其他人的 MeSH 术语建议方法的 Python 库,目标是希望进一步研究、创建或部署这类方法的研究人员。我们描述了基于 Web 的系统的体系结构,以及如何将其用于 MeSH 术语建议任务。对于 Python 库,我们描述了如何使用该库来推进进一步的研究和实验,并验证了该库中包含的标准数据集方法的结果。我们的基于网络的原型系统可以在 http://ielab-mesh-suggest.uqcloud.net 上使用,而我们的 Python 库则处于 https://github.com/ielab/meshsuggestlib。 code 0
Marginal-Certainty-Aware Fair Ranking Algorithm Tao Yang, Zhichao Xu, Zhenduo Wang, Anh Tran, Qingyao Ai University of Utah, Salt Lake City, UT, USA; Tsinghua University, Beijing, China Ranking systems are ubiquitous in modern Internet services, including online marketplaces, social media, and search engines. Traditionally, ranking systems only focus on how to get better relevance estimation. When relevance estimation is available, they usually adopt a user-centric optimization strategy where ranked lists are generated by sorting items according to their estimated relevance. However, such user-centric optimization ignores the fact that item providers also draw utility from ranking systems. It has been shown in existing research that such user-centric optimization will cause much unfairness to item providers, followed by unfair opportunities and unfair economic gains for item providers. To address ranking fairness, many fair ranking methods have been proposed. However, as we show in this paper, these methods could be suboptimal as they directly rely on the relevance estimation without being aware of the uncertainty (i.e., the variance of the estimated relevance). To address this uncertainty, we propose a novel Marginal-Certainty-aware Fair algorithm named MCFair. MCFair jointly optimizes fairness and user utility, while relevance estimation is constantly updated in an online manner. In MCFair, we first develop a ranking objective that includes uncertainty, fairness, and user utility. Then we directly use the gradient of the ranking objective as the ranking score. We theoretically prove that MCFair based on gradients is optimal for the aforementioned ranking objective. Empirically, we find that on semi-synthesized datasets, MCFair is effective and practical and can deliver superior performance compared to state-of-the-art fair ranking methods. To facilitate reproducibility, we release our code https://github.com/Taosheng-ty/WSDM22-MCFair. 排名系统在现代互联网服务中无处不在,包括在线市场、社交媒体和搜索引擎。传统的排序系统只关注如何获得更好的相关性估计。当相关性估计可用时,它们通常采用以用户为中心的优化策略,根据估计的相关性对项目进行排序,从而生成排名列表。然而,这种以用户为中心的优化忽略了一个事实,即项目提供者也从排名系统中获取效用。已有的研究表明,这种以用户为中心的优化会给项目提供者带来很大的不公平,其次是不公平的机会和不公平的经济收益。为了解决排序公平问题,人们提出了许多公平排序方法。然而,正如我们在本文中所展示的,这些方法可能是次优的,因为它们直接依赖于相关性估计而不知道不确定性(即,估计的相关性的方差)。针对这种不确定性,我们提出了一种新的边际确定性公平算法 MCFair。MCFair 共同优化公平性和用户效用,而相关性估计是不断更新的在线方式。在 MCFair 中,我们首先开发一个包含不确定性、公平性和用户效用的排名目标。然后直接使用排序目标的梯度作为排序得分。从理论上证明了基于梯度的 MCFair 对于上述排序目标是最优的。通过实验,我们发现在半合成数据集上,MCFair 是有效和实用的,能够提供比最先进的公平排名方法更好的性能。为了便于重现,我们发布了我们的代码 https://github.com/taosheng-ty/wsdm22-mcfair。 code 0
Learning Stance Embeddings from Signed Social Graphs John PouguéBiyong, Akshay Gupta, Aria Haghighi, Ahmed ElKishky University of Oxford, Oxford, United Kingdom; Twitter Cortex, Seattle, WA, USA; Meta, London, United Kingdom A key challenge in social network analysis is understanding the position, or stance, of people in the graph on a large set of topics. While past work has modeled (dis)agreement in social networks using signed graphs, these approaches have not modeled agreement patterns across a range of correlated topics. For instance, disagreement on one topic may make disagreement(or agreement) more likely for related topics. We propose the Stance Embeddings Model(SEM), which jointly learns embeddings for each user and topic in signed social graphs with distinct edge types for each topic. By jointly learning user and topic embeddings, SEM is able to perform cold-start topic stance detection, predicting the stance of a user on topics for which we have not observed their engagement. We demonstrate the effectiveness of SEM using two large-scale Twitter signed graph datasets we open-source. One dataset, TwitterSG, labels (dis)agreements using engagements between users via tweets to derive topic-informed, signed edges. The other, BirdwatchSG, leverages community reports on misinformation and misleading content. On TwitterSG and BirdwatchSG, SEM shows a 39% and 26% error reduction respectively against strong baselines. 社交网络分析中的一个关键挑战是理解人们在大量主题图中的位置或立场。虽然过去的工作已经在社会网络中使用有符号图表建模(dis)一致性,但是这些方法还没有在一系列相关主题中建模一致性模式。例如,在一个话题上的分歧可能会使相关话题更容易产生分歧(或一致意见)。我们提出了立场嵌入模型(Stance Embeddings Model,SEM) ,它共同学习每个用户和每个主题的不同边缘类型的有符号社会图中的主题的嵌入。通过联合学习用户和话题嵌入,SEM 能够执行冷启动话题姿态检测,预测用户在我们没有观察到他们参与的话题上的立场。我们使用两个开源的大型 Twitter 签名图表数据集来证明 SEM 的有效性。一个数据集 TwitterSG 使用用户之间通过 tweet 的约定来获得主题知情的、有符号的边。另一个是 Birdwatch SG,利用社区关于错误信息和误导性内容的报告。在 TwitterSG 和 Birdwatch SG 上,SEM 显示与强基线相比,错误率分别降低了39% 和26% 。 code 0
Range Restricted Route Recommendation Based on Spatial Keyword Hongwei Tang, Detian Zhang Soochow University, Suzhou, China In this paper, we focus on a new route recommendation problem, i.e., when a user gives a keyword and range constraint, the route that contains the maximum number of POIs tagged with the keyword or similar POIs in the range will be returned for him. This is a practical problem when people want to explore a place, e.g., find a route within 2 km containing as many clothing stores as possible. To solve the problem, we first calculate the score of each edge in road networks based on the number and similarity of POIs. Then, we reformulate the problem into finding the path in a graph with the maximum score within the distance constraint problem, which is proved NP-hard. Given this, we not only propose an exact branch and bound (BnB) algorithm, but also devise a more efficient top-k based network expansion (k-NE) algorithm to find the near-optimal solution. Extensive experiments on real datasets not only verify the effectiveness of the proposed route recommendation algorithm, but also show that the efficiency and accuracy of k-NE algorithm are completely acceptable. 本文研究了一个新的路由推荐问题,即当用户给出一个关键字和范围约束时,该路由将返回包含该关键字或范围内相似 POI 标记的最大 POI 数目的路由。这是一个实际的问题,当人们想要探索一个地方,例如,找到一条2公里内的路线,包括尽可能多的服装店。为了解决这个问题,我们首先根据 POI 的个数和相似度计算路网中每个边缘的得分。然后,在距离约束问题中,我们将该问题重新表述为在一个具有最大得分的图中寻找路径,证明了该问题是 NP 难的。鉴于此,我们不仅提出了一种精确的分枝定界(BnB)算法,而且设计了一种更有效的基于 top-k 的网络扩展(k-NE)算法来寻找近似最优解。在实际数据集上的大量实验不仅验证了该算法的有效性,而且表明 k-NE 算法的效率和准确性是完全可以接受的。 code 0
NGAME: Negative Mining-aware Mini-batching for Extreme Classification Kunal Dahiya, Nilesh Gupta, Deepak Saini, Akshay Soni, Yajun Wang, Kushal Dave, Jian Jiao, Gururaj K, Prasenjit Dey, Amit Singh, Deepesh Hada, Vidit Jain, Bhawna Paliwal, Anshul Mittal, Sonu Mehta, Ramachandran Ramjee, Sumeet Agarwal, Purushottam Kar, Manik Varma Microsoft Research, Bangalore, India; Microsoft Research & IIT Delhi, Bangalore, India; IIT Delhi, New Delhi, India; IIT Kanpur, Kanpur, India; UT Austin, Austin, TX, USA; Microsoft, Bangalore, India; Microsoft, Bellevue , WA, USA; Linkedin, Sunnyvale , CA, USA; Microsoft, Sunnyvale , CA, USA Extreme Classification (XC) seeks to tag data points with the most relevant subset of labels from an extremely large label set. Performing deep XC with dense, learnt representations for data points and labels has attracted much attention due to its superiority over earlier XC methods that used sparse, hand-crafted features. Negative mining techniques have emerged as a critical component of all deep XC methods that allow them to scale to millions of labels. However, despite recent advances, training deep XC models with large encoder architectures such as transformers remains challenging. This paper identifies that memory overheads of popular negative mining techniques often force mini-batch sizes to remain small and slow training down. In response, this paper introduces NGAME, a light-weight mini-batch creation technique that offers provably accurate in-batch negative samples. This allows training with larger mini-batches offering significantly faster convergence and higher accuracies than existing negative sampling techniques. NGAME was found to be up to 16% more accurate than state-of-the-art methods on a wide array of benchmark datasets for extreme classification, as well as 3% more accurate at retrieving search engine queries in response to a user webpage visit to show personalized ads. In live A/B tests on a popular search engine, NGAME yielded up to 23% gains in click-through-rates. 极端分类(XC)试图用来自极大标签集的最相关的标签子集来标记数据点。对数据点和标签进行深层 XC 表示时,由于其优于使用稀疏的手工特性的早期 XC 方法,因此引起了人们的广泛关注。负面挖掘技术已经成为所有深层 XC 方法的关键组成部分,这些方法可以扩展到数百万个标签。然而,尽管最近取得了一些进展,使用大型编码器架构(如变压器)训练深层 XC 模型仍然具有挑战性。本文指出,流行的负面挖掘技术的内存开销往往迫使小批量保持较小的规模和缓慢的训练下来。作为回应,本文介绍了 NGAME,一种轻量级的小批量生成技术,它可以提供可证明的准确的批内阴性样品。与现有的负面采样技术相比,这使得培训可以使用更大的迷你批量,提供更快的收敛速度和更高的准确度。研究发现,NGAME 在一系列基准数据集的极端分类中,比最先进的方法准确率高达16% ,在检索用户访问网页显示个性化广告的搜索引擎查询时,准确率高达3% 。在一个流行搜索引擎的 A/B 测试中,NGAME 的点击率提高了23% 。 code 0
Federated Unlearning for On-Device Recommendation Wei Yuan, Hongzhi Yin, Fangzhao Wu, Shijie Zhang, Tieke He, Hao Wang Microsoft Research Asia, Beijing, China; Alibaba Cloud, Alibaba Group, Hangzhou, China; Nanjing University, Nanjing, China; Tencent, Shenzhen, China; The University of Queensland, Brisbane, Australia The increasing data privacy concerns in recommendation systems have made federated recommendations (FedRecs) attract more and more attention. Existing FedRecs mainly focus on how to effectively and securely learn personal interests and preferences from their on-device interaction data. Still, none of them considers how to efficiently erase a user's contribution to the federated training process. We argue that such a dual setting is necessary. First, from the privacy protection perspective, ``the right to be forgotten'' requires that users have the right to withdraw their data contributions. Without the reversible ability, FedRecs risk breaking data protection regulations. On the other hand, enabling a FedRec to forget specific users can improve its robustness and resistance to malicious clients' attacks. To support user unlearning in FedRecs, we propose an efficient unlearning method FRU (Federated Recommendation Unlearning), inspired by the log-based rollback mechanism of transactions in database management systems. It removes a user's contribution by rolling back and calibrating the historical parameter updates and then uses these updates to speed up federated recommender reconstruction. However, storing all historical parameter updates on resource-constrained personal devices is challenging and even infeasible. In light of this challenge, we propose a small-sized negative sampling method to reduce the number of item embedding updates and an importance-based update selection mechanism to store only important model updates. To evaluate the effectiveness of FRU, we propose an attack method to disturb FedRecs via a group of compromised users and use FRU to recover recommenders by eliminating these users' influence. Finally, we conduct experiments on two real-world recommendation datasets with two widely used FedRecs to show the efficiency and effectiveness of our proposed approaches. 推荐系统中日益增长的数据隐私问题使得联邦推荐(FedRecs)越来越受到人们的关注。现有的 FedRecs 主要关注如何有效和安全地从设备上的交互数据中学习个人兴趣和偏好。尽管如此,它们都没有考虑如何有效地删除用户对联合培训过程的贡献。我们认为这种双重设置是必要的。首先,从隐私保护的角度来看,“被遗忘的权利”要求用户有权撤回他们的数据贡献。如果没有这种可逆能力,FedRecs 就有可能违反数据保护规定。另一方面,允许 FedRec 忘记特定用户可以提高其健壮性和对恶意客户端攻击的抵抗力。为了支持联邦推荐系统中的用户去学习,受数据库管理系统中基于日志的事务回滚机制的启发,提出了一种有效的去学习方法 FRU (FederatedRecumentUnlearning)。它通过回滚和校准历史参数更新来消除用户的贡献,然后使用这些更新来加速联邦推荐重建。然而,在资源受限的个人设备上存储所有历史参数更新是具有挑战性的,甚至是不可行的。针对这一挑战,我们提出了一种小规模的负抽样方法来减少嵌入更新项的数量,以及一种基于重要性的更新选择机制来只存储重要的模型更新。为了评估 FRU 的有效性,我们提出了一种通过一组受到攻击的用户来干扰 FedRecs 的攻击方法,并通过消除这些用户的影响,使用 FRU 来恢复推荐信息。最后,我们使用两个广泛使用的 FedRecs 在两个真实世界的推荐数据集上进行实验,以证明我们提出的方法的效率和有效性。 code 0
Cognition-aware Knowledge Graph Reasoning for Explainable Recommendation Qingyu Bing, Qiannan Zhu, Zhicheng Dou Renmin University of China, Beijing, China Knowledge graphs (KGs) have been widely used in recommendation systems to improve recommendation accuracy and interpretability effectively. Recent research usually endows KG reasoning to find the multi-hop user-item connection paths for explaining why an item is recommended. The existing path-finding process is well designed by logic-driven inference algorithms, while there exists a gap between how algorithms and users perceive the reasoning process. Factually, human thinking is a natural reasoning process that can provide more proper and convincing explanations of why particular decisions are made. Motivated by the Dual Process Theory in cognitive science, we propose a cognition-aware KG reasoning model CogER for Explainable Recommendation, which imitates the human cognition process and designs two modules, i.e., System1 (making intuitive judgment) and System2 (conducting explicit reasoning), to generate the actual decision-making process. At each step during the cognition-aware reasoning process, System1 generates an intuitive estimation of the next-step entity based on the user's historical behavior, and System2 conducts explicit reasoning and selects the most promising knowledge entities. These two modules work iteratively and are mutually complementary, enabling our model to yield high-quality recommendations and proper reasoning paths. Experiments on three real-world datasets show that our model achieves better recommendation results with explanations compared with previous methods. 知识图在推荐系统中得到了广泛的应用,有效地提高了推荐的准确性和可解释性。最近的研究通常使用 KG 推理来寻找多跳用户-项目的连接路径来解释为什么推荐一个项目。现有的路径寻找过程是由逻辑驱动的推理算法设计的,而算法与用户对推理过程的感知存在差距。事实上,人类的思考是一个自然的推理过程,可以提供更恰当和令人信服的解释为什么做出特定的决定。基于认知科学中的二元过程理论,本文提出了一种模仿人类认知过程的认知知觉 KG 推理模型 CogER for Explainable 汪洋推理模型,设计了系统1(直觉判断)和系统2(显性推理)两个模块来生成实际的决策过程。在认知推理过程的每个步骤中,System ~ 1根据用户的历史行为对下一步实体进行直观的估计,System ~ 2进行显式推理并选择最有前途的知识实体。这两个模块迭代工作,相互补充,使我们的模型能够产生高质量的建议和适当的推理路径。在三个实际数据集上的实验结果表明,与以前的方法相比,该模型在解释方面取得了较好的推荐效果。 code 0
AGREE: Aligning Cross-Modal Entities for Image-Text Retrieval Upon Vision-Language Pre-trained Models Xiaodan Wang, Lei Li, Zhixu Li, Xuwu Wang, Xiangru Zhu, Chengyu Wang, Jun Huang, Yanghua Xiao Fudan University & Fudan-Aishu Cognitive Intelligence Joint Research Center, Shanghai, China; East China Normal University, Shanghai, China; Alibaba Group, Hangzhou, China; Fudan University, Shanghai, China Image-text retrieval is a challenging cross-modal task that arouses much attention. While the traditional methods cannot break down the barriers between different modalities, Vision-Language Pre-trained (VLP) models greatly improve image-text retrieval performance based on massive image-text pairs. Nonetheless, the VLP-based methods are still prone to produce retrieval results that cannot be cross-modal aligned with entities. Recent efforts try to fix this problem at the pre-training stage, which is not only expensive but also unpractical due to the unavailable of full datasets. In this paper, we novelly propose a lightweight and practical approach to align cross-modal entities for image-text retrieval upon VLP models only at the fine-tuning and re-ranking stages. We employ external knowledge and tools to construct extra fine-grained image-text pairs, and then emphasize cross-modal entity alignment through contrastive learning and entity-level mask modeling in fine-tuning. Besides, two re-ranking strategies are proposed, including one specially designed for zero-shot scenarios. Extensive experiments with several VLP models on multiple Chinese and English datasets show that our approach achieves state-of-the-art results in nearly all settings. 图像-文本检索是一个具有挑战性的跨模态任务,引起了人们的广泛关注。传统的检索方法无法突破不同检索模式之间的障碍,而视觉语言预训练(VLP)模型可以大大提高基于海量图像-文本对的图像-文本检索性能。尽管如此,基于 VLP 的方法仍然容易产生不能与实体进行跨模式对齐的检索结果。最近的努力试图在训练前阶段解决这个问题,这不仅昂贵,而且不切实际,因为没有完整的数据集。本文提出了一种轻量级、实用的方法,仅在微调和重新排序阶段对 VLP 模型上的跨模态实体进行对齐。我们利用外部知识和工具构造超细粒度的图像-文本对,然后通过对比学习和实体级掩模建模进行微调,强调跨模态实体对齐。此外,提出了两种重新排序策略,包括一种专门为零射击场景设计的重新排序策略。在多个中文和英文数据集上对多个 VLP 模型进行的大量实验表明,我们的方法在几乎所有的设置中都取得了最先进的结果。 code 0
Disentangled Representation for Diversified Recommendations Xiaoying Zhang, Hongning Wang, Hang Li AI Lab, Bytedance Inc., Beijing, China; Department of Computer Science, University of Virginia, Charlottesville, VA, USA Accuracy and diversity have long been considered to be two conflicting goals for recommendations. We point out, however, that as the diversity is typically measured by certain pre-selected item attributes, e.g., category as the most popularly employed one, improved diversity can be achieved without sacrificing recommendation accuracy, as long as the diversification respects the user's preference about the pre-selected attributes. This calls for a fine-grained understanding of a user's preferences over items, where one needs to recognize the user's choice is driven by the quality of the item itself, or the pre-selected attributes of the item. In this work, we focus on diversity defined on item categories. We propose a general diversification framework agnostic to the choice of recommendation algorithms. Our solution disentangles the learnt user representation in the recommendation module into category-independent and category-dependent components to differentiate a user's preference over items from two orthogonal perspectives. Experimental results on three benchmark datasets and online A/B test demonstrate the effectiveness of our solution in improving both recommendation accuracy and diversity. In-depth analysis suggests that the improvement is due to our improved modeling of users' categorical preferences and refined ranking within item categories. 长期以来,准确性和多样性一直被认为是提出建议的两个相互冲突的目标。然而,我们指出,由于多样性通常是通过某些预先选择的项目属性来衡量的,例如,类别作为最受欢迎的一个,只要多样性尊重用户对预先选择的属性的偏好,就可以在不牺牲推荐准确性的情况下实现改善的多样性。这需要对用户对项目的偏好有一个细粒度的理解,在这种情况下,人们需要认识到用户的选择是由项目本身的质量或预先选择的项目属性驱动的。在这项工作中,我们关注的多样性定义的项目类别。我们提出了一个与推荐算法的选择无关的通用多样化框架。我们的解决方案将推荐模块中的学习用户表示分解为与类别无关和与类别相关的组件,从两个正交的角度区分用户对项目的偏好。在三个基准数据集上的实验结果和在线 A/B 测试表明了该方案在提高推荐精度和多样性方面的有效性。深入的分析表明,这种改进是由于我们改进了对用户分类偏好的建模,并在项目类别中进行了精确的排名。 code 0
Knowledge-Adaptive Contrastive Learning for Recommendation Hao Wang, Yao Xu, Cheng Yang, Chuan Shi, Xin Li, Ning Guo, Zhiyuan Liu Beijing University of Posts and Telecommunications, Beijing, China; Researcher, Beijing, China; Tsinghua University, Beijing, China By jointly modeling user-item interactions and knowledge graph (KG) information, KG-based recommender systems have shown their superiority in alleviating data sparsity and cold start problems. Recently, graph neural networks (GNNs) have been widely used in KG-based recommendation, owing to the strong ability of capturing high-order structural information. However, we argue that existing GNN-based methods have the following two limitations. Interaction domination: the supervision signal of user-item interaction will dominate the model training, and thus the information of KG is barely encoded in learned item representations; Knowledge overload: KG contains much recommendation-irrelevant information, and such noise would be enlarged during the message aggregation of GNNs. The above limitations prevent existing methods to fully utilize the valuable information lying in KG. In this paper, we propose a novel algorithm named Knowledge-Adaptive Contrastive Learning (KACL) to address these challenges. Specifically, we first generate data augmentations from user-item interaction view and KG view separately, and perform contrastive learning across the two views. Our design of contrastive loss will force the item representations to encode information shared by both views, thereby alleviating the interaction domination issue. Moreover, we introduce two learnable view generators to adaptively remove task-irrelevant edges during data augmentation, and help tolerate the noises brought by knowledge overload. Experimental results on three public benchmarks demonstrate that KACL can significantly improve the performance on top-K recommendation compared with state-of-the-art methods. 基于 KG 的推荐系统通过联合建模用户-项目交互和知识图(KG)信息,在缓解数据稀疏和冷启动问题方面显示出其优越性。近年来,图神经网络(GNN)由于具有较强的高阶结构信息捕获能力,在 KG 推荐中得到了广泛的应用。然而,我们认为现有的基于 GNN 的方法有以下两个局限性。交互控制: 用户-项目交互的监控信号将主导模型训练,因此 KG 的信息几乎不被编码到学习项目表示中; 知识超载: KG 包含大量与推荐无关的信息,这种噪声在 GNN 的信息聚合过程中会被放大。上述限制使得现有的方法无法充分利用幼儿园的宝贵信息。本文提出了一种新的知识自适应对比学习(KACL)算法来解决这些问题。具体来说,我们首先分别从用户项交互视图和 KG 视图生成数据增强,并在两个视图之间进行对比学习。对比性损失的设计将迫使项目表征对两种视图共享的信息进行编码,从而缓解交互支配问题。此外,本文还引入了两个可学习的视图生成器来自适应地去除数据增强过程中与任务无关的边缘,并有助于抑制知识过载带来的噪声。在三个公共基准上的实验结果表明,KACL 算法能够显著提高 top-K 推荐的性能。 code 0
Calibrated Recommendations as a Minimum-Cost Flow Problem Himan Abdollahpouri, Zahra Nazari, Alex Gain, Clay Gibson, Maria Dimakopoulou, Jesse Anderton, Benjamin A. Carterette, Mounia Lalmas, Tony Jebara Airbnb, San Francisco, CA, USA; Spotify, London, United Kingdom; Spotify, New York, NY, USA Calibration in recommender systems has recently gained significant attention. In the recommended list of items, calibration ensures that the various (past) areas of interest of a user are reflected with their corresponding proportions. For instance, if a user has watched, say, 80 romance movies and 20 action movies, then it is reasonable to expect the recommended list of movies to be comprised of about 80% romance and 20% action movies as well. Calibration is particularly important given that optimizing towards accuracy often leads to the user's minority interests being dominated by their main interests, or by a few overall popular items, in the recommendations they receive. In this paper, we propose a novel approach based on the max flow problem for generating calibrated recommendations. In a series of experiments using two publicly available datasets, we demonstrate the superior performance of our proposed approach compared to the state-of-the-art in generating relevant and calibrated recommendation lists. 推荐系统的校准最近引起了广泛的关注。在推荐的项目列表中,校准确保用户感兴趣的各个(过去的)领域以其相应的比例得到反映。例如,如果一个用户已经看了80部浪漫电影和20部动作片,那么推荐的电影列表应该包括80% 的浪漫电影和20% 的动作片。校准尤其重要,因为对准确性的优化往往导致用户的少数利益被他们的主要利益所主导,或者在他们收到的推荐中被一些整体流行的项目所主导。在本文中,我们提出了一种新的方法基于最大流问题生成校准的建议。在使用两个公开可用数据集的一系列实验中,我们证明了我们提出的方法在生成相关和校准的推荐列表方面的优越性能。 code 0
Generative Slate Recommendation with Reinforcement Learning Romain Deffayet, Thibaut Thonet, JeanMichel Renders, Maarten de Rijke Naver Labs Europe, Meylan, France; University of Amsterdam, Amsterdam, Netherlands; Naver Labs Europe & University of Amsterdam, Meylan, France Recent research has employed reinforcement learning (RL) algorithms to optimize long-term user engagement in recommender systems, thereby avoiding common pitfalls such as user boredom and filter bubbles. They capture the sequential and interactive nature of recommendations, and thus offer a principled way to deal with long-term rewards and avoid myopic behaviors. However, RL approaches are intractable in the slate recommendation scenario - where a list of items is recommended at each interaction turn - due to the combinatorial action space. In that setting, an action corresponds to a slate that may contain any combination of items. While previous work has proposed well-chosen decompositions of actions so as to ensure tractability, these rely on restrictive and sometimes unrealistic assumptions. Instead, in this work we propose to encode slates in a continuous, low-dimensional latent space learned by a variational auto-encoder. Then, the RL agent selects continuous actions in this latent space, which are ultimately decoded into the corresponding slates. By doing so, we are able to (i) relax assumptions required by previous work, and (ii) improve the quality of the action selection by modeling full slates instead of independent items, in particular by enabling diversity. Our experiments performed on a wide array of simulated environments confirm the effectiveness of our generative modeling of slates over baselines in practical scenarios where the restrictive assumptions underlying the baselines are lifted. Our findings suggest that representation learning using generative models is a promising direction towards generalizable RL-based slate recommendation. 最近的研究使用强化学习算法来优化推荐系统中的长期用户参与,从而避免常见的陷阱,如用户厌烦和过滤气泡。它们捕捉到了建议的连续性和互动性,因此提供了一个处理长期奖励和避免短视行为的原则性方法。然而,由于组合操作空间的存在,RL 方法在平板推荐场景(在每个交互回合中推荐一个项目列表)中是难以处理的。在该设置中,一个操作对应于可能包含任何项目组合的石板。虽然以前的工作提出了精心选择的行动分解方法,以确保易于处理,但这些方法依赖于限制性的、有时不切实际的假设。相反,在这项工作中,我们建议编码石板在一个连续的,低维潜在的空间学习变分自动编码器。然后,RL 代理在这个潜在空间中选择连续的动作,这些动作最终被解码到相应的平板中。通过这样做,我们能够(i)放松以前工作所需要的假设,(ii)通过建模完整的板岩而不是独立的项目来提高行动选择的质量,特别是通过支持多样性。我们在大量的模拟环境中进行的实验证实了我们在基线上的板岩生成模型在实际场景中的有效性,在实际场景中基线的限制性假设被取消。我们的研究结果表明,使用生成模型的表示学习是一个有前途的方向,可推广的 RL 为基础的板岩推荐。 code 0
AutoGen: An Automated Dynamic Model Generation Framework for Recommender System Chenxu Zhu, Bo Chen, Huifeng Guo, Hang Xu, Xiangyang Li, Xiangyu Zhao, Weinan Zhang, Yong Yu, Ruiming Tang City University of Hong Kong, Hong Kong, China; Huawei Noah's Ark Lab, Shenzhen, China; Shanghai Jiao Tong University, Shanghai, China Considering the balance between revenue and resource consumption for industrial recommender systems, intelligent recommendation computing has been emerging recently. Existing solutions deploy the same recommendation model to serve users indiscriminately, which is sub-optimal for total revenue maximization. We propose a multi-model service solution by deploying different-complexity models to serve different-valued users. An automated dynamic model generation framework AutoGen is elaborated to efficiently derive multiple parameter-sharing models with diverse complexities and adequate predictive capabilities. A mixed search space is designed and an importance-aware progressive training scheme is proposed to prevent interference between different architectures, which avoids the model retraining and improves the search efficiency, thereby efficiently deriving multiple models. Extensive experiments are conducted on two public datasets to demonstrate the effectiveness and efficiency of AutoGen. 考虑到工业推荐系统的收益和资源消耗之间的平衡,智能推荐计算近年来兴起。现有的解决方案部署相同的推荐模型来不加区分地为用户服务,这对于总收入最大化是次优的。我们提出了一个多模型服务解决方案,通过部署不同复杂度的模型来服务不同价值的用户。提出了一种自动动态模型生成框架 AutoGen,有效地推导出复杂度不同、预测能力足够的多参数共享模型。设计了一种混合搜索空间,提出了一种重要性感知的渐进训练方案,避免了模型再训练,提高了搜索效率,从而有效地推导出多个模型。为了验证 AutoGen 的有效性和效率,在两个公共数据集上进行了大量的实验。 code 0
Bring Your Own View: Graph Neural Networks for Link Prediction with Personalized Subgraph Selection Qiaoyu Tan, Xin Zhang, Ninghao Liu, Daochen Zha, Li Li, Rui Chen, SooHyun Choi, Xia Hu Rice University, Houston, TX, USA; Samsung Electronics, Mountain view, CA, USA; Samsung Electronics America, Mountain view, CA, USA; Texas A&M University, College Station, TX, USA; University of Georgia, Athens, GA, USA; The Hong Kong Polytechnic University, Hong Kong, Hong Kong Graph neural networks (GNNs) have received remarkable success in link prediction (GNNLP) tasks. Existing efforts first predefine the subgraph for the whole dataset and then apply GNNs to encode edge representations by leveraging the neighborhood structure induced by the fixed subgraph. The prominence of GNNLP methods significantly relies on the adhoc subgraph. Since node connectivity in real-world graphs is complex, one shared subgraph is limited for all edges. Thus, the choices of subgraphs should be personalized to different edges. However, performing personalized subgraph selection is nontrivial since the potential selection space grows exponentially to the scale of edges. Besides, the inference edges are not available during training in link prediction scenarios, so the selection process needs to be inductive. To bridge the gap, we introduce a Personalized Subgraph Selector (PS2) as a plug-and-play framework to automatically, personally, and inductively identify optimal subgraphs for different edges when performing GNNLP. PS2 is instantiated as a bi-level optimization problem that can be efficiently solved differently. Coupling GNNLP models with PS2, we suggest a brand-new angle towards GNNLP training: by first identifying the optimal subgraphs for edges; and then focusing on training the inference model by using the sampled subgraphs. Comprehensive experiments endorse the effectiveness of our proposed method across various GNNLP backbones (GCN, GraphSage, NGCF, LightGCN, and SEAL) and diverse benchmarks (Planetoid, OGB, and Recommendation datasets). Our code is publicly available at \url{https://github.com/qiaoyu-tan/PS2} 图神经网络(GNNs)在链路预测(GNNLP)任务中取得了显著的成功。现有的工作首先为整个数据集预先定义子图,然后利用固定子图产生的邻域结构,应用 GNN 对边缘表示进行编码。GNNLP 方法的突出性在很大程度上依赖于自组织子图。由于实际图中的节点连通性比较复杂,所以对于所有边,一个共享子图是有限的。因此,子图的选择应该针对不同的边进行个性化。然而,执行个性化的子图选择是不平凡的,因为潜在的选择空间成指数增长的尺度的边。此外,在链路预测场景的训练过程中,推理边不可用,因此选择过程需要归纳。为了弥补这一差距,我们引入了一个个性化子图选择器(PS2)作为即插即用的框架,以便在执行 GNNLP 时自动、个性化和归纳地识别不同边的最优子图。PS2被实例化为一个双层最佳化问题,可以通过不同的方式有效地解决。将 GNNLP 模型与 PS2相结合,提出了一种全新的 GNNLP 训练方法: 首先确定最优边缘子图,然后利用抽样子图对推理模型进行训练。综合实验认可了我们提出的方法在各种 GNNLP 骨干网(GCN,GraphSage,NGCF,LightGCN 和 SEAL)和各种基准(Planetoid,OGB 和推荐数据集)中的有效性。我们的代码可以在 url { https://github.com/qiaoyu-tan/ps2}上公开获得 code 0
Heterogeneous Graph-based Context-aware Document Ranking Shuting Wang, Zhicheng Dou, Yutao Zhu Renmin University of China, Beijing, China; University of Montreal, Montreal, PQ, Canada Users' complex information needs usually require consecutive queries, which results in sessions with a series of interactions. Exploiting such contextual interactions has been proven to be favorable for result ranking. However, existing studies mainly model the contextual information independently and sequentially. They neglect the diverse information hidden in different relations and structured information of session elements as well as the valuable signals from other relevant sessions. In this paper, we propose HEXA, a heterogeneous graph-based context-aware document ranking framework. It exploits heterogeneous graphs to organize the contextual information and beneficial search logs for modeling user intents and ranking results. Specifically, we construct two heterogeneous graphs, i.e., a session graph and a query graph. The session graph is built from the current session queries and documents. Meanwhile, we sample the current query's k-layer neighbors from search logs to construct the query graph. Then, we employ heterogeneous graph neural networks and specialized readout functions on the two graphs to capture the user intents from local and global aspects. Finally, the document ranking scores are measured by how well the documents are matched with the two user intents. Results on two large-scale datasets confirm the effectiveness of our model. 用户复杂的信息需求通常需要连续的查询,这会导致一系列交互的会话。利用这样的情境互动已被证明是有利于结果排名。然而,现有的研究主要是依次独立地对语境信息进行建模。它们忽视了隐藏在不同关系中的各种信息、会议要素的结构化信息以及其他相关会议的宝贵信号。本文提出了一种基于异构图的上下文感知文档排序框架 HEXA。它利用异构图来组织上下文信息和有益的搜索日志,以建立用户意图和排序结果。具体来说,我们构造了两个异构图,即会话图和查询图。会话图是根据当前会话查询和文档构建的。同时,从搜索日志中抽取当前查询的 k 层邻居,构造查询图。然后,利用异构图形神经网络和专用读出函数,从局部和全局两个方面获取用户意图。最后,文档排名分数通过文档与两个用户意图的匹配程度来衡量。两个大规模数据集的结果证实了模型的有效性。 code 0
Graph Summarization via Node Grouping: A Spectral Algorithm Arpit Merchant, Michael Mathioudakis, Yanhao Wang East China Normal University, Shanghai, China; University of Helsinki, Helsinki, Finland Graph summarization via node grouping is a popular method to build concise graph representations by grouping nodes from the original graph into supernodes and encoding edges into superedges such that the loss of adjacency information is minimized. Such summaries have immense applications in large-scale graph analytics due to their small size and high query processing efficiency. In this paper, we reformulate the loss minimization problem for summarization into an equivalent integer maximization problem. By initially allowing relaxed (fractional) solutions for integer maximization, we analytically expose the underlying connections to the spectral properties of the adjacency matrix. Consequently, we design an algorithm called SpecSumm that consists of two phases. In the first phase, motivated by spectral graph theory, we apply k-means clustering on the k largest (in magnitude) eigenvectors of the adjacency matrix to assign nodes to supernodes. In the second phase, we propose a greedy heuristic that updates the initial assignment to further improve summary quality. Finally, via extensive experiments on 11 datasets, we show that SpecSumm efficiently produces high-quality summaries compared to state-of-the-art summarization algorithms and scales to graphs with millions of nodes. 基于节点分组的图摘要是一种流行的图表示方法,它将原始图中的节点分组成超节点,并将边编码成超边,从而使邻接信息的损失最小化。这种摘要由于体积小、查询处理效率高,在大规模图形分析中有着广泛的应用。本文将汇总损失最小化问题重新表述为等价整数最大化问题。通过最初允许整数最大化的松弛(分数)解,我们分析地揭示了邻接矩阵的光谱特性的潜在联系。因此,我们设计了一个称为 SpecSumm 的算法,该算法由两个阶段组成。在第一阶段,由谱图理论驱动,我们应用 K平均算法对邻接矩阵的 k 个最大(大小)特征向量来分配节点到超节点。在第二阶段,我们提出了一个贪婪的启发式算法,更新初始分配以进一步提高汇总质量。最后,通过对11个数据集的大量实验,我们发现 SpecSumm 与最先进的摘要算法相比,能够有效地生成高质量的摘要,并且可以对具有数百万个节点的图进行缩放。 code 0
Ranking-based Group Identification via Factorized Attention on Social Tripartite Graph Mingdai Yang, Zhiwei Liu, Liangwei Yang, Xiaolong Liu, Chen Wang, Hao Peng, Philip S. Yu University of Illinois at Chicago, Chicago, IL, USA; Salesforce AI Research, Palo Alto, CA, USA; Beihang University, Beijing, China Due to the proliferation of social media, a growing number of users search for and join group activities in their daily life. This develops a need for the study on the ranking-based group identification (RGI) task, i.e., recommending groups to users. The major challenge in this task is how to effectively and efficiently leverage both the item interaction and group participation of users' online behaviors. Though recent developments of Graph Neural Networks (GNNs) succeed in simultaneously aggregating both social and user-item interaction, they however fail to comprehensively resolve this RGI task. In this paper, we propose a novel GNN-based framework named Contextualized Factorized Attention for Group identification (CFAG). We devise tripartite graph convolution layers to aggregate information from different types of neighborhoods among users, groups, and items. To cope with the data sparsity issue, we devise a novel propagation augmentation (PA) layer, which is based on our proposed factorized attention mechanism. PA layers efficiently learn the relatedness of non-neighbor nodes to improve the information propagation to users. Experimental results on three benchmark datasets verify the superiority of CFAG. Additional detailed investigations are conducted to demonstrate the effectiveness of the proposed framework. 随着社交媒体的普及,越来越多的用户在日常生活中搜索和参加小组活动。这就需要对基于排名的群体识别(RGI)任务进行研究,即向用户推荐群体。这项任务的主要挑战是如何有效地利用项目互动和用户在线行为的群体参与。近年来图形神经网络(GNN)虽然成功地同时聚合了社会交互和用户项目交互,但未能全面解决这一 RGI 任务。本文提出了一种新的基于 GNN 的群体识别框架——上下文分解注意(CFAG)。我们设计三部分图卷积层来聚合来自用户、组和项目之间不同类型的邻域的信息。为了解决数据稀疏的问题,我们在分解注意机制的基础上,设计了一种新的传播增强(PA)层。PA 层有效地学习非邻居节点的相关性,提高信息传播给用户的效率。在三个基准数据集上的实验结果验证了 CFAG 算法的优越性。还进行了更多的详细调查,以证明拟议框架的有效性。 code 0
Graph Sequential Neural ODE Process for Link Prediction on Dynamic and Sparse Graphs Linhao Luo, Gholamreza Haffari, Shirui Pan Griffith University, Brisbane, QLD, Australia; Monash University, Melbourne, VIC, Australia Link prediction on dynamic graphs is an important task in graph mining. Existing approaches based on dynamic graph neural networks (DGNNs) typically require a significant amount of historical data (interactions over time), which is not always available in practice. The missing links over time, which is a common phenomenon in graph data, further aggravates the issue and thus creates extremely sparse and dynamic graphs. To address this problem, we propose a novel method based on the neural process, called Graph Sequential Neural ODE Process (GSNOP). Specifically, GSNOP combines the advantage of the neural process and neural ordinary differential equation that models the link prediction on dynamic graphs as a dynamic-changing stochastic process. By defining a distribution over functions, GSNOP introduces the uncertainty into the predictions, making it generalize to more situations instead of overfitting to the sparse data. GSNOP is also agnostic to model structures that can be integrated with any DGNN to consider the chronological and geometrical information for link prediction. Extensive experiments on three dynamic graph datasets show that GSNOP can significantly improve the performance of existing DGNNs and outperform other neural process variants. 动态图的链接预测是图挖掘中的一项重要任务。基于动态图神经网络(DGNN)的现有方法通常需要大量的历史数据(随着时间的推移相互作用) ,这在实践中并不总是可用的。随着时间的推移,丢失链接是图形数据中常见的现象,这进一步加剧了问题的严重性,从而产生了极其稀疏和动态的图形。为了解决这一问题,我们提出了一种基于神经过程的图序贯神经 ODE 过程(GSNOP)方法。具体来说,GSNOP 结合了神经过程和神经常微分方程的优势,将动态图表上的链接预测建模为一个动态变化的随机过程。通过定义函数上的分布,GSNOP 将不确定性引入到预测中,使其能够推广到更多的情况,而不是对稀疏数据进行过度拟合。GSNOP 也是不可知的模型结构,可以与任何 DGNN 集成,以考虑时间和几何信息的链路预测。在三个动态图数据集上的大量实验表明,GSNOP 可以显著提高现有 DGNN 的性能,并优于其他神经过程变体。 code 0
CL4CTR: A Contrastive Learning Framework for CTR Prediction Fangye Wang, Yingxu Wang, Dongsheng Li, Hansu Gu, Tun Lu, Peng Zhang, Ning Gu Microsoft Research Asia, Shanghai, China; Independent, Seattle, WA, USA; Fudan University, Shanghai, China Many Click-Through Rate (CTR) prediction works focused on designing advanced architectures to model complex feature interactions but neglected the importance of feature representation learning, e.g., adopting a plain embedding layer for each feature, which results in sub-optimal feature representations and thus inferior CTR prediction performance. For instance, low frequency features, which account for the majority of features in many CTR tasks, are less considered in standard supervised learning settings, leading to sub-optimal feature representations. In this paper, we introduce self-supervised learning to produce high-quality feature representations directly and propose a model-agnostic Contrastive Learning for CTR (CL4CTR) framework consisting of three self-supervised learning signals to regularize the feature representation learning: contrastive loss, feature alignment, and field uniformity. The contrastive module first constructs positive feature pairs by data augmentation and then minimizes the distance between the representations of each positive feature pair by the contrastive loss. The feature alignment constraint forces the representations of features from the same field to be close, and the field uniformity constraint forces the representations of features from different fields to be distant. Extensive experiments verify that CL4CTR achieves the best performance on four datasets and has excellent effectiveness and compatibility with various representative baselines. 许多点进率预测工作集中于设计先进的体系结构来模拟复杂的特征交互,但忽视了特征表示学习的重要性,例如,对每个特征采用一个普通的嵌入层,这导致了次优的特征表示,从而导致了较差的 CTR 预测性能。例如,在许多点击率任务中占大多数的低频特征,在标准的监督式学习设置中很少被考虑,导致了次优的特征表示。本文引入自监督学习,直接生成高质量的特征表示,提出了一种由三个自监督学习信号组成的模型无关的 CTR 对比学习(CL4CTR)框架,用于规范特征表示学习: 对比度丢失、特征对齐和场均匀性。对比模块首先通过数据增强构造正特征对,然后通过对比损失最小化每个正特征对表示之间的距离。特征对齐约束迫使来自同一域的特征表示相近,而场均匀性约束迫使来自不同域的特征表示相距较远。大量的实验证明,CL4CTR 在四个数据集上取得了最好的性能,并且与各种代表性的基线具有良好的效率和兼容性。 code 0
Telecommunication Traffic Forecasting via Multi-task Learning Xiaochuan Gou, Xiangliang Zhang King Abdullah University of Science and Technology, Thuwal, Saudi Arabia; University of Notre Dame & King Abdullah University of Science and Technology, Notre Dame, IN, USA Accurate telecommunication time series forecasting is critical for smart management systems of cellular networks, and has a special challenge in predicting different types of time series simultaneously at one base station (BS), e.g., the SMS, Calls, and Internet. Unlike the well-studied single target forecasting problem for one BS, this distributed multi-target forecasting problem should take advantage of both the intra-BS dependence of different types of time series at the same BS and the inter-BS dependence of time series at different BS. To this end, we first propose a model to learn the inter-BS dependence by aggregating the multi-view dependence, e.g., from the viewpoint of SMS, Calls, and Internet. To incorporate the interBS dependence in time series forecasting, we then propose a Graph Gate LSTM (GGLSTM) model that includes a graph-based gate mechanism to unite those base stations with a strong dependence on learning a collaboratively strengthened prediction model. We also extract the intra-BS dependence by an attention network and use it in the final prediction. Our proposed approach is evaluated on two real-world datasets. Experiment results demonstrate the effectiveness of our model in predicting multiple types of telecom traffic at the distributed base stations. 准确的电信时间序列预测对于蜂窝网络的智能管理系统至关重要,并且对于在一个基站(BS)同时预测不同类型的时间序列(如短信、呼叫和因特网)具有特殊的挑战。这种分布式多目标预测问题不同于已有研究的单目标预测问题,它既要利用同一目标点上不同类型时间序列的内部相关性,又要利用不同目标点上时间序列的内部相关性。为此,我们首先从短信、呼叫和互联网的角度提出了一个通过聚合多视图依赖来学习基站间依赖的模型。为了在时间序列预测中考虑基站间的相关性,我们提出了一种基于图的门机制的图门 LSTM (GGLSTM)模型,该模型可以将那些强烈依赖于学习协同增强预测模型的基站联合起来。我们还利用一个注意网络提取了 BS 内部的相关性,并将其应用于最终的预测。我们提出的方法是评估两个真实世界的数据集。实验结果表明,该模型能够有效地预测分布式基站的多种电信业务类型。 code 0
Uncertainty Quantification for Fairness in Two-Stage Recommender Systems Lequn Wang, Thorsten Joachims Cornell University, Ithaca, NY, USA Many large-scale recommender systems consist of two stages. The first stage efficiently screens the complete pool of items for a small subset of promising candidates, from which the second-stage model curates the final recommendations. In this paper, we investigate how to ensure group fairness to the items in this two-stage architecture. In particular, we find that existing first-stage recommenders might select an irrecoverably unfair set of candidates such that there is no hope for the second-stage recommender to deliver fair recommendations. To this end, motivated by recent advances in uncertainty quantification, we propose two threshold-policy selection rules that can provide distribution-free and finite-sample guarantees on fairness in first-stage recommenders. More concretely, given any relevance model of queries and items and a point-wise lower confidence bound on the expected number of relevant items for each threshold-policy, the two rules find near-optimal sets of candidates that contain enough relevant items in expectation from each group of items. To instantiate the rules, we demonstrate how to derive such confidence bounds from potentially partial and biased user feedback data, which are abundant in many large-scale recommender systems. In addition, we provide both finite-sample and asymptotic analyses of how close the two threshold selection rules are to the optimal thresholds. Beyond this theoretical analysis, we show empirically that these two rules can consistently select enough relevant items from each group while minimizing the size of the candidate sets for a wide range of settings. 许多大型推荐系统由两个阶段组成。第一阶段有效地筛选出一小部分有希望的候选人的完整项目库,第二阶段模型从中筛选出最终的建议。在本文中,我们研究了如何在这两个阶段的体系结构中保证项目的群公平性。特别是,我们发现现有的第一阶段推荐人可能会选择一组不可挽回的不公平候选人,以至于第二阶段推荐人没有希望提供公平的推荐。为此,在不确定性量化研究的最新进展的推动下,我们提出了两个门限策略选择规则,它们可以为第一阶段推荐者的公平性提供无分布和有限样本的保证。更具体地说,给定任何查询和项目的相关性模型,以及对每个阈值策略的相关项目预期数量的逐点置信下限,这两个规则发现在每组项目预期中包含足够相关项目的候选集接近最优。为了实例化这些规则,我们演示了如何从潜在的部分和有偏见的用户反馈数据中推导出这样的置信界限,这些数据在许多大规模的推荐系统中都很丰富。此外,我们还提供了有限样本和渐近分析,如何接近两个阈值选择规则的最佳阈值。除了这个理论分析,我们的经验表明,这两个规则可以一致地选择足够的相关项目从每个组,同时最小化候选集的大小为广泛的设置。 code 0
Revisiting Code Search in a Two-Stage Paradigm Fan Hu, Yanlin Wang, Lun Du, Xirong Li, Hongyu Zhang, Shi Han, Dongmei Zhang Renmin University of China, Beijing, China; The University of Newcastle, Sydney, NSW, China; Sun Yat-sen University, Zhuhai, China; Microsoft Research, Beijing, China With a good code search engine, developers can reuse existing code snippets and accelerate software development process. Current code search methods can be divided into two categories: traditional information retrieval (IR) based and deep learning (DL) based approaches. DL-based approaches include the cross-encoder paradigm and the bi-encoder paradigm. However, both approaches have certain limitations. The inference of IR-based and bi-encoder models are fast, however, they are not accurate enough; while cross-encoder models can achieve higher search accuracy but consume more time. In this work, we propose TOSS, a two-stage fusion code search framework that can combine the advantages of different code search methods. TOSS first uses IR-based and bi-encoder models to efficiently recall a small number of top-k code candidates, and then uses fine-grained cross-encoders for finer ranking. Furthermore, we conduct extensive experiments on different code candidate volumes and multiple programming languages to verify the effectiveness of TOSS. We also compare TOSS with six data fusion methods. Experimental results show that TOSS is not only efficient, but also achieves state-of-the-art accuracy with an overall mean reciprocal ranking (MRR) score of 0.763, compared to the best baseline result on the CodeSearchNet benchmark of 0.713. 有了一个好的代码搜索引擎,开发人员可以重用现有的代码片段,并加快软件开发过程。目前的代码检索方法可分为两类: 传统的基于信息检索的方法和基于深度学习的方法。基于 DL 的方法包括交叉编码器范式和双编码器范式。然而,这两种方法都有一定的局限性。基于红外和双编码器模型的推理速度较快,但不够准确,而交叉编码器模型可以实现更高的搜索精度,但需要更多的时间。在这项工作中,我们提出了 TOSS,一个两阶段的融合代码搜索框架,可以结合不同的代码搜索方法的优点。TOSS 首先使用基于 IR 和双编码器的模型来有效地回忆少量的 top-k 代码候选,然后使用细粒度的交叉编码器进行更精细的排序。此外,我们在不同的代码候选卷和多种编程语言上进行了广泛的实验,以验证 TOSS 的有效性。并将 TOSS 与六种数据融合方法进行了比较。实验结果表明,与 CodeSearchNet 基准测试的最佳基线结果0.713相比,TOSS 不仅有效,而且达到了最先进的准确度,总体平均互惠排名(MRR)得分为0.763。 code 0
MMBench: The Match Making Benchmark Yongsheng Liu, Yanxing Qi, Jiangwei Zhang, Connie Kou, Qiaolin Chen Tencent, Shenzhen, China; Tencent, Singapore, Singapore Video gaming has gained huge popularity over the last few decades. As reported, there are about 2.9 billion gamers globally. Among all genres, competitive games are one of the most popular ones. Matchmaking is a core problem for competitive games, which determines the player satisfaction, hence influences the game success. Most matchmaking systems group the queuing players into opposing teams with similar skill levels. The key challenge is to accurately rate the players' skills based on their match performances. There has been an increasing amount of effort on developing such rating systems such as Elo, Glicko. However, games with different game-plays might have different game modes, which might require an extensive amount of effort for rating system customization. Even though there are many rating system choices and various customization strategies, there is a clear lack of a systematic framework with which different rating systems can be analysed and compared against each other. Such a framework could help game developers to identify the bottlenecks of their matchmaking systems and enhance the performance of their matchmaking systems. To bridge the gap, we present MMBench, the first benchmark framework for evaluating different rating systems. It serves as a fair means of comparison for different rating systems and enables a deeper understanding of different rating systems. In this paper, we will present how MMBench could benchmark the three major rating systems, Elo, Glicko, Trueskill in the battle modes of 1 vs 1, n vs n, battle royal and teamed battle royal over both real and synthetic datasets. 在过去的几十年里,视频游戏获得了巨大的普及。据报道,全球大约有29亿玩家。在所有类型中,竞技游戏是最受欢迎的游戏之一。匹配是竞技游戏的核心问题,它决定着玩家的满意度,从而影响着游戏的成功。大多数匹配系统将排队的队员分组成技术水平相似的对立队伍。关键的挑战是根据球员的比赛表现来准确地评价他们的技术。在开发诸如 Elo、 Glicko 之类的评级系统方面,人们付出了越来越多的努力。然而,具有不同游戏玩法的游戏可能有不同的游戏模式,这可能需要大量的工作来评定系统定制。尽管有许多评级制度的选择和各种定制战略,但明显缺乏一个系统框架,用以分析和比较不同的评级制度。这样一个框架可以帮助游戏开发人员识别其匹配系统的瓶颈,并提高其匹配系统的性能。为了弥补差距,我们提出了 MMBench,第一个评估不同评级系统的基准框架。它作为一个公平的手段,比较不同的评级制度,并使不同的评级制度更深入的了解。在这篇文章中,我们将介绍 MMBench 如何在真实和合成数据集上对三个主要的评级系统进行基准测试: Elo,Glicko,Trueskill 在1对1,n 对 n 的战斗模式中,Battle royal 和 team fight royal。 code 0
Trustworthy Algorithmic Ranking Systems Markus Schedl, Emilia Gómez, Elisabeth Lex Graz University of Technology, Graz, Austria; Johannes Kepler University Linz & Linz Institute of Technology, Linz, Austria; European Commission, Joint Research Centre & Universitat Pompeu Fabra, Seville and Barcelona, Spain This tutorial aims at providing its audience an interdisciplinary overview about the topics of fairness and non-discrimination, diversity, and transparency as relevant dimensions of trustworthy AI systems, tailored to algorithmic ranking systems such as search engines and recommender systems. We will equip the mostly technical audience of WSDM with the necessary understanding of the social and ethical implications of their research and development on the one hand, and of recent ethical guidelines and regulatory frameworks addressing the aforementioned dimensions on the other hand. While the tutorial foremost takes a European perspective, starting from the concept of trustworthy AI and discussing EU regulation in this area currently in the implementation stages, we also consider related initiatives worldwide. Since ensuring non-discrimination, diversity, and transparency in retrieval and recommendation systems is an endeavor in which academic institutions and companies in different parts of the world should collaborate, this tutorial is relevant for researchers and practitioners interested in the ethical, social, and legal impact of their work. The tutorial, therefore, targets both academic scholars and practitioners around the globe, by reviewing recent research and providing practical examples addressing these particular trustworthiness aspects, and showcasing how new regulations affect the audience's daily work. 本教程的目的是为其受众提供一个公平和非歧视,多样性和透明度作为值得信赖的人工智能系统的相关维度的主题的跨学科概述,定制的算法排名系统,如搜索引擎和推荐系统。我们将使主要是技术性的 WSDM 受众一方面对其研究和开发的社会和伦理影响有必要的了解,另一方面对处理上述方面的最新伦理准则和管理框架有必要的了解。虽然最重要的教程从欧洲的角度出发,从可信赖的人工智能的概念出发,讨论欧盟在这一领域目前正处于实施阶段的规章制度,但我们也考虑到世界各地的相关举措。由于确保检索和推荐系统的非歧视性、多样性和透明度是世界不同地区的学术机构和公司应该合作的一项努力,本教程适用于对其工作的伦理、社会和法律影响感兴趣的研究人员和从业人员。因此,本教程通过回顾最近的研究,并提供实际例子,解决这些特定的可信赖性方面,以及展示新的法规如何影响观众的日常工作,面向全球学术界学者和从业人员。 code 0
Proactive Conversational Agents Lizi Liao, Grace Hui Yang, Chirag Shah Singapore Management University, Singapore, Singapore; University of Washington, Seattle, WA, USA; Georgetown University, Washington, DC, USA Conversational agents, or commonly known as dialogue systems, have gained escalating popularity in recent years. Their widespread applications support conversational interactions with users and accomplishing various tasks as personal assistants. However, one key weakness in existing conversational agents is that they only learn to passively answer user queries via training on pre-collected and manually-labeled data. Such passiveness makes the interaction modeling and system-building process relatively easier, but it largely hinders the possibility of being human-like hence lowering the user engagement level. In this tutorial, we introduce and discuss methods to equip conversational agents with the ability to interact with end users in a more proactive way. This three-hour tutorial is divided into three parts and includes two interactive exercises. It reviews and presents recent advancements on the topic, focusing on automatically expanding ontology space, actively driving conversation by asking questions or strategically shifting topics, and retrospectively conducting response quality control. 会话代理,或通常称为对话系统,近年来越来越受欢迎。它们广泛的应用程序支持与用户的对话交互,并作为个人助理完成各种任务。然而,现有会话代理的一个关键弱点是,它们只能通过对预收集和手动标记的数据进行培训,学会被动地回答用户的查询。这种被动性使得交互建模和系统构建过程相对容易,但是它在很大程度上阻碍了人性化的可能性,从而降低了用户参与水平。在本教程中,我们将介绍和讨论使会话代理具备以更主动的方式与最终用户交互的能力的方法。这个三个小时的教程分为三个部分,包括两个互动练习。它回顾并介绍了最近在这一主题上的进展,侧重于自动扩展本体空间,通过提出问题或策略性地转移话题来积极推动会话,以及回顾性地进行回应质量控制。 code 0
AutoML for Deep Recommender Systems: Fundamentals and Advances Ruiming Tang, Bo Chen, Yejing Wang, Huifeng Guo, Yong Liu, Wenqi Fan, Xiangyu Zhao The Hong Kong Polytechnic University, Hong Kong, Hong Kong; City University of Hong Kong, Hong Kong, Hong Kong; Huawei Noah's Ark Lab, Shenzhen, China Recommender systems have become increasingly important in our daily lives since they play an important role in mitigating the information overload problem, especially in many user-oriented online services. Recommender systems aim to identify a set of items that best match users' explicit or implicit preferences, by utilizing the user and item interactions to improve the accuracy. With the fast advancement of deep neural networks (DNNs) in the past few decades, recommendation techniques have achieved promising performance. However, we still meet three inherent challenges to design deep recommender systems (DRS): 1) the majority of existing DRS are developed based on hand-crafted components, which requires ample expert knowledge recommender systems; 2) human error and bias can lead to suboptimal components, which reduces the recommendation effectiveness; 3) non-trivial time and engineering efforts are usually required to design the task-specific components in different recommendation scenarios. In this tutorial, we aim to give a comprehensive survey on the recent progress of advanced Automated Machine Learning (AutoML) techniques for solving the above problems in deep recommender systems. More specifically, we will present feature selection, feature embedding search, feature interaction search, and whole DRS pipeline model training and comprehensive search for deep recommender systems. In this way, we expect academic researchers and industrial practitioners in related fields can get deep understanding and accurate insight into the spaces, stimulate more ideas and discussions, and promote developments of technologies in recommendations. 推荐系统在我们的日常生活中越来越重要,因为它们在缓解信息超载问题方面发挥着重要作用,特别是在许多以用户为本的网上服务中。推荐系统旨在通过利用用户和项目的交互来提高准确性,从而识别出一组最符合用户显性或隐性偏好的项目。近几十年来,随着深度神经网络(DNN)的快速发展,推荐技术已经取得了令人满意的性能。然而,在设计深度推荐系统时,我们仍然面临三个内在的挑战: 1)现有的深度推荐系统大部分是基于手工制作的组件开发的,这需要大量的专家知识推荐系统; 2)人为错误和偏差可能导致次优组件,从而降低推荐的有效性; 3)在不同的推荐场景中,通常需要花费大量的时间和工程努力来设计任务特定的组件。在本教程中,我们的目的是给出一个全面的综述,最近的进展,先进的自动机器学习(AutoML)技术,以解决上述问题在深度推荐系统。更具体地说,我们将介绍深度推荐系统的特征选择、特征嵌入搜索、特征交互搜索以及整个 DRS 流水线模型训练和综合搜索。通过这种方式,我们期望学术研究人员和相关领域的行业从业人员能够深入理解和准确洞察空间,激发更多的想法和讨论,并在建议中促进技术的发展。 code 0
DIGMN: Dynamic Intent Guided Meta Network for Differentiated User Engagement Forecasting in Online Professional Social Platforms Feifan Li, Lun Du, Qiang Fu, Shi Han, Yushu Du, Guangming Lu, Zi Li Microsoft Research, Beijing, China; LinkedIn Corp., Beijing, China; Dalian University of Technology, Dalian, China User engagement prediction plays a critical role for designing interaction strategies to grow user engagement and increase revenue in online social platforms. Through the in-depth analysis of the real-world data from the world's largest professional social platforms, i.e., LinkedIn, we find that users expose diverse engagement patterns, and a major reason for the differences in user engagement patterns is that users have different intents. That is, people have different intents when using LinkedIn, e.g., applying for jobs, building connections, or checking notifications, which shows quite different engagement patterns. Meanwhile, user intents and the corresponding engagement patterns may change over time. Although such pattern differences and dynamics are essential for user engagement prediction, differentiating user engagement patterns based on user dynamic intents for better user engagement forecasting has not received enough attention in previous works. In this paper, we proposed a Dynamic Intent Guided Meta Network (DIGMN), which can explicitly model user intent varying with time and perform differentiated user engagement forecasting. Specifically, we derive some interpretable basic user intents as prior knowledge from data mining and introduce prior intents in explicitly modeling dynamic user intent. Furthermore, based on the dynamic user intent representations, we propose a meta predictor to perform differentiated user engagement forecasting. Through a comprehensive evaluation on LinkedIn anonymous user data, our method outperforms state-of-the-art baselines significantly, i.e., 2.96% and 3.48% absolute error reduction, on coarse-grained and fine-grained user engagement prediction tasks, respectively, demonstrating the effectiveness of our method. 用户参与度预测在设计交互策略以增加用户参与度和在线社交平台收入方面起着至关重要的作用。通过对世界上最大的专业社交平台 LinkedIn 的现实数据进行深入分析,我们发现用户暴露了不同的参与模式,用户参与模式不同的一个主要原因是用户有不同的意图。也就是说,人们在使用 LinkedIn 时有不同的意图,比如,申请工作,建立联系,或者查看通知,这些都显示出完全不同的参与模式。同时,用户意图和相应的参与模式可能会随着时间的推移而改变。尽管这些模式差异和动态对于用户参与预测是必不可少的,但是基于用户动态意图区分用户参与模式以获得更好的用户参与预测在以前的工作中没有得到足够的重视。本文提出了一种动态意图引导元网络(DIGMN) ,它可以显式地模拟随时间变化的用户意图,并进行差异化的用户参与预测。具体地说,我们从数据挖掘中推导出一些可解释的基本用户意图作为先验知识,并将先验意图引入到动态用户意图的显式建模中。此外,基于动态用户意图表示,我们提出了一个元预测器来执行差异化的用户参与预测。通过对 LinkedIn 匿名用户数据的综合评价,该方法在粗粒度和细粒度用户参与预测任务上分别显著优于最先进的基线(2.96% 和3.48%) ,证明了该方法的有效性。 code 0
BLADE: Biased Neighborhood Sampling based Graph Neural Network for Directed Graphs Srinivas Virinchi, Anoop Saladi Amazon, Bengaluru, India Directed graphs are ubiquitous and have applications across multiple domains including citation, website, social, and traffic networks. Yet, majority of research involving graph neural networks (GNNs) focus on undirected graphs. In this paper, we deal with the problem of node recommendation in directed graphs. Specifically, given a directed graph and query node as input, the goal is to recommend top- nodes that have a high likelihood of a link with the query node. Here we propose BLADE, a novel GNN to model directed graphs. In order to jointly capture link likelihood and link direction, we employ an asymmetric loss function and learn dual embeddings for each node, by appropriately aggregating features from its neighborhood. In order to achieve optimal performance on both low and high-degree nodes, we employ a biased neighborhood sampling scheme that generates locally varying neighborhoods which differ based on a node's connectivity structure. Extensive experimentation on several open-source and proprietary directed graphs show that BLADE outperforms state-of-the-art baselines by 6-230% in terms of HitRate and MRR for the node recommendation task and 10.5% in terms of AUC for the link direction prediction task. We perform ablation study to accentuate the importance of biased neighborhood sampling employed in generating higher quality recommendations for both low-degree and high-degree query nodes. Further, BLADE delivers significant improvement in revenue and sales as measured through an A/B experiment. 有向图是无处不在,并有跨多个领域的应用,包括引用,网站,社会和流量网络。然而,大多数涉及图神经网络(GNN)的研究集中在无向图上。本文研究有向图中的节点推荐问题。具体来说,给定一个有向图和查询节点作为输入,目标是推荐与查询节点具有高度可能性的链接的顶部节点。在这里,我们提出 BLADE,一个新的 GNN 模型有向图。为了联合捕获链路可能性和链路方向,我们采用了一种非对称损失函数,通过适当地从每个节点的邻域聚集特征来学习每个节点的对偶嵌入。为了在低度和高度节点上获得最佳的性能,我们采用了一种有偏的邻域抽样方案,根据节点的连通性结构产生局部变化的邻域。对几个开放源码和专有有向图的广泛实验表明,BLADE 在节点推荐任务的 HitRate 和 MRR 方面比最先进的基线表现高出6-230% ,在链路方向预测任务的 AUC 方面高出10.5% 。我们进行消融研究,以强调有偏的邻域抽样的重要性,使用在产生高质量的建议,无论是低度和高度查询节点。此外,BLADE 通过 A/B 实验,在收入和销售方面取得了显著的改善。 code 0
Mining User-aware Multi-relations for Fake News Detection in Large Scale Online Social Networks Xing Su, Jian Yang, Jia Wu, Yuchen Zhang Macquarie University, Sydney, NSW, Australia Users' involvement in creating and propagating news is a vital aspect of fake news detection in online social networks. Intuitively, credible users are more likely to share trustworthy news, while untrusted users have a higher probability of spreading untrustworthy news. In this paper, we construct a dual-layer graph (i.e., the news layer and the user layer) to extract multiple relations of news and users in social networks to derive rich information for detecting fake news. Based on the dual-layer graph, we propose a fake news detection model named Us-DeFake. It learns the propagation features of news in the news layer and the interaction features of users in the user layer. Through the inter-layer in the graph, Us-DeFake fuses the user signals that contain credibility information into the news features, to provide distinctive user-aware embeddings of news for fake news detection. The training process conducts on multiple dual-layer subgraphs obtained by a graph sampler to scale Us-DeFake in large scale social networks. Extensive experiments on real-world datasets illustrate the superiority of Us-DeFake which outperforms all baselines, and the users' credibility signals learned by interaction relation can notably improve the performance of our model. 用户参与创造和传播新闻是在线社交网络虚假新闻检测的一个重要方面。直观地说,可信的用户更可能分享可信的新闻,而不可信的用户传播不可信的新闻的可能性更大。本文构造了一个双层图(即新闻层和用户层)来提取社交网络中新闻和用户之间的多重关系,从而获取丰富的信息来检测虚假新闻。在双层图的基础上,提出了一种假新闻检测模型 Us-DeFake。它学习新闻层中新闻的传播特征和用户层中用户的交互特征。Us-DeFake 通过图中的中间层,将包含可信信息的用户信号融合到新闻特征中,为假新闻检测提供独特的用户感知新闻嵌入。该训练过程对图采样器获得的多个双层子图进行训练,以在大规模社会网络中对 Us-DeFake 进行标度。在实际数据集上进行的大量实验表明,Us-DeFake 的性能优于所有基线,通过交互关系获得的用户可信度信号可以显著提高我们模型的性能。 code 0
Generating Explainable Product Comparisons for Online Shopping Nikhita Vedula, Marcus D. Collins, Eugene Agichtein, Oleg Rokhlenko Amazon, Atlanta, GA, USA; Amazon, Seattle, WA, USA An essential part of making shopping purchase decisions is to compare and contrast products based on key differentiating features, but doing this manually can be overwhelming. Prior methods offer limited product comparison capabilities, e.g., via pre-defined common attributes that may be difficult to understand, or irrelevant to a particular product or user. Automatically generating an informative, natural-sounding, and factually consistent comparative text for multiple product and attribute types is a challenging research problem. We describe HCPC (Human Centered Product Comparison), to tackle two kinds of comparisons for online shopping: (i) product-specific, to describe and compare products based on their key attributes; and (ii) attribute-specific comparisons, to compare similar products on a specific attribute. To ensure that comparison text is faithful to the input product data, we introduce a novel multi-decoder, multi-task generative language model. One decoder generates product comparison text, and a second one generates supportive, explanatory text in the form of product attribute names and values. The second task imitates a copy mechanism, improving the comparison generator, and its output is used to justify the factual accuracy of the generated comparison text, by training a factual consistency model to detect and correct errors in the generated comparative text. We release a new dataset (https://registry.opendata.aws/) of ~15K human generated sentences, comparing products on one or more attributes (the first such data we know of for product comparison). We demonstrate on this data that HCPC significantly outperforms strong baselines, by ~10% using automatic metrics, and ~5% using human evaluation. 作出购物决定的一个重要部分是比较和对比产品的基础上的关键差异功能,但这样做手动可能是压倒性的。先前的方法提供有限的产品比较能力,例如,通过预定义的公共属性,可能难以理解,或无关的特定产品或用户。为多种产品和属性类型自动生成信息丰富、听起来自然、事实一致的比较文本是一个具有挑战性的研究问题。我们描述了 HCPC (以人为中心的产品比较) ,以解决网上购物的两种比较: (i)产品特定的,基于关键属性描述和比较产品; 和(ii)属性特定的比较,以比较具体属性上的相似产品。为了保证比较文本忠实于输入的产品数据,我们引入了一种新的多解码器、多任务生成语言模型。一个解码器生成产品比较文本,另一个解码器以产品属性名称和值的形式生成支持性的解释性文本。第二个任务模仿复制机制,改进比较生成器,并通过训练一个事实内存一致性模型来检测和纠正生成的比较文本中的错误,将其输出用于证明所生成的比较文本的事实准确性。我们发布了一个新的数据集( https://registry.opendata.aws/) ,包括大约15k 个人类生成的句子,比较产品的一个或多个属性(这是我们所知道的第一个用于产品比较的数据)。我们在这些数据上证明,HCPC 显著优于强基线,使用自动度量的优势约为10% ,使用人工评估的优势约为5% 。 code 0
Never Too Late to Learn: Regularizing Gender Bias in Coreference Resolution Sunyoung Park, Kyuri Choi, Haeun Yu, Youngjoong Ko Sungkyunkwan University, Suwon-si, Republic of Korea Leveraging pre-trained language models (PLMs) as initializers for efficient transfer learning has become a universal approach for text-related tasks. However, the models not only learn the language understanding abilities but also reproduce prejudices for certain groups in the datasets used for pre-training. Recent studies show that the biased knowledge acquired from the datasets affects the model predictions on downstream tasks. In this paper, we mitigate and analyze the gender biases in PLMs with coreference resolution, which is one of the natural language understanding (NLU) tasks. PLMs exhibit two types of gender biases: stereotype and skew. The primary causes for the biases are the imbalanced datasets with more male examples and the stereotypical examples on gender roles. While previous studies mainly focused on the skew problem, we aim to mitigate both gender biases in PLMs while maintaining the model's original linguistic capabilities. Our method employs two regularization terms, Stereotype Neutralization (SN) and Elastic Weight Consolidation (EWC). The models trained with the methods show to be neutralized and reduce the biases significantly on the WinoBias dataset compared to the public BERT. We also invented a new gender bias quantification metric called the Stereotype Quantification (SQ) score. In addition to the metrics, embedding visualizations were used to interpret how our methods have successfully debiased the models. 利用预训练语言模型(PLM)作为有效迁移学习的初始化方法已经成为文本相关任务的通用方法。然而,这些模型不仅学习了语言理解能力,而且在用于预训练的数据集中重现了某些群体的偏见。最近的研究表明,从数据集中获得的有偏见的知识会影响对下游任务的模型预测。本文采用共指消解的方法来缓解和分析 PLM 中的性别偏见,这是自然语言理解(NLU)的任务之一。PLM 表现出两种类型的性别偏见: 刻板印象和偏见。造成偏见的主要原因是数据集不平衡,男性例子较多,以及性别角色方面的陈规定型例子。虽然以前的研究主要集中在倾斜问题,我们的目标是减轻 PLM 中的性别偏见,同时保持模型的原始语言能力。我们的方法采用两个正则化项,刻板印象中和(SN)和弹性加权固结(EWC)。与公开的 BERT 相比,用这些方法训练的模型在 WinoBias 数据集上显示出中和和减少了偏差。我们还发明了一种新的性别偏见量化指标,称为刻板印象量化(SQ)评分。除了度量之外,嵌入可视化被用来解释我们的方法是如何成功地去偏模型的。 code 0
Learning to Distill Graph Neural Networks Cheng Yang, Yuxin Guo, Yao Xu, Chuan Shi, Jiawei Liu, Chunchen Wang, Xin Li, Ning Guo, Hongzhi Yin Beijing University of Posts and Telecommunications, Beijing, China; The University of Queensland, Brisbane, QLD, Australia; Researcher, Beijing, China Graph Neural Networks (GNNs) can effectively capture both the topology and attribute information of a graph, and have been extensively studied in many domains. Recently, there is an emerging trend that equips GNNs with knowledge distillation for better efficiency or effectiveness. However, to the best of our knowledge, existing knowledge distillation methods applied on GNNs all employed predefined distillation processes, which are controlled by several hyper-parameters without any supervision from the performance of distilled models. Such isolation between distillation and evaluation would lead to suboptimal results. In this work, we aim to propose a general knowledge distillation framework that can be applied on any pretrained GNN models to further improve their performance. To address the isolation problem, we propose to parameterize and learn distillation processes suitable for distilling GNNs. Specifically, instead of introducing a unified temperature hyper-parameter as most previous work did, we will learn node-specific distillation temperatures towards better performance of distilled models. We first parameterize each node's temperature by a function of its neighborhood's encodings and predictions, and then design a novel iterative learning process for model distilling and temperature learning. We also introduce a scalable variant of our method to accelerate model training. Experimental results on five benchmark datasets show that our proposed framework can be applied on five popular GNN models and consistently improve their prediction accuracies with 3.12% relative enhancement on average. Besides, the scalable variant enables 8 times faster training speed at the cost of 1% prediction accuracy. 图神经网络(GNN)能够有效地捕获图的拓扑和属性信息,已经在许多领域得到了广泛的研究。最近,有一个新兴的趋势,装备 GNN 的知识提取更好的效率或有效性。然而,据我们所知,现有的应用于 GNN 的知识蒸馏方法都采用了预定义的蒸馏过程,这些过程由多个超参数控制,没有对蒸馏模型的性能进行任何监督。蒸馏和评价之间的这种隔离将导致次优结果。在这项工作中,我们的目标是提出一个通用的知识提取框架,可以应用于任何预先训练的 GNN 模型,以进一步提高其性能。为了解决隔离问题,我们提出参数化和学习蒸馏过程适合蒸馏 GNN。具体来说,我们不会像以前的大多数工作那样引入统一的温度超参数,我们将学习节点特定的蒸馏温度,以提高蒸馏模型的性能。我们首先根据每个节点的邻域编码和预测的函数来参数化每个节点的温度,然后设计一个新的模型提取和温度学习的迭代学习过程。我们还引入了一种可扩展的方法来加速模型训练。在五个基准数据集上的实验结果表明,我们提出的框架可以应用于五个流行的 GNN 模型上,预测精度持续提高,平均相对提高3.12% 。此外,可扩展变量使8倍更快的训练速度的成本1% 的预测准确性。 code 0
S2TUL: A Semi-Supervised Framework for Trajectory-User Linking Liwei Deng, Hao Sun, Yan Zhao, Shuncheng Liu, Kai Zheng University of Electronic Science and Technology of China, ChengDu, China; Peking University, Peking, China; Aalborg University, Aalborg, China Trajectory-User Linking (TUL) aiming to identify users of anonymous trajectories, has recently received increasing attention due to its wide range of applications, such as criminal investigation and personalized recommendation systems. In this paper, we propose a flexible Semi-Supervised framework for Trajectory-User Linking, namely S2TUL, which includes five components: trajectory-level graph construction, trajectory relation modeling, location-level sequential modeling, a classification layer and greedy trajectory-user relinking. The first two components are proposed to model the relationships among trajectories, in which three homogeneous graphs and two heterogeneous graphs are firstly constructed and then delivered into the graph convolutional networks for converting the discrete identities to hidden representations. Since the graph constructions are irrelevant to the corresponding users, the unlabelled trajectories can also be included in the graphs, which enables the framework to be trained in a semi-supervised way. Afterwards, the location-level sequential modeling component is designed to capture fine-grained intra-trajectory information by passing the trajectories into the sequential neural networks. Finally, these two level representations are concatenated into a classification layer to predict the user of the input trajectory. In the testing phase, a greedy trajectory-user relinking method is proposed to assure the linking results satisfy the timespan overlap constraint. We conduct extensive experiments on three public datasets with six representative competitors. The evaluation results demonstrate the effectiveness of the proposed framework. 轨迹用户链接(TUL)是一种旨在识别匿名轨迹用户的技术,由于其广泛的应用,如刑事调查和个性化推荐系统,近年来受到越来越多的关注。本文提出了一个灵活的 S2TUL 管理框架,即 S2TUL,它包括轨迹层图的构造、轨迹关系建模、位置层顺序建模、分类层和贪婪轨迹-用户重联五个部分。首先构造出三个同质图和两个异质图,然后将其传递到图卷积网络中,将离散恒等式转化为隐藏表示,最后利用图卷积网络模型对轨迹间的关系进行建模。由于图的构造与相应的用户无关,所以图中也可以包含未标记的轨迹,这样就可以用半监督的方式对框架进行训练。然后,设计位置级序列建模组件,通过将轨迹传递给序列神经网络来获取细粒度的轨迹内信息。最后,将这两层表示连接到一个分类层,预测用户的输入轨迹。在测试阶段,提出了一种贪婪的轨迹用户重联方法,以保证链接结果满足时间跨度重叠约束。我们在三个公共数据集上与六个有代表性的竞争对手进行了广泛的实验。评价结果表明了该框架的有效性。 code 0
Ask "Who", Not "What": Bitcoin Volatility Forecasting with Twitter Data M. Eren Akbiyik, Mert Erkul, Killian Kämpf, Vaiva Vasiliauskaite, Nino AntulovFantulin ETH Zurich, Zurich, Switzerland Understanding the variations in trading price (volatility), and its response to exogenous information, is a well-researched topic in finance. In this study, we focus on finding stable and accurate volatility predictors for a relatively new asset class of cryptocurrencies, in particular Bitcoin, using deep learning representations of public social media data obtained from Twitter. For our experiments, we extracted semantic information and user statistics from over 30 million Bitcoin-related tweets, in conjunction with 15-minute frequency price data over a horizon of 144 days. Using this data, we built several deep learning architectures that utilized different combinations of the gathered information. For each model, we conducted ablation studies to assess the influence of different components and feature sets over the prediction accuracy. We found statistical evidences for the hypotheses that: (i) temporal convolutional networks perform significantly better than both classical autoregressive models and other deep learning-based architectures in the literature, and (ii) tweet author meta-information, even detached from the tweet itself, is a better predictor of volatility than the semantic content and tweet volume statistics. We demonstrate how different information sets gathered from social media can be utilized in different architectures and how they affect the prediction results. As an additional contribution, we make our dataset public for future research. 理解交易价格(波动性)的变化及其对外部信息的响应是金融学研究的热点。在这项研究中,我们的重点是找到稳定和准确的波动性预测相对较新的资产类别的加密货币,特别是比特币,使用从 Twitter 获得的公共社会媒体数据的深度学习表示。在我们的实验中,我们从超过3000万条与比特币相关的推文中提取了语义信息和用户数据,以及144天内15分钟的频率价格数据。使用这些数据,我们构建了几个深度学习架构,它们利用了所收集信息的不同组合。对于每个模型,我们进行了烧蚀研究,以评估不同组成部分和特征集对预测准确性的影响。我们发现统计学证据的假设: (i)时间卷积网络表现显着优于文献中的经典自回归模型和其他基于深度学习的架构,以及(ii) tweet 作者元信息,即使与 tweet 本身分离,是比语义内容和 tweet 量统计更好的波动性预测器。我们展示了如何在不同的架构中使用从社会媒体收集的不同信息集,以及它们如何影响预测结果。作为额外的贡献,我们将我们的数据集公开以供未来的研究使用。 code 0
Zero to Hero: Exploiting Null Effects to Achieve Variance Reduction in Experiments with One-sided Triggering Alex Deng, LoHua Yuan, Naoya Kanai, Alexandre SalamaManteau Airbnb, Seattle, WA, USA; Airbnb, San Francisco, CA, USA; Airbnb, Paris, France In online experiments where the intervention is only exposed, or "triggered", for a small subset of the population, it is critical to use variance reduction techniques to estimate treatment effects with sufficient precision to inform business decisions. Trigger-dilute analysis is often used in these situations, and reduces the sampling variance of overall intent-to-treat (ITT) effects by an order of magnitude equal to the inverse of the triggering rate; for example, a triggering rate of $5%$ corresponds to roughly a $20x$ reduction in variance. To apply trigger-dilute analysis, one needs to know experimental subjects' triggering counterfactual statuses, i.e., the counterfactual behavior of subjects under both treatment and control conditions. In this paper, we propose an unbiased ITT estimator with reduced variance applicable for experiments where the triggering counterfactual status is only observed in the treatment group. Our method is based on the efficiency augmentation idea of CUPED and draws upon identification frameworks from the principal stratification and instrumental variables literature. The unbiasedness of our estimation approach relies on a testable assumption that the augmentation term used for covariate adjustment equals zero in expectation. Unlike traditional covariate adjustment or principal score modeling approaches, our estimator can incorporate both pre-experiment and in-experiment observations. We demonstrate through a real-world experiment and simulations that our estimator can remain unbiased and achieve precision improvements as large as if triggering status were fully observed, and in some cases can even outperform trigger-dilute analysis. 在在线实验中,干预只暴露或“触发”一小部分人群,使用方差减少技术以足够的精确度来估计治疗效果,以便为业务决策提供信息是至关重要的。触发稀释分析经常用于这些情况下,并减少整体意向治疗(ITT)效应的抽样方差的数量级等于触发率的反数,例如,5% $的触发率相当于大约 $20 x $的方差减少。应用触发稀释分析,需要了解实验对象的触发反事实状态,即在治疗和控制条件下实验对象的反事实行为。在本文中,我们提出了一个无偏的减少方差的 ITT 估计器,适用于实验中的触发反事实状态仅在治疗组中观察到。我们的方法基于 CUPED 的效率增强思想,并借鉴了主要分层和工具变量文献中的识别框架。我们估计方法的无偏性依赖于一个可检验的假设,即用于协变量平差的增广项等于期望值中的零。与传统的协变量调整或主成分模型方法不同,我们的估计器可以结合实验前和实验中的观察。我们通过一个真实世界的实验和模拟表明,我们的估计器可以保持无偏,并实现精度的提高,如果触发状态得到充分观察,在某些情况下甚至可以优于触发稀释分析。 code 0
Unbiased and Efficient Self-Supervised Incremental Contrastive Learning Cheng Ji, Jianxin Li, Hao Peng, Jia Wu, Xingcheng Fu, Qingyun Sun, Philip S. Yu Macquarie University, Sydney, NSW, Australia; University of Illinois at Chicago, Chicago, IL, USA; Beihang University, Beijing, China Contrastive Learning (CL) has been proved to be a powerful self-supervised approach for a wide range of domains, including computer vision and graph representation learning. However, the incremental learning issue of CL has rarely been studied, which brings the limitation in applying it to real-world applications. Contrastive learning identifies the samples with the negative ones from the noise distribution that changes in the incremental scenarios. Therefore, only fitting the change of data without noise distribution causes bias, and directly retraining results in low efficiency. To bridge this research gap, we propose a self-supervised Incremental Contrastive Learning (ICL) framework consisting of (i) a novel Incremental InfoNCE (NCE-II) loss function by estimating the change of noise distribution for old data to guarantee no bias with respect to the retraining, (ii) a meta-optimization with deep reinforced Learning Rate Learning (LRL) mechanism which can adaptively learn the learning rate according to the status of the training processes and achieve fast convergence which is critical for incremental learning. Theoretically, the proposed ICL is equivalent to retraining, which is based on solid mathematical derivation. In practice, extensive experiments in different domains demonstrate that, without retraining a new model, ICL achieves up to 16.7x training speedup and 16.8x faster convergence with competitive results. 对比学习(CL)已被证明是一种强大的自我监督方法,适用于包括计算机视觉和图形表示学习在内的广泛领域。然而,关于 CL 的在线机机器学习问题很少被研究,这就限制了它在实际应用中的局限性。对比学习从增量情景中噪声分布的变化中识别出样本与负样本。因此,只对无噪声分布的数据变化进行拟合会产生偏差,直接导致再训练效率低下。为了弥补这一研究空白,我们提出了一个自监督增量对比学习(ICL)框架,该框架包括: (i)一个新的增量信息增量对比学习(nce-II)损失函数,通过估计旧数据噪声分布的变化来保证再训练方面没有偏差; (ii)一个元优化与深度增强学习率学习(LRL)机制,它可以根据训练过程的状态自适应地学习学习率,并实现快速收敛,这对于在线机机器学习来说是至关重要的。理论上,本文提出的 ICL 等价于再训练,它是基于固体数学推导的。在实际应用中,不同领域的大量实验表明,在不对新模型进行再训练的情况下,ICL 的训练加速比可达16.7倍,收敛速度可达16.8倍,具有较强的竞争力。 code 0
Reducing the Bias of Visual Objects in Multimodal Named Entity Recognition Xin Zhang, Jingling Yuan, Lin Li, Jianquan Liu Wuhan University of Technology & Engineering Research Center of Digital Publishing Intelligent Service Technology, Ministry of Education, Wuhan, China; NEC Corporation, Tokyo, Japan; Wuhan University of Technology, Wuhan, China Visual information shows to empower accurately named entity recognition in short texts, such as posts from social media. Previous work on multimodal named entity recognition (MNER) often regards an image as a set of visual objects, trying to explicitly align visual objects and entities. However, these methods may suffer the bias introduced by visual objects when they are not identical to entities in quantity and entity type. Different from this kind of explicit alignment, we argue that implicit alignment is effective in optimizing the shared semantic space learning between text and image for improving MNER. To this end, we propose a de-bias contrastive learning based approach for MNER, which studies modality alignment enhanced by cross-modal contrastive learning. Specifically, our contrastive learning adopts a hard sample mining strategy and a debiased contrastive loss to alleviate the bias of quantity and entity type, respectively, which globally learns to align the feature spaces from text and image. Finally, the learned semantic space works with a NER decoder to recognize entities in text. Conducted on two benchmark datasets, experimental results show that our approach outperforms the current state-of-the-art methods. 视觉信息显示,以授权准确命名的实体识别在短文本,如来自社会媒体的帖子。以前关于多模态命名实体识别(MNER)的工作通常将图像视为一组可视对象,试图显式地对齐可视对象和实体。然而,当视觉对象在数量和实体类型上与实体不一致时,这些方法可能会受到视觉对象引入的偏差。与这种显性对齐不同,本文认为隐性对齐可以有效地优化文本与图像之间的共享语义空间学习,从而提高 MNER。为此,我们提出了一种基于去偏差对比学习的 MNER 方法,该方法研究了通过跨模态对比学习增强模态对齐。具体地说,我们的对比学习分别采用硬样本挖掘策略和去偏对比损失策略来减轻数量和实体类型的偏差,从而在全局上学习从文本和图像中对齐特征空间。最后,学习语义空间与 NER 解码器一起工作来识别文本中的实体。在两个基准数据集上进行的实验结果表明,我们的方法优于当前最先进的方法。 code 0
Variance-Minimizing Augmentation Logging for Counterfactual Evaluation in Contextual Bandits Aaron David Tucker, Thorsten Joachims Cornell University, Ithaca, NY, USA Methods for offline A/B testing and counterfactual learning are seeing rapid adoption in search and recommender systems, since they allow efficient reuse of existing log data. However, there are fundamental limits to using existing log data alone, since the counterfactual estimators that are commonly used in these methods can have large bias and large variance when the logging policy is very different from the target policy being evaluated. To overcome this limitation, we explore the question of how to design data-gathering policies that most effectively augment an existing dataset of bandit feedback with additional observations for both learning and evaluation. To this effect, this paper introduces Minimum Variance Augmentation Logging (MVAL), a method for constructing logging policies that minimize the variance of the downstream evaluation or learning problem. We explore multiple approaches to computing MVAL policies efficiently, and find that they can be substantially more effective in decreasing the variance of an estimator than naïve approaches. 离线 A/B 测试和反事实学习方法正迅速被搜索和推荐系统采用,因为它们允许有效地重用现有的日志数据。然而,单独使用现有的测井数据存在基本的局限性,因为当测井策略与被评估的目标策略非常不同时,这些方法中常用的反事实估计量可能会有很大的偏差和很大的方差。为了克服这一局限性,我们探讨了如何设计数据收集政策的问题,以便最有效地增加现有的土匪反馈数据集,并为学习和评估提供额外的观察数据。为此,本文介绍了最小方差增强测井(MVAL) ,一种构造测井策略的方法,使下游评估或学习问题的方差最小化。我们探索了多种有效计算 MVAL 策略的方法,发现它们在减少估计量的方差方面比单纯的方法更有效。 code 0
DisKeyword: Tweet Corpora Exploration for Keyword Selection Sacha Lévy, Reihaneh Rabbany McGill University & Mila, Montreal, PQ, Canada How to accelerate the search for relevant topical keywords within a tweet corpus? Computational social scientists conducting topical studies employ large, self-collected or crowdsourced social media datasets such as tweet corpora. Comprehensive sets of relevant keywords are often necessary to sample or analyze these data sources. However, naively skimming through thousands of keywords can quickly become a daunting task. In this study, we present a web-based application to simplify the search for relevant topical hashtags in a tweet corpus. DisKeyword allows users to grasp high-level trends in their dataset, while iteratively labeling keywords recommended based on their links to prior labeled hashtags. We open-source our code under the MIT license. 如何在 tweet 语料库中加快相关主题关键词的搜索?进行专题研究的计算社会科学家使用大型、自我收集或众包的社会媒体数据集,如 tweet 语料库。为了抽样或分析这些数据源,通常需要相关关键字的综合集合。然而,天真地浏览数以千计的关键字很快就会成为一项艰巨的任务。在这项研究中,我们提出了一个网络应用程序来简化在 tweet 语料库中搜索相关话题标签的过程。DisKeyword 允许用户掌握数据集中的高级趋势,同时根据关键字与之前标记的 # 标签的链接反复标记推荐的关键字。我们在 MIT 许可下开源代码。 code 0
A Tutorial on Domain Generalization Jindong Wang, Haoliang Li, Sinno Jialin Pan, Xing Xie Microsoft Research, Beijing, China; City University of Hong Kong, Hong Kong, Hong Kong; Nanyang Technological University, Singapore, Singapore With the availability of massive labeled training data, powerful machine learning models can be trained. However, the traditional I.I.D. assumption that the training and testing data should follow the same distribution is often violated in reality. While existing domain adaptation approaches can tackle domain shift, it relies on the target samples for training. Domain generalization is a promising technology that aims to train models with good generalization ability to unseen distributions. In this tutorial, we will present the recent advance of domain generalization. Specifically, we introduce the background, formulation, and theory behind this topic. Our primary focus is on the methodology, evaluation, and applications. We hope this tutorial can draw interest of the community and provide a thorough review of this area. Eventually, more robust systems can be built for responsible AI. All tutorial materials and updates can be found online at https://dgresearch.github.io/. 随着海量标记训练数据的可用性,强大的机器学习模型可以训练。然而,传统的 IID 假设训练和测试数据应该遵循相同的分布在现实中经常被违反。虽然现有的域自适应方法可以解决域移位问题,但它依赖于目标样本进行训练。领域推广是一项很有前途的技术,其目标是训练具有良好的对未知分布推广能力的模型。在本教程中,我们将介绍域泛化的最新进展。具体来说,我们将介绍这个主题背后的背景、公式和理论。我们主要关注方法、评估和应用程序。我们希望本教程能够引起社区的兴趣,并提供一个彻底的审查这个领域。最终,可以为负责任的人工智能建立更健壮的系统。所有教程资料及更新可于网上 https://dgresearch.github.io/找到。 code 0
Compliance Analyses of Australia's Online Household Appliances Chang How Tan, Vincent C. S. Lee, Jessie Nghiem, Priya Laxman Monash University, Melbourne, Australia; Energy Safe Victoria, Melbourne, Australia Commercially sold electrical or gas products must comply with the safety standards imposed within a country and get registered and certified by a regulated body. However, with the increasing transition of businesses to e-commerce platforms, it becomes challenging to govern the compliance status of online products. This can increase the risk of purchasing non-compliant products which may be unsafe to use. Additionally, examining the compliance status before purchasing can be strenuous because the relevant compliance information can be ambiguous and not always directly available. Therefore, we collaborated with a regulated body from Australia, Energy Safe Victoria, and conducted compliance analyses for household appliances sold on multiple online platforms. A fully autonomous method shown in this public repository is also introduced to check the compliance status of any online product. In this talk, we discuss the compliance check process, which incorporates fuzzy logic for textual matching and a Convolutional Neural Network (CNN) model to classify the product listing based on the images listed. Subsequently, we studied the results with the business users and found that many online listings are non-compliant, signifying that online-shopping consumers are highly susceptible to buying unsafe products. We hope this talk can inspire more follow-up works that collaborate with regulated bodies to introduce a user-friendly compliance check platform that assists in educating consumers to purchase compliant products. 商业销售的电气或气体产品必须符合国家规定的安全标准,并获得受管制机构的注册和认证。然而,随着企业向电子商务平台的转型,对在线产品的合规状态进行管理变得越来越具有挑战性。这可能会增加购买不符合要求的产品的风险,因为这些产品使用起来可能不安全。此外,在采购之前检查法规遵循状态可能会很费力,因为相关的法规遵循信息可能含糊不清,并且并不总是直接可用。因此,我们与澳大利亚的能源安全维多利亚监管机构合作,对在多个在线平台上销售的家用电器进行了合规性分析。该公共存储库还引入了一种完全自主的方法来检查任何在线产品的遵从性状态。在这个演讲中,我们讨论了遵从性检查过程,它结合了文本匹配的模糊逻辑和一个卷积神经网络(CNN)模型来根据列出的图像对产品清单进行分类。随后,我们研究了商业用户的结果,发现许多网上列表是不符合的,这意味着网上购物的消费者非常容易购买不安全的产品。我们希望这次讲座可以激发更多跟进工作,与受规管机构合作,推出一个方便用户的合规检查平台,协助教育消费者购买合规产品。 code 0
Learning to Infer Product Attribute Values From Descriptive Texts and Images Pablo Montalvo, Aghiles Salah Rakuten Group, Inc., Paris, France Online marketplaces are able to offer a staggering array of products that no physical store can match. While this makes it more likely for customers to find what they want, in order for online providers to ensure a smooth and efficient user experience, they must maintain well-organized catalogs, which depends greatly on the availability of per-product attribute values such as color, material, brand, to name a few. Unfortunately, such information is often incomplete or even missing in practice, and therefore we have to resort to predictive models as well as other sources of information to impute missing attribute values. In this talk we present the deep learning-based approach that we have developed at Rakuten Group to extract attribute values from product descriptive texts and images. Starting from pretrained architectures to encode textual and visual modalities, we discuss several refinements and improvements that we find necessary to achieve satisfactory performance and meet strict business requirements, namely improving recall while maintaining a high precision (>= 95%). Our methodology is driven by a systematic investigation into several practical research questions surrounding multimodality, which we revisit in this talk. At the heart of our multimodal architecture, is a new method to combine modalities inspired by empirical cross-modality comparisons. We present the latter component in details, point out one of its major limitations, namely exacerbating the issue of modality collapse, i.e., when the model forgets one modality, and describe our mitigation to this problem based on a principled regularization scheme. We present various empirical results on both Rakuten data as well as public benchmark datasets, which provide evidence of the benefits of our approach compared to several strong baselines. We also share some insights to characterise the circumstances in which the proposed model offers the most significant improvements. We conclude this talk by criticising the current model and discussing possible future developments and improvements. Our model is successfully deployed in Rakuten Ichiba - a Rakuten marketplace - and we believe that our investigation into multimodal attribute value extraction for e-commerce will benefit other researchers and practitioners alike embarking on similar journeys. 在线市场能够提供一个惊人的产品阵列,没有实体店可以匹配。虽然这使得客户更有可能找到他们想要的东西,为了在线供应商确保一个顺利和有效的用户体验,他们必须保持良好的组织目录,这在很大程度上取决于每个产品属性值的可用性,如颜色,材料,品牌,等等。不幸的是,这些信息往往是不完整的,甚至在实践中缺失,因此,我们不得不求助于预测模型和其他信息来源,以推定缺失的属性值。在这个演讲中,我们介绍了我们在乐天集团开发的基于深度学习的方法,从产品描述性文本和图像中提取属性值。从预先训练的体系结构开始编码文本和视觉模式,我们讨论了几个我们认为必要的细化和改进,以实现令人满意的性能和满足严格的业务需求,即在保持高精度(> = 95%)的同时提高召回率。我们的方法论是由一个系统的调查,对几个实际的研究问题围绕多模态,我们在这个演讲中重新讨论。在我们的多模态结构的核心,是一种新的方法,结合模式的灵感经验交叉模态比较。我们详细介绍了后一个组成部分,指出其主要的局限性之一,即加剧了模态崩溃的问题,即当模型忘记了一个模态,并描述了我们对这个问题的缓解基于一个原则性的正则化方案。我们展示了乐天数据和公共基准数据集的各种实证结果,它们提供了证据,证明我们的方法相对于几个强基线的好处。我们还分享了一些见解,以描述所提议的模型在哪些情况下提供了最重要的改进。我们通过批评目前的模式和讨论未来可能的发展和改进来结束这次演讲。我们的模型已经成功地应用于乐天市场——乐天市场——我们相信,我们对电子商务多模态属性值提取的研究将有利于其他研究人员和从业人员开始类似的旅程。 code 0
Student Behavior Pattern Mining and Analysis: Towards Smart Campuses Teng Guo, Feng Xia Dalian University of Technology, Dalian, China; RMIT University, Melbourne, Australia Understanding student behavior patterns is fundamental to building smart campuses. However, the diversity of student behavior and the complexity of educational data not only bring great obstacles to the relevant research, but also leads to unstable performance and low reliability of current student behavior analysis systems. The emergence of educational big data and the latest advances in deep learning and representation learning provide unprecedented opportunities to tackle the above problems. In this talk, we introduce how we mine and analyze student behavior patterns by overcoming the complexity of educational data. Specifically, we propose a series of algorithmic frameworks, which take advantage of network science, data mining, and machine learning to form a data-driven system for mining and analyzing student behavior patterns. Our research not only fills the gap in the field of student abnormal behavior warning and student status monitoring, but also provides insights into data-driven smart city construction. 理解学生的行为模式是建设智能校园的基础。然而,学生行为的多样性和教育数据的复杂性不仅给相关研究带来了巨大的障碍,而且也导致了现有学生行为分析系统的不稳定性和可靠性低。教育大数据的出现以及深度学习和表征学习的最新进展为解决上述问题提供了前所未有的机遇。在这个演讲中,我们将介绍如何通过克服教育数据的复杂性来挖掘和分析学生的行为模式。具体来说,我们提出了一系列的算法框架,它们利用网络科学、数据挖掘和机器学习的优势,形成了一个数据驱动的系统,用于挖掘和分析学生的行为模式。本研究不仅填补了学生异常行为预警和学生状态监测领域的空白,而且为数据驱动的智能城市建设提供了有益的启示。 code 0
Local Edge Dynamics and Opinion Polarization Nikita Bhalla, Adam Lechowicz, Cameron Musco University of Massachusetts Amherst, Amherst, MA, USA The proliferation of social media platforms, recommender systems, and their joint societal impacts have prompted significant interest in opinion formation and evolution within social networks. We study how local edge dynamics can drive opinion polarization. In particular, we introduce a variant of the classic Friedkin-Johnsen opinion dynamics, augmented with a simple time-evolving network model. Edges are iteratively added or deleted according to simple rules, modeling decisions based on individual preferences and network recommendations. Via simulations on synthetic and real-world graphs, we find that the combined presence of two dynamics gives rise to high polarization: 1) confirmation bias -- i.e., the preference for nodes to connect to other nodes with similar expressed opinions and 2) friend-of-friend link recommendations, which encourage new connections between closely connected nodes. We show that our model is tractable to theoretical analysis, which helps explain how these local dynamics erode connectivity across opinion groups, affecting polarization and a related measure of disagreement across edges. Finally, we validate our model against real-world data, showing that our edge dynamics drive the structure of arbitrary graphs, including random graphs, to more closely resemble real social networks. 社交媒体平台、推荐系统的扩散,以及它们共同的社会影响,促使人们对社交网络中的舆论形成和演变产生了极大的兴趣。我们研究局部边缘动力学如何驱动意见两极分化。特别地,我们引入了经典 Friedkin-Johnsen 观点动力学的一个变体,并辅以一个简单的随时间演化的网络模型。根据简单的规则、基于个人偏好和网络建议的决策建模,可以迭代地添加或删除边缘。通过对合成和真实世界图形的模拟,我们发现两种动力的结合产生了高度极化: 1)确认偏差——即,偏好节点连接到其他有相似表达意见的节点,2)朋友之间的链接推荐,鼓励紧密连接的节点之间建立新的连接。我们表明,我们的模型是易于理论分析,这有助于解释这些局部动态如何侵蚀连通性的意见集团,影响两极分化和相关措施的分歧跨边缘。最后,我们验证了我们的模型对现实世界的数据,表明我们的边缘动力学驱动任意图的结构,包括随机图,更接近真实的社会网络。 code 0
Beyond-Accuracy Goals, Again Maarten de Rijke University of Amsterdam, Amsterdam, Netherlands Improving the performance of information retrieval systems tends to be narrowly scoped. Often, better prediction performance is considered the only metric of improvement. As a result, work on improving information retrieval methods usually focuses on im- proving the methods' accuracy. Such a focus is myopic. Instead, as researchers and practitioners we should adopt a richer perspective measuring the performance of information retrieval systems. I am not the first to make this point (see, e.g., [4]), but I want to highlight dimensions that broaden the scope considered so far and offer a number of examples to illustrate what this would mean for our research agendas. First, trustworthiness is a prerequisite for people, organizations, and societies to use AI-based, and, especially, machine learning- based systems in general, and information retrieval systems in particular. Trust can be gained in an intrinsic manner by revealing the inner workings of an AI-based system, i.e., through explainability. Or it can be gained extrinsically by showing, in a principled or empirical manner, that a system upholds verifiable guarantees. Such guarantees should obtained for the following dimensions (at a minimum): (i) accuracy, including well-defined and explained contexts of usage; (ii) reliability, including exhibiting parity with respect to sensitive attributes; (iii) repeatable and reproducible results, including audit trails; (iv) resilience to adversarial examples, distributional shifts; and (v) safety, including privacy-preserving search and recommendation. Second, in information retrieval, our experiments are mostly conducted in controlled laboratory environments. Extrapolating this information to evaluate the real-world effects often remains a challenge. This is particularly true when measuring the impact of information retrieval systems across broader scales, both temporally and spatially. Conducting controlled experimental trials for evaluating real-world impacts of information retrieval systems can result in depicting a snapshot situation, where systems are tailored towards that specific environment. As society is constantly changing, the requirements set for information retrieval systems are changing as well, resulting in short-term and long-term feedback loops with interactions between society and information retrieval systems. 改善信息检索系统的表现往往局限于狭窄的范围。通常,更好的预测性能被认为是改进的唯一指标。因此,改进信息检索方法的工作通常侧重于提高方法的准确性。这样的关注是短视的。相反,作为研究人员和实践者,我们应该采取更加丰富的视角来衡量信息检索系统的表现。我并不是第一个提出这个观点的人(参见,例如,[4]) ,但是我想强调的是目前为止所考虑的扩大范围的维度,并提供一些例子来说明这对我们的研究议程意味着什么。首先,诚信是个人、组织和社会使用基于人工智能的系统,尤其是基于机器学习的系统,特别是信息检索系统的先决条件。信任可以通过揭示基于人工智能的系统的内部工作方式获得,也就是说,通过可解释性。或者,可以通过原则性或经验性的方式表明,一个系统支持可验证的保证,从而获得外在的保证。这样的保证应该获得以下维度(最低限度) : (i)准确性,包括明确定义和解释的使用背景; (ii)可靠性,包括显示相对于敏感属性的同等性; (iii)可重复和可重复的结果,包括审计跟踪; (iv)对抗性例子的弹性,分布转移; 和(v)安全,包括保护隐私的搜索和推荐。其次,在信息检索中,我们的实验大多是在受控的实验室环境中进行的。外推这些信息以评估现实世界的影响往往仍然是一个挑战。在衡量信息检索系统在时间和空间上的影响时尤其如此。为评估信息检索系统在现实世界中的影响而进行的受控试验,可以描绘出一个快照状态,即系统根据特定环境进行调整。随着社会不断变化,对信息检索系统的要求也在不断变化,导致社会和信息检索系统之间的互动产生短期和长期的反馈循环。 code 0
Towards Autonomous Driving YaQin Zhang CentraleSuplec, France; Institute for Infocomm Research (I2R), A∗STAR, Singapore; Singapore University of Technology and Design, Singapore; CVSSP, University of Surrey, UK With the increasing global popularity of self-driving cars, there is an immediate need for challenging real-world datasets for benchmarking and training various computer vision tasks such as 3D object detection. Existing datasets either represent simple scenarios or provide only day-time data. In this paper, we introduce a new challenging A3D dataset which consists of RGB images and LiDAR data with a significant diversity of scene, time, and weather. The dataset consists of high-density images (≈ 10 times more than the pioneering KITTI dataset), heavy occlusions, a large number of nighttime frames (≈ 3 times the nuScenes dataset), addressing the gaps in the existing datasets to push the boundaries of tasks in autonomous driving research to more challenging highly diverse environments. The dataset contains 39K frames, 7 classes, and 230K 3D object annotations. An extensive 3D object detection benchmark evaluation on the A3D dataset for various attributes such as high density, day-time/night-time, gives interesting insights into the advantages and limitations of training and testing 3D object detection in real-world setting. 随着自动驾驶汽车在全球日益普及,我们迫切需要挑战现实世界中的数据集,以便对各种计算机视觉任务(如3D 目标检测)进行基准测试和培训。现有数据集要么表示简单的方案,要么只提供日间数据。本文介绍了一个新的具有挑战性的 A * 3D 数据集,该数据集由 RGB 图像和激光雷达数据组成,具有显著的场景、时间和天气多样性。该数据集由高密度图像(约比先驱 KITTI 数据集多10倍) ,重闭塞,大量夜间帧(约为 nuScenes 数据集的3倍)组成,解决现有数据集中的差距,以推动自主驾驶研究中的任务边界更具挑战性的高度多样化的环境。该数据集包含39K 帧、7个类和230K 3D 对象注释。对于高密度、日间/夜间等各种属性的 a * 3 d 数据集,一个广泛的3 d 目标检测基准评估,让我们对在现实世界中训练和测试3 d 目标检测的优势和局限性有了有趣的见解。 code 0
Beyond Digital "Echo Chambers": The Role of Viewpoint Diversity in Political Discussion Rishav Hada, Amir Ebrahimi Fard, Sarah Shugars, Federico Bianchi, Patrícia G. C. Rossini, Dirk Hovy, Rebekah Tromble, Nava Tintarev Microsoft Research India, Bengaluru, India; Stanford University, Stanford, CA, USA; Maastricht University, Maastricht, Netherlands; Rutgers University, New Brunswick, NJ, USA; George Washington University, Washington, D.C., DC, USA; Bocconi University, Milan, Italy; University of Glasgow, Glasgow, United Kingdom Increasingly taking place in online spaces, modern political conversations are typically perceived to be unproductively affirming -- siloed in so called ``echo chambers'' of exclusively like-minded discussants. Yet, to date we lack sufficient means to measure viewpoint diversity in conversations. To this end, in this paper, we operationalize two viewpoint metrics proposed for recommender systems and adapt them to the context of social media conversations. This is the first study to apply these two metrics (Representation and Fragmentation) to real world data and to consider the implications for online conversations specifically. We apply these measures to two topics -- daylight savings time (DST), which serves as a control, and the more politically polarized topic of immigration. We find that the diversity scores for both Fragmentation and Representation are lower for immigration than for DST. Further, we find that while pro-immigrant views receive consistent pushback on the platform, anti-immigrant views largely operate within echo chambers. We observe less severe yet similar patterns for DST. Taken together, Representation and Fragmentation paint a meaningful and important new picture of viewpoint diversity. 现代政治对话越来越多地发生在网络空间,人们通常认为这种肯定毫无成效——被孤立在所谓的“回音室”里,里面全是志趣相投的讨论者。然而,到目前为止,我们还缺乏足够的手段来衡量会话中的观点多样性。为此,本文将推荐系统中提出的两个视点度量进行操作化,并将其应用到社交媒体会话的语境中。这是第一个将这两个指标(表示和碎片化)应用于真实世界数据并特别考虑在线对话的含义的研究。我们将这些措施应用于两个主题——日光节约时间(DST) ,作为一种控制手段,以及政治上更加两极分化的移民问题。我们发现,分裂和表征的多样性得分对于移民来说都比对于 DST 来说要低。此外,我们发现支持移民的观点在平台上受到一致的抵制,而反移民的观点在很大程度上是在回音室内运作的。我们观察到不太严重但类似的 DST 模式。综上所述,表示和碎片描绘了一幅有意义的、重要的视点多样性的新图景。 code 0
MM-GNN: Mix-Moment Graph Neural Network towards Modeling Neighborhood Feature Distribution Wendong Bi, Lun Du, Qiang Fu, Yanlin Wang, Shi Han, Dongmei Zhang Microsoft Research Asia, Beijing, China; University of Chinese Academy of Sciences, Beijing, China Graph Neural Networks (GNNs) have shown expressive performance on graph representation learning by aggregating information from neighbors. Recently, some studies have discussed the importance of modeling neighborhood distribution on the graph. However, most existing GNNs aggregate neighbors' features through single statistic (e.g., mean, max, sum), which loses the information related to neighbor's feature distribution and therefore degrades the model performance. In this paper, inspired by the method of moment in statistical theory, we propose to model neighbor's feature distribution with multi-order moments. We design a novel GNN model, namely Mix-Moment Graph Neural Network (MM-GNN), which includes a Multi-order Moment Embedding (MME) module and an Element-wise Attention-based Moment Adaptor module. MM-GNN first calculates the multi-order moments of the neighbors for each node as signatures, and then use an Element-wise Attention-based Moment Adaptor to assign larger weights to important moments for each node and update node representations. We conduct extensive experiments on 15 real-world graphs (including social networks, citation networks and web-page networks etc.) to evaluate our model, and the results demonstrate the superiority of MM-GNN over existing state-of-the-art models. 图形神经网络(GNN)通过聚集邻居的信息,在图表示学习方面表现出很好的表现。最近,一些研究讨论了在图上建立邻域分布模型的重要性。然而,现有的 GNN 通过单一统计量(如均值、最大值、和)聚集邻居的特征,丢失了与邻居特征分布相关的信息,从而降低了模型的性能。本文借鉴统计理论中的矩方法,提出了用多阶矩来模拟邻居的特征分布。设计了一种新的 GNN 模型,即混合矩图神经网络(MM-GNN) ,它包括一个多阶矩嵌入(MME)模块和一个基于元素注意的矩适配器模块。MM-GNN 首先计算每个节点邻居的多阶矩作为签名,然后使用基于元素的注意力矩适配器为每个节点的重要矩赋予更大的权值,并更新节点表示。我们对15个真实世界的图形(包括社交网络、引文网络和网页网络等)进行了广泛的实验,以评估我们的模型,结果表明 MM-GNN 相对于现有的最先进的模型的优越性。 code 0
Global Counterfactual Explainer for Graph Neural Networks Zexi Huang, Mert Kosan, Sourav Medya, Sayan Ranu, Ambuj K. Singh ; University of Illinois Chicago, Chicago, IL, USA; University of California, Santa Barbara, Santa Barbara, CA, USA; Indian Institute of Technology Delhi, Delhi, India Graph neural networks (GNNs) find applications in various domains such as computational biology, natural language processing, and computer security. Owing to their popularity, there is an increasing need to explain GNN predictions since GNNs are black-box machine learning models. One way to address this is counterfactual reasoning where the objective is to change the GNN prediction by minimal changes in the input graph. Existing methods for counterfactual explanation of GNNs are limited to instance-specific local reasoning. This approach has two major limitations of not being able to offer global recourse policies and overloading human cognitive ability with too much information. In this work, we study the global explainability of GNNs through global counterfactual reasoning. Specifically, we want to find a small set of representative counterfactual graphs that explains all input graphs. Towards this goal, we propose GCFExplainer, a novel algorithm powered by vertex-reinforced random walks on an edit map of graphs with a greedy summary. Extensive experiments on real graph datasets show that the global explanation from GCFExplainer provides important high-level insights of the model behavior and achieves a 46.9% gain in recourse coverage and a 9.5% reduction in recourse cost compared to the state-of-the-art local counterfactual explainers. 图形神经网络(GNN)在计算生物学、自然语言处理和计算机安全等领域有着广泛的应用。由于 GNN 的流行,人们越来越需要解释 GNN 的预测,因为 GNN 是黑盒机器学习模型。解决这个问题的一种方法是反事实推理,其目标是通过输入图中的最小变化来改变 GNN 预测。现有的 GNN 反事实解释方法仅限于实例特定的局部推理。这种方法有两个主要的局限性,即不能提供全球追索政策和过多的信息使人类的认知能力负担过重。本文通过全局反事实推理来研究 GNN 的全局可解性。具体来说,我们希望找到一小组有代表性的反事实图解释所有输入图。为了实现这一目标,我们提出了 GCFExplainer 算法,这是一种新的算法,它通过在带有贪婪摘要的图的编辑地图上顶点增强的随机游动来实现。对实际图形数据集的大量实验表明,GCFExplainer 的全局解释为模型行为提供了重要的高层次见解,与最先进的本地反事实解释者相比,获得了46.9% 的追索覆盖率和9.5% 的追索成本降低。 code 0
Effective Graph Kernels for Evolving Functional Brain Networks Xinlei Wang, Jinyi Chen, Bing Tian Dai, Junchang Xin, Yu Gu, Ge Yu Singapore Management University, Singapore, Singapore; Northeastern University, Shenyang, China The graph kernel of the functional brain network is an effective method in the field of neuropsychiatric disease diagnosis like Alzheimer's Disease (AD). The traditional static brain networks cannot reflect dynamic changes of brain activities, but evolving brain networks, which are a series of brain networks over time, are able to seize such dynamic changes. As far as we know, the graph kernel method is effective for calculating the differences among networks. Therefore, it has a great potential to understand the dynamic changes of evolving brain networks, which are a series of chronological differences. However, if the conventional graph kernel methods which are built for static networks are applied directly to evolving networks, the evolving information will be lost and accurate diagnostic results will be far from reach. We propose an effective method, called Global Matching based Graph Kernels (GM-GK), which captures dynamic changes of evolving brain networks and significantly improves classification accuracy. At the same time, in order to reflect the natural properties of the brain activity of the evolving brain network neglected by the GM-GK method, we also propose a Local Matching based Graph Kernel (LM-GK), which allows the order of the evolving brain network to be locally fine-tuned. Finally, the experiments are conducted on real data sets and the results show that the proposed methods can significantly improve the neuropsychiatric disease diagnostic accuracy. 功能性脑网络图核是诊断阿尔茨海默病(AD)等神经精神疾病的有效方法。传统的静态大脑网络不能反映大脑活动的动态变化,而进化的大脑网络是一系列随时间变化的大脑网络,能够抓住这种动态变化。据我们所知,图核方法是计算网络间差异的有效方法。因此,研究进化中的大脑网络的动态变化,即一系列的时间差异,具有很大的潜力。然而,如果将静态网络的传统图核方法直接应用于进化网络,则进化信息将会丢失,远不能得到准确的诊断结果。提出了一种有效的基于全局匹配的图核(GM-GK)方法,该方法能够捕捉大脑网络演化过程中的动态变化,显著提高分类精度。同时,为了反映被 GM-GK 方法忽略的进化脑网络的大脑活动的自然属性,我们还提出了一种基于局部匹配的图核(LM-GK) ,它允许对进化脑网络的顺序进行局部微调。最后,在实际数据集上进行了实验,结果表明所提出的方法可以显著提高神经精神疾病的诊断准确率。 code 0
Self-Supervised Graph Structure Refinement for Graph Neural Networks Jianan Zhao, Qianlong Wen, Mingxuan Ju, Chuxu Zhang, Yanfang Ye University of Notre Dame, Notre Dame, IN, USA; Brandeis University, Waltham, MA, USA Graph structure learning (GSL), which aims to learn the adjacency matrix for graph neural networks (GNNs), has shown great potential in boosting the performance of GNNs. Most existing GSL works apply a joint learning framework where the estimated adjacency matrix and GNN parameters are optimized for downstream tasks. However, as GSL is essentially a link prediction task, whose goal may largely differ from the goal of the downstream task. The inconsistency of these two goals limits the GSL methods to learn the potential optimal graph structure. Moreover, the joint learning framework suffers from scalability issues in terms of time and space during the process of estimation and optimization of the adjacency matrix. To mitigate these issues, we propose a graph structure refinement (GSR) framework with a pretrain-finetune pipeline. Specifically, The pre-training phase aims to comprehensively estimate the underlying graph structure by a multi-view contrastive learning framework with both intra- and inter-view link prediction tasks. Then, the graph structure is refined by adding and removing edges according to the edge probabilities estimated by the pre-trained model. Finally, the fine-tuning GNN is initialized by the pre-trained model and optimized toward downstream tasks. With the refined graph structure remaining static in the fine-tuning space, GSR avoids estimating and optimizing graph structure in the fine-tuning phase which enjoys great scalability and efficiency. Moreover, the fine-tuning GNN is boosted by both migrating knowledge and refining graphs. Extensive experiments are conducted to evaluate the effectiveness (best performance on six benchmark datasets), efficiency, and scalability (13.8x faster using 32.8% GPU memory compared to the best GSL baseline on Cora) of the proposed model. 图结构学习(GSL)旨在学习图神经网络(gnn)的邻接矩阵,在提高 GNN 的性能方面显示出巨大的潜力。大部分现有的政府物流服务工作均采用联合学习架构,为下游工作优化估计的邻接矩阵和 GNN 参数。然而,由于 GSL 本质上是一个链路预测任务,其目标可能与下游任务的目标大不相同。这两个目标的不一致性限制了 GSL 方法学习潜在的最优图结构。此外,联合学习架构在评估和优化邻接矩阵的过程中,在时间和空间方面都存在可扩展性问题。为了缓解这些问题,我们提出了一个图结构细化(GSR)框架与预训练-精细调整流水线。具体来说,预训练阶段的目标是通过一个多视图对比学习框架,同时完成视图内和视图间的链接预测任务,全面估计底层的图结构。然后,根据预训练模型估计的边概率,通过添加和去除边来细化图的结构。最后,通过预训练模型对微调 GNN 进行初始化,并针对下游任务进行优化。由于精化后的图结构在微调空间中保持静态,GSR 避免了在微调阶段对图结构进行估计和优化,具有很强的可扩展性和高效性。此外,知识的迁移和图的精化对微调 GNN 有很大的促进作用。进行了广泛的实验来评估所提出模型的有效性(在六个基准数据集上的最佳性能) ,效率和可伸缩性(使用32.8% GPU 存储器比 Cora 上的最佳 GSL 基线快13.8倍)。 code 0
Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning Tianchi Cai, Jiyan Jiang, Wenpeng Zhang, Shiji Zhou, Xierui Song, Li Yu, Lihong Gu, Xiaodong Zeng, Jinjie Gu, Guannan Zhang Ant Group, Hangzhou, China; Ant Group, Beijing, China; Tsinghua University, Beijing, China We study the budget allocation problem in online marketing campaigns that utilize previously collected offline data. We first discuss the long-term effect of optimizing marketing budget allocation decisions in the offline setting. To overcome the challenge, we propose a novel game-theoretic offline value-based reinforcement learning method using mixed policies. The proposed method reduces the need to store infinitely many policies in previous methods to only constantly many policies, which achieves nearly optimal policy efficiency, making it practical and favorable for industrial usage. We further show that this method is guaranteed to converge to the optimal policy, which cannot be achieved by previous value-based reinforcement learning methods for marketing budget allocation. Our experiments on a large-scale marketing campaign with tens-of-millions users and more than one billion budget verify the theoretical results and show that the proposed method outperforms various baseline methods. The proposed method has been successfully deployed to serve all the traffic of this marketing campaign. 我们研究预算分配问题在线营销活动,利用以前收集的离线数据。我们首先讨论了线下环境下优化营销预算分配决策的长期效果。为了克服这一挑战,我们提出了一种新的基于游戏理论的离线价值强化学习的混合策略方法。该方法将以往方法中存储无限多策略的需要减少到只需要不断存储多个策略,达到了接近最优的策略效率,具有实用性,有利于工业应用。我们进一步表明,这种方法能够保证收敛到最优策略,这是以前基于价值的强化学习方法无法实现的。我们在拥有数千万用户和超过十亿预算的大规模营销活动中进行的实验验证了理论结果,并表明所提出的方法优于各种基准方法。所提出的方法已成功地应用于服务的所有流量的这个营销活动。 code 0
Reliable Decision from Multiple Subtasks through Threshold Optimization: Content Moderation in the Wild Donghyun Son, Byounggyu Lew, Kwanghee Choi, Yongsu Baek, Seungwoo Choi, Beomjun Shin, Sungjoo Ha, Buru Chang Match Group, Dallas, TX, USA; Hyperconnect, Seoul, Republic of Korea Social media platforms struggle to protect users from harmful content through content moderation. These platforms have recently leveraged machine learning models to cope with the vast amount of user-generated content daily. Since moderation policies vary depending on countries and types of products, it is common to train and deploy the models per policy. However, this approach is highly inefficient, especially when the policies change, requiring dataset re-labeling and model re-training on the shifted data distribution. To alleviate this cost inefficiency, social media platforms often employ third-party content moderation services that provide prediction scores of multiple subtasks, such as predicting the existence of underage personnel, rude gestures, or weapons, instead of directly providing final moderation decisions. However, making a reliable automated moderation decision from the prediction scores of the multiple subtasks for a specific target policy has not been widely explored yet. In this study, we formulate real-world scenarios of content moderation and introduce a simple yet effective threshold optimization method that searches the optimal thresholds of the multiple subtasks to make a reliable moderation decision in a cost-effective way. Extensive experiments demonstrate that our approach shows better performance in content moderation compared to existing threshold optimization methods and heuristics. 社交媒体平台努力通过内容管制来保护用户免受有害内容的伤害。这些平台最近利用机器学习模型来处理日常大量的用户生成内容。由于适度政策因国家和产品类型的不同而有所不同,因此通常会根据政策对模型进行培训和部署。然而,这种方法是非常低效的,尤其是当策略发生变化时,需要重新标记数据集和对移动数据分布进行模型再训练。为了降低成本效率,社交媒体平台经常使用第三方内容审核服务,提供多个子任务的预测分数,比如预测未成年人、粗鲁的手势或武器的存在,而不是直接提供最终审核决定。然而,从特定目标策略的多个子任务的预测分数中做出可靠的自动调节决策还没有得到广泛的研究。在这项研究中,我们提出了真实世界中的内容审核场景,并介绍了一个简单而有效的阈值优化方法,搜索多个子任务的最佳阈值,以一种成本效益高的方式作出可靠的审核决策。大量的实验表明,与现有的阈值优化方法和启发式算法相比,我们的方法在内容调节方面表现出更好的性能。 code 0
Few-shot Node Classification with Extremely Weak Supervision Song Wang, Yushun Dong, Kaize Ding, Chen Chen, Jundong Li Arizona State University, Phoniex, AZ, USA; University of Virginia, Charlottesville, VA, USA Few-shot node classification aims at classifying nodes with limited labeled nodes as references. Recent few-shot node classification methods typically learn from classes with abundant labeled nodes (i.e., meta-training classes) and then generalize to classes with limited labeled nodes (i.e., meta-test classes). Nevertheless, on real-world graphs, it is usually difficult to obtain abundant labeled nodes for many classes. In practice, each meta-training class can only consist of several labeled nodes, known as the extremely weak supervision problem. In few-shot node classification, with extremely limited labeled nodes for meta-training, the generalization gap between meta-training and meta-test will become larger and thus lead to suboptimal performance. To tackle this issue, we study a novel problem of few-shot node classification with extremely weak supervision and propose a principled framework X-FNC under the prevalent meta-learning framework. Specifically, our goal is to accumulate meta-knowledge across different meta-training tasks with extremely weak supervision and generalize such knowledge to meta-test tasks. To address the challenges resulting from extremely scarce labeled nodes, we propose two essential modules to obtain pseudo-labeled nodes as extra references and effectively learn from extremely limited supervision information. We further conduct extensive experiments on four node classification datasets with extremely weak supervision to validate the superiority of our framework compared to the state-of-the-art baselines. 少镜头节点分类旨在将有限标记节点作为参考节点进行分类。最近的少镜头节点分类方法通常学习具有大量标记节点的类(即元训练类) ,然后推广到具有有限标记节点的类(即元测试类)。然而,在现实世界图中,通常很难获得多个类的大量标记节点。在实践中,每个元培训课程只能由几个标记节点组成,这就是所谓的极弱监督问题。在少镜头节点分类中,由于元训练的标记节点非常有限,元训练和元测试之间的泛化差距会变大,从而导致性能的次优。为了解决这个问题,我们研究了一个新的极弱监督的少镜头节点分类问题,并在现有的元学习框架下提出了一个原则框架 X-FNC。具体来说,我们的目标是在监督非常薄弱的情况下,通过不同的元培训任务积累元知识,并将这些知识推广到元测试任务中。为了解决标记节点稀缺带来的挑战,提出了两个基本模块来获取伪标记节点作为额外的参考,并有效地学习极其有限的监督信息。我们进一步在监督极其薄弱的四个节点分类数据集上进行了广泛的实验,以验证我们的框架相对于最先进的基线的优越性。 code 0
Adversarial Autoencoder for Unsupervised Time Series Anomaly Detection and Interpretation Xuanhao Chen, Liwei Deng, Yan Zhao, Kai Zheng University of Electronic Science and Technology of China, Chengdu, China; Aalborg University, Aalborg, Denmark In many complex systems, devices are typically monitored and generating massive multivariate time series. However, due to the complex patterns and little useful labeled data, it is a great challenge to detect anomalies from these time series data. Existing methods either rely on less regularizations, or require a large number of labeled data, leading to poor accuracy in anomaly detection. To overcome the limitations, in this paper, we propose an adversarial autoencoder anomaly detection and interpretation framework named DAEMON, which performs robustly for various datasets. The key idea is to use two discriminators to adversarially train an autoencoder to learn the normal pattern of multivariate time series, and thereafter use the reconstruction error to detect anomalies. The robustness of DAEMON is guaranteed by the regularization of hidden variables and reconstructed data using the adversarial generation method. An unsupervised approach used to detect anomalies is proposed. Moreover, in order to help operators better diagnose anomalies, DAEMON provides anomaly interpretation by computing the gradients of anomalous data. An extensive empirical study on real data offers evidence that the framework is capable of outperforming state-of-the-art methods in terms of the overall F1-score and interpretation accuracy for time series anomaly detection. 在许多复杂的系统中,设备通常被监控并产生大量的多变量时间序列。然而,由于这些时间序列数据模式复杂,标记数据不多,检测异常是一个很大的挑战。现有的方法要么依赖较少的规范化,要么需要大量的标记数据,导致异常检测的准确性较差。为了克服这些局限性,在本文中,我们提出了一个对抗性的自动编码器异常检测和解释框架 DAEMON,它可以对各种数据集进行鲁棒的处理。其核心思想是利用两个鉴别器对自动编码器进行逆向训练,使其学习多变量时间序列的正态模式,然后利用重构误差检测异常。利用对抗生成方法对隐变量进行正则化和重构数据,保证了 DAEMON 的鲁棒性。提出了一种无监督的异常检测方法。此外,为了帮助操作人员更好地诊断异常,DAEMON 通过计算异常数据的梯度来提供异常解释。对实际数据的广泛实证研究提供的证据表明,该框架能够在时间序列异常检测的整体 f 1得分和解释准确性方面超越最先进的方法。 code 0
Simultaneous Linear Multi-view Attributed Graph Representation Learning and Clustering Chakib Fettal, Lazhar Labiod, Mohamed Nadif Université Paris Cité, Paris, France; Université Paris Cité & Informatique CDC, Paris, France Over the last few years, various multi-view graph clustering methods have shown promising performances. However, we argue that these methods can have limitations. In particular, they are often unnecessarily complex, leading to scalability problems that make them prohibitive for most real-world graph applications. Furthermore, many of them can handle only specific types of multi-view graphs. Another limitation is that the process of learning graph representations is separated from the clustering process, and in some cases these methods do not even learn a graph representation, which severely restricts their flexibility and usefulness. In this paper we propose a simple yet effective linear model that addresses the dual tasks of multi-view attributed graph representation learning and clustering in a unified framework. The model starts by performing a first-order neighborhood smoothing step for the different individual views, then gives each one a weight corresponding to its importance. Finally, an iterative process of simultaneous clustering and representation learning is performed w.r.t. the importance of each view, yielding a consensus embedding and partition of the graph. Our model is generic and can deal with any type of multi-view graph. Finally, we show through extensive experimentation that this simple model consistently achieves competitive performances w.r.t. state-of-the-art multi-view attributed graph clustering models, while at the same time having training times that are shorter, in some cases by orders of magnitude. 在过去的几年中,各种多视图图聚类方法已经显示出良好的性能。然而,我们认为这些方法可能有局限性。特别是,它们通常不必要地复杂,导致可伸缩性问题,这使得它们对于大多数真实世界的图形应用程序来说都是禁止的。此外,它们中的许多只能处理特定类型的多视图图形。另一个限制是学习图表示的过程与聚类过程是分离的,在某些情况下这些方法甚至不学习图表示,这严重限制了它们的灵活性和有用性。本文提出了一个简单而有效的线性模型,在一个统一的框架下解决了多视图属性图表示学习和聚类的双重任务。该模型首先对不同的单个视图执行一阶邻域平滑步骤,然后给每个视图一个与其重要性相对应的权重。最后,对每个视图的重要性进行了同时聚类和表示学习的迭代过程,得到了图的一致嵌入和划分。我们的模型是通用的,可以处理任何类型的多视图图形。最后,我们通过大量的实验表明,这个简单的模型一致地获得了具有竞争力的性能。 r.t. 最先进的多视图属性图聚类模型,同时具有较短的训练时间,在某些情况下通过数量级。 code 0
DeMEtRIS: Counting (near)-Cliques by Crawling Suman K. Bera, Jayesh Choudhari, Shahrzad Haddadan, Sara Ahmadian Cube Global Ltd., London, United Kingdom; Katana Graph, San Jose, CA, USA; Rutgers Business School and Brown Data Science Initiative, New Brunswick, NJ, USA; Google Research, Mountain View, CA, USA We study the problem of approximately counting cliques and near cliques in a graph, where the access to the graph is only available through crawling its vertices; thus typically seeing only a small portion of it. This model, known as the random walk model or the neighborhood query model has been introduced recently and captures real-life scenarios in which the entire graph is too massive to be stored as a whole or be scanned entirely and sampling vertices independently is non-trivial in it. We introduce DeMEtRIS: Dense Motif Estimation through Random Incident Sampling. This method provides a scalable algorithm for clique and near clique counting in the random walk model. We prove the correctness of our algorithm through rigorous mathematical analysis and extensive experiments. Both our theoretical results and our experiments show that DeMEtRIS obtains a high precision estimation by only crawling a sub-linear portion on vertices, thus we demonstrate a significant improvement over previously known results. 我们研究了近似计算图中的团和近团的问题,其中对图的访问只能通过爬行它的顶点,因此通常只能看到它的一小部分。这种模型被称为随机游走模型或邻域查询模型,最近被引入,它捕获了整个图太大而不能作为一个整体存储或完全被扫描的真实场景,并且独立的采样顶点在其中是不平凡的。我们介绍了 DeMEtRIS: 基于随机事件抽样的稠密基序估计。该方法为随机游走模型中的团簇计数和近团簇计数提供了一种可扩展的算法。通过严格的数学分析和大量的实验证明了算法的正确性。我们的理论结果和我们的实验表明,DeMEtRIS 获得了一个高精度的估计,只爬行一个次线性部分的顶点,因此我们证明了一个显着的改进以前已知的结果。 code 0
A Multi-graph Fusion Based Spatiotemporal Dynamic Learning Framework Xu Wang, Lianliang Chen, Hongbo Zhang, Pengkun Wang, Zhengyang Zhou, Yang Wang University of Science and Technology of China, Hefei, China Spatiotemporal data forecasting is a fundamental task in the field of graph data mining. Typical spatiotemporal data prediction methods usually capture spatial dependencies by directly aggregating features of local neighboring vertices in a fixed graph. However, this kind of aggregators can only capture localized correlations between vertices, and while been stacked for larger receptive field, they fall into the dilemma of over-smoothing. Additional, in temporal perspective, traditional methods focus on fixed graphs, while the correlations among vertexes can be dynamic. And time series components integrated strategies in traditional spatiotemporal learning methods can hardly handle frequently and drastically changed sequences. To overcome those limitations of existing works, in this paper, we propose a novel multi-graph based dynamic learning framework. First, a novel Dynamic Neighbor Search (DNS) mechanism is introduced to model global dynamic correlations between vertices by constructing a feature graph (FG), where the adjacency matrix is dynamically determined by DNS. Then we further alleviate the over-smoothing issue with our newly designed Adaptive Heterogeneous Representation (AHR) module. Both FG and origin graph (OG) are fed into the AHR modules and fused in our proposed Multi-graph Fusion block. Additionally, we design a Differential Vertex Representation (DVR) module which takes advantage of differential information to model temporal trends. Extensive experiments illustrate the superior forecasting performances of our proposed multi-graph based dynamic learning framework on six real-world spatiotemporal datasets from different cities and domains, and this corroborates the solid effectiveness of our proposed framework and its superior generalization ability. 时空数据预测是图形数据挖掘领域的一项基础性工作。典型的时空数据预测方法通常通过直接聚集固定图中局部相邻顶点的特征来捕获空间依赖性。然而,这种聚合器只能捕获顶点之间的局部相关性,当它们被叠加以获得更大的接收场时,它们就陷入了过度平滑的困境。另外,从时间的角度来看,传统的方法主要集中在固定的图上,而顶点之间的相关性可以是动态的。传统时空学习方法中的时间序列分量集成策略难以处理频繁剧烈变化的序列。为了克服现有工作的局限性,本文提出了一种新的基于多图的动态学习框架。首先,引入一种新的动态邻域搜索(dNS)机制,通过构造一个特征图(FG)来模拟顶点之间的全局动态关联,其中邻接矩阵由 DNS 动态确定。然后通过设计自适应异构表示(AHR)模块进一步缓解了过平滑问题。在 AHR 模块中输入 FG 和原点图(OG) ,然后在我们提出的多图融合模块中进行融合。此外,我们还设计了差分顶点表示(DVR)模块,利用差分信息对时间趋势进行建模。大量的实验表明,我们提出的基于多图的动态学习框架对来自不同城市和领域的6个真实世界的时空数据集具有优越的预测性能,这证实了我们提出的框架的有效性及其优越的泛化能力。 code 0
Self-supervised Graph Representation Learning for Black Market Account Detection Zequan Xu, Lianyun Li, Hui Li, Qihang Sun, Shaofeng Hu, Rongrong Ji Tencent Inc., Guangzhou, China; Xiamen University, Xiamen, China Nowadays, Multi-purpose Messaging Mobile App (MMMA) has become increasingly prevalent. MMMAs attract fraudsters and some cybercriminals provide support for frauds via black market accounts (BMAs). Compared to fraudsters, BMAs are not directly involved in frauds and are more difficult to detect. This paper illustrates our BMA detection system SGRL (Self-supervised Graph Representation Learning) used in WeChat, a representative MMMA with over a billion users. We tailor Graph Neural Network and Graph Self-supervised Learning in SGRL for BMA detection. The workflow of SGRL contains a pretraining phase that utilizes structural information, node attribute information and available human knowledge, and a lightweight detection phase. In offline experiments, SGRL outperforms state-of-the-art methods by 16.06%-58.17% on offline evaluation measures. We deploy SGRL in the online environment to detect BMAs on the billion-scale WeChat graph, and it exceeds the alternative by 7.27% on the online evaluation measure. In conclusion, SGRL can alleviate label reliance, generalize well to unseen data, and effectively detect BMAs in WeChat. 目前,多用途消息移动应用(MMMA)已经变得越来越普遍。MMMA 吸引欺诈者,一些网络犯罪分子通过黑市账户(BMAs)为欺诈提供支持。与欺诈者相比,BMA 不直接参与欺诈,更难以发现。本文介绍了我们的 BMA 检测系统 SGRL (自监督图形表示学习)在微信上的应用。在 SGRL 中,我们将图神经网络和图自监督学习用于 BMA 检测。SGRL 的工作流包括一个利用结构信息、节点属性信息和可用人类知识的预训练阶段和一个轻量级检测阶段。在离线实验中,SGRL 的离线评价指标比最先进的方法提高了16.06% -58.17% 。我们在在线环境中部署 SGRL 来检测十亿规模的微信图像中的 BMA,在线评价指标比其他方法高出7.27% 。总之,SGRL 可以减轻标签依赖,对未知数据进行良好的泛化,有效地检测微信中的 BMA。 code 0
Graph Explicit Neural Networks: Explicitly Encoding Graphs for Efficient and Accurate Inference Yiwei Wang, Bryan Hooi, Yozen Liu, Neil Shah Snap Inc., Seattle, WA, USA; National University of Singapore, Singapore, Singapore As the state-of-the-art graph learning models, the message passing based neural networks (MPNNs) implicitly use the graph topology as the "pathways" to propagate node features. This implicit use of graph topology induces the MPNNs' over-reliance on (node) features and high inference latency, which hinders their large-scale applications in industrial contexts. To mitigate these weaknesses, we propose the Graph Explicit Neural Network (GENN) framework. GENN can be flexibly applied to various MPNNs and improves them by providing more efficient and accurate inference that is robust in feature-constrained settings. Specifically, we carefully incorporate recent developments in network embedding methods to efficiently prioritize the graph topology for inference. From this vantage, GENN explicitly encodes the topology as an important source of information to mitigate the reliance on node features. Moreover, by adopting knowledge distillation (KD) techniques, GENN takes an MPNN as the teacher to supervise the training for better effectiveness while avoiding the teacher's high inference latency. Empirical results show that our GENN infers dramatically faster than its MPNN teacher by 40x-78x. In terms of accuracy, GENN yields significant gains (more than 40%) for its MPNN teacher when the node features are limited based on our explicit encoding. Moreover, GENN outperforms the MPNN teacher even in feature-rich settings thanks to our KD design. 作为最先进的图形学习模型,基于消息传递的神经网络(MPNN)隐含地使用图拓扑作为“路径”来传播节点特征。这种图形拓扑的隐含使用导致了 MPNN 对(节点)特征的过度依赖和高的推理延迟,从而阻碍了它们在工业环境中的大规模应用。为了弥补这些不足,我们提出了图显式神经网络(GENN)框架。GENN 可以灵活地应用于各种 MPNN,并通过提供更有效和准确的推理来改进它们,这种推理在特征约束设置中具有鲁棒性。具体来说,我们小心地结合了网络嵌入方法的最新发展,以有效地优先图拓扑推理。从这个优势出发,GENN 显式地将拓扑编码为一个重要的信息源,以减轻对节点特征的依赖。此外,通过采用知识提取(KD)技术,GENN 以 MPNN 为教师,在避免教师高推理潜伏期的同时,监督训练的有效性。实证结果表明,我们的 GENN 推理速度明显快于其 MPNN 教师的40倍 -78倍。在准确性方面,GENN 产生显着的增益(超过40%)的 MPNN 教师时,节点的特点是有限的基于我们的显式编码。而且,由于我们的 KD 设计,GENN 甚至在功能丰富的设置方面优于 MPNN 教师。 code 0
GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection Yixin Liu, Kaize Ding, Huan Liu, Shirui Pan Griffith University, Gold Coast, SQ, Australia; Arizona State University, Tempe, AZ, USA; Monash University, Melbourne, VIC, Australia Most existing deep learning models are trained based on the closed-world assumption, where the test data is assumed to be drawn i.i.d. from the same distribution as the training data, known as in-distribution (ID). However, when models are deployed in an open-world scenario, test samples can be out-of-distribution (OOD) and therefore should be handled with caution. To detect such OOD samples drawn from unknown distribution, OOD detection has received increasing attention lately. However, current endeavors mostly focus on grid-structured data and its application for graph-structured data remains under-explored. Considering the fact that data labeling on graphs is commonly time-expensive and labor-intensive, in this work we study the problem of unsupervised graph OOD detection, aiming at detecting OOD graphs solely based on unlabeled ID data. To achieve this goal, we develop a new graph contrastive learning framework GOOD-D for detecting OOD graphs without using any ground-truth labels. By performing hierarchical contrastive learning on the augmented graphs generated by our perturbation-free graph data augmentation method, GOOD-D is able to capture the latent ID patterns and accurately detect OOD graphs based on the semantic inconsistency in different granularities (i.e., node-level, graph-level, and group-level). As a pioneering work in unsupervised graph-level OOD detection, we build a comprehensive benchmark to compare our proposed approach with different state-of-the-art methods. The experiment results demonstrate the superiority of our approach over different methods on various datasets. 大多数现有的深度学习模型都是基于封闭世界的假设进行训练的,其中测试数据被假设是从与训练数据相同的分布中提取出来的,称为内分布(in-distribution,ID)。然而,当模型部署在开放世界场景中时,测试样本可能会超出分布(OOD) ,因此应该谨慎处理。为了检测这些来自未知分布的 OOD 样品,OOD 检测近年来受到越来越多的关注。然而,目前的研究主要集中在网格结构化数据上,而其在图形结构化数据中的应用还有待进一步探索。考虑到图形数据标注通常耗费大量时间和人力,本文研究了无监督图形 OOD 检测问题,目的是仅仅基于未标记的 ID 数据来检测 OOD 图形。为了实现这一目标,我们开发了一个新的图形对比学习框架 Good-D,用于检测 OOD 图,而不使用任何地面真值标签。通过对我们的无扰动图数据增强方法生成的增强图进行分层对比学习,Good-D 能够捕获潜在的 ID 模式,并基于不同粒度(即节点级,图级和组级)的语义不一致性准确检测 OOD 图。作为无监督图级 OOD 检测的开创性工作,我们建立了一个全面的基准来比较我们提出的方法与不同的国家最先进的方法。实验结果证明了该方法在不同数据集上对不同方法的优越性。 code 0
Alleviating Structural Distribution Shift in Graph Anomaly Detection Yuan Gao, Xiang Wang, Xiangnan He, Zhenguang Liu, Huamin Feng, Yongdong Zhang Graph anomaly detection (GAD) is a challenging binary classification problem due to its different structural distribution between anomalies and normal nodes – abnormal nodes are a minority, therefore holding high heterophily and low homophily compared to normal nodes. Furthermore, due to various time factors and the annotation preferences of human experts, the heterophily and homophily can change across training and testing data, which is called structural distribution shift (SDS) in this paper. The mainstream methods are built on graph neural networks (GNNs), benefiting the classification of normals from aggregating homophilous neighbors, yet ignoring the SDS issue for anomalies and suffering from poor generalization. This work solves the problem from a feature view. We observe that the degree of SDS varies between anomalies and normal nodes. Hence to address the issue, the key lies in resisting high heterophily for anomalies meanwhile benefiting the learning of normals from homophily. We tease out the anomaly features on which we constrain to mitigate the effect of heterophilous neighbors and make them invariant. We term our proposed framework as Graph Decomposition Network (GDN). Extensive experiments are conducted on two benchmark datasets, and the proposed framework achieves a remarkable performance boost in GAD, especially in an SDS environment where anomalies have largely different structural distribution across training and testing environments. Codes are open-sourced in https://github.com/blacksingular/wsdm_GDN. 图形异常检测(GAD)是一个具有挑战性的二元分类问题,因为它在异常节点和正常节点之间的结构分布不同-异常节点是少数,因此与正常节点相比具有高度异质性和低度同质性。此外,由于各种时间因素和人类专家的注释偏好,训练和测试数据之间的异质性和同质性会发生变化,本文称之为结构分布偏移(SDS)。主流的方法都是建立在图神经网络(GNN)之上的,有利于法向量的分类从聚集同调邻居,但忽略了 SDS 问题的异常和普遍性差。这项工作从特征视图解决了这个问题。我们观察到 SDS 的程度在异常节点和正常节点之间变化。因此,要解决这一问题,关键在于抵制异常的高度异质性,同时有利于从同质性中学习规范。我们梳理出异常特征,我们约束,以减轻异质邻居的影响,使他们不变。我们将我们提出的框架称为图分解网络(GDN)。在两个基准数据集上进行了大量的实验,结果表明,该框架在广域设计中取得了显著的性能提升,尤其是在 SDS 环境中,在不同的训练和测试环境中,异常的结构分布有很大的不同。代码在 https://github.com/blacksingular/wsdm_gdn 中是开源的。 code 0
Friendly Conditional Text Generator Noriaki Kawamae NTT Comware, Tokyo, Japan Our goal is to control text generation with more fine-grained conditions at lower computational cost than is possible with current alternatives; these conditions are attributes (i.e., multiple codes and free-text). As large-scale pre-trained language models (PLMs) offer excellent performance in free-form text generation, we explore efficient architectures and training schemes that can best leverage PLMs. Our framework, Friendly Conditional Text Generator (FCTG), introduces a multi-view attention (MVA) mechanism and two training tasks, Masked Attribute Modeling (MAM) and Attribute Linguistic Matching (ALM), to direct various PLMs via modalities between the text and its attributes. The motivation of FCTG is to map texts and attributes into a shared space, and bridge their modality gaps, as the texts and attributes reside in different regions of semantic space. To avoid catastrophic forgetting, modality-free embedded representations are learnt, and used to direct PLMs in this space, FCTG applies MAM to learn attribute representations, maps them in the same space as text through MVA, and optimizes their alignment in this space via ALM. Experiments on publicly available datasets show that FCTG outperforms baselines over higher level conditions at lower computation cost. 我们的目标是用更细粒度的条件来控制文本的生成,这些条件是属性(即多个代码和自由文本) ,计算成本比目前的替代方案更低。由于大规模的预训练语言模型(PLM)在自由形式的文本生成方面提供了出色的性能,我们探索了能够最好地利用 PLM 的高效体系结构和培训方案。我们的框架,友好条件文本生成器(FCTG) ,引入了一个多视图注意(MVA)机制和两个训练任务,掩盖属性建模(MAM)和属性语言匹配(ALM) ,通过文本和属性之间的模式来指导各种 PLM。FCTG 的目的是将文本和属性映射到一个共享的空间中,由于文本和属性位于语义空间的不同区域,从而弥合它们之间的情态差异。为了避免灾难性的遗忘,FCTG 学习了无模态嵌入式表示,并用于指导 PLM 在这个空间中,FCTG 应用 MAM 来学习属性表示,通过 MVA 将它们映射到与文本相同的空间中,并通过 ALM 优化它们在这个空间中的对齐。在公开可用数据集上的实验表明,FCTG 在较低的计算成本下优于较高级别条件下的基线。 code 0
Can Pre-trained Language Models Understand Chinese Humor? Yuyan Chen, Zhixu Li, Jiaqing Liang, Yanghua Xiao, Bang Liu, Yunwen Chen Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University & Fudan-Aishu Cognitive Intelligence Joint Research Center, Shanghai, China; RALI & Mila, Université de Montréal, Montréal, Canada; DataGrand Inc., Shanghai, China; Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, Shanghai, China; School of Data Science, Fudan University, Shanghai, China Humor understanding is an important and challenging research in natural language processing. As the popularity of pre-trained language models (PLMs), some recent work makes preliminary attempts to adopt PLMs for humor recognition and generation. However, these simple attempts do not substantially answer the question: whether PLMs are capable of humor understanding? This paper is the first work that systematically investigates the humor understanding ability of PLMs. For this purpose, a comprehensive framework with three evaluation steps and four evaluation tasks is designed. We also construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework. Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation. 幽默理解是自然语言处理领域的一个重要而富有挑战性的研究课题。随着预训练语言模型(PLM)的普及,近年来的一些研究工作对采用 PLM 进行幽默识别和生成进行了初步尝试。然而,这些简单的尝试并没有实质性地回答这个问题: PLM 是否能够理解幽默?本文是第一篇系统研究 PLM 幽默理解能力的论文。为此,设计了一个包含三个评估步骤和四个评估任务的综合框架。我们还构建了一个全面的中文幽默数据集,能够完全满足所提出的评价框架的所有数据需求。我们对中文幽默数据集的实证研究得到了一些有价值的结果,对今后幽默理解和幽默生成的 PLM 优化具有重要的指导意义。 code 0
Robust Training of Graph Neural Networks via Noise Governance Siyi Qian, Haochao Ying, Renjun Hu, Jingbo Zhou, Jintai Chen, Danny Z. Chen, Jian Wu Baidu Research, Beijing, China; Zhejiang University, Hangzhou, China; University of Notre Dame, Notre Dame, IN, USA; Alibaba Group, Hangzhou, China Graph Neural Networks (GNNs) have become widely-used models for semi-supervised learning. However, the robustness of GNNs in the presence of label noise remains a largely under-explored problem. In this paper, we consider an important yet challenging scenario where labels on nodes of graphs are not only noisy but also scarce. In this scenario, the performance of GNNs is prone to degrade due to label noise propagation and insufficient learning. To address these issues, we propose a novel RTGNN (Robust Training of Graph Neural Networks via Noise Governance) framework that achieves better robustness by learning to explicitly govern label noise. More specifically, we introduce self-reinforcement and consistency regularization as supplemental supervision. The self-reinforcement supervision is inspired by the memorization effects of deep neural networks and aims to correct noisy labels. Further, the consistency regularization prevents GNNs from overfitting to noisy labels via mimicry loss in both the inter-view and intra-view perspectives. To leverage such supervisions, we divide labels into clean and noisy types, rectify inaccurate labels, and further generate pseudo-labels on unlabeled nodes. Supervision for nodes with different types of labels is then chosen adaptively. This enables sufficient learning from clean labels while limiting the impact of noisy ones. We conduct extensive experiments to evaluate the effectiveness of our RTGNN framework, and the results validate its consistent superior performance over state-of-the-art methods with two types of label noises and various noise rates. 图形神经网络(GNN)已成为广泛使用的半监督学习模型。然而,在标签噪声的存在下,GNN 的稳健性仍然是一个很大程度上未被探讨的问题。在本文中,我们考虑了一个重要但具有挑战性的场景,其中图的节点上的标签不仅是有噪声的,而且是稀缺的。在这种情况下,由于标签噪声的传播和不充分的学习,GNN 的性能容易下降。为了解决这些问题,我们提出了一种新的 RTGNN (通过噪声治理的图形神经网络的鲁棒训练)框架,通过学习显式地治理标签噪声来实现更好的鲁棒性。更具体地说,我们引入自我强化和一致性规范作为补充监督。自我强化监督的灵感来自深层神经网络的记忆效应,旨在纠正噪声标签。此外,一致性正则化防止 GNN 过度拟合噪声标签通过模仿损失在视图内和视图内的观点。为了利用这种监督,我们将标签划分为干净和嘈杂的类型,纠正不准确的标签,并进一步在未标记的节点上生成伪标签。然后自适应地选择对具有不同类型标签的节点的监视。这样可以从干净的标签中学到足够的知识,同时减少噪音标签的影响。我们进行了广泛的实验来评估我们的 RTGNN 框架的有效性,结果验证了其一致性优于最先进的方法与两种类型的标签噪声和不同的噪声率。 code 0
Cooperative Explanations of Graph Neural Networks Junfeng Fang, Xiang Wang, An Zhang, Zemin Liu, Xiangnan He, TatSeng Chua University of Science and Technology of China, Hefei, China; National University of Singapore, Singapore, Singapore With the growing success of graph neural networks (GNNs), the explainability of GNN is attracting considerable attention. Current explainers mostly leverage feature attribution and selection to explain a prediction. By tracing the importance of input features, they select the salient subgraph as the explanation. However, their explainability is at the granularity of input features only, and cannot reveal the usefulness of hidden neurons. This inherent limitation makes the explainers fail to scrutinize the model behavior thoroughly, resulting in unfaithful explanations. In this work, we explore the explainability of GNNs at the granularity of both input features and hidden neurons. To this end, we propose an explainer-agnostic framework, Cooperative GNN Explanation (CGE) to generate the explanatory subgraph and subnetwork simultaneously, which jointly explain how the GNN model arrived at its prediction. Specifically, it first initializes the importance scores of input features and hidden neurons with masking networks. Then it iteratively retrains the importance scores, refining the salient subgraph and subnetwork by discarding low-scored features and neurons in each iteration. Through such cooperative learning, CGE not only generates faithful and concise explanations, but also exhibits how the salient information flows by activating and deactivating neurons. We conduct extensive experiments on both synthetic and real-world datasets, validating the superiority of CGE over state-of-the-art approaches. Code is available at https://github.com/MangoKiller/CGE_demo. 随着图形神经网络(GNN)的日益成功,GNN 的可解释性引起了人们的广泛关注。当前的解释者大多利用特征归属和选择来解释预测。通过跟踪输入特征的重要性,选择显著子图作为解释。然而,它们的可解释性仅限于输入特征的粒度,不能揭示隐藏神经元的有用性。这种固有的局限性使得解释者无法彻底审视模型行为,导致解释失实。在这项工作中,我们探讨了 GNN 的可解释性在粒度的输入特征和隐藏的神经元。为此,我们提出了一个解释者不可知的框架,即同时生成解释子图和子网络(CGE) ,从而共同解释 GNN 模型是如何达到其预测目的的。具体地说,它首先用掩蔽网络初始化输入特征和隐藏神经元的重要性得分。然后迭代地重新训练重要性得分,通过在每次迭代中丢弃低得分特征和神经元来精化显著子图和子网络。通过这样的合作学习,CGE 不仅产生了忠实、简洁的解释,而且展示了显著信息是如何通过激活和去活化神经元而流动的。我们在合成和真实世界的数据集上进行了广泛的实验,验证了 CGE 相对于最先进的方法的优越性。密码可于 https://github.com/mangokiller/cge_demo 索取。 code 0
Towards Faithful and Consistent Explanations for Graph Neural Networks Tianxiang Zhao, Dongsheng Luo, Xiang Zhang, Suhang Wang The Pennsylvania State University, State College, PA, USA; Florida International University, Miami, FL, USA Uncovering rationales behind predictions of graph neural networks (GNNs) has received increasing attention over recent years. Instance-level GNN explanation aims to discover critical input elements, like nodes or edges, that the target GNN relies upon for making predictions. Though various algorithms are proposed, most of them formalize this task by searching the minimal subgraph which can preserve original predictions. However, an inductive bias is deep-rooted in this framework: several subgraphs can result in the same or similar outputs as the original graphs. Consequently, they have the danger of providing spurious explanations and fail to provide consistent explanations. Applying them to explain weakly-performed GNNs would further amplify these issues. To address this problem, we theoretically examine the predictions of GNNs from the causality perspective. Two typical reasons of spurious explanations are identified: confounding effect of latent variables like distribution shift, and causal factors distinct from the original input. Observing that both confounding effects and diverse causal rationales are encoded in internal representations, we propose a simple yet effective countermeasure by aligning embeddings. Concretely, concerning potential shifts in the high-dimensional space, we design a distribution-aware alignment algorithm based on anchors. This new objective is easy to compute and can be incorporated into existing techniques with no or little effort. Theoretical analysis shows that it is in effect optimizing a more faithful explanation objective in design, which further justifies the proposed approach. 近年来,揭示图神经网络(GNN)预测背后的基本原理越来越受到人们的关注。实例级 GNN 解释旨在发现关键的输入元素,如节点或边,目标 GNN 依赖于这些元素来进行预测。虽然提出了各种算法,但大多数算法都是通过搜索最小子图来形式化这一任务,从而保留了原始的预测。然而,归纳偏差在这个框架中是根深蒂固的: 几个子图可以产生与原始图相同或相似的输出。因此,他们有提供虚假解释的危险,而且无法提供一致的解释。用它们来解释表现不佳的 GNN 将进一步放大这些问题。为了解决这个问题,我们从因果关系的角度对 GNN 的预测进行了理论研究。两个典型的原因虚假的解释被确定: 混杂效应的潜在变量,如分布转移,和因果因素不同于原始输入。观察到混杂效应和不同的因果理由都编码在内部表征中,我们通过排列嵌入提出了一个简单而有效的对策。具体地,针对高维空间中的位移,我们设计了一种基于锚的分布感知对准算法。这个新目标很容易计算,并且可以毫不费力地将其纳入现有技术中。理论分析表明,它实际上是在优化设计中一个更加忠实的解释目标,这进一步证明了所提出的方法。 code 0
Position-Aware Subgraph Neural Networks with Data-Efficient Learning Chang Liu, Yuwen Yang, Zhe Xie, Hongtao Lu, Yue Ding Shanghai Jiao Tong University, Shanghai, China Data-efficient learning on graphs (GEL) is essential in real-world applications. Existing GEL methods focus on learning useful representations for nodes, edges, or entire graphs with small'' labeled data. But the problem of data-efficient learning for subgraph prediction has not been explored. The challenges of this problem lie in the following aspects: 1) It is crucial for subgraphs to learn positional features to acquire structural information in the base graph in which they exist. Although the existing subgraph neural network method is capable of learning disentangled position encodings, the overall computational complexity is very high. 2) Prevailing graph augmentation methods for GEL, including rule-based, sample-based, adaptive, and automated methods, are not suitable for augmenting subgraphs because a subgraph contains fewer nodes but richer information such as position, neighbor, and structure. Subgraph augmentation is more susceptible to undesirable perturbations. 3) Only a small number of nodes in the base graph are contained in subgraphs, which leads to a potential bias'' problem that the subgraph representation learning is dominated by these ``hot'' nodes. By contrast, the remaining nodes fail to be fully learned, which reduces the generalization ability of subgraph representation learning. In this paper, we aim to address the challenges above and propose a Position-Aware Data-Efficient Learning framework for subgraph neural networks called PADEL. Specifically, we propose a novel node position encoding method that is anchor-free, and design a new generative subgraph augmentation method based on a diffused variational subgraph autoencoder, and we propose exploratory and exploitable views for subgraph contrastive learning. Extensive experiment results on three real-world datasets show the superiority of our proposed method over state-of-the-art baselines. 图上的数据有效学习(GEL)在现实应用中是必不可少的。现有的 GEL 方法侧重于学习对带有“小”标记数据的节点、边或整个图的有用表示。但子图预测的数据有效学习问题尚未得到研究。这个问题的挑战在于以下几个方面: 1)子图学习位置特征以获取其所在的基图中的结构信息至关重要。虽然现有的子图神经网络方法能够学习分离位置编码,但整体计算复杂度很高。2)基于规则的、基于样本的、自适应的、自动化的 GEL 图增强方法不适用于子图的增强,因为子图包含的节点较少,但位置、邻居和结构等信息较丰富。子图增广更容易受到不良扰动的影响。3)基图中只有少量的节点包含在子图中,这就导致了子图表示学习受这些“热”节点支配的潜在“偏差”问题。相比之下,剩余的节点不能完全学习,从而降低了子图表示学习的泛化能力。本文针对上述挑战,提出了一种子图神经网络的位置感知数据有效学习框架 PADEL。具体地说,我们提出了一种新的无锚节点位置编码方法,设计了一种基于扩散变分子图自动编码器的生成子图增强方法,并提出了子图对比学习的探索性和可开发性观点。在三个实际数据集上的大量实验结果表明了我们提出的方法相对于最先进的基线的优越性。 code 0
Graph Neural Networks with Interlayer Feature Representation for Image Super-Resolution Shenggui Tang, Kaixuan Yao, Jianqing Liang, Zhiqiang Wang, Jiye Liang Shanxi University, Taiyuan, China Although deep learning has been extensively studied and achieved remarkable performance on single image super-resolution (SISR), existing convolutional neural networks (CNN) mainly focus on broader and deeper architecture design, ignoring the detailed information of the image itself and the potential relationship between the features. Recently, several attempts have been made to address the SISR with graph representation learning. However, existing GNN-based methods learning to deal with the SISR problem are limited to the information processing of the entire image or the relationship processing between different feature images of the same layer, ignoring the interdependence between the extracted features of different layers, which is not conducive to extracting deeper hierarchical features. In this paper, we propose an interlayer feature representation based graph neural network for image super-resolution (LSGNN), which consists of a layer feature graph representation learning module and a channel spatial attention module. The layer feature graph representation learning module mainly captures the interdependence between the features of different layers, which can learn more fine-grained image detail features. In addition, we also unified a channel attention module and a spatial attention module into our model, which takes into account the channel dimension information and spatial scale information, to improve the expressive ability, and achieve high quality image details. Extensive experiments and ablation studies demonstrate the superiority of the proposed model. 虽然深度学习在单幅图像超分辨率(SISR)方面已经得到了广泛的研究,并取得了显著的效果,但现有的卷积神经网络(CNN)主要集中在更广泛和更深入的结构设计上,忽略了图像本身的详细信息和特征之间的潜在关系。近年来,人们利用图表示学习的方法对 SISR 问题进行了一些尝试。然而,现有的基于 GNN 的 SISR 问题学习方法仅局限于对整幅图像的信息处理或同一层次不同特征图像之间的关系处理,忽略了不同层次提取特征之间的相互依赖性,不利于提取更深层次的特征。本文提出了一种基于层间特征表示的图像超分辨率神经网络(LSGNN) ,它由层间特征图表示学习模块和通道空间注意模块组成。层次特征图表示学习模块主要捕捉不同层次特征之间的相互依赖关系,可以学习更细粒度的图像细节特征。此外,我们还统一了信道注意模块和空间注意模块,该模型考虑了信道尺寸信息和空间尺度信息,以提高表达能力,实现高质量的图像细节。大量的实验和烧蚀研究证明了该模型的优越性。 code 0
CLNode: Curriculum Learning for Node Classification Xiaowen Wei, Xiuwen Gong, Yibing Zhan, Bo Du, Yong Luo, Wenbin Hu JD Explore Academy, Beijing, China; The University of Sydney, Sydney, Australia; Wuhan University, Wuhan, China Node classification is a fundamental graph-based task that aims to predict the classes of unlabeled nodes, for which Graph Neural Networks (GNNs) are the state-of-the-art methods. Current GNNs assume that nodes in the training set contribute equally during training. However, the quality of training nodes varies greatly, and the performance of GNNs could be harmed by two types of low-quality training nodes: (1) inter-class nodes situated near class boundaries that lack the typical characteristics of their corresponding classes. Because GNNs are data-driven approaches, training on these nodes could degrade the accuracy. (2) mislabeled nodes. In real-world graphs, nodes are often mislabeled, which can significantly degrade the robustness of GNNs. To mitigate the detrimental effect of the low-quality training nodes, we present CLNode, which employs a selective training strategy to train GNN based on the quality of nodes. Specifically, we first design a multi-perspective difficulty measurer to accurately measure the quality of training nodes. Then, based on the measured qualities, we employ a training scheduler that selects appropriate training nodes to train GNN in each epoch. To evaluate the effectiveness of CLNode, we conduct extensive experiments by incorporating it in six representative backbone GNNs. Experimental results on real-world networks demonstrate that CLNode is a general framework that can be combined with various GNNs to improve their accuracy and robustness. 节点分类是一项基于图的基本任务,其目的是预测未标记节点的类别,图神经网络(GNN)是这方面的最新研究成果。当前的 GNN 假设训练集中的节点在训练期间作出同样的贡献。然而,训练节点的质量差异很大,两种类型的低质量训练节点可能会损害 GNN 的性能: (1)位于类边界附近的类间节点缺乏相应类的典型特征。由于 GNN 是数据驱动的方法,在这些节点上进行训练可能会降低精度。(2)标记错误的节点。在现实图中,节点经常被错误标记,这会显著降低 GNN 的鲁棒性。为了减轻低质量训练节点的不利影响,我们提出了 CLNode,它采用一种基于节点质量的选择性训练策略来训练 GNN。具体来说,我们首先设计了一个多视角的难度度量器来准确测量训练节点的质量。然后,根据测量的质量,采用训练调度器,选择合适的训练节点,在每个时代训练 GNN。为了评估 CLNode 的有效性,我们进行了广泛的实验,将其纳入六个具有代表性的骨干 GNN。在实际网络上的实验结果表明,CLNode 是一种通用的网络结构,可以与各种 GNN 结合使用,提高网络的准确性和鲁棒性。 code 0
Learning and Maximizing Influence in Social Networks Under Capacity Constraints Pritish Chakraborty, Sayan Ranu, Krishna Sri Ipsit Mantri, Abir De Indian Institute of Technology, Delhi, New Delhi, India; Indian Institute of Technology, Bombay, Mumbai, India Influence maximization (IM) refers to the problem of finding a subset of nodes in a network through which we could maximize our reach to other nodes in the network. This set is often called the "seed set", and its constituent nodes maximize the social diffusion process. IM has previously been studied in various settings, including under a time deadline, subject to constraints such as that of budget or coverage, and even subject to measures other than the centrality of nodes. The solution approach has generally been to prove that the objective function is submodular, or has a submodular proxy, and thus has a close greedy approximation. In this paper, we explore a variant of the IM problem where we wish to reach out to and maximize the probability of infection of a small subset of bounded capacity K. We show that this problem does not exhibit the same submodular guarantees as the original IM problem, for which we resort to the theory of gamma-weakly submodular functions. Subsequently, we develop a greedy algorithm that maximizes our objective despite the lack of submodularity. We also develop a suitable learning model that out-competes baselines on the task of predicting the top-K infected nodes, given a seed set as input. 影响最大化(IM)是指在网络中寻找一个节点子集,通过这个子集我们可以最大限度地接触到网络中的其他节点。这个集合通常被称为“种子集合”,它的组成节点使社会扩散过程最大化。IM 以前已经被研究在各种环境下,包括在一个时间期限下,受到限制,如预算或覆盖面,甚至受到措施以外的节点的中心性。求解方法一般是证明目标函数是子模的,或者有一个子模代理,因此有一个近似的贪婪近似。在本文中,我们探讨了 IM 问题的一个变种,其中我们希望达到并最大化有界容量 K 的一个小子集的感染概率。我们证明了这个问题并没有展示出与原 IM 问题相同的子模保证,对于这个问题我们采用伽马弱子模函数理论。随后,我们开发了一个贪婪算法,最大化我们的目标,尽管缺乏子模块。我们还开发了一个合适的学习模型,在预测顶部 K 感染节点的任务竞争基线,给定一个种子集作为输入。 code 0
Beyond Individuals: Modeling Mutual and Multiple Interactions for Inductive Link Prediction between Groups Gongzhu Yin, Xing Wang, Hongli Zhang, Chao Meng, Yuchen Yang, Kun Lu, Yi Luo Harbin Institute of Technology, Harbin, China Link prediction is a core task in graph machine learning with wide applications. However, little attention has been paid to link prediction between two group entities. This limits the application of the current approaches to many real-life problems, such as predicting collaborations between academic groups or recommending bundles of items to group users. Moreover, groups are often ephemeral or emergent, forcing the predicting model to deal with challenging inductive scenes. To fill this gap, we develop a framework composed of a GNN-based encoder and neural-based aggregating networks, namely the Mutual Multi-view Attention Networks (MMAN). First, we adopt GNN-based encoders to model multiple interactions among members and groups through propagating. Then, we develop MMAN to aggregate members' node representations into multi-view group representations and compute the final results by pooling pairwise scores between views. Specifically, several view-guided attention modules are adopted when learning multi-view group representations, thus capturing diversified member weights and multifaceted group characteristics. In this way, MMAN can further mimic the mutual and multiple interactions between groups. We conduct experiments on three datasets, including two academic group link prediction datasets and one bundle-to-group recommendation dataset. The results demonstrate that the proposed approach can achieve superior performance on both tasks compared with plain GNN-based methods and other aggregating methods. 链路预测是图机学习的核心任务,有着广泛的应用。然而,人们很少关注两个群体实体之间的联系预测。这限制了当前方法在许多实际问题上的应用,例如预测学术团体之间的协作,或者向团体用户推荐成捆的项目。此外,群体往往是短暂的或突发的,迫使预测模型处理具有挑战性的归纳场景。为了填补这一空白,我们开发了一个由基于 GNN 的编码器和基于神经元的聚合网络组成的框架,即互多视点注意网络(MMAN)。首先,我们采用基于 GNN 的编码器,通过传播来建模成员和组之间的多重交互。然后,开发 MMAN,将成员的节点表示聚合为多视图组表示,并通过在视图之间汇集成对分数来计算最终结果。具体地说,在学习多视点群体表征时,采用了多个视点引导的注意模块,从而获取多元化的成员权重和多方面的群体特征。这样,MMAN 可以进一步模拟群体之间的相互作用和多重作用。我们在三个数据集上进行实验,包括两个学术群组链接预测数据集和一个捆绑群组推荐数据集。结果表明,与普通的基于 GNN 的聚集方法和其他聚集方法相比,该方法在两种任务上都能获得较好的性能。 code 0
Scalable Adversarial Attack Algorithms on Influence Maximization Lichao Sun, Xiaobin Rui, Wei Chen Lehigh University, Bethlehem, PA, USA; Microsoft Research Asia, Beijing, China; China University of Mining and Technology, Xuzhou, China In this paper, we study the adversarial attacks on influence maximization under dynamic influence propagation models in social networks. In particular, given a known seed set S, the problem is to minimize the influence spread from S by deleting a limited number of nodes and edges. This problem reflects many application scenarios, such as blocking virus (e.g. COVID-19) propagation in social networks by quarantine and vaccination, blocking rumor spread by freezing fake accounts, or attacking competitor's influence by incentivizing some users to ignore the information from the competitor. In this paper, under the linear threshold model, we adapt the reverse influence sampling approach and provide efficient algorithms of sampling valid reverse reachable paths to solve the problem. We present three different design choices on reverse sampling, which all guarantee $1/2 - \varepsilon$ approximation (for any small $\varepsilon &gt;0$) and an efficient running time. 本文研究了社会网络中动态影响传播模型下影响最大化的对手攻击问题。特别地,给定一个已知的种子集 S,问题是通过删除有限数量的节点和边来最小化来自 S 的影响。这个问题反映了许多应用场景,比如通过隔离和接种疫苗来阻止病毒(例如2019冠状病毒疾病)在社交网络中的传播,通过冻结虚假账户来阻止谣言的传播,或者通过鼓励一些用户忽略竞争对手的信息来攻击竞争对手的影响力。本文在线性阈值模型下,采用反向影响采样方法,提出了有效的反向可达路径采样算法。我们提出了三种不同的反向采样设计选择,它们都保证了 $1/2-varepsilon $近似(对于任何小于0 $的 varepsilon)和有效的运行时间。 code 0
S2GAE: Self-Supervised Graph Autoencoders are Generalizable Learners with Graph Masking Qiaoyu Tan, Ninghao Liu, Xiao Huang, SooHyun Choi, Li Li, Rui Chen, Xia Hu Rice University, Houston, TX, USA; Samsung Electronics, Mountain view, CA, USA; Samsung Electronics America, Mountain view, CA, USA; University of Georgia, Athens, GA, USA; The Hong Kong Polytechnic University, Hong Kong, China; Texas A&M University, College station, TX, USA Self-supervised learning (SSL) has been demonstrated to be effective in pre-training models that can be generalized to various downstream tasks. Graph Autoencoder (GAE), an increasingly popular SSL approach on graphs, has been widely explored to learn node representations without ground-truth labels. However, recent studies show that existing GAE methods could only perform well on link prediction tasks, while their performance on classification tasks is rather limited. This limitation casts doubt on the generalizability and adoption of GAE. In this paper, for the first time, we show that GAE can generalize well to both link prediction and classification scenarios, including node-level and graph-level tasks, by redesigning its critical building blocks from the graph masking perspective. Our proposal is called Self-Supervised Graph Autoencoder--S2GAE, which unleashes the power of GAEs with minimal yet nontrivial efforts. Specifically, instead of reconstructing the whole input structure, we randomly mask a portion of edges and learn to reconstruct these missing edges with an effective masking strategy and an expressive decoder network. Moreover, we theoretically prove that S2GAE could be regarded as an edge-level contrastive learning framework, providing insights into why it generalizes well. Empirically, we conduct extensive experiments on 21 benchmark datasets across link prediction and node & graph classification tasks. The results validate the superiority of S2GAE against state-of-the-art generative and contrastive methods. This study demonstrates the potential of GAE as a universal representation learner on graphs. Our code is publicly available at https://github.com/qiaoyu-tan/S2GAE. 自监督学习(SSL)已被证明是有效的预训练模型,可以推广到各种下游任务。图形自动编码器(Graph Autoencoder,GAE)是一种越来越受欢迎的图形 SSL 方法,已经被广泛研究用于学习没有地面真值标签的节点表示。然而,最近的研究表明,现有的 GAE 方法只能在链路预测任务上表现良好,而在分类任务上的表现相当有限。这种局限性使人们对 GAE 的普遍性和采用性产生了怀疑。本文首次从图掩蔽的角度出发,通过重新设计关键模块,证明了 GAE 可以很好地推广到链路预测和分类场景,包括节点级和图级任务。我们的提议被称为自监督图形自动编码器—— S2GAE,它通过最少的努力释放了 GAE 的威力。具体来说,我们不是重建整个输入结构,而是随机掩蔽一部分边缘,并学习用有效的掩蔽策略和表达式解码器网络重建这些缺失的边缘。此外,我们从理论上证明了 S2GAE 可以被看作是一个边缘层次的对比学习框架,从而为 S2GAE 的推广提供了理论依据。通过实验,我们对21个基准数据集进行了跨链路预测和节点与图分类任务的广泛实验。实验结果验证了 S2GAE 算法相对于最新的生成方法和对比方法的优越性。本研究证明了 GAE 作为图形表示学习者的潜力。我们的代码可以在 https://github.com/qiaoyu-tan/s2gae 上公开获取。 code 0
Dependency-aware Self-training for Entity Alignment Bing Liu, Tiancheng Lan, Wen Hua, Guido Zuccon The University of Queensland, Brisbane, QLD, Australia Entity Alignment (EA), which aims to detect entity mappings (i.e. equivalent entity pairs) in different Knowledge Graphs (KGs), is critical for KG fusion. Neural EA methods dominate current EA research but still suffer from their reliance on labelled mappings. To solve this problem, a few works have explored boosting the training of EA models with self-training, which adds confidently predicted mappings into the training data iteratively. Though the effectiveness of self-training can be glimpsed in some specific settings, we still have very limited knowledge about it. One reason is the existing works concentrate on devising EA models and only treat self-training as an auxiliary tool. To fill this knowledge gap, we change the perspective to self-training to shed light on it. In addition, the existing self-training strategies have limited impact because they introduce either much False Positive noise or a low quantity of True Positive pseudo mappings. To improve self-training for EA, we propose exploiting the dependencies between entities, a particularity of EA, to suppress the noise without hurting the recall of True Positive mappings. Through extensive experiments, we show that the introduction of dependency makes the self-training strategy for EA reach a new level. The value of self-training in alleviating the reliance on annotation is actually much higher than what has been realised. Furthermore, we suggest future study on smart data annotation to break the ceiling of EA performance. 实体对齐(EA)是 KG 融合的关键技术,其目的是检测不同知识图(KG)中的实体映射(即等价实体对)。神经电子学方法主导了目前的电子学研究,但仍然受到依赖于标记映射。为了解决这个问题,一些工作已经探索了用自训练的方法来提高 EA 模型的训练效率,这种方法可以在训练数据中迭代地添加可信的预测映射。虽然自我训练的有效性可以在一些特定的环境中看到,但我们对它的了解仍然非常有限。原因之一是现有的工作集中在设计 EA 模型,只把自我训练作为辅助工具。为了填补这一知识空白,我们将视角转向自我训练,以阐明这一点。此外,现有的自训练策略由于引入了大量的假正噪声或少量的真正正伪映射,因而影响有限。为了提高自训练算法的性能,我们提出利用实体之间的依赖关系,即自训练算法的特殊性,在不影响真正正映射召回的前提下抑制噪声。通过大量的实验,我们发现依赖性的引入使得自我训练策略达到了一个新的水平。自我训练在减轻对注释的依赖方面的价值实际上远远高于已经实现的价值。此外,我们建议未来研究智能数据注释,以打破 EA 性能的上限。 code 0
Weakly Supervised Entity Alignment with Positional Inspiration Wei Tang, Fenglong Su, Haifeng Sun, Qi Qi, Jingyu Wang, Shimin Tao, Hao Yang Beijing University of Posts and Telecommunications, Beijing, China; Huawei, Beijing, China; National University of Defense Technology, Changsha, China The current success of entity alignment (EA) is still mainly based on large-scale labeled anchor links. However, the refined annotation of anchor links still consumes a lot of manpower and material resources. As a result, an increasing number of works based on active learning, few-shot learning, or other deep network learning techniques have been developed to address the performance bottleneck caused by a lack of labeled data. These works focus either on the strategy of choosing more informative labeled data or on the strategy of model training, while it remains opaque why existing popular EA models (e.g., GNN-based models) fail the EA task with limited labeled data. To overcome this issue, this paper analyzes the problem of weakly supervised EA from the perspective of model design and proposes a novel weakly supervised learning framework, Position Enhanced Entity Alignment (PEEA). Besides absorbing structural and relational information, PEEA aims to increase the connections between far-away entities and labeled ones by incorporating positional information into the representation learning with a Position Attention Layer (PAL). To fully utilize the limited anchor links, we further introduce a novel position encoding method that considers both anchor links and relational information from a global view. The proposed position encoding will be fed into PEEA as additional entity features. Extensive experiments on public datasets demonstrate the effectiveness of PEEA. 目前实体对齐(EA)的成功仍然主要基于大规模标记的锚链。然而,锚链的精细标注仍然耗费了大量的人力物力。因此,越来越多的基于主动学习、少镜头学习或其他深度网络学习技术的工作被开发出来,以解决因缺乏标记数据而造成的性能瓶颈。这些工作要么集中在选择更多信息的标记数据的策略,要么集中在模型训练的策略上,而为什么现有的流行的 EA 模型(例如,基于 GNN 的模型)在有限的标记数据下无法完成 EA 任务仍然是不透明的。为了克服这个问题,本文从模型设计的角度分析了弱监督算法的问题,并提出了一种新的弱监督监督式学习框架——位置增强实体对齐算法。除了吸收结构信息和关系信息外,PEEA 的目标是通过位置注意层(PAL)将位置信息整合到表征学习中来增加远距离实体和被标记实体之间的联系。为了充分利用有限的锚链,我们进一步介绍了一种新的位置编码方法,从全局的角度考虑锚链和关系信息。提出的位置编码将作为额外的实体功能输入 PEEA。在公共数据集上的大量实验证明了 PEEA 的有效性。 code 0
Hansel: A Chinese Few-Shot and Zero-Shot Entity Linking Benchmark Zhenran Xu, Zifei Shan, Yuxin Li, Baotian Hu, Bing Qin Wechat, Tencent, Shanghai, China; Harbin Institute of Technology, Harbin, China; Harbin Institute of Technology, Shenzhen, China; Wechat, Tencent, Shenzhen, China Modern Entity Linking (EL) systems entrench a popularity bias, yet there is no dataset focusing on tail and emerging entities in languages other than English. We present Hansel, a new benchmark in Chinese that fills the vacancy of non-English few-shot and zero-shot EL challenges. The test set of Hansel is human annotated and reviewed, created with a novel method for collecting zero-shot EL datasets. It covers 10K diverse documents in news, social media posts and other web articles, with Wikidata as its target Knowledge Base. We demonstrate that the existing state-of-the-art EL system performs poorly on Hansel (R@1 of 36.6% on Few-Shot). We then establish a strong baseline that scores a R@1 of 46.2% on Few-Shot and 76.6% on Zero-Shot on our dataset. We also show that our baseline achieves competitive results on TAC-KBP2015 Chinese Entity Linking task. 现代实体链接(EL)系统固化了一种流行偏见,然而除了英语以外,没有关于尾部和新兴实体的数据集。我们提出韩塞尔,一个新的基准在汉语,填补了非英语少射击和零射击 EL 挑战的空缺。Hansel 的测试集是人工注释和评论的,用一种新颖的方法创建用于收集零激发 EL 数据集。它涵盖了新闻、社会媒体帖子和其他网络文章中的10K 不同文档,以 Wikidata 作为其目标知识库。我们证明了现有的最先进的 EL 系统在 Hansel 上表现很差(R@1在 Little-Shot 上表现为36.6%)。然后,我们建立一个强基线,在我们的数据集上,在 Little-Shot 和 Zero-Shot 分别得到46.2% 和76.6% 的 R@1。我们的基线在 TAC-KBP2015中文实体连接任务上也取得了有竞争力的成绩。 code 0
Self-supervised Multi-view Disentanglement for Expansion of Visual Collections Nihal Jain, Praneetha Vaddamanu, Paridhi Maheshwari, Vishwa Vinay, Kuldeep Kulkarni Adobe Research, Bangalore, India; Carnegie Mellon University, Pittsburgh, PA, USA; Stanford University, Stanford, CA, USA Image search engines enable the retrieval of images relevant to a query image. In this work, we consider the setting where a query for similar images is derived from a collection of images. For visual search, the similarity measurements may be made along multiple axes, or views, such as style and color. We assume access to a set of feature extractors, each of which computes representations for a specific view. Our objective is to design a retrieval algorithm that effectively combines similarities computed over representations from multiple views. To this end, we propose a self-supervised learning method for extracting disentangled view-specific representations for images such that the inter-view overlap is minimized. We show how this allows us to compute the intent of a collection as a distribution over views. We show how effective retrieval can be performed by prioritizing candidate expansion images that match the intent of a query collection. Finally, we present a new querying mechanism for image search enabled by composing multiple collections and perform retrieval under this setting using the techniques presented in this paper. 图像搜索引擎可以检索与查询图像相关的图像。在这项工作中,我们考虑的设置,其中相似的图像查询是从一个图像集合派生。对于可视化搜索,可以沿着多个轴或视图(如样式和颜色)进行相似性度量。我们假设访问一组特征提取器,每个特征提取器计算特定视图的表示。我们的目标是设计一个检索算法,有效地结合相似性计算的表示从多个视图。为此,我们提出了一种自监督学习方法来提取图像的非纠缠视点特定表示,从而使视点间的重叠最小化。我们展示了这是如何允许我们将集合的意图作为视图的分布来计算的。我们展示了如何通过对匹配查询集合意图的候选扩展图像进行优先级排序来执行有效的检索。最后,我们提出了一种新的图像搜索查询机制,该机制通过组合多个集合,并使用本文提出的技术在此设置下进行检索。 code 0
Efficient Integration of Multi-Order Dynamics and Internal Dynamics in Stock Movement Prediction Thanh Trung Huynh, Minh Hieu Nguyen, Thanh Tam Nguyen, Phi Le Nguyen, Matthias Weidlich, Quoc Viet Hung Nguyen, Karl Aberer Hanoi University of Science and Technology, Hanoi, Vietnam; Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland; Humboldt-Universitãt zu Berlin, Berlin, Germany; Griffith University, Gold Coast, Australia Advances in deep neural network (DNN) architectures have enabled new prediction techniques for stock market data. Unlike other multivariate time-series data, stock markets show two unique characteristics: (i) \emph{multi-order dynamics}, as stock prices are affected by strong non-pairwise correlations (e.g., within the same industry); and (ii) \emph{internal dynamics}, as each individual stock shows some particular behaviour. Recent DNN-based methods capture multi-order dynamics using hypergraphs, but rely on the Fourier basis in the convolution, which is both inefficient and ineffective. In addition, they largely ignore internal dynamics by adopting the same model for each stock, which implies a severe information loss. In this paper, we propose a framework for stock movement prediction to overcome the above issues. Specifically, the framework includes temporal generative filters that implement a memory-based mechanism onto an LSTM network in an attempt to learn individual patterns per stock. Moreover, we employ hypergraph attentions to capture the non-pairwise correlations. Here, using the wavelet basis instead of the Fourier basis, enables us to simplify the message passing and focus on the localized convolution. Experiments with US market data over six years show that our framework outperforms state-of-the-art methods in terms of profit and stability. Our source code and data are available at \url{https://github.com/thanhtrunghuynh93/estimate}. 深度神经网络(DNN)结构的进步使得新的股票市场数据预测技术成为可能。与其他多变量时间序列数据不同,股票市场表现出两个独特的特征: (i)多阶动态效应,因为股票价格受到强烈的非成对相关性(例如,在同一行业内)的影响; (ii)多阶动态效应,因为每只股票都表现出一些特殊的行为。目前基于 DNN 的方法利用超图来捕获多阶动力学,但是依赖于卷积中的傅里叶基,这种方法不仅效率低下,而且效率低下。此外,他们在很大程度上忽略了内部动态,对每只股票采用相同的模型,这意味着严重的信息损失。针对上述问题,本文提出了一个股票走势预测框架。具体来说,该框架包括时间生成过滤器,它在 LSTM 网络上实现基于内存的机制,以便学习每只股票的单个模式。此外,我们使用超图注意力来捕捉非成对的相关性。这里,用小波基代替傅里叶基,使我们能够简化信息的传递,并集中于局部卷积。六年来对美国市场数据的实验表明,我们的框架在利润和稳定性方面优于最先进的方法。我们的源代码和数据可以在 url { https://github.com/thanhtrunghuynh93/estimate }找到。 code 0
Combining vs. Transferring Knowledge: Investigating Strategies for Improving Demographic Inference in Low Resource Settings Yaguang Liu, Lisa Singh Georgetown University, Washington, DC, USA For some learning tasks, generating a large labeled data set is impractical. Demographic inference using social media data is one such task. While different strategies have been proposed to mitigate this challenge, including transfer learning, data augmentation, and data combination, they have not been explored for the task of user level demographic inference using social media data. This paper explores two of these strategies: data combination and transfer learning. First, we combine labeled training data from multiple data sets of similar size to understand when the combination is valuable and when it is not. Using data set distance, we quantify the relationship between our data sets to help explain the performance of the combination strategy. Then, we consider supervised transfer learning, where we pretrain a model on a larger labeled data set, fine-tune the model on smaller data sets, and incorporate regularization as part of the transfer learning process. We empirically show the strengths and limitations of the proposed techniques on multiple Twitter data sets. 对于一些学习任务,生成一个大的标记数据集是不切实际的。利用社交媒体数据进行人口统计推断就是这样一项任务。虽然已经提出了不同的策略来缓解这一挑战,包括转移学习,数据增强和数据组合,但是还没有探索使用社交媒体数据进行用户级人口推断的任务。本文探讨了其中的两种策略: 数据组合和迁移学习。首先,我们将来自多个大小相似的数据集的标记训练数据进行组合,以了解这种组合在什么时候有价值,什么时候没有价值。利用数据集距离量化数据集之间的关系,有助于解释组合策略的性能。然后,我们考虑有监督的迁移学习,其中我们预训练一个模型在一个较大的标记数据集,微调模型在较小的数据集,并纳入正则化作为迁移学习过程的一部分。我们通过实例展示了在多个 Twitter 数据集上提出的技术的优势和局限性。 code 0
Active Ensemble Learning for Knowledge Graph Error Detection Junnan Dong, Qinggang Zhang, Xiao Huang, Qiaoyu Tan, Daochen Zha, Zihao Zhao Rice University, Houston, TX, USA; The Hong Kong Polytechnic University, Hong Kong, China; Texas A&M University, College Station, TX, USA Knowledge graphs (KGs) could effectively integrate a large number of real-world assertions, and improve the performance of various applications, such as recommendation and search. KG error detection has been intensively studied since real-world KGs inevitably contain erroneous triples. While existing studies focus on developing a novel algorithm dedicated to one or a few data characteristics, we explore advancing KG error detection by assembling a set of state-of-the-art (SOTA) KG error detectors. However, it is nontrivial to develop a practical ensemble learning framework for KG error detection. Existing ensemble learning models heavily rely on labels, while it is expensive to acquire labeled errors in KGs. Also, KG error detection itself is challenging since triples contain rich semantic information and might be false because of various reasons. To this end, we propose to leverage active learning to minimize human efforts. Our proposed framework - KAEL, could effectively assemble a set of off-the-shelf error detection algorithms, by actively using a limited number of manual annotations. It adaptively updates the ensemble learning policy in each iteration based on active queries, i.e., the answers from experts. After all annotation budget is used, KAEL utilizes the trained policy to identify remaining suspicious triples. Experiments on real-world KGs demonstrate that we can achieve significant improvement when applying KAEL to assemble SOTA error detectors. KAEL also outperforms SOTA ensemble learning baselines significantly. 知识图(KGs)可以有效地集成大量真实世界的断言,并提高各种应用程序(如推荐和搜索)的性能。由于现实生活中的 KG 不可避免地含有错误的三元组,因此对 KG 误差检测进行了深入的研究。现有的研究集中在开发一种新的算法,专门用于一个或几个数据特征,我们探讨了先进的 KG 错误检测组装一套最先进的(SOTA) KG 错误检测器。然而,开发一个实用的 KG 错误检测集成学习框架并非易事。现有的集成学习模型严重依赖于标签,而在幼儿园中获取标签错误的成本很高。此外,由于三元组包含丰富的语义信息,因此 KG 错误检测本身具有挑战性,并且由于各种原因可能出现错误。为此,我们建议利用主动学习来最大限度地减少人类的努力。我们提出的框架—— KAEL,可以通过主动使用有限数量的手动注释,有效地组装一组现成的错误检测算法。它根据活动查询(即专家的回答)自适应地更新每次迭代中的集成学习策略。在使用所有注释预算之后,KAEL 使用经过训练的策略来识别剩余的可疑三元组。在实际 KG 上的实验表明,采用 KAEL 组装 SOTA 误差检测器可以取得明显的改进。KAEL 的表现也明显优于 SOTA 集成学习基准。 code 0
Stochastic Solutions for Dense Subgraph Discovery in Multilayer Networks Yasushi Kawase, Atsushi Miyauchi, Hanna Sumita The University of Tokyo, Bunkyo-ku, Japan; Tokyo Institute of Technology, Meguro-ku, Japan Network analysis has played a key role in knowledge discovery and data mining. In many real-world applications in recent years, we are interested in mining multilayer networks, where we have a number of edge sets called layers, which encode different types of connections and/or time-dependent connections over the same set of vertices. Among many network analysis techniques, dense subgraph discovery, aiming to find a dense component in a network, is an essential primitive with a variety of applications in diverse domains. In this paper, we introduce a novel optimization model for dense subgraph discovery in multilayer networks. Our model aims to find a stochastic solution, i.e., a probability distribution over the family of vertex subsets, rather than a single vertex subset, whereas it can also be used for obtaining a single vertex subset. For our model, we design an LP-based polynomial-time exact algorithm. Moreover, to handle large-scale networks, we also devise a simple, scalable preprocessing algorithm, which often reduces the size of the input networks significantly and results in a substantial speed-up. Computational experiments demonstrate the validity of our model and the effectiveness of our algorithms. 网络分析在知识发现和数据挖掘中发挥了重要作用。在最近几年的许多实际应用中,我们对挖掘多层网络感兴趣,其中我们有许多称为层的边集,它们在同一组顶点上编码不同类型的连接和/或依赖于时间的连接。在众多的网络分析技术中,致密子图发现是一种必要的原语,其目的是在网络中寻找致密组件,在不同的领域有着广泛的应用。本文提出了一种新的多层网络稠密子图发现优化模型。我们的模型的目的是找到一个随机解,即,一个概率分布在顶点子集家族,而不是一个单一的顶点子集,然而它也可以用来获得一个单一的顶点子集。对于我们的模型,我们设计了一个基于 LP 的多项式时间精确算法。此外,为了处理大规模的网络,我们还设计了一个简单的,可扩展的预处理算法,这往往大大减少了输入网络的大小,并导致大幅度的加速。计算实验验证了模型的有效性和算法的有效性。 code 0
Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization Canzhe Zhao, Yanjie Ze, Jing Dong, Baoxiang Wang, Shuai Li The Chinese University of Hong Kong, Shenzhen, Shenzhen, China; Shanghai Jiao Tong University, Shanghai, China Temporal difference (TD) learning is a widely used method to evaluate policies in reinforcement learning. While many TD learning methods have been developed in recent years, little attention has been paid to preserving privacy and most of the existing approaches might face the concerns of data privacy from users. To enable complex representative abilities of policies, in this paper, we consider preserving privacy in TD learning with nonlinear value function approximation. This is challenging because such a nonlinear problem is usually studied in the formulation of stochastic nonconvex-strongly-concave optimization to gain finite-sample analysis, which would require simultaneously preserving the privacy on primal and dual sides. To this end, we employ a momentum-based stochastic gradient descent ascent to achieve a single-timescale algorithm, and achieve a good trade-off between meaningful privacy and utility guarantees of both the primal and dual sides by perturbing the gradients on both sides using well-calibrated Gaussian noises. As a result, our DPTD algorithm could provide $(\epsilon,\delta)$-differential privacy (DP) guarantee for the sensitive information encoded in transitions and retain the original power of TD learning, with the utility upper bounded by $\widetilde{\mathcal{O}}(\frac{(d\log(1/\delta))^{1/8}}{(n\epsilon)^{1/4}})$ (The tilde in this paper hides the log factor.), where $n$ is the trajectory length and $d$ is the dimension. Extensive experiments conducted in OpenAI Gym show the advantages of our proposed algorithm. 时差学习是一种广泛使用的评估强化学习政策的方法。尽管近年来 TD 学习方法得到了广泛的应用,但对于保护隐私的研究却很少,现有的方法大多面临着用户对数据隐私的关注。为了使策略具有复杂的代表性能力,本文考虑在具有非线性值函数逼近的 TD 学习中保护隐私。这是一个具有挑战性的问题,因为这样的非线性问题通常被研究在随机非凸-强凹优化的公式获得有限样本分析,这将需要同时保留原始和对偶方面的隐私。为此,我们采用基于动量的随机梯度下降上升来实现一个单时间尺度算法,并通过使用校准良好的高斯噪声扰动两侧的梯度,在原始和双侧的有意义的隐私和效用保证之间达到一个良好的平衡。因此,我们的 dPTD 算法可以为过渡过程中编码的敏感信息提供 $(epsilon,delta) $- 差分隐私(DP)保证,并且保留了 TD 学习的原始能力,其效用上界为 $widtilde { mathcal { O }}(frac {(d log (1/delta)) ^ {1/8}}{(n epsilon) ^ {1/4}}) $(本文中的波浪线隐藏了 log 因子),其中 $n $是轨迹长度,$d $是维数。在 OpenAI 健身房进行的大量实验表明了我们提出的算法的优点。 code 0
Feature Missing-aware Routing-and-Fusion Network for Customer Lifetime Value Prediction in Advertising Xuejiao Yang, Binfeng Jia, Shuangyang Wang, Shijie Zhang Tencent, Shenzhen, China Nowadays, customer lifetime value (LTV) plays an important role in mobile game advertising, since it can be beneficial to adjust ad bids and ensure that the games are promoted to the most valuable users. Some neural models are utilized for LTV prediction based on the rich user features. However, in the advertising scenario, due to the privacy settings or limited length of log retention, etc, most of existing approaches suffer from the missing feature problem. Moreover, only a small fraction of purchase behaviours can be observed. The label sparsity inevitably limits model expressiveness. To tackle the aforementioned challenges, we propose a feature missing-aware routing-and-fusion network (MarfNet) to reduce the effect of the missing features while training. Specifically, we calculate the missing states of raw features and feature interactions for each sample. Based on the missing states, two missing-aware layers are designed to route samples into different experts, thus each expert can focus on the real features of samples assigned to it. Finally we get the missing-aware representation by the weighted fusion of the experts. To alleviate the label sparsity, we further propose a batch-in dynamic discrimination enhanced (Bidden) loss weight mechanism, which can automatically assign greater loss weights to difficult samples in the training process. Both offline experiments and online A/B tests have validated the superiority of our proposed Bidden-MarfNet. 目前,客户生命周期价值(LTV)在手机游戏广告中扮演着重要角色,它有利于调整广告投标价格,保证手机游戏向最有价值的用户推广。基于丰富的用户特征,利用一些神经模型对 LTV 进行预测。然而,在广告场景中,由于隐私设置或有限的日志保留时间等原因,大多数现有的方法都存在缺少特性的问题。此外,只有一小部分的购买行为可以观察到。标签的稀疏性不可避免地限制了模型的表达能力。为了解决上述问题,我们提出了一种特征缺失感知的路由融合网络(MarfNet) ,以减少训练过程中特征缺失的影响。具体来说,我们计算每个样本的原始特征和特征交互的缺失状态。在缺失状态的基础上,设计了两个缺失感知层,将样本分配给不同的专家,从而使每个专家能够专注于分配给它的样本的真实特征。最后通过专家加权融合得到缺失感知表示。为了缓解标签稀疏性,本文进一步提出了一种批中动态鉴别增强(Biden)损失权重机制,该机制可以在训练过程中自动为难度较大的样本赋予较大的损失权重。离线实验和在线 A/B 测试都验证了我们提出的 Biden-MarfNet 的优越性。 code 0
Boosting Advertising Space: Designing Ad Auctions for Augment Advertising Yangsu Liu, Dagui Chen, Zhenzhe Zheng, Zhilin Zhang, Chuan Yu, Fan Wu, Guihai Chen Alibaba Group, Beijing , China; Alibaba Group, Beijing, China; Shanghai Jiao Tong University, Shanghai, China In online e-commerce platforms, sponsored ads are always mixed with non-sponsored organic content (recommended items). To guarantee user experience, online platforms always impose strict limitations on the number of ads displayed, becoming the bottleneck for advertising revenue. To boost advertising space, we introduce a novel advertising business paradigm called Augment Advertising, where once a user clicks on a leading ad on the main page, instead of being shown the corresponding products, a collection of mini-detail ads relevant to the clicked ad is displayed. A key component for augment advertising is to design ad auctions to jointly select leading ads on the main page and mini-detail ads on the augment ad page. In this work, we decouple the ad auction into a two-stage auction, including a leading ad auction and a mini-detail ad auction. We design the Potential Generalized Second Price (PGSP) auction with Symmetric Nash Equilibrium (SNE) for leading ads, and adopt GSP auction for mini-detail ads. We have deployed augment advertising on Taobao advertising platform, and conducted extensive offline evaluations and online A/B tests. The evaluation results show that augment advertising could guarantee user experience while improving the ad revenue and the PGSP auction outperforms baselines in terms of revenue and user experience in augment advertising. 在在线电子商务平台中,赞助商的广告总是与非赞助商的有机内容(推荐商品)混合在一起。为了保证用户体验,在线平台总是对广告的显示数量进行严格的限制,成为广告收入的瓶颈。为了扩大广告空间,我们引入了一种新颖的广告商业模式,称为“增强广告”(Augment Advertising) ,一旦用户点击主页上的一个领先广告,而不是显示相应的产品,就会显示与点击广告相关的一系列小细节广告。增强型广告的一个关键组成部分是设计广告拍卖,联合选择主页上的领先广告和增强型广告页面上的小细节广告。在这个工作中,我们将广告拍卖分解为两个阶段的拍卖,包括领先广告拍卖和微细节广告拍卖。我们为领先广告设计了对称纳什均衡点的潜在广义二价拍卖(PGSP) ,为微细节广告设计了潜在广义二价拍卖(pgSP)。我们在淘宝广告平台上部署了增强型广告,并进行了广泛的线下评估和在线 A/B 测试。评价结果表明,增强广告可以在保证用户体验的同时提高广告收入,PGSP 拍卖在增强广告收入和用户体验方面优于基线。 code 0
Long-Document Cross-Lingual Summarization Shaohui Zheng, Zhixu Li, Jiaan Wang, Jianfeng Qu, An Liu, Lei Zhao, Zhigang Chen Soochow University, Suzhou, China; Jilin Kexun Information Technology Co., Ltd., Jilin, China; Fudan University, Shanghai, China Cross-Lingual Summarization (CLS) aims at generating summaries in one language for the given documents in another language. CLS has attracted wide research attention due to its practical significance in the multi-lingual world. Though great contributions have been made, existing CLS works typically focus on short documents, such as news articles, short dialogues and guides. Different from these short texts, long documents such as academic articles and business reports usually discuss complicated subjects and consist of thousands of words, making them non-trivial to process and summarize. To promote CLS research on long documents, we construct Perseus, the first long-document CLS dataset which collects about 94K Chinese scientific documents paired with English summaries. The average length of documents in Perseus is more than two thousand tokens. As a preliminary study on long-document CLS, we build and evaluate various CLS baselines, including pipeline and end-to-end methods. Experimental results on Perseus show the superiority of the end-to-end baseline, outperforming the strong pipeline models equipped with sophisticated machine translation systems. Furthermore, to provide a deeper understanding, we manually analyze the model outputs and discuss specific challenges faced by current approaches. We hope that our work could benchmark long-document CLS and benefit future studies. 跨语言摘要(CLS)的目的是用一种语言为给定的文档生成另一种语言的摘要。CLS 因其在多语言世界中的实际意义而引起了广泛的研究关注。虽然已经作出了巨大的贡献,现有的 CLS 工作通常集中在短文档,如新闻文章,短对话和指南。与这些短文不同的是,学术文章、商务报告等长文档通常涉及复杂的主题,由数千字组成,因此处理和总结起来非常重要。为了促进长文献的 CLS 研究,我们构建了第一个长文献 CLS 数据集 Perseus,该数据集收集了约94K 中文科学文献并附有英文摘要。英仙座文件的平均长度超过二千个令牌。作为对长文档 CLS 的初步研究,我们建立和评估各种 CLS 基线,包括流水线和端到端方法。在 Perseus 上的实验结果显示了端到端基线的优越性,优于配备复杂机器翻译系统的强流水线模型。此外,为了提供更深入的理解,我们手动分析模型输出并讨论当前方法面临的具体挑战。我们希望我们的工作可以作为长文档 CLS 的基准,并有利于未来的研究。 code 0
FineSum: Target-Oriented, Fine-Grained Opinion Summarization Suyu Ge, Jiaxin Huang, Yu Meng, Jiawei Han University of Illinois Urbana-Champaign, Urbana, IL, USA Target-oriented opinion summarization is to profile a target by extracting user opinions from multiple related documents. Instead of simply mining opinion ratings on a target (e.g., a restaurant) or on multiple aspects (e.g., food, service) of a target, it is desirable to go deeper, to mine opinion on fine-grained sub-aspects (e.g., fish). However, it is expensive to obtain high-quality annotations at such fine-grained scale. This leads to our proposal of a new framework, FineSum, which advances the frontier of opinion analysis in three aspects: (1) minimal supervision, where no document-summary pairs are provided, only aspect names and a few aspect/sentiment keywords are available; (2) fine-grained opinion analysis, where sentiment analysis drills down to a specific subject or characteristic within each general aspect; and (3) phrase-based summarization, where short phrases are taken as basic units for summarization, and semantically coherent phrases are gathered to improve the consistency and comprehensiveness of summary. Given a large corpus with no annotation, FineSum first automatically identifies potential spans of opinion phrases, and further reduces the noise in identification results using aspect and sentiment classifiers. It then constructs multiple fine-grained opinion clusters under each aspect and sentiment. Each cluster expresses uniform opinions towards certain sub-aspects (e.g., "fish" in "food" aspect) or characteristics (e.g., "Mexican" in "food" aspect). To accomplish this, we train a spherical word embedding space to explicitly represent different aspects and sentiments. We then distill the knowledge from embedding to a contextualized phrase classifier, and perform clustering using the contextualized opinion-aware phrase embedding. Both automatic evaluations on the benchmark and quantitative human evaluation validate the effectiveness of our approach. 面向目标的意见摘要是从多个相关文档中提取用户意见,对目标进行轮廓分析。与其简单地挖掘目标(如餐馆)或目标的多个方面(如食物、服务)的意见评级,不如更深入地挖掘细粒度的子方面(如鱼)的意见。然而,在如此细粒度的规模下获得高质量的注释是非常昂贵的。这导致我们提出了一个新的框架,FineSum,它在三个方面推进了意见分析的前沿: (1)最小的监督,没有提供文档-摘要对,只有方面名称和一些方面/情绪关键词可用; (2)细粒度的意见分析,情绪分析深入到每个一般方面的特定主题或特征; 和(3)基于短语的摘要,短语被作为基本的总结单元,语义连贯的短语被收集起来,以提高摘要的一致性和全面性。给定一个没有注释的大型语料库,FineSum 首先自动识别意见短语的潜在范围,并使用方面和情感分类器进一步降低识别结果中的噪声。然后在各个方面和情绪下构建多个细粒度的意见聚类。每个群组对某些子方面(例如,“食物”方面的“鱼”)或特征(例如,“食物”方面的“墨西哥”)表达统一的意见。为了达到这个目的,我们训练一个球形词嵌入空间来显式地表示不同的方面和情感。然后将嵌入的知识提取到上下文短语分类器中,利用上下文意见感知短语嵌入进行聚类。基准的自动评估和定量的人工评估都验证了该方法的有效性。 code 0
EZInterviewer: To Improve Job Interview Performance with Mock Interview Generator Mingzhe Li, Xiuying Chen, Weiheng Liao, Yang Song, Tao Zhang, Dongyan Zhao, Rui Yan Renmin University of China, Beijing, China; Peking University, Beijing, China; KAUST, Jaddah, Saudi Arabia; Made by DATA, Beijing, China; BOSS Zhipin, Beijing, China Interview has been regarded as one of the most crucial step for recruitment. To fully prepare for the interview with the recruiters, job seekers usually practice with mock interviews between each other. However, such a mock interview with peers is generally far away from the real interview experience: the mock interviewers are not guaranteed to be professional and are not likely to behave like a real interviewer. Due to the rapid growth of online recruitment in recent years, recruiters tend to have online interviews, which makes it possible to collect real interview data from real interviewers. In this paper, we propose a novel application named EZInterviewer, which aims to learn from the online interview data and provides mock interview services to the job seekers. The task is challenging in two ways: (1) the interview data are now available but still of low-resource; (2) to generate meaningful and relevant interview dialogs requires thorough understanding of both resumes and job descriptions. To address the low-resource challenge, EZInterviewer is trained on a very small set of interview dialogs. The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource. Evaluation results on a real-world job interview dialog dataset indicate that we achieve promising results to generate mock interviews. With the help of EZInterviewer, we hope to make mock interview practice become easier for job seekers. 面试被认为是招聘中最关键的步骤之一。为了充分准备与招聘人员的面试,求职者通常会在彼此之间进行模拟面试练习。然而,这种与同龄人的模拟面试通常远离真正的面试经验: 模拟面试官不能保证是专业的,也不太可能表现得像一个真正的面试官。由于近年来网络招聘的快速增长,招聘人员往往采用网络面试的方式,这使得从真正的面试官那里收集真实的面试数据成为可能。本文提出了一个新颖的应用程序 EZInterview,旨在从网上面试数据中获取信息,为求职者提供模拟面试服务。这项任务在两个方面具有挑战性: (1)面试数据现在已经可用,但仍然是低资源; (2)产生有意义的和相关的面试对话需要对简历和职位描述都有透彻的理解。为了解决资源不足的问题,EZInterview 接受了一个非常小的面试对话集的培训。其核心思想是通过分离知识选择器和对话生成器,减少依赖于面试对话的参数数量,使得大多数参数可以通过不接地的对话和资源不少的简历数据进行训练。实际工作面试对话数据集的评估结果表明,我们在模拟面试中取得了令人满意的效果。通过 EZInterview 的帮助,我们希望使模拟面试变得对求职者来说更加容易。 code 0
A Framework for Detecting Frauds from Extremely Few Labels YaLin Zhang, YiXuan Sun, Fangfang Fan, Meng Li, Yeyu Zhao, Wei Wang, Longfei Li, Jun Zhou, Jinghua Feng Ant Group, Hangzhou, China; Nanjing University, Nanjing, China; Zheijiang University & Ant Group, Hangzhou, China In this paper, we present a framework to deal with the fraud detection task with extremely few labeled frauds. We involve human intelligence in the loop in a labor-saving manner and introduce several ingenious designs to the model construction process. Namely, a rule mining module is introduced, and the learned rules will be refined with expert knowledge. The refined rules will be used to relabel the unlabeled samples and get the potential frauds. We further present a model to learn with the reliable frauds, the potential frauds, and the rest normal samples. Note that the label noise problem, class imbalance problem, and confirmation bias problem are all addressed with specific strategies when building the model. Experimental results are reported to demonstrate the effectiveness of the framework. 在本文中,我们提出了一个框架来处理欺诈检测任务的极少标记欺诈。我们以节省人力的方式将人类智能融入到这个循环中,并且在模型的建造过程中引入了几个巧妙的设计。即引入规则挖掘模块,利用专家知识对学习到的规则进行细化。改进后的规则将被用于重新标记未标记的样品,并得到潜在的欺诈行为。我们进一步提出了一个模型来学习与可靠的欺诈,潜在的欺诈,和其余的正常样本。注意,在建立模型时,标签噪声问题、类别不平衡问题和确认偏差问题都是通过特定的策略来解决的。实验结果证明了该框架的有效性。 code 0
Concept-Oriented Transformers for Visual Sentiment Analysis QuocTuan Truong, Hady W. Lauw Amazon, Seattle, WA, USA; Singapore Management University, Singapore, Singapore In the richly multimedia Web, detecting sentiment signals expressed in images would support multiple applications, e.g., measuring customer satisfaction from online reviews, analyzing trends and opinions from social media. Given an image, visual sentiment analysis aims at recognizing positive or negative sentiment, and occasionally neutral sentiment as well. A nascent yet promising direction is Transformer-based models applied to image data, whereby Vision Transformer (ViT) establishes remarkable performance on large-scale vision benchmarks. In addition to investigating the fitness of ViT for visual sentiment analysis, we further incorporate concept orientation into the self-attention mechanism, which is the core component of Transformer. The proposed model captures the relationships between image features and specific concepts. We conduct extensive experiments on Visual Sentiment Ontology (VSO) and Yelp.com online review datasets, showing that not only does the proposed model significantly improve upon the base model ViT in detecting visual sentiment but it also outperforms previous visual sentiment analysis models with narrowly-defined orientations. Additional analyses yield insightful results and better understanding of the concept-oriented self-attention mechanism. 在丰富的多媒体网络中,检测图像中表达的情绪信号将支持多种应用,例如,通过在线评论测量客户满意度,分析来自社交媒体的趋势和意见。给定一个图像,视觉情绪分析的目的是识别积极或消极的情绪,偶尔中立的情绪以及。一个新兴但有前途的方向是应用于图像数据的基于 Transform- 的模型,其中视觉转换器(ViT)在大规模视觉基准上建立了显著的性能。除了研究 ViT 对视觉情绪分析的适应性之外,我们还将概念定向引入到自我注意机制中,这是变压器的核心部分。提出的模型捕捉图像特征和特定概念之间的关系。我们在视觉情感本体(VSO)和 Yelp.com 在线评论数据集上进行了广泛的实验,结果表明,所提出的模型不仅在检测视觉情感方面显着改善了基本模型 ViT,而且在狭义定义的方向上优于以前的视觉情感分析模型。更多的分析产生了深刻的结果和更好的理解概念导向的自我注意机制。 code 0
UnCommonSense in Action! Informative Negations for Commonsense Knowledge Bases Hiba Arnaout, TuanPhong Nguyen, Simon Razniewski, Gerhard Weikum Max Planck Institute for Informatics, Saarbrücken , Germany Knowledge bases about commonsense knowledge i.e., CSKBs, are crucial in applications such as search and question answering. Prominent CSKBs mostly focus on positive statements. In this paper we show that materializing important negations increases the usability of CSKBs. We present Uncommonsense, a web portal to explore informative negations about everyday concepts: (i) in a research-focused interface, users get a glimpse into results-per-steps of the methodology; (ii) in a trivia interface, users can browse fun negative trivia about concepts of their choice; and (iii) in a query interface, users can submit triple-pattern queries with explicit negated relations and compare results with significantly less relevant answers from the positive-only baseline. It can be accessed at:https://uncommonsense.mpi-inf.mpg.de/. 关于常识知识的知识库,即 CSKB,在诸如搜索和问题回答等应用中是至关重要的。著名的 CSKB 大多侧重于积极的声明。在本文中,我们表明,物化重要的否定增加 CSKB 的可用性。我们提出 Uncommonsense,一个网络门户网站,探索日常概念的信息否定: (i)在一个研究为重点的界面,用户得到一个结果,每一步的方法; (ii)在一个琐事界面,用户可以浏览有趣的负面琐事,他们选择的概念; (iii)在一个查询界面,用户可以提交三重模式的查询明确否定的关系,并比较结果显着相关性较低的答案,从积极的基线。你可浏览以下 https://uncommonsense.mpi-inf.mpg.de/ :。 code 0
SoCraft: Advertiser-level Predictive Scoring for Creative Performance on Meta Alfred Huang, Qi Yang, Sergey I. Nikolenko, Marlo Ongpin, Ilia Gossoudarev, Ngoc Yen Duong, Kirill Lepikhin, Sergey Vishnyakov, YuYi ChuFarseeva, Aleksandr Farseev ITMO University, Saint Petersburg, United Kingdom; SoMin.ai, London, United Kingdom; ITMO University, Saint Petersburg, Russian Fed. In this technical demonstration, we present SoCraft, a framework to build an advertiser-level multimedia ad content scoring platform for Meta Ads. The system utilizes a multimodal deep neural architecture to score and evaluate advertised content on Meta using both high- and low-level features of its contextual data such as text, image, targeting, and ad settings. In this demo, we present two deep models, SoDeep and SoWide, and validate the effectiveness of SoCraft with a successful real-world case study in Singapore. 在这个技术演示中,我们介绍了 SoCraft,一个为元广告构建广告客户级多媒体广告内容评分平台的框架。该系统利用一个多模态深度神经结构,使用其上下文数据(如文本、图像、目标和广告设置)的高级和低级特征,对 Meta 上的广告内容进行评分和评估。在这个演示中,我们提出了两个深度模型,SoDeep 和 SoWide,并验证了 SoCraft 在新加坡成功的现实案例研究的有效性。 code 0
Privacy Aware Experiments without Cookies Shiv Shankar, Ritwik Sinha, Saayan Mitra, Viswanathan (Vishy) Swaminathan, Sridhar Mahadevan, Moumita Sinha Adobe Research, San Jose, CA, USA; University of Massachusetts, Amherst, MA, USA; Adobe Inc, San Jose, CA, USA Consider two brands that want to jointly test alternate web experiences for their customers with an A/B test. Such collaborative tests are today enabled using \textit{third-party cookies}, where each brand has information on the identity of visitors to another website. With the imminent elimination of third-party cookies, such A/B tests will become untenable. We propose a two-stage experimental design, where the two brands only need to agree on high-level aggregate parameters of the experiment to test the alternate experiences. Our design respects the privacy of customers. We propose an estimater of the Average Treatment Effect (ATE), show that it is unbiased and theoretically compute its variance. Our demonstration describes how a marketer for a brand can design such an experiment and analyze the results. On real and simulated data, we show that the approach provides valid estimate of the ATE with low variance and is robust to the proportion of visitors overlapping across the brands. 考虑两个品牌,它们希望通过 A/B 测试为客户联合测试替代的网络体验。这样的协作测试今天使用的文本{第三方 cookie } ,其中每个品牌有信息的身份访问另一个网站。随着第三方 cookie 即将被淘汰,这种 A/B 测试将变得站不住脚。我们提出了一个两阶段的实验设计,其中两个品牌只需要同意实验的高水平集合参数来测试交替的经验。我们的设计尊重顾客的隐私。我们提出了一个平均处理效应(ATE)的估计器,证明了它是无偏的,并从理论上计算了它的方差。我们的演示描述了一个品牌的营销人员如何设计这样一个实验并分析结果。在实际数据和模拟数据上,我们表明该方法提供了有效的估计 ATE 与低方差,是鲁棒的比例的访问者重叠跨品牌。 code 0
ElasticDL: A Kubernetes-native Deep Learning Framework with Fault-tolerance and Elastic Scheduling Jun Zhou, Ke Zhang, Feng Zhu, Qitao Shi, Wenjing Fang, Lin Wang, Yi Wang Ant Group, Hangzhou, China; Zhejiang University & Ant Group, Hangzhou, China The power of artificial intelligence (AI) models originates with sophisticated model architecture as well as the sheer size of the model. These large-scale AI models impose new and challenging system requirements regarding scalability, reliability, and flexibility. One of the most promising solutions in the industry is to train these large-scale models on distributed deep-learning frameworks. With the power of all distributed computations, it is desired to achieve a training process with excellent scalability, elastic scheduling (flexibility), and fault tolerance (reliability). In this paper, we demonstrate the scalability, flexibility, and reliability of our open-source Elastic Deep Learning (ElasticDL) framework. Our ElasticDL utilizes an open-source system, i.e., Kubernetes, for automating deployment, scaling, and management of containerized application features to provide fault tolerance and support elastic scheduling for DL tasks. 人工智能(AI)模型的力量源于复杂的模型体系结构以及模型的庞大规模。这些大规模的人工智能模型对系统的可伸缩性、可靠性和灵活性提出了新的和具有挑战性的要求。行业中最有前途的解决方案之一是在分布式深度学习框架上对这些大规模模型进行培训。利用所有分布式计算的能力,我们希望能够实现一个具有良好可伸缩性、弹性调度(灵活性)和容错性(可靠性)的训练过程。在本文中,我们展示了我们的开源弹性深度学习(ElasticDL)框架的可伸缩性、灵活性和可靠性。我们的 ElasticDL 利用一个开源系统,即 Kubernetes,来自动部署、扩展和管理容器化应用程序特性,以提供容错能力并支持 DL 任务的弹性调度。 code 0
PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals Zhihao Zhang, Siwen Luo, Junyi Chen, Sijia Lai, Siqu Long, Hyunsuk Chung, Soyeon Caren Han Fortify Edge, Sydney, NSW, Australia; The University of Sydney, Sydney, NSW, Australia; The University of Sydney & The University of Western Australia, Perth, WA, Australia We propose a PiggyBack, a Visual Question Answering platform that allows users to apply the state-of-the-art visual-language pretrained models easily. The PiggyBack supports the full stack of visual question answering tasks, specifically data processing, model fine-tuning, and result visualisation. We integrate visual-language models, pretrained by HuggingFace, an open-source API platform of deep learning technologies; however, it cannot be runnable without programming skills or deep learning understanding. Hence, our PiggyBack supports an easy-to-use browser-based user interface with several deep learning visual language pretrained models for general users and domain experts. The PiggyBack includes the following benefits: Free availability under the MIT License, Portability due to web-based and thus runs on almost any platform, A comprehensive data creation and processing technique, and ease of use on deep learning-based visual language pretrained models. The demo video is available on YouTube and can be found at https://youtu.be/iz44RZ1lF4s. 我们提出了 PiggyBack,一个可视化问题回答平台,允许用户轻松应用最先进的可视化语言预先训练的模型。PiggyBack 支持完整的可视化问题回答任务堆栈,特别是数据处理、模型微调和结果可视化。我们集成了可视化语言模型,这些模型是由 HuggingFace (深度学习技术的开源 API 平台)预先训练的; 然而,如果没有编程技能或深度学习理解,它就不能运行。因此,PiggyBack 支持一个易于使用的基于浏览器的用户界面,为一般用户和领域专家提供了几个深度学习可视化语言预先训练的模型。PiggyBack 包括以下好处: 麻省理工学院许可证下的免费可用性,基于网络的便携性,因此可以在几乎任何平台上运行,一个全面的数据创建和处理技术,易于使用基于深度学习的可视化语言预训模型。演示视频可以在 YouTube 上找到,也可以在 https://youtu.be/iz44rz1lf4s 上找到。 code 0
Web of Conferences: A Conference Knowledge Graph Shuo Yu, Ciyuan Peng, Chengchuan Xu, Chen Zhang, Feng Xia Chengdu Neusoft University, Chengdu, China; Dalian University of Technology, Dalian, China; RMIT University, Melbourne, Australia Academic conferences have been proven to be significant in facilitating academic activities. To promote information retrieval specific to academic conferences, building complete, systematic, and professional conference knowledge graphs is a crucial task. However, many related systems mainly focus on general knowledge of overall academic information or concentrate services on specific domains. Aiming at filling this gap, this work demonstrates a novel conference knowledge graph, namely Web of Conferences. The system accommodates detailed conference profiles, conference ranking lists, intelligent conference queries, and personalized conference recommendations. Web of Conferences supports detailed conference information retrieval while providing the ranking of conferences based on the most recent data. Conference queries in the system can be implemented via precise search or fuzzy search. Then, according to users' query conditions, personalized conference recommendations are available. Web of Conferences is demonstrated with a user-friendly visualization interface and can be served as a useful information retrieval system for researchers. 事实证明,学术会议在促进学术活动方面具有重要意义。为了推广学术会议的信息检索,建立完整、系统和专业的会议知识图表是一项至关重要的任务。然而,许多相关的系统主要集中于整体学术信息的一般知识或集中于特定领域的服务。为了填补这一空白,本文展示了一个新颖的会议知识图,即会议网络。该系统可容纳详细的会议概况、会议排名列表、智能会议查询和个性化会议推荐。会议网站支持详细的会议信息检索,同时提供基于最新数据的会议排名。系统中的会议查询可以通过精确搜索或模糊搜索来实现。然后,根据用户的查询条件,提供个性化的会议推荐。会议网络是一个用户友好的可视化界面,可以作为一个有用的信息检索系统供研究人员使用。 code 0
Developing and Evaluating Graph Counterfactual Explanation with GRETEL Mario Alfonso PradoRomero, Bardh Prenkaj, Giovanni Stilo University of L'Aquila, L'Aquila, Italy; Sapienza University of Rome, Rome, Italy; Gran Sasso Science Institute, L'Aquila, Italy The black-box nature and the lack of interpretability detract from constant improvements in Graph Neural Networks (GNNs) performance in social network tasks like friendship prediction and community detection. Graph Counterfactual Explanation (GCE) methods aid in understanding the prediction of GNNs by generating counterfactual examples that promote trustworthiness, debiasing, and privacy in social networks. Alas, the literature on GCE lacks standardised definitions, explainers, datasets, and evaluation metrics. To bridge the gap between the performance and interpretability of GNNs in social networks, we discuss GRETEL, a unified framework for GCE methods development and evaluation. We demonstrate how GRETEL comes with fully extensible built-in components that allow users to define ad-hoc explainer methods, generate synthetic datasets, implement custom evaluation metrics, and integrate state-of-the-art prediction models. 黑盒子的特性和缺乏可解释性使得图神经网络(GNN)在社交网络任务(如友谊预测和社区检测)中的性能不断提高。图形反事实解释(GCE)方法通过生成反事实例子来帮助理解 GNN 的预测,这些反事实例子提高了社交网络中的可信度、消除偏见和隐私。遗憾的是,关于 GCE 的文献缺乏标准化的定义、解释者、数据集和评估指标。为了弥合社交网络中 GNN 的性能和可解释性之间的差距,我们讨论 GRETEL,一个用于 GCE 方法开发和评估的统一框架。我们演示 GRETEL 如何带有完全可扩展的内置组件,允许用户定义特别解释器方法,生成合成数据集,实现自定义评估指标,并集成最先进的预测模型。 code 0
DistriBayes: A Distributed Platform for Learning, Inference and Attribution on Large Scale Bayesian Network Yi Ding, Jun Zhou, Qing Cui, Lin Wang, Mengqi Zhang, Yang Dong Ant Group, Beijing, China; Zhejiang University & Ant Group, Hangzhou, China To improve the marketing performance in the financial scenario, it is necessary to develop a trustworthy model to analyze and select promotion-sensitive customers. Bayesian Network (BN) is suitable for this task because of its interpretability and flexibility, but it usually suffers the exponentially growing computation complexity as the number of nodes grows. To tackle this problem, we present a comprehensive distributed platform named DistriBayes, which can efficiently learn, infer and attribute on a large-scale BN all-in-one platform. It implements several score-based structure learning methods, loopy belief propagation with backdoor adjustment for inference, and a carefully optimized search procedure for attribution. Leveraging the distributed cluster, DistriBayes can finish the learning and attribution on Bayesian Network with hundreds of nodes and millions of samples in hours. 为了提高财务情景下的营销绩效,有必要建立一个可信赖的模型来分析和选择促销敏感的客户。贝氏网路因其可解释性和灵活性而适合这项任务,但随着节点数目的增加,计算复杂度通常会呈指数级增长。为了解决这一问题,提出了一个综合分布式平台 DireBayes,该平台可以在大规模的 BN 一体化平台上进行高效的学习、推理和归属。它实现了多种基于分数的结构化学习方法、带有后门调整的循环信念传播推理以及精心优化的属性搜索过程。通过利用分布式集群,distyes 可以在数小时内完成数百个节点和数百万个样本的贝氏网路学习和归属。 code 0
SimSumIoT: A Platform for Simulating the Summarisation from Internet of Things Wei Emma Zhang, Adnan Mahmood, Lixin Deng, Minhao Zhu Macquarie University, Sydney, Australia; The University of Adelaide, Adelaide, Australia Summarising from the Web could be formed as a problem of multi-document Summarisaiton (MDS) from multiple sources. In contrast to the current MDS problem that involves working on benchmark datasets which provide well clustered set of documents, we envisage to build a pipeline for content Summarisaiton from the Web, but narrow down to the Social Internet of Things (SIoT) paradigm, starting at data collection from the IoT objects, then applying natural language processing techniques for grouping and summarising the data, to distributing summaries back to the IoT objects. In this paper, we present our simulation tool, SimSumIoT, that simulates the process of data sharing, receiving, clustering, and Summarisaiton. A Web-based interface is developed for this purpose allowing users to visualize the process through a set of interactions. The Web interface is accessible via http://simsumlot.tk. 从 Web 上进行摘要可以形成一个多文档摘要(MDS)问题。与目前的 MDS 问题相反,我们设想建立一个管道,从网络内容摘要,但缩小到物联网(SIoT)范式,从物联网对象的数据收集开始,然后应用自然语言处理技术分组和总结数据,分发摘要回物联网对象。在本文中,我们提出了我们的模拟工具,模拟数据共享,接收,集群和 Summarisaiton 的过程。为此开发了一个基于 Web 的界面,允许用户通过一组交互将流程可视化。网页界面可透过 http://simsumlot.tk 进入。 code 0
AntTS: A Toolkit for Time Series Forecasting in Industrial Scenarios Jianping Wei, Zhibo Zhu, Ziqi Liu, Zhiqiang Zhang, Jun Zhou Ant Group, Hangzhou, China; Zhejiang University & Ant Group, Hangzhou, China Time series forecasting is an important ingredient in the intelligence of business and decision processes. In industrial scenarios, the time series of interest are mostly macroscopic time series that are aggregated from microscopic time series, e.g., the retail sales is aggregated from the sales of different goods, and that are also intervened by certain treatments on the microscopic individuals, e.g., issuing discount coupons on some goods to increase the retail sales. These characteristics are not considered in existing toolkits, which just focus on the "natural" time series forecasting that predicts the future value based on historical data, regardless of the impact of treatments. In this paper, we present AntTS, a time series toolkit paying more attention on the forecasting of the macroscopic time series with underlying microscopic time series and certain treatments, besides the "natural" time series forecasting. AntTS consists of three decoupled modules, namely Clustering module, Natural Forecasting module, and Effect module, which are utilized to study the homogeneous groups of microscopic individuals, the "natural" time series forecasting of homogeneous groups, and the treatment effect estimation of homogeneous groups. With the combinations of different modules, it can exploit the microscopic individuals and the interventions on them, to help the forecasting of macroscopic time series. We show that AntTS helps address many typical tasks in the industry. 时间序列预测是商业和决策过程智能化的重要组成部分。在工业方案中,利息的时间序列大多是宏观时间序列,由微观时间序列聚合而成,例如,零售销售是由不同商品的销售聚合而成,同时也受到某些对微观个体的干预,例如,发行某些商品的折扣券以增加零售销售。现有的工具包没有考虑这些特征,它们只关注基于历史数据预测未来价值的“自然”时间序列预测,而不考虑治疗的影响。本文提出了一个时间序列工具包,除了“自然”时间序列预测之外,它更注重对宏观时间序列的预测,包括底层的微观时间序列和一定的处理方法。AntTS 由聚类模块、自然预测模块和效应模块三个解耦模块组成,用于研究微观个体的同质群体、同质群体的“自然”时间序列预测以及同质群体的治疗效果估计。通过不同模块的组合,可以利用微观个体以及对微观个体的干预,对宏观时间序列进行预测。我们展示了 Ant TS 可以帮助解决行业中的许多典型任务。 code 0
Unsupervised Question Duplicate and Related Questions Detection in e-learning platforms Maksimjeet Chowdhary, Sanyam Goyal, Venktesh V, Mukesh K. Mohania, Vikram Goyal IIIT Delhi, Delhi, India; Indraprastha Institute of Information Technology, Delhi, India Online learning platforms provide diverse questions to gauge the learners' understanding of different concepts. The repository of questions has to be constantly updated to ensure a diverse pool of questions to conduct assessments for learners. However, it is impossible for the academician to manually skim through the large repository of questions to check for duplicates when onboarding new questions from external sources. Hence, we propose a tool QDup in this paper that can surface near-duplicate and semantically related questions without any supervised data. The proposed tool follows an unsupervised hybrid pipeline of statistical and neural approaches for incorporating different nuances in similarity for the task of question duplicate detection. We demonstrate that QDup can detect near-duplicate questions and also suggest related questions for practice with remarkable accuracy and speed from a large repository of questions. The demo video of the tool can be found at https://www.youtube.com/watch?v=loh0_-7XLW4. 在线学习平台提供各种各样的问题来衡量学习者对不同概念的理解。问题库必须不断更新,以确保为学习者进行评估的问题库多样化。但是,当从外部来源获得新的问题时,学者不可能手动浏览大量的问题库来检查重复的问题。因此,本文提出了一个 QDup 工具,它可以在没有任何监督数据的情况下将近重复和语义相关的问题表面化。提议的工具遵循统计学和神经学方法的无监督混合管道,以便在相似性方面纳入不同的细微差别,从而完成问题重复检测任务。我们证明了 QDup 可以检测接近重复的问题,并且可以从大量的问题库中以显著的准确性和速度提出相关的问题供实践使用。该工具的演示视频可以在 https://www.youtube.com/watch?v=loh0_-7xlw4找到。 code 0
"Just To See You Smile": SMILEY, a Voice-Guided GUY GAN Qi Yang, Christos Tzelepis, Sergey Nikolenko, Ioannis Patras, Aleksandr Farseev SoMin.ai Research, London, United Kingdom; Queen Mary University of London, London, United Kingdom; ITMO University, Saint Petersburg, Russian Fed. In this technical demonstration, we present SMILEY, a voice-guided virtual assistant. The system utilizes a deep neural architecture ContraCLIP to manipulate facial attributes using voice instructions, allowing for deeper speaker engagement and smoother customer experience when being used in the "virtual concierge" scenario. We validate the effectiveness of SMILEY and ContraCLIP via a successful real-world case study in Singapore and a large-scale quantitative evaluation. 在这个技术演示中,我们介绍了 SMILEY,一个语音引导的虚拟助手。该系统利用深层神经结构 ContraCLIP,通过语音指令操纵面部属性,当在“虚拟礼宾”场景中使用时,允许更深入的说话者参与和更顺畅的客户体验。通过在新加坡成功的实际案例研究和大规模的定量评估,我们验证了 SMILEY 和 ContraCLIP 的有效性。 code 0
DOCoR: Document-level OpenIE with Coreference Resolution Shan Jie Yong, Kuicai Dong, Aixin Sun Nanyang Technological University, Singapore, Singapore Open Information Extraction (OpenIE) extracts relational fact tuples in the form of <subject, relation, object> from text. Most existing OpenIE solutions operate at sentence level and extract relational tuples solely from a sentence. However, many sentences exist as a part of paragraph or a document, where coreferencing is common. In this demonstration, we present a system which refines the semantic tuples generated by OpenIE with the aid of a coreference resolution tool. Specifically, all coreferential mentions across the entire document are identified and grouped into coreferential clusters. Objects and subjects in the extracted tuples from OpenIE which match any coreferential mentions are then resolved with a suitable representative term. In this way, our system is able to resolve both anaphoric and cataphoric references, to achieve Document-level OpenIE with Coreference Resolution (DOCoR). The demonstration video can be viewed at https://youtu.be/o9ZSWCBvlDs 开放式信息抽取(OpenIE)从文本中提取关系事实元组,其格式为 < subject,relations,object > 。大多数现有的 OpenIE 解决方案都是在句子级别上运行的,并且只从一个句子中提取关系元组。然而,许多句子作为段落或文档的一部分存在,其中共同参照是常见的。在这个演示中,我们提出了一个系统,该系统借助于一个共引用解析工具来提炼 OpenIE 生成的语义元组。具体来说,整个文档中的所有共引用提及都被识别并分组为共引用集群。然后,从 OpenIE 中提取的元组中匹配任何相关提及的对象和主题用一个合适的代表性术语进行解析。通过这种方式,我们的系统能够同时解析照应和照应两种指称,从而实现具有指称解析(DOCoR)的文档级 OpenIE。市民可于 https://youtu.be/o9zswcbvlds 浏览示范短片 code 0
Classification of Different Participating Entities in the Rise of Hateful Content in Social Media Mithun Das Indian Institute of Technology Kharagpur, Kharagpur, India Hateful content is a growing concern across different platforms, whether it is a moderated platform or an unmoderated platform. The public expression of hate speech encourages the devaluation of minority members. It has some consequences in the real world as well. In such a scenario, it is necessary to design AI systems that could detect such harmful entities/elements in online social media and take cautionary actions to mitigate the risk/harm they cause to society. The way individuals disseminate content on social media platforms also deviates. The content can be in the form of texts, images, videos, etc. Hence hateful content in all forms should be detected, and further actions should be taken to maintain the civility of the platform. We first introduced two published works addressing the challenges of detecting low-resource multilingual abusive speech and hateful user detection. Finally, we discuss our ongoing work on multimodal hateful content detection. 不管是一个有节制的平台还是一个没有节制的平台,仇恨内容都越来越受到不同平台的关注。公开发表仇恨言论鼓励贬低少数群体成员。它在现实世界中也有一些后果。在这种情况下,有必要设计人工智能系统,以便能够发现在线社交媒体中的这种有害实体/元素,并采取谨慎行动,减轻它们对社会造成的风险/损害。个人在社交媒体平台上传播内容的方式也有偏差。内容可以是文本、图像、视频等形式。因此,应当发现各种形式的仇恨内容,并采取进一步行动维护该平台的文明。我们首先介绍了两个已发表的工作,解决检测低资源多语言辱骂性言论和仇恨用户检测的挑战。最后,我们讨论了我们正在进行的多通道仇恨内容检测工作。 code 0
Generalizing Graph Neural Network across Graphs and Time Zhihao Wen Singapore Management University, Singapore, Singapore Graph-structured data widely exist in diverse real-world scenarios, analysis of these graphs can uncover valuable insights about their respective application domains. However, most previous works focused on learning node representation from a single fixed graph, while many real-world scenarios require representations to be quickly generated for unseen nodes, new edges, or entirely new graphs. This inductive ability is essential for high-throughtput machine learning systems. However, this inductive graph representation problem is quite difficult, compared to the transductive setting, for that generalizing to unseen nodes requires new subgraphs containing the new nodes to be aligned to the neural network trained already. Meanwhile, following a message passing framework, graphneural network (GNN) is an inductive and powerful graph representation tool. We further explore inductive GNN from more specific perspectives: (1) generalizing GNN across graphs, in which we tackle with the problem of semi-supervised node classification across graphs; (2) generalizing GNN across time, in which we mainly solve the problem of temporal link prediction; (3) generalizing GNN across tasks; (4) generalizing GNN across locations. 图结构化数据广泛存在于各种不同的现实场景中,对这些图的分析可以揭示关于它们各自应用领域的有价值的见解。然而,大多数以前的工作集中在从一个固定的图学习节点表示,而许多现实世界的场景需要表示为看不见的节点,新的边,或完全新的图快速生成。这种归纳能力对于高吞吐量的机器学习系统是必不可少的。然而,这个归纳图表示问题是相当困难的,相对于传导设置,因为这个泛化到看不见的节点需要新的子图包含新的节点对齐的神经网络已经训练。同时,遵循消息传递框架的图形神经网络(GNN)是一种归纳的、功能强大的图形表示工具。我们进一步从更具体的角度探索归纳 GNN: (1)跨图泛化 GNN,解决跨图的半监督节点分类问题; (2)跨时间泛化 GNN,主要解决时间链接预测问题; (3)跨任务泛化 GNN; (4)跨位置泛化 GNN。 code 0
Graphs: Privacy and Generation through ML Rucha Bhalchandra Joshi National Institute of Science Education and Research & Homi Bhabha National Institute, Bhubaneswar & Mumbai, India Graphs are ubiquitous, which makes machine learning on graphs an important research area. While there are many aspects to this field, our research is focused primarily on two aspects of it. The first research question concerns privacy in graphs, where our work primarily focuses on preserving structural privacy in graphs. The second research question is about generating graphs. With applications in various fields such as drug discovery, designing novel proteins, etc., graph generation is emerging as an essential problem. This paper briefly describes the problems and the methodology to address them. 图的普遍存在使得图的机器学习成为一个重要的研究领域。虽然这个领域有很多方面,但是我们的研究主要集中在两个方面。第一个研究问题涉及图中的隐私,我们的工作主要集中在保护图中的结构隐私。第二个研究问题是关于图的生成。随着图形生成技术在药物开发、新型蛋白质设计等领域的应用,图形生成已经成为一个重要的研究课题。本文简要介绍了这些问题和解决这些问题的方法。 code 0
Data-Efficient Graph Learning Meets Ethical Challenges Tao Tang Federation University Australia, Ballarat, Australia Recommender systems have achieved great success in our daily life. In recent years, the ethical concerns of AI systems have gained lots of attention. At the same time, graph learning techniques are powerful in modelling the complex relations among users and items under recommender system applications. These graph learning- based methods are data hungry, which brought a significant data efficiency challenge. In this proposal, I introduce my PhD research from three aspects: 1) Efficient privacy-preserving recommendation for imbalanced data. 2) Efficient recommendation model training for Insufficient samples. 3) Explainability in the social recommendation. Challenges and solutions of the above research problems have been proposed in this proposal. 推荐系统在我们的日常生活中取得了巨大的成功。近年来,人工智能系统的伦理问题引起了人们的广泛关注。与此同时,图形学习技术在建立用户和推荐系统项目之间的复杂关系方面非常有效。这些基于图形学习的方法对数据的需求量很大,给数据效率带来了很大的挑战。在这个建议中,我从三个方面介绍了我的博士研究: 1)不平衡数据的有效隐私保护建议。2)针对样本不足的有效推荐模型训练。3)社会推荐中的可解释性。提出了上述研究问题面临的挑战和解决方案。 code 0
From Classic GNNs to Hyper-GNNs for Detecting Camouflaged Malicious Actors Venus Haghighi Macquarie University, Sydney, Australia Graph neural networks (GNNs), which extend deep learning models to graph-structured data, have achieved great success in many applications such as detecting malicious activities. However, GNN-based models are vulnerable to camouflage behavior of malicious actors, i.e., the performance of existing GNN-based models has been hindered significantly. In this research proposal, we follow two research directions to address this challenge. One direction focuses on enhancing the existing GNN-based models and enabling them to identify both camouflaged and non-camouflaged malicious actors. In this regard, we propose to explore an adaptive aggregation strategy, which empowers GNN-based models to handle camouflage behavior of fraudsters. The other research direction concentrates on leveraging hypergraph neural networks (hyper-GNNs) to learn nodes' representation for more effective identification of camouflaged malicious actors. 将深度学习模型扩展到图结构数据的图神经网络(GNN)在检测恶意行为等许多应用中取得了巨大的成功。然而,基于 GNN 的模型容易受到恶意行为者的伪装行为的影响,即现有的基于 GNN 的模型的性能受到了严重的阻碍。在这项研究建议中,我们遵循两个研究方向来应对这一挑战。其中一个方向侧重于增强现有的基于 GNN 的模型,并使它们能够识别伪装和非伪装的恶意行为者。在这方面,我们提出了一种自适应聚合策略,该策略使得基于 GNN 的模型能够处理欺诈者的伪装行为。另一个研究方向集中在利用超图神经网络学习节点的表示,以更有效地识别伪装的恶意行为者。 code 0
Efficient Graph Learning for Anomaly Detection Systems Falih Gozi Febrinanto Federation University Australia, Ballarat, Australia Anomaly detection plays a significant role in preventing from detrimental effects of abnormalities. It brings many benefits in real-world sectors ranging from transportation, finance to cybersecurity. In reality, millions of data do not stand independently, but they might be connected to each other and form graph or network data. A more advanced technique, named graph anomaly detection, is required to model that data type. The current works of graph anomaly detection have achieved state-of-the-art performance compared to regular anomaly detection. However, most models ignore the efficiency aspect, leading to several problems like technical bottlenecks. This project mainly focuses on improving the efficiency aspect of graph anomaly detection while maintaining its performance. 异常检测在预防异常的有害影响方面起着重要作用。它给现实世界带来了许多好处,从交通运输、金融到网络安全。实际上,数以百万计的数据并不是独立存在的,但它们可能相互连接,形成图形或网络数据。需要一种更先进的技术,称为图形异常检测,来对数据类型进行建模。目前图形异常检测的作品与普通异常检测相比已经达到了最先进的水平。然而,大多数模型忽略了效率方面,导致了一些问题,如技术瓶颈。这个项目主要集中在提高图形异常检测的效率方面,同时保持其性能。 code 0
Self-supervision and Controlling Techniques to Improve Counter Speech Generation Punyajoy Saha Indian Institute of Technology Kharagpur, Kharagpur, India Hate speech is a challenging problem in today's online social media. One of the current solutions followed by different social media platforms is detecting hate speech using human-in-the-loop approaches. After detection, they moderate such hate speech by deleting the posts or suspending the users. While this approach can be a short-term solution for reducing the spread of hate, many researchers argue that it stifles freedom of expression. An alternate strategy that does not hamper freedom of expression is counterspeech. Recently, many studies have tried to create generation models to assist counter speakers by providing counterspeech suggestions for combating the explosive proliferation of online hate. This pipeline has two major challenges 1) How to improve the performance of generation without a large-scale dataset since building the dataset is costly 2) How to add control in the counter speech generation to make it more personalized. In this paper, we present our published and proposed research aimed at solving these two challenges. 仇恨言论在当今的网络社交媒体中是一个具有挑战性的问题。目前不同社交媒体平台采用的解决方案之一是使用人在线的方法来检测仇恨言论。在被发现后,他们通过删除帖子或暂停用户来缓和这种仇恨言论。虽然这种方法可以作为减少仇恨传播的短期解决办法,但许多研究人员认为,它扼杀了言论自由。另一种不妨碍言论自由的策略是反言论。最近,许多研究试图建立生成模型,通过提供反言论建议来打击网络仇恨的爆炸性扩散,从而帮助反言论者。该流水线面临两大挑战: 1)如何在没有大规模数据集的情况下提高生成性能,因为建立数据集的成本较高; 2)如何在反语音生成中增加控制,使其更加个性化。在本文中,我们介绍了我们发表和提出的研究,旨在解决这两个挑战。 code 0
Knowledge-Augmented Methods for Natural Language Processing Chenguang Zhu, Yichong Xu, Xiang Ren, Bill Yuchen Lin, Meng Jiang, Wenhao Yu Microsoft Cognitive Services Research, Bellevue, WA, USA; University of Notre Dame, Notre Dame, IN, USA; University of Southern California, Los Angeles, CA, USA Knowledge in natural language processing (NLP) has been a rising trend especially after the advent of large scale pre-trained models. NLP models with attention to knowledge can i) access unlimited amount of external information; ii) delegate the task of storing knowledge from its parameter space to knowledge sources; iii) obtain up-to-date information; iv) make prediction results more explainable via selected knowledge. In this tutorial, we will introduce the key steps in integrating knowledge into NLP, including knowledge grounding from text, knowledge representation and fusing. In addition, we will introduce recent state-of-the-art applications in fusing knowledge into language understanding, language generation and commonsense reasoning. 自然语言处理(NLP)中的知识已经成为一种新兴的趋势,特别是在大规模预训练模型出现之后。注重知识的 NLP 模型可以 i)访问无限量的外部信息; ii)将存储知识的任务从其参数空间委托给知识源; iii)获取最新信息; iv)通过选择的知识使预测结果更加可解释。在本教程中,我们将介绍将知识整合到自然语言处理中的关键步骤,包括从文本的知识基础,知识表示和融合。此外,我们将介绍最新的技术应用在融合知识到语言理解,语言生成和常识推理。 code 0
Hate Speech: Detection, Mitigation and Beyond Punyajoy Saha, Mithun Das, Binny Mathew, Animesh Mukherjee Indian Institute of Technology, Kharagpur, Kharagpur, India Social media sites such as Twitter and Facebook have connected billions of people and given the opportunity to the users to share their ideas and opinions instantly. That being said, there are several negative consequences as well such as online harassment, trolling, cyber-bullying, fake news, and hate speech. Out of these, hate speech presents a unique challenge as it is deeply engraved into our society and is often linked with offline violence. Social media platforms rely on human moderators to identify hate speech and take necessary action. However, with the increase in online hate speech, these platforms are turning toward automated hate speech detection and mitigation systems. This shift brings several challenges to the plate, and hence, is an important avenue to explore for the computation social science community. In this tutorial, we present an exposition of hate speech detection and mitigation in three steps. First, we describe the current state of research in the hate speech domain, focusing on different hate speech detection and mitigation systems that have developed over time. Next, we highlight the challenges that these systems might carry like bias and the lack of transparency. The final section concretizes the path ahead, providing clear guidelines for the community working in hate speech and related domains. We also outline the open challenges and research directions for interested researchers. Twitter 和 Facebook 等社交媒体网站已经连接了数十亿人,并为用户提供了即时分享想法和观点的机会。尽管如此,还是存在一些负面后果,比如网络骚扰、网络钓鱼、网络欺凌、假新闻和仇恨言论。除此之外,仇恨言论是一个独特的挑战,因为它深深地植根于我们的社会,而且往往与线下暴力联系在一起。社交媒体平台依赖人类管理员来识别仇恨言论并采取必要的行动。然而,随着在线仇恨言论的增加,这些平台正在转向自动仇恨言论检测和缓解系统。这种转变给板块带来了一些挑战,因此,是计算社会科学界探索的一个重要途径。在本教程中,我们将分三个步骤阐述仇恨语音检测和缓解。首先,我们描述了仇恨语音领域的研究现状,重点介绍了随着时间推移而发展起来的不同的仇恨语音检测和缓解系统。接下来,我们强调这些系统可能带来的挑战,如偏见和缺乏透明度。最后一部分具体化了前进的道路,为从事仇恨言论和相关领域工作的社区提供了明确的指导方针。我们还概述了开放的挑战和研究方向感兴趣的研究人员。 code 0
Natural and Artificial Dynamics in GNNs: A Tutorial Dongqi Fu, Zhe Xu, Hanghang Tong, Jingrui He University of Illinois at Urbana-Champaign, Urbana, IL, USA In the big data era, the relationship between entities becomes more complex. Therefore, graph (or network) data attracts increasing research attention for carrying complex relational information. For a myriad of graph mining/learning tasks, graph neural networks (GNNs) have been proven as effective tools for extracting informative node and graph representations, which empowers a broad range of applications such as recommendation, fraud detection, molecule design, and many more. However, real-world scenarios bring pragmatic challenges to GNNs. First, the input graphs are evolving, i.e., the graph structure and node features are time-dependent. Integrating temporal information into the GNNs to enhance their representation power requires additional ingenious designs. Second, the input graphs may be unreliable, noisy, and suboptimal for a variety of downstream graph mining/learning tasks. How could end-users deliberately modify the given graphs (e.g., graph topology and node features) to boost GNNs' utility (e.g., accuracy and robustness)? Inspired by the above two kinds of dynamics, in this tutorial, we focus on topics of natural dynamics and artificial dynamics in GNNs and introduce the related works systematically. After that, we point out some promising but under-explored research problems in the combination of these two dynamics. We hope this tutorial could be beneficial to researchers and practitioners in areas including data mining, machine learning, and general artificial intelligence. 在大数据时代,实体之间的关系变得更加复杂。因此,图形(或网络)数据由于承载复杂的关系信息而越来越受到研究者的关注。对于大量的图形挖掘/学习任务,图形神经网络(GNN)已被证明是提取信息节点和图表示的有效工具,它赋予了广泛的应用,如推荐、欺诈检测、分子设计等等。然而,真实世界的场景给 GNN 带来了实用的挑战。首先,输入图是演化的,即图的结构和节点特征是依赖于时间的。将时间信息整合到 GNN 中以增强它们的表示能力需要额外的巧妙设计。其次,对于各种下游图挖掘/学习任务,输入图可能是不可靠的、有噪声的和次优的。最终用户如何故意修改给定的图(例如,图形拓扑和节点特性)以提高 GNN 的效用(例如,准确性和鲁棒性) ?受上述两种动力学的启发,本教程重点讨论了 GNN 中的自然动力学和人工动力学,并系统地介绍了相关的工作。然后,我们指出了这两种动力学相结合的一些有前途但尚未得到充分探索的研究问题。我们希望本教程能够对数据挖掘、机器学习和一般人工智能等领域的研究人员和从业人员有所帮助。 code 0
Data Democratisation with Deep Learning: The Anatomy of a Natural Language Data Interface George KatsogiannisMeimarakis, Mike Xydas, Georgia Koutrika Athena Research Center, Athens, Greece In the age of the Digital Revolution, almost all human activities, from industrial and business operations to medical and academic research, are reliant on the constant integration and utilisation of ever-increasing volumes of data. However, the explosive volume and complexity of data makes data querying and exploration challenging even for experts, and makes the need to democratise the access to data, even for non-technical users, all the more evident. It is time to lift all technical barriers, by empowering users to access relational databases through conversation. We consider 3 main research areas that a natural language data interface is based on: Text-to-SQL, SQL-to-Text, and Data-to-Text. The purpose of this tutorial is a deep dive into these areas, covering state-of-the-art techniques and models, and explaining how the progress in the deep learning field has led to impressive advancements. We will present benchmarks that sparked research and competition, and discuss open problems and research opportunities with one of the most important challenges being the integration of these 3 research areas into one conversational system. 在数字革命时代,几乎所有的人类活动,从工业和商业运作到医学和学术研究,都依赖于不断增长的数据量的不断整合和利用。然而,数据的爆炸性数量和复杂性使得数据查询和探索甚至对专家来说都具有挑战性,并使得即使对非技术用户来说也需要使数据的获取更加民主化,这一点更加明显。现在是消除所有技术障碍的时候了,通过授权用户通过对话访问关系数据库。我们考虑了自然语言数据接口所基于的3个主要研究领域: 文本到 SQL、 SQL 到文本和数据到文本。本教程的目的是深入这些领域,涵盖了最先进的技术和模型,并解释了深度学习领域的进展如何导致了令人印象深刻的进步。我们将展示引发研究和竞争的基准,并讨论开放性问题和研究机会,其中最重要的挑战之一是将这三个研究领域整合到一个会话系统中。 code 0
Next-generation Challenges of Responsible Data Integration Fatemeh Nargesian, Abolfazl Asudeh, H. V. Jagadish Univ Illinois, Chicago, IL USA; Univ Michigan, Ann Arbor, MI USA; Univ Rochester, Rochester, NY 14627 USA Data integration has been extensively studied by the data management community and is a core task in the data pre-processing step of ML pipelines. When the integrated data is used for analysis and model training, responsible data science requires addressing concerns about data quality and bias. We present a tutorial on data integration and responsibility, highlighting the existing efforts in responsible data integration along with research opportunities and challenges. In this tutorial, we encourage the community to audit data integration tasks with responsibility measures and develop integration techniques that optimize the requirements of responsible data science. We focus on three critical aspects: (1) the requirements to be considered for evaluating and auditing data integration tasks for quality and bias; (2) the data integration tasks that elicit attention to data responsibility measures and methods to satisfy these requirements; and, (3) techniques, tasks, and open problems in data integration that help achieve data responsibility. 数据集成已经被数据管理界广泛研究,并且是机器学习管道数据预处理的核心任务。当集成数据用于分析和模型训练时,负责任的数据科学需要解决关于数据质量和偏差的问题。我们提供了一个关于数据集成和责任的教程,强调了在负责任的数据集成方面的现有努力以及研究机会和挑战。在本教程中,我们鼓励社区使用责任度量来审计数据集成任务,并开发优化责任数据科学需求的集成技术。我们集中在三个关键的方面: (1)评估和审计数据集成任务的质量和偏差需要考虑的要求; (2)引起注意的数据集成任务的数据责任措施和方法,以满足这些要求; 和(3)技术,任务,以及数据集成中帮助实现数据责任的开放问题。 code 0
Integrity 2023: Integrity in Social Networks and Media Lluís Garcia Pueyo, Panayiotis Tsaparas, Prathyusha Senthil Kumar, Timos Sellis, Paolo Papotti, Sibel Adali, Giuseppe Manco, Tudor Trufinescu, Gireeja Ranade, James Verbus, Mehmet N. Tek, Anthony McCosker EURECOM, Biot, France; Meta, Menlo Park, CA, USA; Meta, Bellevue, WA, USA; Archimedes / Athena Research Center, Athens, Greece; ICAR-CNR, Rende, Italy; Swinburne Social Innovation Research Institute, Melbourne, VIC, Australia; Google, Redwood City, CA, USA; LinkedIn, Sunnyvale, CA, USA; University of Ioannina, Ioannina, Greece; UC Berkeley, Berkeley, CA, USA; Rensselaer Polytechnic Institute, Troy, NY, USA Integrity 2023 is the fourth edition of the successful Workshop on Integrity in Social Networks and Media, held in conjunction with the ACM Conference on Web Search and Data Mining (WSDM) in the past three years. The goal of the workshop is to bring together researchers and practitioners to discuss content and interaction integrity challenges in social networks and social media platforms. The event consists of a combination of invited talks by reputed members of the Integrity community from both academia and industry and peer-reviewed contributed talks and posters solicited via an open call-for-papers. “诚信2023”是过去三年与 ACM 网络搜索和数据挖掘会议(WSDM)联合举办的第四届成功的社交网络和媒体诚信研讨会。研讨会的目标是让研究人员和从业人员聚集一堂,讨论社交网络和社交媒体平台中的内容和互动完整性挑战。这次活动包括邀请来自学术界和工业界的知名人士进行的演讲,以及通过公开征集论文征集到的经过同行评议的贡献演讲和海报。 code 0
Responsible AI for Trusted AI-powered Enterprise Platforms Steven C. H. Hoi Salesforce Research Asia, Singapore, Singapore With the rapidly growing AI market opportunities and the accelerated adoption of AI technologies for a wide range of real-world applications, responsible AI has attracted increasing attention in both academia and industries. In this talk, I will focus on the topics of responsible AI in the industry settings towards building trusted AI-powered enterprise platforms. I will share our efforts and experience of responsible AI for enterprise at Salesforce, from defining the principles to putting them into practice to build trust in AI. Finally, I will also address some emerging challenges and open issues of recent generative AI advances and call for actions of joint responsible AI efforts from academia, industries and governments. 随着人工智能市场机会的迅速增长,以及人工智能技术在现实世界中广泛应用的加速发展,负责任的人工智能已引起学术界和工业界越来越多的关注。在这个演讲中,我将集中讨论在行业环境中建立可信的 AI 驱动的企业平台的负责任的 AI 的主题。我将在 Salesforce 分享我们为企业负责任的人工智能所做的努力和经验,从界定原则到将原则付诸实践,以建立对人工智能的信任。最后,我还将讨论一些新出现的挑战和最近人工智能发展的公开问题,并呼吁学术界、工业界和政府共同采取负责任的人工智能行动。 code 0
Simulating Humans at Scale to Evaluate Voice Interfaces for TVs: the Round-Trip System at Comcast Breck Baldwin, Lauren Reese, Liming Zhang, Jan Neumann, Taylor Cassidy, Michael Pereira, G. Craig Murray, Kishorekumar Sundararajan, Yidnekachew Endale, Pramod Kadagattor, Paul Wolfe, Brian Aiken, Tony Braskich, Donte Jiggetts, Adam Sloan, Esther Vaturi, Crystal Pender, Ferhan Ture Comcast Applied AI, Washington, DC, USA Evaluating large-scale customer-facing voice interfaces involves a variety of challenges, such as data privacy, fairness or unintended bias, and the cost of human labor. Comcast's Xfinity Voice Remote is one such voice interface aimed at users looking to discover content on their TVs. The artificial intelligence (AI) behind the voice remote currently powers multiple voice interfaces, serving tens of millions of requests every day, from users across the globe.In this talk, we introduce a novel Round-Trip system we have built to evaluate the AI serving these voice interfaces in a semi-automated manner, providing a robust and cheap alternative to traditional quality assurance methods. We discuss five specific challenges we have encountered in Round-Trip and describe our solutions in detail. 评估大规模面向客户的语音界面涉及各种挑战,如数据隐私、公平性或意外偏差,以及人力成本。康卡斯特的 Xfinity Voice Remote 就是这样一个语音界面,旨在帮助用户发现电视上的内容。目前,语音遥控器背后的人工智能(AI)为多个语音界面提供动力,每天为全球用户提供数以千万计的请求服务。在这次演讲中,我们介绍了一个新颖的往返系统,我们已经建立了这个系统,以半自动的方式评估服务于这些语音界面的 AI,为传统的质量保证方法提供了一个强大而廉价的替代方案。我们讨论了在往返过程中遇到的五个具体挑战,并详细描述了我们的解决方案。 code 0
Considerations for Ethical Speech Recognition Datasets Orestis Papakyriakopoulos, Alice Xiang Sony AI, Seattle, WA, USA; Sony AI, Zurich, Switzerland Speech AI Technologies are largely trained on publicly available datasets or by the massive web-crawling of speech. In both cases, data acquisition focuses on minimizing collection effort, without necessarily taking the data subjects' protection or user needs into consideration. This results to models that are not robust when used on users who deviate from the dominant demographics in the training set, discriminating individuals having different dialects, accents, speaking styles, and disfluencies. In this talk, we use automatic speech recognition as a case study and examine the properties that ethical speech datasets should possess towards responsible AI applications. We showcase diversity issues, inclusion practices, and necessary considerations that can improve trained models, while facilitating model explainability and protecting users and data subjects. We argue for the legal & privacy protection of data subjects, targeted data sampling corresponding to user demographics & needs, appropriate meta data that ensure explainability & accountability in cases of model failure, and the sociotechnical & situated model design. We hope this talk can inspire researchers & practitioners to design and use more human-centric datasets in speech technologies and other domains, in ways that empower and respect users, while improving machine learning models' robustness and utility. 语音人工智能技术在很大程度上是通过公开的数据集或大规模的语音网络爬行训练出来的。在这两种情况下,数据采集都侧重于尽可能减少采集工作,而不必考虑数据主体的保护或用户需求。这种结果导致模型不健壮时,使用的用户偏离主导人口统计学在训练集,区分个人有不同的方言,口音,说话风格,和不流利。在这个演讲中,我们使用自动语音识别作为一个案例研究,并检查伦理语音数据集应具有的特性,以负责任的人工智能应用。我们展示了多样性问题、包容实践和必要的考虑因素,这些因素可以改进经过训练的模型,同时促进模型的可解释性并保护用户和数据主体。我们主张数据主体的法律和隐私保护,针对用户人口统计和需求的有针对性的数据抽样,适当的元数据,以确保模型失败的情况下的可解释性和问责制,以及社会技术和情境模型设计。我们希望这次演讲能够激励研究人员和从业人员在语音技术和其他领域设计和使用更多以人为中心的数据集,以授权和尊重用户的方式,同时提高机器学习模型的健壮性和实用性。 code 0
Under the Hood of Social Media Advertising: How Do We use AI Responsibly for Advertising Targeting and Creative Evaluation Aleksandr Farseev Somin.ai, ITMO University, Singapore, Singapore Digital Advertising is historically one of the most developed areas where Machine Learning and AI have been applied since its origination. From smart bidding to creative content generation and DCO, AI is well-demanded in the modern digital marketing industry and partially serves as a backbone of most of the state-of-the-art computational advertising systems, making them impossible for the AI tech and the programmatic systems to exist apart from one another. At the same time, given the drastic growth of the available AI technology nowadays, the issue of responsible AI utilization as well as the balance between the opportunity of deploying AI systems and the possible borderline etic and privacy-related consequences are still yet to be discussed comprehensively in both business and research communities. Particularly, an important issue of automatic User Profiling use in modern Programmatic systems like Meta Ads as well as the need for responsible application of the creative assessment models to fit into the business etic guidelines is yet to be described well. Therefore, in this talk, we are going to discuss the technology behind modern programmatic bidding and content scoring systems and the responsible application of AI by SoMin.ai to manage the Advertising targeting and Creative Validation process. 数字广告历来是机器学习和人工智能应用最发达的领域之一。从智能投标到创意内容生成和 DCO,人工智能在现代数字营销行业中的需求很大,并且在一定程度上作为大多数最先进的计算广告系统的骨干,使得人工智能技术和编程系统不可能彼此分离。与此同时,鉴于目前可用的人工智能技术急剧增长,负责任地利用人工智能的问题,以及在部署人工智能系统的机会与可能的边缘病态和与隐私有关的后果之间的平衡问题,仍有待商界和研究界全面讨论。特别是,一个重要的问题,自动用户剖析使用现代编程系统,如元广告,以及需要负责任的应用创造性的评估模型,以符合商业遗传学指南尚未被很好地描述。因此,在本次演讲中,我们将讨论现代程序投标和内容评分系统背后的技术,以及 SoMin.AI 对人工智能负责任的应用,以管理广告定位和创意验证过程。 code 0
An Open-Source Suite of Causal AI Tools and Libraries Emre Kiciman Microsoft Research, Redmond, WA, USA We propose to accelerate use-inspired basic research in causal AI through a suite of causal tools and libraries that simultaneously provides core causal AI functionality to practitioners and creates a platform for research advances to be rapidly deployed. In this presentation, we describe our contributions towards an open-source causal AI suite. We describe some of their applications, the lessons learned from their usage, and what is next. 我们建议通过一套因果工具和库,加速因果 AI 的基础研究,这些工具和库同时为从业者提供核心因果 AI 功能,并为快速部署研究进展创建一个平台。在这个演讲中,我们描述了我们对开源因果 AI 套件的贡献。我们描述了它们的一些应用程序,从使用中学到的经验教训,以及下一步是什么。 code 0
Privacy in the Time of Language Models Charith Peris, Christophe Dupuy, Jimit Majmudar, Rahil Parikh, Sami Smaili, Richard S. Zemel, Rahul Gupta Columbia University, New York, NY, USA; Amazon Alexa, Cambridge, MA, USA; Amazon Alexa, Toronto, ON, USA; Amazon Alexa, Sunnyvale, CA, USA Pretrained large language models (LLMs) have consistently shown state-of-the-art performance across multiple natural language processing (NLP) tasks. These models are of much interest for a variety of industrial applications that use NLP as a core component. However, LLMs have also been shown to memorize portions of their training data, which can contain private information. Therefore, when building and deploying LLMs, it is of value to apply privacy-preserving techniques that protect sensitive data. In this talk, we discuss privacy measurement and preservation techniques for LLMs that can be applied in the context of industrial applications and present case studies of preliminary solutions. We discuss select strategies and metrics relevant for measuring memorization in LLMs that can, in turn, be used to measure privacy-risk in these models. We then discuss privacy-preservation techniques that can be applied at different points of the LLM training life-cycle; including our work on an algorithm for fine-tuning LLMs with improved privacy. In addition, we discuss our work on privacy-preserving solutions that can be applied to LLMs during inference and are feasible for use at run time. 经过预先训练的大型语言模型(LLM)在多个自然语言处理(NLP)任务中始终表现出最先进的性能。这些模型对于使用 NLP 作为核心组件的各种工业应用程序非常有意义。然而,LLM 也被证明能够记住它们训练数据的一部分,这些数据可以包含私人信息。因此,在构建和部署 LLM 时,应用保护敏感数据的隐私保护技术是有价值的。在这个演讲中,我们讨论了 LLM 的隐私度量和保护技术,这些技术可以应用于工业应用的背景下,并提出了初步解决方案的案例研究。我们讨论与测量 LLM 记忆相关的选择策略和指标,这些策略和指标反过来又可以用来测量这些模型中的隐私风险。然后,我们将讨论可以应用于 LLM 培训生命周期不同阶段的隐私保护技术; 包括我们在具有改进隐私的 LLM 微调算法方面的工作。此外,我们还讨论了我们在隐私保护解决方案方面的工作,这些解决方案可以在推理期间应用于 LLM,并且在运行时使用是可行的。 code 0
Incorporating Fairness in Large Scale NLU Systems Rahul Gupta, Lisa Bauer, KaiWei Chang, Jwala Dhamala, Aram Galstyan, Palash Goyal, Qian Hu, Avni Khatri, Rohit Parimi, Charith Peris, Apurv Verma, Richard S. Zemel, Prem Natarajan Amazon Alexa, Cambridge, MA, USA; Amazon Alexa, Sunnyvale, CA, USA; Amazon Alexa, New York, NY, USA; Amazon Alexa, Los Angeles, CA, USA NLU models power several user facing experiences such as conversations agents and chat bots. Building NLU models typically consist of 3 stages: a) building or finetuning a pre-trained model b) distilling or fine-tuning the pre-trained model to build task specific models and, c) deploying the task-specific model to production. In this presentation, we will identify fairness considerations that can be incorporated in the aforementioned three stages in the life-cycle of NLU model building: (i) selection/building of a large scale language model, (ii) distillation/fine-tuning the large model into task specific model and, (iii) deployment of the task specific model. We will present select metrics that can be used to quantify fairness in NLU models and fairness enhancement techniques that can be deployed in each of these stages. Finally, we will share some recommendations to successfully implement fairness considerations when building an industrial scale NLU system. NLU 模型支持多种用户体验,如会话代理和聊天机器人。构建 NLU 模型通常包括3个阶段: a)构建或调整预先训练的模型 b)提取或调整预先训练的模型以构建特定于任务的模型,c)将特定于任务的模型部署到生产环境中。在这个介绍中,我们将确定公平性的考虑,可以纳入上述三个阶段的 NLU 模型建设的生命周期: (i)选择/建立一个大规模的语言模型,(ii)精馏/微调大型模型到任务特定的模型,以及(iii)部署任务特定的模型。我们将介绍可用于量化 NLU 模型中的公平性的选择指标,以及可在每个阶段部署的公平性增强技术。最后,我们将分享一些建议,以成功实施公平的考虑时,建立一个工业规模的自然语言大学系统。 code 0
Social Public Health Infrastructure for a Smart City Citizen Patient: Advances and Opportunities for AI Driven Disruptive Innovation Ankur Teredesai University of Washington & CueZen Inc., Seattle, WA, USA Promoting health, preventing disease, and prolonging life are central to the success of any smart city initiative. Today, wireless communication, data infrastructure, and low-cost sensors such as lifestyle and activity trackers are making it increasingly possible for cities to collect, collate, and innovate on developing a smart infrastructure. Combining this with AI driven disruptions for human behavior change can fundamentally transform delivery of public health for the citizen patient[4]. Most urban development government bodies consider such infrastructure to be a distributed ecosystem consisting of physical infrastructure, institutional infrastructure, social infrastructure and economic infrastructure[1]. In this talk we will focus mostly on the social health infrastructure component and first showcase some of the recent initiatives created with the purpose of addressing health which is a key social goal, as a smart city goal. Specifically we will discuss how technical advances in IoT, recommendation systems, geospatial computing, and digital health therapeutics are creating a new future. Yet, such advances are not addressing the issues of making healthy behavior change sustainable with broad health equity for those that need it most[3]. Overwhelming nature of siloed apps and digital health solutions which often leave the citizens overwhelmed and those most in need underserved[2]. This talk highlights the advances and opportunities created when behavioral economics and public health combine with AI and cloud infrastructure to make smart public health initiatives personalized to each individual citizen patient. 促进健康、预防疾病和延长寿命是任何智慧城市倡议成功的核心。如今,无线通信、数据基础设施以及生活方式和活动跟踪器等低成本传感器使得城市越来越有可能在开发智能基础设施方面进行收集、整理和创新。将这一点与人工智能驱动的人类行为改变的破坏相结合,可以从根本上改变为公民患者提供公共卫生服务[4]。大多数城市发展政府机构认为这些基础设施是一个分布式的生态系统,包括有形基础设施、制度基础设施、社会基础设施和经济基础设施[1]。在这次演讲中,我们将主要侧重于社会卫生基础设施部分,并首先展示最近为解决卫生问题而采取的一些举措,这是一个关键的社会目标,也是一个智慧城市的目标。具体来说,我们将讨论物联网、推荐系统、地理空间计算和数字健康治疗的技术进步是如何创造一个新的未来的。然而,这些进步并没有解决问题,使健康的行为改变可持续与广泛的健康公平,为那些最需要它[3]。孤立的应用程序和数字健康解决方案的压倒性本质,往往让公民不堪重负,最需要帮助的人得不到充分的服务[2]。这次演讲强调了当行为经济学和公共卫生与人工智能和云基础设施相结合,使智能公共卫生倡议个性化到每个公民病人时所创造的进步和机遇。 code 0
Recent Advances on Deep Learning based Knowledge Tracing Zitao Liu, Jiahao Chen, Weiqi Luo TAL Education Group, Beijing, China; Jinan University, Guangzhou, China Knowledge tracing (KT) is the task of using students' historical learning interaction data to model their knowledge mastery over time so as to make predictions on their future interaction performance. Recently, remarkable progress has been made of using various deep learning techniques to solve the KT problem. However, the success behind deep learning based knowledge tracing (DLKT) approaches is still left somewhat unknown and proper measurement and analysis of these DLKT approaches remain a challenge. In this talk, we will comprehensively review recent developments of applying state-of-the-art deep learning approaches in KT problems, with a focus on those real-world educational data. Beyond introducing the recent advances of various DLKT models, we will discuss how to guarantee valid comparisons across DLKT methods via thorough evaluations on several publicly available datasets. More specifically, we will talk about (1) KT related psychometric theories; (2) the general DLKT modeling framework that covers recently developed DLKT approaches from different categories; (3) the general DLKT benchmark that allows existing approaches comparable on public KT datasets; (4) the broad application of algorithmic assessment and personalized feedback. Participants will learn about recent trends and emerging challenges in this topic, representative tools and learning resources to obtain ready-to-use models, and how related models and techniques benefit real-world KT applications. 知识追踪(KT)是利用学生的历史学习互动数据,建立学生随时间变化的知识掌握模型,从而预测学生未来的互动表现的任务。近年来,利用各种深度学习技术解决 KT 问题的研究取得了显著的进展。然而,基于深度学习的知识跟踪(DLKT)方法的成功仍然是未知的,对这些 DLKT 方法的正确测量和分析仍然是一个挑战。在这个演讲中,我们将全面回顾在 KT 问题中应用最先进的深度学习方法的最新进展,重点放在那些真实世界的教育数据上。除了介绍各种 DLKT 模型的最新进展之外,我们还将讨论如何通过对几个公开数据集的全面评估来保证跨 DLKT 方法的有效比较。更具体地说,我们将讨论(1)与 KT 相关的心理测量学理论; (2)涵盖最近从不同类别开发的 DLKT 方法的通用 DLKT 建模框架; (3)允许现有方法在公共 KT 数据集上具有可比性的通用 DLKT 基准; (4)算法评估和个性化反馈的广泛应用。与会者将了解这一主题的最新趋势和新出现的挑战,获得现成可用模型的代表性工具和学习资源,以及相关模型和技术如何有利于现实世界的 KT 应用。 code 0
SmartCityBus - A Platform for Smart Transportation Systems Georgios Bouloukakis, Chrysostomos Zeginis, Nikolaos Papadakis, Kostas Magoutis, George Christodoulou, Chrysanthi Kosyfaki, Konstantinos Lampropoulos, Nikos Mamoulis Foundation for Research and Technology - Hellas (FORTH) & University of Crete, Heraklion, Greece; Télécom SudParis, Institut Polytechnique de Paris, Paris, France; University of Ioannina, Ioannina, Greece; Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece With the growth of the Internet of Things (IoT), Smart(er) Cities have been a research goal of researchers, businesses and local authorities willing to adopt IoT technologies to improve their services. Among them, Smart Transportation [7,8], the integrated application of modern technologies and management strategies in transportation systems, refers to the adoption of new IoT solutions to improve urban mobility. These technologies aim to provide innovative solutions related to different modes of transport and traffic management and enable users to be better informed and make safer and 'smarter' use of transport networks. This talk presents SmartCityBus, a data-driven intelligent transportation system (ITS) whose main objective is to use online and offline data in order to provide accurate statistics and predictions and improve public transportation services in the short and medium/long term. 随着物联网的发展,智能城市已成为研究人员、企业和地方政府愿意采用物联网技术改善其服务的研究目标。其中,智能交通[7,8] ,现代技术和管理战略在交通系统的综合应用,指采用新的物联网解决方案,以改善城市流动性。这些技术旨在提供与不同运输和交通管理模式相关的创新解决方案,使用户能够更好地了解情况,更安全、更“智能”地使用运输网络。是次讲座介绍数据驱动的智能交通系统「智慧城市巴士」(SmartCityBus) ,其主要目的是利用在线和离线数据提供准确的统计数据和预测,并在短、中/长期内改善公共交通服务。 code 0
Towards an Event-Aware Urban Mobility Prediction System Zhaonan Wang, Renhe Jiang, Zipei Fan, Xuan Song, Ryosuke Shibasaki The University of Tokyo, Tokyo, Japan Today, thanks to the rapid developing mobile and sensor networks in IoT (Internet of Things) systems, spatio-temporal big data are being constantly generated. They have brought us a data-driven possibility to sense and understand crowd mobility on a city scale. A fundamental task towards the next-generation mobility services, such as Intelligent Transportation Systems (ITS), Mobility-as-a-Service (MaaS), is spatio-temporal predictive modeling of the geo-sensory signals. There is a recent line of research leveraging deep learning techniques to boost the forecasting performance on such tasks. While simulating the regularity of mobility behaviors (e.g., routines, periodicity) in a more sophisticated way, the existing studies ignore an important part of urban activities, i.e., events. Including holidays, extreme weathers, pandemic, accidents, various urban events happen from time to time and cause non-stationary phenomena, which by nature make the spatio-temporal forecasting task challenging. We thereby envision an event-aware urban mobility prediction model that is capable of fast adapting and making reliable predictions in different scenarios, which is crucial to decision making towards emergency response and urban resilience. 今天,由于物联网系统中快速发展的移动和传感器网络,时空大数据不断产生。他们给我们带来了一种数据驱动的可能性,在城市规模上感知和理解人群的流动性。智能交通系统(ITS)、移动即服务(MaaS)等下一代移动服务的基本任务是对地理感知信号进行时空预测建模。最近有一项研究利用深度学习技术来提高这类任务的预测性能。虽然现有的研究以一种更复杂的方式模拟流动行为的规律性(例如,例行公事,周期性) ,但是忽略了城市活动的一个重要组成部分,即事件。包括节假日、极端天气、流行病、事故、各种城市事件时有发生,引起非平稳现象,使得时空预测工作具有挑战性。因此,我们设想了一个能够意识到事件的城市流动性预测模型,该模型能够在不同的情况下快速适应并作出可靠的预测,这对于决策应对紧急情况和城市复原力至关重要。 code 0
Metropolitan-scale Mobility Digital Twin Zipei Fan, Renhe Jiang, Ryosuke Shibasaki University of Tokyo, Kashiwa, Chiba, Japan and Southern University of Science and Technology, Shenzhen, Guangdong, China; University of Tokyo, Kashiwa, Chiba, Japan Knowing "what is happening" and "what will happen" of the mobility in a city is the building block of a data-driven smart city system. In recent years, mobility digital twin that makes a virtual replication of human mobility and predicting or simulating the fine-grained movements of the subjects in a virtual space at a metropolitan scale in near real-time has shown its great potential in modern urban intelligent systems. However, few studies have provided practical solutions. The main difficulties are four-folds: 1) the daily variation of human mobility is hard to model and predict; 2) the transportation network enforces a complex constraints on human mobility; 3) generating a rational fine-grained human trajectory is challenging for existing machine learning models; and 4) making a fine-grained prediction incurs high computational costs, which is challenging for an online system. Bearing these difficulties in mind, in this paper we propose a two-stage human mobility predictor that stratifies the coarse and fine-grained level predictions. In the first stage, to encode the daily variation of human mobility at a metropolitan level, we automatically extract citywide mobility trends as crowd contexts and predict long-term and long-distance movements at a coarse level. In the second stage, the coarse predictions are resolved to a fine-grained level via a probabilistic trajectory retrieval method, which offloads most of the heavy computations to the offline phase. We tested our method using a real-world mobile phone GPS dataset in the Kanto area in Japan, and achieved good prediction accuracy and a time efficiency of about 2 min in predicting future 1h movements of about 220K mobile phone users on a single machine to support more higher-level analysis of mobility prediction. 了解城市中移动性的“正在发生的事情”和“将要发生的事情”是数据驱动智能城市系统的组成部分。近年来,移动性数字孪生兄弟在现代城市智能系统中显示出巨大的潜力,它可以对人类的移动性进行虚拟复制,并在城市尺度的虚拟空间中近实时地预测或模拟主体的细粒度移动。然而,很少有研究提供切实可行的解决方案。主要的困难有四个方面: 1)人类流动性的日变化很难建模和预测; 2)交通网络对人类流动性施加了复杂的约束; 3)生成一个合理的细粒度人类轨迹对现有的机器学习模型是一个挑战; 4)做出细粒度预测会带来高计算成本,这对在线系统是一个挑战。考虑到这些困难,在本文中,我们提出了一个两阶段的人类流动性预测器,分层粗粒度和细粒度的水平预测。在第一阶段,为了编码城市层面上人口流动的日变化,我们自动提取城市范围内的人口流动趋势作为人群背景,并在一个粗略的层面上预测长期和长距离的流动。在第二阶段,通过概率轨迹检索方法将粗预测分解到细粒度水平,将大部分繁重的计算转移到离线阶段。我们在 Kanto 地区使用一个真实世界的手机 GPS 数据集测试了我们的方法,在一台机器上预测未来1小时内约220k 手机用户的移动时,取得了很好的预测精度和大约2分钟的时间效率,以支持更高层次的移动性预测分析。 code 0
Interpretable Research Interest Shift Detection with Temporal Heterogeneous Graphs Qiang Yang, Changsheng Ma, Qiannan Zhang, Xin Gao, Chuxu Zhang, Xiangliang Zhang Brandeis University, Massachusetts, MA, USA; King Abdullah University of Science and Technology, Jeddah, Saudi Arabia; University of Notre Dame, Indiana, IN, USA Researchers dedicate themselves to research problems they are interested in and often have evolving research interests in their academic careers. The study of research interest shift detection can help to find facts relevant to scientific training paths, scientific funding trends, and knowledge discovery. Existing methods define specific graph structures like author-conference-topic networks, and co-citing networks to detect research interest shift. They either ignore the temporal factor or miss heterogeneous information characterizing academic activities. More importantly, the detection results lack the interpretations of how research interests change over time, thus reducing the model's credibility. To address these issues, we propose a novel interpretable research interest shift detection model with temporal heterogeneous graphs. We first construct temporal heterogeneous graphs to represent the research interests of the target authors. To make the detection interpretable, we design a deep neural network to parameterize the generation process of interpretation on the predicted results in the form of a weighted sub-graph. Additionally, to improve the training process, we propose a semantic-aware negative data sampling strategy to generate non-interesting auxiliary shift graphs as contrastive samples. Extensive experiments demonstrate that our model outperforms the state-of-the-art baselines on two public academic graph datasets and is capable of producing interpretable results. code -1
Learning to Understand Audio and Multimodal Content Rosie Jones Spotify, Boston, MA, USA Music, podcasts and audiobooks are rich audio content types with strong listener engagement. Search and recommendation across these content types can be greatly enhanced with a deep understanding of their content; across audio, text, and other multimodal content. In this talk, I discuss some of the challenges and opportunities in understanding this content. This deep understanding of content enables us to delight our users and expand the reach of our content creators. As part of enabling wider academic research into podcast content understanding, Spotify Research [1] has released a podcast dataset [2] with 120,000 hours of podcasts in English [3] and Portuguese [4]. 音乐、播客和有声读物是丰富的音频内容类型,具有强大的听众参与度。通过深入理解这些内容类型的内容,可以大大增强跨这些内容类型的搜索和推荐; 跨音频、文本和其他多通道内容。在这个演讲中,我讨论了理解这些内容的一些挑战和机遇。这种对内容的深刻理解使我们能够取悦我们的用户,并扩大我们的内容创作者的范围。为了使更广泛的学术研究能够理解播客内容,Spotify Research [1]发布了一个播客数据集[2] ,其中包括120,000小时的英语和葡萄牙语播客[3]。 code -1
Preference-Based Offline Evaluation Charles L. A. Clarke, Fernando Diaz, Negar Arabzadeh Google, Montreal, PQ, Canada; University of Waterloo, Waterloo, ON, Canada A core step in production model research and development involves the offline evaluation of a system before production deployment. Traditional offline evaluation of search, recommender, and other systems involves gathering item relevance labels from human editors. These labels can then be used to assess system performance using offline evaluation metrics. Unfortunately, this approach does not work when evaluating highly effective ranking systems, such as those emerging from the advances in machine learning. Recent work demonstrates that moving away from pointwise item and metric evaluation can be a more effective approach to the offline evaluation of systems. This tutorial, intended for both researchers and practitioners, reviews early work in preference-based evaluation and covers recent developments in detail. 生产模型研究和开发的核心步骤包括在生产部署之前对系统进行离线评估。传统的搜索、推荐和其他系统的离线评估包括从人工编辑器收集项目相关标签。然后可以使用这些标签使用离线评估指标来评估系统性能。不幸的是,这种方法在评估高效的排名系统时不起作用,例如那些来自机器学习进步的排名系统。最近的工作表明,从点态项目和度量评价可以是一个更有效的方法离线评价系统。本教程面向研究人员和从业人员,回顾了基于偏好的评估的早期工作,并详细介绍了最近的发展。 code -1