Scholar - SciOpen

The integration of Large Language Models (LLMs) into e-commerce platforms has significantly enhanced user experience through personalized recommendations and automated customer support. However, existing Retrieval-Augmented Generation (RAG) frameworks face challenges when applied to e-commerce product Question Answering (QA), such as handling extensive product catalogs, ensuring timely knowledge updates, and maintaining efficient retrieval performance. In this paper, we propose ItemRAG, a novel framework that combines RAG with item-based knowledge computing to address these challenges. ItemRAG decouples QA templates from specific products by leveraging a dynamic knowledge graph, enabling efficient updates and reducing the size of the knowledge base. The framework includes state analysis to capture user intent and context, grouped indexing for efficient retrieval, and knowledge computing to dynamically generate accurate answers. Experimental results demonstrate that decoupled-based ItemRAG significantly outperforms the Coupled-based RAG approaches (CoupledRAG) in retrieval accuracy and generation quality, achieving higher precision, recall, F1-score, and factual correctness. Our work highlights the efficacy of integrating the knowledge graph with RAG to enhance LLM-based e-commerce customer service systems.

Regular Paper Issue

Sequential Cooperative Distillation for Imbalanced Multi-Task Learning

Quan Feng, Jia-Yu Yao, Ming-Kun Xie, Sheng-Jun Huang, Song-Can Chen

Journal of Computer Science and Technology 2024, 39(5): 1094-1106

Published: 05 December 2024

Abstract Collect Collected

Multi-task learning (MTL) can boost the performance of individual tasks by mutual learning among multiple related tasks. However, when these tasks assume diverse complexities, their corresponding losses involved in the MTL objective inevitably compete with each other and ultimately make the learning biased towards simple tasks rather than complex ones. To address this imbalanced learning problem, we propose a novel MTL method that can equip multiple existing deep MTL model architectures with a sequential cooperative distillation (SCD) module. Specifically, we first introduce an efficient mechanism to measure the similarity between tasks, and group similar tasks into the same block to allow their cooperative learning from each other. Based on this, the grouped task blocks are sorted in a queue to determine the learning sequence of the tasks according to their complexities estimated with the defined performance indicator. Finally, a distillation between the individual task-specific models and the MTL model is performed block by block from complex to simple manner, achieving a balance between competition and cooperation among learning multiple tasks. Extensive experiments demonstrate that our method is significantly more competitive compared with state-of-the-art methods, ranking No.1 with average performances across multiple datasets by improving 12.95% and 3.72% compared with OMTL and MTLKD, respectively.

Regular Paper Issue

Preface

Min-Ling Zhang, Sheng-Jun Huang, Ming-Sheng Long

Journal of Computer Science and Technology 2021, 36(3): 588-589

Published: 05 May 2021

Abstract Collect Collected

Regular Paper Issue

Incremental Multi-Label Learning with Active Queries

Sheng-Jun Huang, Guo-Xiang Li, Wen-Yu Huang, Shao-Yuan Li

Journal of Computer Science and Technology 2020, 35(2): 234-246

Published: 27 March 2020

Abstract Collect Collected

In multi-label learning, it is rather expensive to label instances since they are simultaneously associated with multiple labels. Therefore, active learning, which reduces the labeling cost by actively querying the labels of the most valuable data, becomes particularly important for multi-label learning. A good multi-label active learning algorithm usually consists of two crucial elements: a reasonable criterion to evaluate the gain of querying the label for an instance, and an effective classification model, based on whose prediction the criterion can be accurately computed. In this paper, we first introduce an effective multi-label classification model by combining label ranking with threshold learning, which is incrementally trained to avoid retraining from scratch after every query. Based on this model, we then propose to exploit both uncertainty and diversity in the instance space as well as the label space, and actively query the instance-label pairs which can improve the classification model most. Extensive experiments on 20 datasets demonstrate the superiority of the proposed approach to state-of-the-art methods.

Total 4