Các bài báo công bố quốc tế

Bay Vo, Tuong Le, Frans Coenen, Tzung-Pei Hong; Mining frequent itemsets using the N-list and subsume concepts; International Journal of Machine Learning and Cybernetics, DOI: 10.1007/s13042-014-0252-2 (2014) (SCIE).

Abstract

Frequent itemset mining is a fundamental element with respect to many data mining problems directed at finding interesting patterns in data. Recently the PrePost algorithm, a new algorithm for mining frequent itemsets based on the idea of N-lists, which in most cases outperforms other current state-of-the-art algorithms, has been presented. This paper proposes an improved version of PrePost, the N-list and Subsume-based algorithm for mining Frequent Itemsets (NSFI) algorithm that uses a hash table to enhance the process of creating the N-lists associated with 1-itemsets and an improved N-list intersection algorithm. Furthermore, two new theorems are proposed for determining the “subsume index” of frequent 1-itemsets based on the N-list concept. Using the subsume index, NSFI can identify groups of frequent itemsets without determining the N-list associated with them. The experimental results show that NSFI outperforms PrePost in terms of runtime and memory usage and outperforms dEclat in terms of runtime.
Keywords
  • Data mining
  • Pattern mining
  • Frequent itemset
  • N-list
  • Subsume