MINING TOP-K FREQUENT SEQUENTIAL PATTERN IN ITEM INTERVAL EXTENDED SEQUENCE DATABASE

Duong Huy Tran, Thang Truong Nguyen, Thi Duc Vu, Anh The Tran
Author affiliations

Authors

  • Duong Huy Tran Institute of Information Technology - Vietnam Academy of Science and Technology
  • Thang Truong Nguyen Institute of Information Technology - Vietnam Academy of Science and Technology
  • Thi Duc Vu Information Technology Institute, Vietnam National University
  • Anh The Tran Institute of Information Technology - Vietnam Academy of Science and Technology

DOI:

https://doi.org/10.15625/1813-9663/34/3/13053

Keywords:

sequential pattern, time, item interval, top-K, weighted

Abstract

Abstract. Frequent sequential pattern mining in item interval extended sequence database (iSDB) has been one of interesting task in recent years. Unlike classic frequent sequential pattern mining, the pattern mining in iSDB also consider the item interval between successive items; thus, it may extract more meaningful sequential patterns in real life. Most previous frequent sequential pattern mining in iSDB algorithms needs a minimum support threshold (minsup) to perform the mining. However, it’s not easy for users to provide an appropriate threshold in practice. The too high minsup value will lead to missing valuable patterns, while the too low minsup value may generate too many useless patterns. To address this problem, we propose an algorithm: TopKWFP – Top-k weighted frequent sequential pattern mining in item interval extended sequence database. Our algorithm doesn’t need to provide a fixed minsup value, this minsup value will dynamically raise during the mining process

Metrics

Metrics Loading ...

References

Agrawal.R, Srikant.R, "Mining sequential patterns," in Proceedings of the International Conference on Data Engineering (ICDE), 1995.

Zaki.M, "An Efficient Algorithm for Mining Frequent Sequences," in Machine Learning, 2000.

Pei.J, Han.J, Asi.B.M, Pino.H, "PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth," in Proceedings of the Seventeenth International Conference on Data Engineering, 2001.

Zaki.M, "SPADE: An Efficient Algorithm for Mining Frequent Sequences," Machine Learning, vol. 40, pp. 31-60, 2000.

Ayres.J, Gehrke.J, Yiu.T and Flannick.J, "Sequential Pattern Mining using Bitmap Representation," in Proc. of ACM SIGKDD’02, 2002.

Yu Hirate, Hayato Yamana, "Generalized Sequential Pattern Mining with Item," JOURNAL OF COMPUTERS, vol. 1, no. 3, pp. 51-60, 2006.

Cai.C.H, Chee Fu.A.W, Cheng.C.H, and Kwong.W.W, "Mining Association Rules with Weighted Items," in Proceedings of the 1998 International Symposium on Database Engineering & Applications, Cardiff, Wales, 1998.

Wang.W, Yang.J, and Yu.P.S, "Efficient Mining of Weighted Association Rules (WAR)," in Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2000.

Tao.F, Murtagh.F, Farid.M, "Weighted Association Rule Mining Using Weighted Support and Significance Framework," in Proceedings of 9th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2003.

Khan.M.S, Muyeba.M, Coenen.F, "Weighted Association Rule Mining from Binary and Fuzzy Data," in Proceedings of 8th Industrial Conference, ICDM 2008, 2008.

Yun.U, Leggett.J.J, "WFIM: weighted frequent itemset mining with a weight range and a minimum weight," in 5th SIAM Int. Conf. on Data Mining, 2005.

Janos Demetrovics, Vu Duc Thi, Tran Huy Duong, "An algorithm to mine normalized weighted sequential patterns using Prefix-Projected Database," Serdica Journal of Computing, Sofia, Bulgarian Academy of Sciences, vol. 2, p. 105–122, 2015.

Tran Huy Duong, Vu Duc Thi, "Algorithm mining normalized weighted frequent sequential patterns with Time intervals," Research, Development and Application on Information & Communication Technology, vol. 2, pp. 72-81, 2015.

Wang.J and Han.J, TFP, "An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets," TKDE, vol. 17, pp. 652-664, 2005.

Chuang.K, Huang.J and Chen.M, "Mining Top-K Frequent Patterns in the Presence of the Memory Constraint," VLDB Journal, vol. 17, pp. 1321-1344, 2008.

Cheung.Y.L and Fu.A.W, "Mining frequent itemsets without support threshold: with and without item constraints," TKDE, vol. 16, pp. 1052-1069, 2004.

Sharda Khode, Sudhir Mohod, "Mining high utility itemsets using TKO and TKU to find top-k high utility web access patterns," in 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, 2017.

Tzvetkov.P, Yan.X and Han.J, "TSP: Mining Top-K Closed Sequential Patterns," ICDM, pp. 347-354, 2003.

Zheng.Z, Cao.L, Song.Y and Wei.W, "Efficiently Mining Top-K High Utility Sequential Patterns," 2013 IEEE 13th International Conference on Data Mining, pp. 1259-1264, 2013.

Asima Jamil, Abdus Salam and Farhat Amin, "Performance evaluation of top-k sequential mining methods on synthetic and real datasets," International Journal of Advanced Computer Research, vol. 7, no. 32, pp. 176-184, 2017.

Fournier-Viger.P, Gomariz.A, Gueniche.T, Mwamikazi.E, Thomas.R, "TKS: Efficient Mining of Top-K Sequential Patterns," Springer Advanced Data Mining and Application, vol. 8346, pp. 109-120, 2013.

Karishma B Hathi , Jatin R Ambasana, "Top K Sequential Pattern Mining Algorithm.," International Conference on Information Engineering, Management and Security, pp. 115-120, 2015.

Downloads

Published

23-11-2018

How to Cite

[1]
D. H. Tran, T. T. Nguyen, T. D. Vu, and A. T. Tran, “MINING TOP-K FREQUENT SEQUENTIAL PATTERN IN ITEM INTERVAL EXTENDED SEQUENCE DATABASE”, JCC, vol. 34, no. 3, p. 249–263, Nov. 2018.

Issue

Section

Computer Science