Rule Generation from Several Types of Table Data Sets and Its Application: Decision-Making with Transparency and an Improved Execution Environment

  • Hiroshi Sakai Kyushu Institute of Technology
  • Zhiwen Jian Kyushu Institute of Technology
Keywords: Rule Generation, apriori algorithm, data mining, decision support, rough sets

Abstract

This paper copes with rule generation from table data sets and applies the obtained rules to decision support. Here, two types of table data sets are considered. One type of them is specified as a Deterministic Information System (DIS). The other type is specified as a Non-deterministic Information System (NIS) for dealing with incomplete information. Two rule generation algorithms are refined and newly implemented in Python. Every obtained rule is applied as evidence of decision-making. Therefore, the reasoning process preserves its transparency, which will be an essential characteristic for Explainable AI. The decision support environment is strengthened due to some described improvements and is also brushed up in Python. Some running videos of Python are available on the web page. This framework applies to almost any table data set, and we can generate rules from them. This framework based on discrete data will complement statistical data analysis based on numerical data.

References

R. Agrawal, R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,” Proc. VLDB’94, Morgan Kaufmann, 1994, pp. 487–499.

R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, A.I. Verkamo, “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, 1996, pp. 307–328.

N.V. Chawla, “Data Mining for Imbalanced Datasets: An overview,” Data Mining and Knowledge Discovery Handbook, Springer, 2009, pp. 875–886.

A. Frank, A. Asuncion, UCI Machine Learning Repository, Irvine, CA: University of California, School of Information and Computer Science, 2010. http://mlearn.ics.uci.edu/MLRepository.html (Accessed May 27, 2021)

Google Explainable AI, 2020. https://cloud.google.com/ (Accessed March 14, 2020)

J.W., Grzymała-Busse, “Data with Missing Attribute Values: Generalization of Indiscernibility Relation and Rule Induction,” Transactions on Rough Sets, vol. 1, Springer LNCS 3100, 2004, pp. 78–95.

Z. Jian, H. Sakai, J. Watada, A. Roy, M. B. Hassan, “An Apriori-based Data Analysis on Suspicious Network Event Recognition,” Proc. IEEE Big Data 2019, 2019, pp. 5888–5896.

Z. Jian, H. Sakai, T. Ohwa, K. Y. Shen, M. Nakata, “An Adjusted Apriori Algorithm to Itemsets Defined by Tables and an Improved Rule Generator with Three-way Decisions,” Proc. IJCRS2020, Springer LNCS 12179, 2020, pp. 95–110.

Z. Jian, H. Sakai, “On Apriori-Based Rule Generation and the Explainable Reasoning Functionality for Decision (Short Notes),” Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, 33(1), 2021, pp. 506-510 (in Japanese).

J. Komorowski, Z. Pawlak, L. Polkowski, A. Skowron, “Rough Sets: a Tutorial,” Rough Fuzzy Hybridization: A New Method for Decision Making, Springer, 1999, pp. 3–98.

P. McNicholas, Y. Zhao, “Association Rules: An Overview,” Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction, IDI-Global, 2009, pp. 1–10.

E. Orłowska, Z. Pawlak, “Representation of nondeterministic information,” Theoretical Computer Science, vol. 29, no. 1–2, 1984, pp. 27–39.

Z. Pawlak, Systemy Informacyjne: Podstawy Teoretyczne (in Polish) WNT, 1983, p.186.

Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, 1991, p. 231.

R. Rojas, “The Backpropagation Algorithm,” Neural Networks - A Systematic Introduction, Chapter 7, Springer-Verlag, 1996.

H. Sakai, R. Ishibashi, K. Koba, M. Nakata, “Rules and Apriori Algorithm in Nondeterministic Information Systems,” Transactions on Rough Sets, vol. 9, Springer LNCS 5390, 2008, pp. 328–350.

H. Sakai, M. Nakata, W. Z. Wu, D. Miao, G. Wang, “Special Issue: Rough Sets and Data Mining,” CAAI Transactions on Intelligence Technology, IET, vol. 4, no. 4, 2019, pp. 201–260.

H. Sakai, M. Nakata, J.Watada, “NIS-Apriori-based Rule Generation with Three-way Decisions and Its Application System in SQL,” Information Sciences, vol. 507, 2020, pp. 755–771.

H. Sakai, M. Nakata, J. Watada, “Rough Sets Non-Deterministic Information Analysis and a NIS-Apriori System - A Rule Generation System Based on Possible World Semantics -,” Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, vol. 32, no. 4, 2020, pp. 747–758. (in Japanese)

H. Sakai, Z. Jian, “DIS-Apriori, NIS-Apriori, Decision Support, and Machine Learning by Rule Generation,” Software Tools for RNIA (Rough Sets Non-deterministic Information Analysis) Web Page, 2021.

http://www.mns.kyutech.ac.jp/~sakai/RNIA/ (Accessed May 25, 2021)

A. Skowron, C. Rauszer, “The Discernibility Matrices and Functions in Information Systems,” Intelligent Decision Support - Handbook of Advances and Applications of the Rough Set Theory, Kluwer Academic Publishers, 1992, pp. 331–362.

D. ´ Sl˛ezak, H. Sakai, “Automatic Extraction of Decision Rules from Non-deterministic Data Systems: Theoretical Foundations and SQL-based Implementation,” Proc. DTA2009, Springer CCIS 64, 2009, pp. 151–162.

W. Ziarko, “Variable Precision Rough Set Model,” Journal of Computer and System Sciences, vol. 46, no. 1, 1993, pp. 39–59.

Published
2023-02-28
Section
Technical Papers