RIDDLE: Rule Induction with Deep Learning


  • Cosimo Persia University of Bergen
  • Ricardo Guimarães University of Bergen




Rule Induction, Deep Learning, Possibilistic Logic


Numerous applications rely on the efficiency of Deep Learning models to address complex classification tasks for critical decisions-making. However, we may not know how each feature in these models contributes towards the prediction. In contrast, Rule Induction algorithms provide an interpretable way to extract patterns from data, but traditional approaches suffer in terms of scalability. In this work, we bridge Deep Learning and Rule Induction and define the RIDDLE (Rule Induction with Deep Learning) architecture. We show that RIDDLE has state-of-the-art performance in Rule Induction via an empirical evaluation.


D. Angluin. Queries and concept learning. Machine Learning, 2(4):319–342, Apr 1988. doi: 10.1023/A:1022821128753.

J. L. J. L. Bell and M. Machover. A course in mathematical logic. North-Holland, Amsterdam, 1977. ISBN 0720428440. doi: https://doi.org/10.1007/978-1-4757-4385-2.

L. Bottou. Stochastic learning. In O. Bousquet and U. von Luxburg, editors, Advanced Lectures on Machine Learning, Lecture Notes in Artificial Intelligence, LNAI 3176, pages 146–168. Springer Verlag, Berlin, 2004. doi:10.1007/978-3-540-28650-9 7.

I. Bratko. Applications of machine learning: Towards knowledge synthesis. New Generation Computing, 11(3):343–360, Sep 1993. ISSN 1882-7055. doi: 10.1007/BF03037182.

W. W. Cohen. Fast effective rule induction. In A. Prieditis and S. Russell, editors, Machine Learning, Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, California, USA, July 9-12, 1995, pages 115–123. Morgan Kaufmann, 1995. doi: 10.1016/b978-1-55860-377-6.50023-2.

S. Dash, O. Günlük, and D. Wei. Boolean decision rules via column generation. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pages 4660–4670, 2018. doi: https://doi.org/ 10.48550/arXiv.1805.09901.

J. A. Dhanraj, M. Prabhakar, C. P. Ramaian, M. Subramaniam, J. M. Solomon, and N. Vinayagam. Increasing the Wind Energy Production by Identifying the State of Wind Turbine Blade. In Lecture Notes in Mechanical Engineering, pages 139–148. Springer Nature Singapore, 2022. doi: 10.1007/978-981-16-7909-4 13.

D. Dua and C. Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml

D. Dubois and H. Prade. Resolution principles in possibilistic logic. Int. J. Approx. Reasoning, 4(1):1–21, 1990. doi: https://doi.org/10.1016/0888-613X(90)90006-N

D. Dubois and H. Prade. When upper probabilities are possibility measures. Fuzzy Sets and Systems, 49:65–74, 1992. doi: https://doi.org/10.1016/0165-0114(92)90110-P

D. Dubois and H. Prade. Possibility Theory, pages 6927–6939. Springer New York, New York, NY, 2009. ISBN 978-0-387-30440-3. doi: 10.1007/978-0-387-30440-3_413

D. Dubois and H. Prade. Possibility theory and its applications: Where do we stand? In Springer Handbook of Computational Intelligence, pages 31–60. 2015. doi: 10.1007/978-3-662-43505-2_3

D. Dubois and H. Prade. Practical methods for constructing possibility distributions. International Journal of Intelligent Systems, 31(3):215–239, 2016. doi: https://doi.org/10.1002/int.21782

D. Dubois, J. Lang, and H. Prade. Possibilistic Logic, page 439–513. Oxford University Press, Inc., 1994

M. Elkano, J. A. Sanz, E. Barrenechea, H. Bustince, and M. Galar. CFM-BD: A distributed rule induction algorithm for building compact fuzzy models in big data classification problems. IEEE Transactions on Fuzzy Systems, 28(1):163–177, jan 2020. doi:10.1109/tfuzz.2019.2900856

N. Fatima, A. S. Imran, Z. Kastrati, S. M. Daudpota, and A. Soomro. A systematic literature review on text generation using deep neural network models. IEEE Access, 10:53490–53503, 2022. doi: 10.1109/access.2022.3174108

E. Frank, M. A. Hall, G. Holmes, R. Kirkby, B. Pfahringer, and I. H. Witten. Weka: A machine learning workbench for data mining., pages 1305–1314. Springer, Berlin, 2005. URL http://researchcommons.waikato.ac.nz/handle/10289/1497

C. Glanois, Z. Jiang, X. Feng, P. Weng, M. Zimmer, D. Li, W. Liu, and J. Hao. Neuro-symbolic hierarchical rule induction. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pages 7583–7615. PMLR, 2022. doi:arxiv-2112.13418.

I. J. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press, Cambridge, MA, USA, 2016. doi: https://doi.org/10.1007/s10710-017-9314-z

J. Hühn and E. Hüllermeier. Furia: an algorithm for unordered fuzzy rule induction. Data Mining and Knowledge Discovery, 19(3):293–319,Dec 2009. ISSN 1573-756X. doi: 10.1007/s10618-009-0131-8

C. Joslyn. Towards an empirical semantics of possibility through maximum uncertainty. 1991

R. Kusters, Y. Kim, M. Collery, C. de Sainte Marie, and S. Gupta. Differentiable rule induction with learned relational features. Jan. 2022. doi: arXiv:2201.06515

R. Liaw, E. Liang, R. Nishihara, P. Moritz, J. E. Gonzalez, and I. Stoica. Tune: A research platform for distributed model selection and training 2018. doi: https://arxiv.org/abs/1807.05118

S. Minaee, Y. Y. Boykov, F. Porikli, A. J. Plaza, N. Kehtarnavaz, and D. Terzopoulos. Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–1, 2021. doi: 10.1109/tpami.2021.3059968

S. Muggleton, L. De Raedt, D. Poole, I. Bratko, P. Flach, K. Inoue, and A. Srinivasan. ILP turns 20. Machine Learning, 86(1):3–23, Jan 2012. ISSN 1573-0565. doi:10.1007/s10994-011-5259-2

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.

V. Podgorelec, P. Kokol, B. Stiglic, and I. Rozman. Decision trees: An overview and their use in medicine. Journal of Medical Systems, 26 (5):445–463, Oct 2002. ISSN 1573-689X. doi: 10.1023/A:1016409317640

L. Qiao, W. Wang, and B. Lin. Learning accurate and interpretable decision rule sets from neural networks. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February2-9, 2021, pages 4303–4311. AAAI Press, 2021. doi: https://arxiv.org/abs/2103.02826

C. Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell., 1(5):206–215, 2019. doi: 10.1038/s42256-019-0048-x

G. Scala, A. Federico, V. Fortino, D. Greco, andB. Majello. Knowledge generation with rule induction in cancer omics. International Journal of Molecular Sciences, 21(1):18, dec 2019. doi:10.3390/ijms21010018

J. Vreeken, M. van Leeuwen, and A. Siebes. Krimp: mining itemsets that compress. Data Mining and Knowledge Discovery, 23(1):169–214,Jul 2011. ISSN 1573-756X. doi: 10.1007/s10618-010-0202-x

L. WJ. A rule-based process control method with feedback. Advances in Instrumentation 41, 169–175, 1987

P. Xu, Z. Ding, and M. Pan. A hybrid interpretable credit card users default prediction model based on RIPPER. Concurrency and Computation: Practice and Experience, 30(23):e4445, feb 2018. doi: 10.1002/cpe.4445.