Skip to main content
Log in

Ontology-based data mining model management for self-service knowledge discovery

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

Data mining (DM) models are knowledge-intensive information products that enable knowledge creation and discovery. As large volume of data is generated with high velocity from a variety of sources, there is a pressing need to place DM model selection and self-service knowledge discovery in the hands of the business users. However, existing knowledge discovery and data mining (KDDM) approaches do not sufficiently address key elements of data mining model management (DMMM) such as model sharing, selection and reuse. Furthermore, they are mainly from a knowledge engineer’s perspective, while the business requirements from business users are often lost. To bridge these semantic gaps, we propose an ontology-based DMMM approach for self-service model selection and knowledge discovery. We develop a DM3 ontology to translate the business requirements into model selection criteria and measurements, provide a detailed deployment architecture for its integration within an organization’s KDDM application, and use the example of a student loan company to demonstrate the utility of the DM3.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. The DM3 is available at http://128.172.188.35:8080/webprotege.

References

  • Alavi, M., & Leidner, D. E. (2001). Review: knowledge management and knowledge management systems: conceptual foundations and research issues. MIS Quarterly, 25(1), 107–136.

    Article  Google Scholar 

  • Baader, F. (2003). The description logic handbook: Theory, implementation, and applications. Cambridge University Press.

  • Baker, T., Bechhofer, S., Isaac, A., Miles, A., Schreiber, G., & Summers, E. (2013). Key choices in the design of simple knowledge organization system (SKOS). Web Semantics: Science, Services and Agents on the World Wide Web, 20, 35–49.

    Article  Google Scholar 

  • Basili, V.R., Caldiera, G., & Rombach, H.D. (1994). Goal question metrics paradigm. In Encyclopedia of Software Engineering (vol. 12, pp. 528–532).

  • Bernstein, P. A., & Melnik, S. (2007). Model management 2.0: manipulating richer mappings. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data (pp. 1–12). ACM.

  • Berry, M.J., & Linoff, G.S. (2004). Data mining techniques: For marketing, sales, and customer relationship management. Wiley Computer Publishing.

  • Bouamrane, M.-M., Rector, A., & Hurrell, M. (2009). Development of an ontology for a preoperative risk assessment clinical decision support system. In Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems, Albuquerque, NM, USA (pp. 1–6).

  • Brezany, P., Buil, C., Janciak, I., & Pllana, S. (2009). ADMIRE D1.2 - DMI model, language and ontology. the ADMIRE Project: The University of Vienna and Others within the ADMIRE Project.

  • Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., et al. (2000). CRISP-DM 1.0. CRISP-DM Consortium.

  • Charest, M., Delisle, S., Cervantes, O., & Shen, Y. (2008). Bridging the gap between data mining and decision support: a case-based reasoning and ontology approach. Intelligent Data Analysis, 12(2), 211–236.

    Google Scholar 

  • Chen, Y. J. (2010). Development of a method for ontology-based empirical knowledge representation and reasoning. Decision Support Systems, 50(1), 1–20.

    Article  Google Scholar 

  • Chen, C. P., & Zhang, C.-Y. (2014). Data-intensive applications, challenges, techniques and technologies: a survey on Big data. Information Sciences, 275, 314–347.

    Article  Google Scholar 

  • Chen, H., Chiang, R. H. L., & Storey, V. C. (2012). Business intelligence and analytics: from big data to big impact. MIS Quarterly, 36(4), 1165–1188.

    Google Scholar 

  • Choinski, M., & Chudziak, J.A. (2009). Ontological learning assistant for knowledge discovery and data mining. In International Multiconference on Computer Science and Information Technology (IMCSIT’09), Mrągowo, Poland (pp. 147–155). IEEE.

  • Data Mining Group (2014). PMML 4.2 - general structure. http://dmg.org/pmml/v4-2-1/GeneralStructure.html. Accessed 02/10 2016.

  • Davenport, T. H. (2006). Competing on analytics. Harvard Business Review, 84(1), 98.

    Google Scholar 

  • Devedzić, V. (2002). Understanding ontological engineering. Communications of the ACM, 45(4), 136–144.

    Article  Google Scholar 

  • Diamantini, C., Potena, D., & Storti, E. (2013). A virtual mart for knowledge discovery in databases. Information Systems Frontiers, 15(3), 447–463.

    Article  Google Scholar 

  • Ding, Y., & Foo, S. (2002). Ontology research and development. Part 1-a review of ontology generation. Journal of Information Science, 28(2), 123–136.

    Google Scholar 

  • Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM, 39(11), 27–34.

    Article  Google Scholar 

  • Fernández López, M., Gómez-Pérez, A., Pazos Sierra, A., & Pazos Sierra, J. (1999). Building a chemical ontology using methontology and the ontology design environment

  • Gangemi, A., Catenacci, C., Ciaramita, M., & Lehmann, J. (2006). Modelling ontology evaluation and validation. In The Semantic Web: Research and Applications (pp. 140–154. Springer.

  • Gartner, I. (2013). Gartner IT glossary. Technology Research.

  • Gruber, T. R. (1995). Toward principles for the design of ontologies used for knowledge sharing? International Journal of Human-Computer Studies, 43(5), 907–928.

    Article  Google Scholar 

  • Grüninger, M., & Fox, M.S. (1995). Methodology for the design and evaluation of ontologies. In Workshop on Basic Ontological Issues in Knowledge Sharing. (pp. 1–10).

  • Haley, A., & Zweben, S. (1984). Development and application of a white box approach to integration testing. Journal of Systems and Software, 4(4), 309–315.

    Article  Google Scholar 

  • Heras, S., Botti, V., & Julián, V. (2014). An ontological-based knowledge-representation formalism for case-based argumentation. Information Systems Frontiers, 17(4), 779–798.

    Article  Google Scholar 

  • Hermida, J. M., Meliá, S., Montoyo, A., & Gómez, J. (2013). Applying model-driven engineering to the development of Rich internet applications for business intelligence. Information Systems Frontiers, 15(3), 411–431.

    Article  Google Scholar 

  • Hevner, A. R., March, S. T., & Park, J. (2004). Design science in information systems research. MIS Quarterly, 28(1), 75–105.

    Google Scholar 

  • Hilario, M., Kalousis, A., Nguyen, P., & Woznica, A. (2009). A data mining ontology for algorithm selection and meta-mining. In ECML/PKDD09 Workshop on 3rd generation Data Mining (SoKD-09) (pp. 76–87).

  • Horrocks, I., Parsia, B., & Sattler, U. (2012). OWL 2 web ontology language direct semantics (2nd Edn). http://www.w3.org/TR/owl2-direct-semantics/. Accessed 12 August 2015.

  • kdnuggets.com (2014). CRISP-DM, still the top methodology for analytics, data mining, or data science projects. http://www.kdnuggets.com/2014/10/crisp-dm-top-methodology-analytics-data-mining-data-science-projects.html. Accessed 02/10 2016.

  • Kietz, J.-U., Serban, F., & Bernstein, A. (2010). eProPlan : a tool to model automatic generation of data mining workflows. In ECML Workshop on third generation data mining: Towards service-oriented knowledge discovery (SoKD-2010), Barcelona, Spain.

  • Kimball, R., & Ross, M. (2011). The data warehouse toolkit: The complete guide to dimensional modeling. Wiley.

  • Leavitt, N. (2002). Data mining for the corporate masses? Computer, 35(5), 22–24.

    Article  Google Scholar 

  • Liu, B., & Tuzhilin, A. (2008). Managing large collections of data mining models. Communications of the ACM, 51(2), 85–89.

    Article  Google Scholar 

  • Maedche, A., & Staab, S. (2001). Ontology learning for the semantic web. IEEE Intelligent Systems, 16(2), 72–79.

    Article  Google Scholar 

  • Marbán, Ó., Mariscal, G., Menasalvas, E., & Segovia, J. (2007). An engineering approach to data mining projects. In H. Yin, P. Tino, E. Corchado, W. Byrne, & X. Yao (Eds.), Intelligent data engineering and automated learning—IDEAL 2007 (vol. 4881, pp. 578–588, Lecture Notes in Computer Science). Springer Berlin Heidelberg.

  • Mariscal, G., Marbán, Ó., & Fernández, C. (2010). A survey of data mining and knowledge discovery process models and methodologies. Knowledge Engineering Review, 25(2), 137.

    Article  Google Scholar 

  • Muhanna, W. A., & Pick, R. A. (1994). Meta-modeling concepts and tools for model management: a systems approach. Management Science, 40(9), 1093–1123.

    Article  Google Scholar 

  • Noy, N.F., & McGuinness, D.L. (2001). Ontology development 101: A guide to creating your first ontology. Stanford knowledge systems laboratory technical report KSL-01-05 and Stanford medical informatics technical report SMI-2001-0880.

  • Osei-Bryson, K.-M. (2004). Evaluation of decision trees: a multi-criteria approach. Computers & Operations Research, 31(11), 1933–1945.

    Article  Google Scholar 

  • Panov, P., Dzeroski, S., & Soldatova, L. (2008). OntoDM: An ontology of data mining. In IEEE International Conference on Data Mining Workshops, 2008 (ICDMW’08) Pisa, Italy, 2008 (pp. 752–760). IEEE.

  • Peroni, S., & Shotton, D. (2012). FaBiO and CiTO: ontologies for describing bibliographic resources and citations. Web Semantics: Science, Services and Agents on the World Wide Web, 17, 33–43.

    Article  Google Scholar 

  • Protégé (2007). http://protege.stanford.edu/. Accessed 02/10 2016.

  • RacerPro (2012). Protégé 4.x Reasoner Plugin for RacerPro. http://www1.racer-systems.com/products/racerpro/index.phtml. Accessed 09/30 2015.

  • Rohanizadeh, S.S., & Moghadam, M.B. (2009). A proposed data mining methodology and its application to industrial procedures. Journal of Industrial Engineering.

  • Schwartz, D. G. (2003). From open IS semantics to the semantic web: the road ahead. IEEE Intelligent Systems, 18(3), 52–58.

    Article  Google Scholar 

  • Sharma, S., Osei-Bryson, K.-M., & Kasper, G. M. (2012). Evaluation of an integrated knowledge discovery and data mining process model. Expert Systems with Applications, 39(13), 11335–11348.

    Article  Google Scholar 

  • Sun, L., Ousmanou, K., & Cross, M. (2008). An ontological modelling of user requirements for personalised information provision. Information Systems Frontiers, 12(3), 337–356.

    Article  Google Scholar 

  • Tudorache, T., Vendetti, J., & Noy, N.F. (2008). Web-Protege: A lightweight OWL ontology editor for the Web. In OWLED, (vol. 432).

  • Uschold, M., & Gruninger, M. (1996). Ontologies: principles, methods and applications. The Knowledge Engineering Review, 11(02), 93–136.

    Article  Google Scholar 

  • Van Solingen, R., Basili, V., Caldiera, G., & Rombach, H.D. (2002). Goal question metric (gqm) approach. Encyclopedia of Software Engineering.

  • Vilalta, R., & Drissi, Y. (2002). A perspective view and survey of meta-learning. Artificial Intelligence Review, 18(2), 77–95.

    Article  Google Scholar 

  • W3C OWL Working Group (2012). OWL 2 web ontology language document overview. https://www.w3.org/TR/owl2-overview/. Accessed 02/10 2016.

  • Yu, J., Thom, J. A., & Tam, A. (2009). Requirements-oriented methodology for evaluating ontologies. Information Systems, 34(8), 766–791.

    Article  Google Scholar 

  • Zack, M., McKeen, J., & Singh, S. (2009). Knowledge management and organizational performance: an exploratory analysis. Journal of Knowledge Management, 13(6), 392–409.

    Article  Google Scholar 

  • Zorrilla, M., & García-Saiz, D. (2013). A service oriented architecture to provide data mining services for non-expert data miners. Decision Support Systems, 55(1), 399–411.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Li.

Appendix

Appendix

Table 6 Assessment of DM3 ontology towards its modeling requirements

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Thomas, M.A. & Osei-Bryson, KM. Ontology-based data mining model management for self-service knowledge discovery. Inf Syst Front 19, 925–943 (2017). https://doi.org/10.1007/s10796-016-9637-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-016-9637-y

Keywords

Navigation