Abstract
The paper deals with the issues of finding and researching optimum algorithms for classification and semantic annotation of textual network content in the interests of filling and updating nuclear knowledge graphs in Russian and English. Testing of the studied algorithms is carried out by the method of cross-validation. The novelty of the presented research is due to the application of the Pareto’s optimality principle for multi-criteria evaluation and ranking of the studied machine learning algorithms, provided that there is no a priori information about the comparative importance of the criteria. The features of the software implementation of efficient classification and semantic annotation algorithms as part of a scalable semantic web portal hosted on a cloud platform are discussed. The proposed software solutions are based on cloud computing using DBaaS and PaaS service models to ensure the scalability of data warehouses and network services.
Notes
https://cds.cern.ch.
http://cdfe.sinp.msu.ru/index.en.html.
https://www.iaea.org/topics/nuclear-knowledge-management.
http://www.innov-rosatom.ru/suz-rosatoma/.
https://www.iaea.org/topics/nuclear-knowledge-management.
http://www.innov-rosatom.ru/suz-rosatoma/.
https://www.mathnet.ru/php/person.phtml?&personid=29853.
https://www.w3.org/TR/rdf-schema/.
https://www.w3.org/TR/owl2-overview/.
http://www.w3.org/TR/sparql11-query.
https://plato.stanford.edu/archives/spr2019/entries/ bayes-theorem/.
https://www.newworldencyclopedia.org/entry/Vilfredo_Pareto.
https://www.iaea.org/.
https://cds.cern.ch.
https://eng.mephi.ru/.
https://phys.msu.ru/eng/.
http://nrcki.ru/.
http://cdfe.sinp.msu.ru/index.en.html.
https://nlp.stanford.edu/software/.
https://web.mit.edu/.
https://www.uniba.it/it/ricerca/dipartimenti/informatica.
https://www.mathcs.uni-leipzig.de/ifi.
https://www.cs.manchester.ac.uk/.
https://www.w3.org/standards/semanticweb/.
https://www.ibm.com/cloud/watson-studio.
https://developers.google.com/learn/topics/datascience.
https://aws.amazon.com/en/comprehend/features/.
https://aws.amazon.com/ru/machine-learning/.
https://datasphere.yandex.ru/.
https://www.mathworks.com/solutions/machine-learning.html.
https://nlp.stanford.edu/software/.
https://scikit-learn.org/stable/.
https://nti2035.ru/technology/competence_centers/mipt.php.
https://2030.itmo.ru/mplatform2.
https://cs.msu.ru/en.
https://www.ispras.ru/en/.
https://www.huawei.ru/.
https://rscf.ru/project/22-21-00182/
REFERENCES
V. Telnov and Y. Korovin, ‘‘Machine learning and text analysis in the tasks of knowledge graphs refinement and enrichment,’’ in Supplementary Proceedings of the 22nd International Conference on Data Analytics and Management in Data Intensive Domains DAMDID/RCDL 2020, Voronezh, Russia, October 13–16, CEUR Workshop Proc. 2790, 48–62 (2020). http://ceur-ws.org/Vol-2790/paper06.pdf.
V. Telnov and Y. Korovin, ‘‘Semantic web and interactive knowledge graphs as educational technology,’’ in Cloud Computing Security, Ed. by D. G. Harkut (IntechOpen, London, 2020). https://doi.org/10.5772/intechopen.83221
V. Telnov and Y. Korovin, ‘‘Semantic web and knowledge graphs as an educational technology of personnel training for nuclear power engineering,’’ Nucl. Energy Technol. 5, 273–280 (2019). https://doi.org/10.3897/nucet.5.39226
V. Telnov and Y. Korovin, ‘‘Semantic web and knowledge graphs as an educational technology of personnel training for nuclear power engineering,’’ Izv. Vyssh. Uchebn. Zaved., Yad. Energet. 2, 219–229 (2019). https://doi.org/10.26583/npe.2019.2.19
V. Telnov and Y. Korovin, ‘‘Programming knowledge graphs, reasoning on graphs,’’ Software Eng. 2, 59–68 (2019). https://doi.org/10.17587/prin.10.59-68
V. Telnov and Y. Korovin, ‘‘Semantic educational web portal,’’ in Selected Papers of the 19th International Conference on Data Analytics and Management in Data Intensive Domains DAMDID/RCDL 2017, Moscow, Russia, October 9–13, 2017, CEUR Workshop Proc. 2022, 50–56 (2020).
Semantic Educational Portal. Nuclear Knowledge Graphs. Intelligent Search Agents. http://vt.obninsk.ru/x/. Accessed 2022.
Knowledge Graphs on Computer Science. Intelligent Search Agents. http://vt.obninsk.ru/s/. Accessed 2022.
A. Geron, Hands-on Machine Learning with Scikit-Learn, Keras & Tensor Flow, 2nd ed. (O’Reilly Media, CA, 2019).
Scikit-learn. Machine Learning in Python. https://scikit-learn.org/stable/. Accessed 2022.
M. Mironczuk and J. Protasiewicz, ‘‘A recent overview of the state of the art elements of text classification,’’ Expert Syst. Appl. 106, 36–54 (2018). https://doi.org/10.1016/j.eswa.2018.03.058
S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, ‘‘Deep learning based text classification: A comprehensive review,’’ ACM Comput. Surv. 54 (3), 1–40 (2022). https://doi.org/10.1145/3439726
Naive Bayes Classifier. http://scikit-learn.org/stable/modules/naive. Accessed 2022.
A. Clark, C. Fox, and S. Lappin, ‘‘Maximum entropy models,’’ in The Handbook of Computational Linguistics and Natural Language Processing (Wiley-Blackwell, 2010), pp. 131–153. https://doi.org/10.1002/9781444324044
A. Christmann and I. Steinwart, Support Vector Machines (Springer, New York 2010). https://doi.org/10.1007/978-0-387-77242-4
Classification Metrics. https://github.com/turi-code/userguide/blob/master/evaluation/classification.md. Accessed 2022.
Knowledge graph named ’Nuclear physics at MSU, MEPhI’ in a serialized format (OWL file). http://drive.google.com/file/d/1HraKrUaTKOlGBG8BsYPEZImlgehxdQEG. Accessed 2022.
C. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. Bethard, and D. McClosky, ‘‘The Stanford CoreNLP natural language processing toolkit,’’ in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Assoc. Comput. Linguist. 52, 55–60 (2014). https://doi.org/10.3115/v1/P14-5010
JavaDoc for ColumnDataClassifier. http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/classify/ColumnDataClassifier.html. Accessed 2022.
ISO/IEC 19505–2:2012(E): Information technology—Object Management Group Unified Modeling Language (OMG UML)—Part 2: Superstructure (ISO/IEC, Geneva, 2012).
Machine Learning with MATLAB and Simulink. https://www.mathworks.com/solutions/machine-learning.html. Accessed 2022.
S. Stupnikov and A. Kalinichenko, ‘‘Extensible unifying data model design for data integration in fair data infrastructures,’’ in Proceedings of the 20th International Conference on Data Analytics and Management in Data Intensive Domains DAMDID/RCDL 2018 (Springer, 2022), Vol. 20, pp. 17–39. https://doi.org/10.1007/978-3-030-23584-0_2
A. Hogan, E. Blomqvist, M. Cochez, C. d’Amato, et al., ‘‘Knowledge graphs,’’ ACM Comput. Surv. 54, 1–37 (2021). https://doi.org/10.1145/3418294
Y. Fettach, M. Ghogno, and B. Bennatalah, ‘‘Knowledge graphs in education and employability: A survey on applications and techniques,’’ IEEE Access 10, 80174–80183 (2022). https://doi.org/10.1109/ACCESS.2022.3194063
C. Grevisse, R. Manrique, O. Marino, and S. Rothkugel, ‘‘Knowledge graph-based teacher support for learning material authoring,’’ in Proceedings of the Colombian Conference on Computing (Springer, Cham, Switzerland, 2018), pp. 177–191.
Y. Chi, Y. Qin, R. Song, and H. Xu, ‘‘Knowledge graph in smart education: A case study of entrepreneurship scientific publication management,’’ Sustainability 4, 995–1004 (2018).
I. Aliyu, A. Kana, and S. Aliyu,‘‘Development of knowledge graph for university courses management,’’ Int. J. Educ. Manage. Eng. 2 (10), 1–15 (2020).
D. De, N. Garofalo, D. Malandrino, M. Pellegrino, and A. Petta, ‘‘Education meets knowledge graphs for the knowledge management,’’ in Proceedings of the International Conference on Methodologies and Intelligent Systems for Technology Enhanced Learning (Springer, Switzerland, 2020), pp. 272–280.
X. Huang, ‘‘Study of personalized E-learning system based on knowledge structural graph,’’ Proc. Eng. 15, 3366–3370 (2011).
K. Sun, Y. Liu, Z. Guo, and C. Wang, ‘‘EduVis: Visualization for education knowledge graph based on web data,’’ in Proceedings of the 9th International Symposium on Visual Information Communication and Interaction (2016), pp. 138–139. https://doi.org/10.1145/2968220.2968227
T. Zhao, C. Chai, Y. Luo, J. Feng, Y. Huang, et al., ‘‘Towards automatic mathematical exercise solving,’’ Data Sci. Eng. 3, 179–192 (2019).
Q. Lin, Z. Zhu, H. Lu, K. Shi, and Z. Niu, ‘‘Improving university faculty evaluations via multi-view knowledge graph,’’ Future Gen. Comput. Syst. 117, 181–192 (2021). https://doi.org/10.1016/j.future.2020.11.021
K. Khadilkar, S. Kulkarni, and P. Bone, ‘‘Plagiarism detection using semantic knowledge graphs,’’ in Proceedings of the 4th International Conference on Computing Communication Control and Automation ICCUBEA (2018), pp. 1–6. https://doi.org/10.1109/ICCUBEA.2018.8697404
Funding
The study was supported by the Russian Science Foundation grant no. 22-21-00182Footnote 38 .
Author information
Authors and Affiliations
Corresponding authors
Additional information
(Submitted by E. K. Lipachev)
Rights and permissions
About this article
Cite this article
Telnov, V.P., Korovin, Y.A. & Odintsov, K.V. On the Issue of Optimum Machine Learning Methods for Filling and Updating Nuclear Knowledge Graphs. Lobachevskii J Math 44, 227–236 (2023). https://doi.org/10.1134/S1995080223010419
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1995080223010419