Abstract
An effective quality analysis of XML web data using clustering and classification approach is used in our proposed method. XML is turning into a standard in representation of data, it is attractive to support keyword search in XML database. A keyword search searches for words anyplace in record. It is developed as best worldview for finding data on web. The most imperative prerequisite for the keyword search is to rank the consequences of question so that the most pertinent outcomes show up. Here, we gather more XML documents. Followed by that, feature extraction occurs. Since the selected feature contains both relevant as well as irrelevant features it is essential to filter the irrelevant features. For the purpose of selecting, the relevant features probability-based feature selection method is used. Then for clustering the relevant features on the basis of keywords weighted fuzzy c means clustering algorithm is used. In order to assess the XML data quality, optimal neural network (ONN) classifier is utilized. In this ONN classifier in order to select the optimal weights, whale optimization algorithm is used. Thus, the web pages are effectively ranked. The efficiency of the proposed method is assessed using clustering and classification accuracy, RMSE, and search time. The proposed method is implemented in JAVA.
Similar content being viewed by others
References
Algergawy A, Schallehn E, Saake G (2009) Improving XML schema matching performance using Prüfer sequences. Data Knowl Eng 68(8):728–747
Alpuente M, Ballis D, Falaschi M, Frechina F, Romero D (2013) Rewriting-based repairing strategies for XML repositories. J Logic Algebraic Progr 82(8):326–352
Barros EG, Laender AHF, Moro MM, da Silva AS (2016) LCA-based algorithms for efficiently processing multiple keyword queries over XML streams. Data Knowl Eng 103:1–18
Böttcher S, Hartel R, Wolters D (2016) S2CX: from relational data via SQL/XML to (Un-)Compressed XML. Inf Syst 56:198–213
Cao Y, Lung C-H, Majumdar S (2016) Efficient message delivery models for XML-based publish/subscribe systems. Comput Commun 85:58–73
Greco S, Gullo F, Ponti G, Tagarelli A (2011) Collaborative clustering of XML documents. J Comput Syst Sci 77(6):988–1008
Grijzenhout S, Marx M (2013) The quality of the XML Web. Web Semant Sci Serv Agents World Wide Web 19:59–68
Liu J, Zhang XX (2016) Dynamic labeling scheme for XML updates. Knowl Based Syst 106:135–149
Liu J, Zhang XX (2017) Efficient keyword search in fuzzy XML. Fuzzy Sets Syst 317:68–87
Ma Z, Yan L (2016) Modeling fuzzy data with XML: a survey. Fuzzy Sets Syst 301:146–159
Ma Z, Bai L, Ishikawa Y, Yan L (2017) Consistencies of fuzzy spatiotemporal data in XML documents. Fuzzy Sets Syst 343:97–125
Mata C, Oliver A, Lalande A, Walker P, Martí J (2017) On the use of XML in medical imaging web-based applications. IRBM 38(1):3–12
Mohammed S, Barradah AF, El-Alfy E-SM (2016) Selectivity estimation of extended XML query tree patterns based on prime number labeling and synopsis modeling. Simul Model Pract Theory 64:30–42
Morris KC (2010) A framework for XML schema naming and design rules development tools. Comput Stand Interfaces 32(4):179–184
Nečaský M, Klímek J, Malý J, Mlýnková I (2012) Evolution and change management of XML-based systems. J Syst Softw 85(3):683–707m
Qadah GZ (2017) Indexing techniques for processing generalized XML documents. Comput Stand Interfaces 49:34–43
Qtaish A, Ahmad K (2016) XAncestor: an efficient mapping approach for storing and querying XML documents in relational database using path-based technique. Knowl Based Syst 114:167–192
Safabahar B, Mirabi M (2017) A new structure and access mechanism for secure and efficient XML data broadcast in mobile wireless networks. J Syst Softw 125:119–132
Schweinsberg K, Wegner L (2017) Advantages of complex SQL types in storing XML documents. Future Gener Comput Syst 68:500–507
Sengupta A (2012) On the feasibility of using conceptual modeling constructs for the design and analysis of XML data. Data Knowl Eng 72:219–238
Szymczak M, Zadrożny S, Bronselaer A, De Tré G (2015) Coreference detection in an XML schema. Inf Sci 296:237–262
Tekli J, Charbel N, Chbeir R (2016) Building semantic trees from XML documents. Web Semant Sci Serv Agents World Wide Web 37–38:1–24
Vela B, Mazón JN, Blanco C, Fernández-Medina E, Trujillo J, Marcos E (2013) Development of secure XML data warehouses with QVT. Inf Softw Technol 55(9):1651–1677
Wang D (2007) An XML-based testing strategy for probing security vulnerabilities in the diameter protocol. Bell Labs Tech J 12(3):79–93
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that we have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gopianand, M., Jaganathan, P. An effective quality analysis of XML web data using hybrid clustering and classification approach. Soft Comput 24, 2139–2150 (2020). https://doi.org/10.1007/s00500-019-04045-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04045-9