An effective quality analysis of XML web data using hybrid clustering and classification approach

Gopianand, M.; Jaganathan, P.

doi:10.1007/s00500-019-04045-9

An effective quality analysis of XML web data using hybrid clustering and classification approach

Methodologies and Application
Published: 27 May 2019

Volume 24, pages 2139–2150, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

M. Gopianand¹ &
P. Jaganathan¹

245 Accesses
4 Citations
Explore all metrics

Abstract

An effective quality analysis of XML web data using clustering and classification approach is used in our proposed method. XML is turning into a standard in representation of data, it is attractive to support keyword search in XML database. A keyword search searches for words anyplace in record. It is developed as best worldview for finding data on web. The most imperative prerequisite for the keyword search is to rank the consequences of question so that the most pertinent outcomes show up. Here, we gather more XML documents. Followed by that, feature extraction occurs. Since the selected feature contains both relevant as well as irrelevant features it is essential to filter the irrelevant features. For the purpose of selecting, the relevant features probability-based feature selection method is used. Then for clustering the relevant features on the basis of keywords weighted fuzzy c means clustering algorithm is used. In order to assess the XML data quality, optimal neural network (ONN) classifier is utilized. In this ONN classifier in order to select the optimal weights, whale optimization algorithm is used. Thus, the web pages are effectively ranked. The efficiency of the proposed method is assessed using clustering and classification accuracy, RMSE, and search time. The proposed method is implemented in JAVA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Machine Learning Technique for Web Page Classification

Article 08 September 2015

RETRACTED ARTICLE: A swarm-optimized tree-based association rule approach for classifying semi-structured data using soft computing approach

Article 31 August 2021

Clustering Analysis to Improve Web Search Ranking Using PCA and RMSE

References

Algergawy A, Schallehn E, Saake G (2009) Improving XML schema matching performance using Prüfer sequences. Data Knowl Eng 68(8):728–747
Article Google Scholar
Alpuente M, Ballis D, Falaschi M, Frechina F, Romero D (2013) Rewriting-based repairing strategies for XML repositories. J Logic Algebraic Progr 82(8):326–352
Article MathSciNet Google Scholar
Barros EG, Laender AHF, Moro MM, da Silva AS (2016) LCA-based algorithms for efficiently processing multiple keyword queries over XML streams. Data Knowl Eng 103:1–18
Article Google Scholar
Böttcher S, Hartel R, Wolters D (2016) S2CX: from relational data via SQL/XML to (Un-)Compressed XML. Inf Syst 56:198–213
Article Google Scholar
Cao Y, Lung C-H, Majumdar S (2016) Efficient message delivery models for XML-based publish/subscribe systems. Comput Commun 85:58–73
Article Google Scholar
Greco S, Gullo F, Ponti G, Tagarelli A (2011) Collaborative clustering of XML documents. J Comput Syst Sci 77(6):988–1008
Article MathSciNet Google Scholar
Grijzenhout S, Marx M (2013) The quality of the XML Web. Web Semant Sci Serv Agents World Wide Web 19:59–68
Article Google Scholar
Liu J, Zhang XX (2016) Dynamic labeling scheme for XML updates. Knowl Based Syst 106:135–149
Article Google Scholar
Liu J, Zhang XX (2017) Efficient keyword search in fuzzy XML. Fuzzy Sets Syst 317:68–87
Article MathSciNet Google Scholar
Ma Z, Yan L (2016) Modeling fuzzy data with XML: a survey. Fuzzy Sets Syst 301:146–159
Article MathSciNet Google Scholar
Ma Z, Bai L, Ishikawa Y, Yan L (2017) Consistencies of fuzzy spatiotemporal data in XML documents. Fuzzy Sets Syst 343:97–125
Article MathSciNet Google Scholar
Mata C, Oliver A, Lalande A, Walker P, Martí J (2017) On the use of XML in medical imaging web-based applications. IRBM 38(1):3–12
Article Google Scholar
Mohammed S, Barradah AF, El-Alfy E-SM (2016) Selectivity estimation of extended XML query tree patterns based on prime number labeling and synopsis modeling. Simul Model Pract Theory 64:30–42
Article Google Scholar
Morris KC (2010) A framework for XML schema naming and design rules development tools. Comput Stand Interfaces 32(4):179–184
Article Google Scholar
Nečaský M, Klímek J, Malý J, Mlýnková I (2012) Evolution and change management of XML-based systems. J Syst Softw 85(3):683–707m
Article Google Scholar
Qadah GZ (2017) Indexing techniques for processing generalized XML documents. Comput Stand Interfaces 49:34–43
Article Google Scholar
Qtaish A, Ahmad K (2016) XAncestor: an efficient mapping approach for storing and querying XML documents in relational database using path-based technique. Knowl Based Syst 114:167–192
Article Google Scholar
Safabahar B, Mirabi M (2017) A new structure and access mechanism for secure and efficient XML data broadcast in mobile wireless networks. J Syst Softw 125:119–132
Article Google Scholar
Schweinsberg K, Wegner L (2017) Advantages of complex SQL types in storing XML documents. Future Gener Comput Syst 68:500–507
Article Google Scholar
Sengupta A (2012) On the feasibility of using conceptual modeling constructs for the design and analysis of XML data. Data Knowl Eng 72:219–238
Article Google Scholar
Szymczak M, Zadrożny S, Bronselaer A, De Tré G (2015) Coreference detection in an XML schema. Inf Sci 296:237–262
Article Google Scholar
Tekli J, Charbel N, Chbeir R (2016) Building semantic trees from XML documents. Web Semant Sci Serv Agents World Wide Web 37–38:1–24
Article Google Scholar
Vela B, Mazón JN, Blanco C, Fernández-Medina E, Trujillo J, Marcos E (2013) Development of secure XML data warehouses with QVT. Inf Softw Technol 55(9):1651–1677
Article Google Scholar
Wang D (2007) An XML-based testing strategy for probing security vulnerabilities in the diameter protocol. Bell Labs Tech J 12(3):79–93
Article Google Scholar

Download references

Author information

Authors and Affiliations

PSNA College of Engineering and Technology, Dindigul, India
M. Gopianand & P. Jaganathan

Authors

M. Gopianand
View author publications
You can also search for this author in PubMed Google Scholar
P. Jaganathan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Gopianand.

Ethics declarations

Conflict of interest

The authors declare that we have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gopianand, M., Jaganathan, P. An effective quality analysis of XML web data using hybrid clustering and classification approach. Soft Comput 24, 2139–2150 (2020). https://doi.org/10.1007/s00500-019-04045-9

Download citation

Published: 27 May 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s00500-019-04045-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An effective quality analysis of XML web data using hybrid clustering and classification approach

Abstract

Access this article

Similar content being viewed by others

Efficient Machine Learning Technique for Web Page Classification

RETRACTED ARTICLE: A swarm-optimized tree-based association rule approach for classifying semi-structured data using soft computing approach

Clustering Analysis to Improve Web Search Ranking Using PCA and RMSE

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An effective quality analysis of XML web data using hybrid clustering and classification approach

Abstract

Access this article

Similar content being viewed by others

Efficient Machine Learning Technique for Web Page Classification

RETRACTED ARTICLE: A swarm-optimized tree-based association rule approach for classifying semi-structured data using soft computing approach

Clustering Analysis to Improve Web Search Ranking Using PCA and RMSE

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation