Abstract
In this work, we have proposed an efficient and less resource intensive strategy for parsing and analyzing switching points in code-mixed data. Specifically, we have explored the rules of code-switching in Hindi–English code-mixed data. The work involves code-mixed text extraction, translation of the extracted texts to its pure form, forming word pairs, annotation of these using of Parts-of-Speech tags and recognition of the rules that govern switching in code-mixed text. We have created three models, viz. baseline model, lexicon-based model, and machine learning-based model, and found out the individual accuracies of these models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
P. Agarwal, A. Sharma, J. Grover, M. Sikka, K. Rudra, M. Choudhury, I may talk in English but gaali toh hindi mein hi denge: a study of English-Hindi code-switching and swearing pattern on social networks, in 2017 9th International Conference on Communication Systems and Networks (COMSNETS) (IEEE, New York, 2017), pp. 554–557
U.Z. Ahmed, K. Bali, M. Choudhury, V. Sowmya, Challenges in designing input method editors for Indian languages: the role of word-origin and context, in Proceedings of the Workshop on Advances in Text Input Methods (WTIM 2011) (2011), pp. 1–9
J.P. Blom, J.J. Gumperz et al., Social meaning in linguistic structure: code-switching in Norway. The bilingualism reader (2000), pp. 111–136
M.S. Cárdenas-Claros, N. Isharyanti, Code-switching and code-mixing in internet chatting: between ‘yes’, ‘ya’, and ‘si’-a case study. Jalt Call J. 5(3), 67–78 (2009)
A. Chopde, Itrans-Indian language transliteration package (2006). http://www.aczoom.com/itrans
M. Choudhury, K. Bali, T. Dasgupta, A. Basu, Resource creation for training and testing of transliteration systems for Indian languages, in LREC (2010)
D. Crystal, A Dictionary of Language (University of Chicago Press, Chicago, 2001)
B. Danet, S.C. Herring, Introduction: the multilingual internet. J. Comput.-Med. Commun. 9(1), JCMC9110 (2003)
B. Danet, S.C. Herring, Multilingualism on the internet, in Language and Communication: Diversity and Change Handbook of Applied Linguistics vol 9, (2007), pp. 553–592
A. Dey, P. Fung, A Hindi-English code-switching corpus, in LREC (2014), pp. 2410–2413
S. Kalmegh, Analysis of WEKA data mining algorithm REPTree, simple cart and randomtree for classification of Indian news. Int. J. Innov. Sci. Eng. Technol. 2(2), 438–446 (2015)
D.C. Li, Cantonese-English code-switching research in Hong Kong: a survey of recent research. Hong Kong Engl.: Auton. Creat. 1, 79 (2002)
S. Mandal, D. Das, S.K. Mahata, Preparing Bengali-English code-mixed corpus for sentiment analysis of Indian languages, in The 13th Workshop on Asian Language Resources (2018), p. 57
J. Patro, B. Samanta, S. Singh, A. Basu, P. Mukherjee, M. Choudhury, A. Mukherjee, All that is English may be Hindi: enhancing language identification through automatic ranking of likeliness of word borrowing in social media (2017). arXiv preprint arXiv:170708446
Y. Vyas, S. Gella, J. Sharma, K. Bali, M. Choudhury, POS tagging of English-Hindi code-mixed social media content, in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014), pp. 974–979
Acknowledgements
This work is supported by Media Lab Asia, MeitY, Government of India, under the Visvesvaraya Ph.D. Scheme for Electronics & IT.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mahata, S.K., Makhija, S., Agnihotri, A., Das, D. (2020). Analyzing Code-Switching Rules for English–Hindi Code-Mixed Text. In: Mandal, J., Bhattacharya, D. (eds) Emerging Technology in Modelling and Graphics. Advances in Intelligent Systems and Computing, vol 937. Springer, Singapore. https://doi.org/10.1007/978-981-13-7403-6_14
Download citation
DOI: https://doi.org/10.1007/978-981-13-7403-6_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7402-9
Online ISBN: 978-981-13-7403-6
eBook Packages: EngineeringEngineering (R0)