skip to main content
research-article

APIRO: A Framework for Automated Security Tools API Recommendation

Authors Info & Claims
Published:13 February 2023Publication History
Skip Abstract Section

Abstract

Security Orchestration, Automation, and Response (SOAR) platforms integrate and orchestrate a wide variety of security tools to accelerate the operational activities of Security Operation Center (SOC). Integration of security tools in a SOAR platform is mostly done manually using APIs, plugins, and scripts. SOC teams need to navigate through API calls of different security tools to find a suitable API to define or update an incident response action. Analyzing various types of API documentation with diverse API format and presentation structure involves significant challenges such as data availability, data heterogeneity, and semantic variation for automatic identification of security tool APIs specific to a particular task. Given these challenges can have negative impact on SOC team’s ability to handle security incident effectively and efficiently, we consider it important to devise suitable automated support solutions to address these challenges. We propose a novel learning-based framework for automated security tool API Recommendation for security Orchestration, automation, and response, APIRO. To mitigate data availability constraint, APIRO enriches security tool API description by applying a wide variety of data augmentation techniques. To learn data heterogeneity of the security tools and semantic variation in API descriptions, APIRO consists of an API-specific word embedding model and a Convolutional Neural Network (CNN) model that are used for prediction of top three relevant APIs for a task. We experimentally demonstrate the effectiveness of APIRO in recommending APIs for different tasks using three security tools and 36 augmentation techniques. Our experimental results demonstrate the feasibility of APIRO for achieving 91.9% Top-1 Accuracy. Compared to the state-of-the-art baseline, APIRO is 26.93%, 23.03%, and 20.87% improved in terms of Top-1, Top-2, and Top-3 Accuracy and outperforms the baseline by 23.7% in terms of Mean Reciprocal Rank (MRR).

REFERENCES

  1. [1] 2019. NLTK: Categorizing and Tagging Words. Retrieved January 13, 2021 from https://www.nltk.org/book/ch05.html.Google ScholarGoogle Scholar
  2. [2] MISP. 2017. PyMISP - Python API. Retrieved from https://pymisp.readthedocs.io/en/latest/. Accessed March 3, 2021.Google ScholarGoogle Scholar
  3. [3] Acar Yasemin, Backes Michael, Fahl Sascha, Kim Doowon, Mazurek Michelle L., and Stransky Christian. 2016. You get where you’re looking for: The impact of information sources on code security. In Proceedings of the 2016 IEEE Symposium on Security and Privacy (SP). IEEE, 289305.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Adel Heike and Schütze Hinrich. 2016. Exploring different dimensions of attention for uncertainty detection. arXiv:1612.06549. Retrieved from https://arxiv.org/abs/1612.06549.Google ScholarGoogle Scholar
  5. [5] Bellamy Laura, Carey Michelle, and Schlotfeldt Jenifer. 2011. DITA Best Practices: A Roadmap for Writing, Editing, and Architecting in DITA. IBM Press.Google ScholarGoogle Scholar
  6. [6] Bengio Yoshua. 2009. Learning Deep Architectures for AI. Now Publishers Inc.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Bojanowski Piotr, Grave Edouard, Joulin Armand, and Mikolov Tomas. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135146.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Networks Cortex XSOAR by Palo Alto. 2019. The State of Incident Response 2017. Retrieved from https://www.paloaltonetworks.com/resources/research/the-state-of-incident-response-2017. Accessed March 3, 2021.Google ScholarGoogle Scholar
  9. [9] Nepal C. Islam, M. A. Babar, and S.. 2019. An ontology-driven approach to automating the process of integrating security software systems. In Proceedings of the 2019 IEEE/ACM International Conference on Software and System Processes (ICSSP). IEEE, 5463.Google ScholarGoogle Scholar
  10. [10] Cai Liang, Wang Haoye, Huang Qiao, Xia Xin, Xing Zhenchang, and Lo David. 2019. BIKER: a tool for Bi-information source based API method recommendation. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 10751079.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Cao Kaibo, Chen Chunyang, Baltes Sebastian, Treude Christoph, and Chen Xiang. 2021. Automated query reformulation for efficient search based on query logs from stack overflow. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 12731285.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Chan Wing-Kwan, Cheng Hong, and Lo David. 2012. Searching connected API subgraph via text phrases. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. 111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Chen Chunyang, Xing Zhenchang, and Liu Yang. 2017. By the community & for the community: A deep learning approach to assist collaborative editing in q&a sites. Proceedings of the ACM on Human–Computer Interaction 1, CSCW (2017), 121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Chen Qingying and Zhou Minghui. 2018. A neural framework for retrieval and summarization of source code. In Proceedings of the 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 826831.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Chollet François et al. 2015. Keras. Retrieved from https://github.com/fchollet/keras. Accessed March 3, 2021.Google ScholarGoogle Scholar
  16. [16] Cochran William G.. 2007. Sampling Techniques. John Wiley & Sons.Google ScholarGoogle Scholar
  17. [17] Connect Threat. 2019. SOAR Platforms: Everything You Need to Know About Security Orchestration, Automation, and Response. Retrieved from https://threatconnect.com/wp-content/uploads/ThreatConnect-SOAR-eBook.pdf. Accessed January 13, 2021.Google ScholarGoogle Scholar
  18. [18] Coulombe Claude. 2018. Text data augmentation made simple by leveraging NLP cloud APIs. arXiv:1812.04718. Retrieved from https://arxiv.org/abs/1812.04718.Google ScholarGoogle Scholar
  19. [19] Defense Orange Cyber. 2020. SOAR: Conclusions for 2020. Retrieved from https://orangecyberdefense.com/global/blog/managed-detection-response/soar-conclusions-for-2020/. March 3, 2021.Google ScholarGoogle Scholar
  20. [20] Sorbo Andrea Di, Panichella Sebastiano, Visaggio Corrado A., Penta Massimiliano Di, Canfora Gerardo, and Gall Harald C.. 2015. Development emails content analyzer: Intention mining in developer discussions (T). In Proceedings of the 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 1223.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Fadaee Marzieh, Bisazza Arianna, and Monz Christof. 2017. Data augmentation for low-resource neural machine translation. arXiv:1705.00440. Retrieved from https://arxiv.org/abs/1705.00440.Google ScholarGoogle Scholar
  22. [22] Gao Zhipeng, Xia Xin, Lo David, and Grundy John. 2020. Technical Q&A site answer recommendation via question boosting. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 1 (2020), 134.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Gu Xiaodong, Zhang Hongyu, Zhang Dongmei, and Kim Sunghun. 2016. Deep API learning. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 631642.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Harkous Hamza, Fawaz Kassem, Lebret Rémi, Schaub Florian, Shin Kang G., and Aberer Karl. 2018. Polisis: Automated analysis and presentation of privacy policies using deep learning. In Proceedings of the 27th \(\lbrace\)USENIX\(\rbrace\) Security Symposium (\(\lbrace\)USENIX\(\rbrace\) Security 18). 531548.Google ScholarGoogle Scholar
  25. [25] Hochreiter Sepp and Schmidhuber Jürgen. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 17351780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Huang Qiao, Xia Xin, Lo David, and Murphy Gail C.. 2018. Automating intention mining. IEEE Transactions on Software Engineering 46, 10 (2018), 10981119.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Huang Qiao, Xia Xin, Xing Zhenchang, Lo David, and Wang Xinyu. 2018. API method recommendation without worrying about the task-API knowledge gap. In Proceedings of the 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 293304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Islam Chadni, Babar M. Ali, and Nepal Surya. 2019. Automated interpretation and integration of security tools using semantic knowledge. In Proceedings of the International Conference on Advanced Information Systems Engineering. Springer, 513528.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Islam Chadni, Babar Muhammad Ali, and Nepal Surya. 2020. Architecture-centric support for integrating security tools in a security orchestration platform. In Proceedings of the European Conference on Software Architecture. Springer, 165181.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Oltsik ESG Jon. July 2017. Cybersecurity Analytics and Operations in Transition: Challenges, Plans, Successes, and Strategies. Retrieved from https://www.esg-global.com/. Accessed March 3, 2021.Google ScholarGoogle Scholar
  31. [31] Kallis Rafael, Sorbo Andrea Di, Canfora Gerardo, and Panichella Sebastiano. 2021. Predicting issue types on GitHub. Science of Computer Programming 205 (2021), 102598.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Khin Nyein Pyae Pyae and Aung Than Nwe. 2015. Analyzing tagging accuracy of part-of-speech taggers. In Proceedings of the International Conference on Genetic and Evolutionary Computing. Springer, 347354.Google ScholarGoogle Scholar
  33. [33] Kim Hannah and Jeong Young-Seob. 2019. Sentiment classification using convolutional neural networks. Applied Sciences 9, 11 (2019), 2347.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Kim Yoon. 2014. Convolutional neural networks for sentence classification. arXiv:1408.5882. Retrieved from https://arxiv.org/abs/1408.5882.Google ScholarGoogle Scholar
  35. [35] Kobayashi Sosuke. 2018. Contextual augmentation: Data augmentation by words with paradigmatic relations. arXiv:1805.06201. Retrieved from https://arxiv.org/abs/1805.06201.Google ScholarGoogle Scholar
  36. [36] Lai Siwei, Xu Liheng, Liu Kang, and Zhao Jun. 2015. Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Le Triet Huynh Minh, Hin David, Croft Roland, and Babar M. Ali. 2020. PUMiner: Mining security posts from developer question and answer websites with PU learning. In Proceedings of the 17th International Conference on Mining Software Repositories. 350361.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Le Triet Huynh Minh, Sabir Bushra, and Babar Muhammad Ali. 2019. Automated software vulnerability assessment with concept drift. In Proceedings of the 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 371382.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Li Jinfeng, Du Tianyu, Ji Shouling, Zhang Rong, Lu Quan, Yang Min, and Wang Ting. 2020. Textshield: Robust text classification based on multimodal embedding and neural machine translation. In Proceedings of the 29th \(\lbrace\)USENIX\(\rbrace\) Security Symposium (\(\lbrace\)USENIX\(\rbrace\) Security 20). 13811398.Google ScholarGoogle Scholar
  40. [40] Li Zhen, Zou Deqing, Xu Shouhuai, Ou Xinyu, Jin Hai, Wang Sujuan, Deng Zhijun, and Zhong Yuyi. 2018. Vuldeepecker: A deep learning-based system for vulnerability detection. arXiv:1801.01681. Retrieved from https://arxiv.org/abs/1801.01681.Google ScholarGoogle Scholar
  41. [41] Limacharlie. 2021. LimaCharlie REST API Documentation. Retrieved from https://api.limacharlie.io/static/swagger/#/. Accessed March 3, 2021.Google ScholarGoogle Scholar
  42. [42] Limacharlie. 2021. LimaCharlie Sensor Commands. Retrieved from https://doc.limacharlie.io/docs/documentation/docs/sensor_commands.md. Accessed March 3, 2021.Google ScholarGoogle Scholar
  43. [43] Limacharlie. 2021. Python-LimaCharlie API Documentation. Retrieved from https://python-limacharlie.readthedocs.io/en/master/limacharlie.html. Accessed March 3, 2021.Google ScholarGoogle Scholar
  44. [44] Lin Bin, Cassee Nathan, Serebrenik Alexander, Bavota Gabriele, Novielli Nicole, and Lanza Michele. 2022. Opinion mining for software development: A systematic literature review. ACM Transactions on Software Engineering and Methodology 31, 3 (2022), 1–41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Ling Chunyang, Lin Zeqi, Zou Yanzhen, and Xie Bing. 2020. Adaptive deep code search. In Proceedings of the 28th International Conference on Program Comprehension. 4859.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Lu Meili, Sun Xiaobing, Wang Shaowei, Lo David, and Duan Yucong. 2015. Query expansion via wordnet for effective code search. In Proceedings of the 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 545549.Google ScholarGoogle Scholar
  47. [47] Ma Edward. 2019. NLP Augmentation. Retrieved from https://github.com/makcedward/nlpaug. Accessed March 3, 2021.Google ScholarGoogle Scholar
  48. [48] Mahdabi Parvaz, Keikha Mostafa, Gerani Shima, Landoni Monica, and Crestani Fabio. 2011. Building queries for prior-art search. In Proceedings of the Information Retrieval Facility Conference. Springer, 315.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] McHugh Mary L.. 2012. Interrater reliability: The kappa statistic. Biochemia Medica 22, 3 (2012), 276282.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] McMillan Collin, Grechanik Mark, Poshyvanyk Denys, Xie Qing, and Fu Chen. 2011. Portfolio: Finding relevant functions and their usage. In Proceedings of the 33rd International Conference on Software Engineering. 111120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Meng Michael, Steinhardt Stephanie, and Schubert Andreas. 2018. Application programming interface documentation: What do software developers want? Journal of Technical Writing and Communication 48, 3 (2018), 295330.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Mikolov Tomas, Chen Kai, Corrado Greg, and Dean Jeffrey. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781. Retrieved from https://arxiv.org/abs/1301.3781.Google ScholarGoogle Scholar
  53. [53] Mikolov Tomas, Sutskever Ilya, Chen Kai, Corrado Greg S., and Dean Jeff. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the Advances in Neural Information Processing Systems. 31113119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] MISP. 2021. MISP Automation API. Retrieved from https://www.circl.lu/doc/misp/automation/. Accessed March 3, 2021.Google ScholarGoogle Scholar
  55. [55] MISP. 2021. PyMISP - Python Library to Access MISP: Example Scripts. Retrieved from https://www.circl.lu/doc/misp/pymisp/. Accessed March 3, 2021.Google ScholarGoogle Scholar
  56. [56] Nguyen Trong Duc, Nguyen Anh Tuan, Phan Hung Dang, and Nguyen Tien N.. 2017. Exploring API embedding for API usages and applications. In Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 438449.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Niederhut Dillon. 2020. Niacin: A Python package for text data enrichment. Journal of Open Source Software 5, 50 (2020), 2136.Google ScholarGoogle ScholarCross RefCross Ref
  58. [58] Wiebke Wagner. 2010. Steven bird, ewan klein and edward loper: Natural language processing with python, analyzing text with the natural language toolkit. Language Resources and Evaluation 44, 4 (2010), 421–424.Google ScholarGoogle Scholar
  59. [59] Novick David G. and Ward Karen. 2006. Why don’t people read the manual? In Proceedings of the 24th Annual ACM International Conference on Design of Communication. 1118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. [60] Panichella Sebastiano, Sorbo Andrea Di, Guzman Emitza, Visaggio Corrado A., Canfora Gerardo, and Gall Harald C.. 2015. How can i improve my app? Classifying user reviews for software maintenance and evolution. In Proceedings of the 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 281290.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Parnin Chris, Treude Christoph, Grammel Lars, and Storey Margaret-Anne. 2012. Crowd documentation: Exploring the coverage and the dynamics of API discussions on Stack Overflow. Georgia Institute of Technology, Tech. Rep 11 (2012).Google ScholarGoogle Scholar
  62. [62] Pennington Jeffrey, Socher Richard, and Manning Christopher D.. 2014. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 15321543.Google ScholarGoogle ScholarCross RefCross Ref
  63. [63] Phantom Splunk. 2021. Splunk Phantom: Harness the Full Power of Your Security Investments with Security Orchestration, Automation and Response. Retrieved from https://www.splunk.com/en_us/software/splunk-security-orchestration-and-automation/features.html. Accessed March 3, 2021.Google ScholarGoogle Scholar
  64. [64] Prechelt Lutz. 1998. Early stopping-but when? In Proceedings of the Neural Networks: Tricks of the Trade. Springer, 5569.Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Qiu Siyuan, Xu Binxia, Zhang Jie, Wang Yafang, Shen Xiaoyu, Melo Gerard de, Long Chong, and Li Xiaolong. 2020. EasyAug: An automatic textual data augmentation platform for classification tasks. In Proceedings of the Companion Proceedings of the Web Conference 2020. 249252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. [66] Raghothaman Mukund, Wei Yi, and Hamadi Youssef. 2016. Swim: Synthesizing what i mean-code search and idiomatic snippet synthesis. In Proceedings of the 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE, 357367.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. [67] Rahman Mohammad Masudur, Roy Chanchal K., and Lo David. 2016. Rack: Automatic api recommendation using crowdsourced knowledge. In Proceedings of the 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. IEEE, 349359.Google ScholarGoogle ScholarCross RefCross Ref
  68. [68] Rapid7. 2020. Security Orchestration and Automation (SOAR) Playbook: Your Practical Guide to Implementing a SOAR Solution. Retrieved from https://www.rapid7.com/info/security-orchestration-and-automation-playbook/. Accessed March 3, 2021.Google ScholarGoogle Scholar
  69. [69] Řehůřek Radim and Sojka Petr. 2010. Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 4550.Google ScholarGoogle Scholar
  70. [70] Rikard Buddy. 2019, Accessed March 3, 2021. Threat Connect- Playbook Fridays: How to Create a Playbook for the Non-Programmer. Retrieved from https://threatconnect.com/blog/playbooks-for-non-programmers/.Google ScholarGoogle Scholar
  71. [71] Rios Anthony and Kavuluru Ramakanth. 2015. Convolutional neural networks for biomedical text classification: Application in indexing biomedical articles. In Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics. 258267.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. [72] Sanders Hillary and Saxe Joshua. 2017. Garbage in, garbage out: How purport-edly great ML models can be screwed up by bad data. Proceedings of Blackhat 2017 (2017).Google ScholarGoogle Scholar
  73. [73] Schuster Mike and Paliwal Kuldip K.. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (1997), 26732681.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. [74] Security D3. 2021. ENTERPRISE INCIDENT & CASE MANAGEMENT SOLUTION FOR SECURITY ORCHESTRATION, AUTOMATION, & RESPONSE. Retrieved from https://query.prod.cms.rt.microsoft.com/cms/api/am/binary/RE36vyb. Accessed March 3, 2021.Google ScholarGoogle Scholar
  75. [75] SNORT. 2020. Running Snort as a Daemon. Retrieved from http://manual-snort-org.s3-website-us-east-1.amazonaws.com/node11.html. Accessed March 3, 2021.Google ScholarGoogle Scholar
  76. [76] SNORT. 2020. SNORT Users Manual 2.9.16. Retrieved from http://manual-snort-org.s3-website-us-east-1.amazonaws.com/. Accessed March 3, 2021.Google ScholarGoogle Scholar
  77. [77] Song Mingcong, Hu Yang, Chen Huixiang, and Li Tao. 2017. Towards pervasive and user satisfactory cnn across gpu microarchitectures. In Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 112.Google ScholarGoogle ScholarCross RefCross Ref
  78. [78] Strigl Daniel, Kofler Klaus, and Podlipnig Stefan. 2010. Performance and scalability of GPU-based convolutional neural networks. In Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing. IEEE, 317324.Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. [79] Sun Chi, Qiu Xipeng, Xu Yige, and Huang Xuanjing. 2019. How to fine-tune BERT for text classification? In Proceedings of the China National Conference on Chinese Computational Linguistics. Springer, 194206.Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. [80] Swimlane. 2021. Security Orchestration, Automation and Response (SOAR) Capabilities. Retrieved from https://swimlane.com/assets/uploads/documents/SOAR_Capabilities_e_book___Swimlane.pdf.Google ScholarGoogle Scholar
  81. [81] TensorFlow. 2021. Sparse_categorical_crossentropy. Retrieved from https://www.tensorflow.org/api_docs/python/tf/keras/losses/sparse_categorical_crossentropy. Accessed January 27, 2021.Google ScholarGoogle Scholar
  82. [82] Tian Yuan, Thung Ferdian, Sharma Abhishek, and Lo David. 2017. APIBot: Question answering bot for API documentation. In Proceedings of the 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 153158.Google ScholarGoogle ScholarCross RefCross Ref
  83. [83] Tiun S., Mokhtar U. A., Bakar S. H., and Saad S.. 2020. Classification of functional and non-functional requirement in software requirement using Word2vec and fast Text. In Proceedings of the Journal of Physics: Conference Series, Vol. 1529. IOP Publishing, 042077.Google ScholarGoogle ScholarCross RefCross Ref
  84. [84] University Princeton. 2010. Princeton University “About WordNet”. Retrieved from https://wordnet.princeton.edu/. January 13, 2021.Google ScholarGoogle Scholar
  85. [85] Vidgen Bertie and Derczynski Leon. 2020. Directions in abusive language training data, a systematic review: Garbage in, garbage out. PloS One 15, 12 (2020), e0243300.Google ScholarGoogle ScholarCross RefCross Ref
  86. [86] Vielberth Manfred, Böhm Fabian, Fichtinger Ines, and Pernul Günther. 2020. Security operations center: A systematic study and open challenges. IEEE Access 8 (2020), 227756227779.Google ScholarGoogle ScholarCross RefCross Ref
  87. [87] Wang Shaohua, Phan NhatHai, Wang Yan, and Zhao Yong. 2019. Extracting API tips from developer question and answer websites. In Proceedings of the 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 321332.Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. [88] Wei Jason and Zou Kai. 2019. Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv:1901.11196. Retrieved from https://arxiv.org/abs/1901.11196.Google ScholarGoogle Scholar
  89. [89] Wen Ying, Zhang Weinan, Luo Rui, and Wang Jun. 2016. Learning text representation using recurrent convolutional neural network with highway layers. arXiv:1606.06905. Retrieved from https://arxiv.org/abs/1606.06905.Google ScholarGoogle Scholar
  90. [90] Wu Di, Jing Xiao-Yuan, Zhang Hongyu, Kong Xiaohui, Xie Yu, and Huang Zhiguo. 2020. Data-driven approach to application programming interface documentation mining: A review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10, 5 (2020), e1369.Google ScholarGoogle ScholarCross RefCross Ref
  91. [91] Xie Wenkai, Peng Xin, Liu Mingwei, Treude Christoph, Xing Zhenchang, Zhang Xiaoxin, and Zhao Wenyun. 2020. API method recommendation via explicit matching of functionality verb phrases. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 10151026.Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. [92] Xu Congying, Sun Xiaobing, Li Bin, Lu Xintong, and Guo Hongjing. 2018. MULAPI: Improving API method recommendation with API usage location. Journal of Systems and Software 142 (2018), 195205.Google ScholarGoogle ScholarCross RefCross Ref
  93. [93] Ye Xin, Shen Hui, Ma Xiao, Bunescu Razvan, and Liu Chang. 2016. From word embeddings to document similarities for improved information retrieval in software engineering. In Proceedings of the 38th International Conference on Software Engineering. 404415.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. [94] Yenigalla Promod, Kar Sibsambhu, Singh Chirag, Nagar Ajay, and Mathur Gaurav. 2018. Addressing unseen word problem in text classification. In Proceedings of the International Conference on Applications of Natural Language to Information Systems. Springer, 339351.Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. [95] Yumusak Semih, Dogdu Erdogan, and Kodaz Halife. 2014. Tagging accuracy analysis on part-of-speech taggers. Journal of Computer and Communications 2, 4 (2014), 157162.Google ScholarGoogle ScholarCross RefCross Ref
  96. [96] Zhang Xiang, Zhao Junbo, and LeCun Yann. 2015. Character-level convolutional networks for text classification. In Proceedings of the Advances in Neural Information Processing Systems. 649657.Google ScholarGoogle Scholar
  97. [97] Zhong Hao, Zhang Lu, Xie Tao, and Mei Hong. 2009. Inferring resource specifications from natural language API documentation. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering. IEEE, 307318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. [98] Zhou Peng, Qi Zhenyu, Zheng Suncong, Xu Jiaming, Bao Hongyun, and Xu Bo. 2016. Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. arXiv:1611.06639. Retrieved from https://arxiv.org/abs/1611.06639.Google ScholarGoogle Scholar

Index Terms

  1. APIRO: A Framework for Automated Security Tools API Recommendation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Software Engineering and Methodology
          ACM Transactions on Software Engineering and Methodology  Volume 32, Issue 1
          January 2023
          954 pages
          ISSN:1049-331X
          EISSN:1557-7392
          DOI:10.1145/3572890
          • Editor:
          • Mauro Pezzè
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 February 2023
          • Online AM: 31 March 2022
          • Accepted: 19 January 2022
          • Revised: 14 December 2021
          • Received: 27 May 2021
          Published in tosem Volume 32, Issue 1

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed
        • Article Metrics

          • Downloads (Last 12 months)606
          • Downloads (Last 6 weeks)88

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text

        HTML Format

        View this article in HTML Format .

        View HTML Format