Skip to main content

An NLP-Based Architecture for the Autocompletion of Partial Domain Models

  • Conference paper
  • First Online:
Advanced Information Systems Engineering (CAiSE 2021)

Abstract

Domain models capture the key concepts and relationships of a business domain. Typically, domain models are manually defined by software designers in the initial phases of a software development cycle, based on their interactions with the client and their own domain expertise. Given the key role of domain models in the quality of the final system, it is important that they properly reflect the reality of the business.

To facilitate the definition of domain models and improve their quality, we propose to move towards a more assisted domain modeling building process where an NLP-based assistant will provide autocomplete suggestions for the partial model under construction based on the automatic analysis of the textual information available for the project (contextual knowledge) and/or its related business domain (general knowledge). The process will also take into account the feedback collected from the designer’s interaction with the assistant. We have developed a proof-of-concept tool and have performed a preliminary evaluation that shows promising results.

Supported by Spanish project TIN2016-75944-R and CEA’s initiative Modelia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    According to the Cambridge dictionary: “information on many different subjects that you collect gradually, from reading, television, etc., rather than detailed information on subjects that you have studied formally”.

  2. 2.

    Note that “NLP model” and “domain model” do not refer to the same type of model at all. In the NLP field, a model is the result of analyzing the textual corpus of data (it could be a trained neural network, a statistical model,...). To avoid confusion, in this work, each time we refer to a NLP model, we always refer to it as “NLP model” and never as “model” alone.

  3. 3.

    https://nlp.stanford.edu/projects/glove/, https://wikipedia2vec.github.io/wikipedia2vec/pretrained/, https://code.google.com/archive/p/word2vec/.

  4. 4.

    Note that, for each model, there is a finite number of slices.

  5. 5.

    In linguistics, lemmatization is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word’s lemma, or dictionary form.

  6. 6.

    https://github.com/modelia/model-autocompletion.

  7. 7.

    https://www.eclipse.org/modeling/emf/.

  8. 8.

    These documents are not publicly available due to industrial property right. Nevertheless, the software artifacts derived from them are available in our Git repository.

References

  1. Agt-Rickauer, H., Kutsche, R., Sack, H.: Automated recommendation of related model elements for domain models. In: MODELSWARD 2018, vol. 991, pp. 134–158 (2018)

    Google Scholar 

  2. Arora, C., Sabetzadeh, M., Briand, L.C., Zimmer, F.: Extracting domain models from natural-language requirements: approach and industrial evaluation. In: MODELS 2016, pp. 250–260 (2016)

    Google Scholar 

  3. Bakar, N.H., Kasirun, Z.M., Salleh, N.: Feature extraction approaches from natural language requirements for reuse in software product lines: a systematic literature review. J. Syst. Softw. 106, 132–149 (2015)

    Google Scholar 

  4. Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., et al.: Language models are few-shot learners (2020). https://arxiv.org/abs/2005.14165

  5. Bruch, M., Monperrus, M., Mezini, M.: Learning from examples to improve code completion systems. In: ESEC-FSE 2009, pp. 213–222 (2009)

    Google Scholar 

  6. Buitelaar, P., Cimiano, P., Magnini, B.: Ontology learning from text: methods, evaluation and applications, vol. 123. IOS press (2005)

    Google Scholar 

  7. CEA NLP tech: LIMA: LIbre Multilingual Analyzer. https://github.com/aymara/lima/wiki/DeepLima-beta#the-lima-multilingual-nlp-tool (2020)

  8. Conesa, J., Olivé, A.: A method for pruning ontologies in the development of conceptual schemas of information systems. In: JoDS V, pp. 64–90 (2006)

    Google Scholar 

  9. Dahab, M.Y., Hassan, H.A., Rafea, A.: TextOntoEx: automatic ontology construction from natural English text. Expert Syst. Appl. 34(2), 1474–1480 (2008)

    Article  Google Scholar 

  10. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018). http://arxiv.org/abs/1810.04805

  11. Elkamel, A., Gzara, M., Ben-Abdallah, H.: An UML class recommender system for software design. In: AICCSA 2016, pp. 1–8 (2016)

    Google Scholar 

  12. Evans, E.: Domain-driven design: tackling complexity in the heart of software. Addison-Wesley Professional (2004)

    Google Scholar 

  13. Fellbaum, C.: WordNet: an electronic lexical database. Bradford Books (1998). https://wordnet.princeton.edu/

  14. Friedrich, F., Mendling, J., Puhlmann, F.: Process model generation from natural language text. In: CAISE 2011, pp. 482–496 (2011)

    Google Scholar 

  15. Ganser, A., Lichter, H.: Engineering model recommender foundations. In: MODELSWARD 2013, vol. 19, pp. 135–142 (2013)

    Google Scholar 

  16. Gasparic, M., Janes, A.: What recommendation systems for software engineering recommend. J. Syst. Softw. 113, 101–113 (2016)

    Google Scholar 

  17. Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. In: LREC 2018 (2018)

    Google Scholar 

  18. Harel, D., Katz, G., Marelly, R., Marron, A.: Wise computing: toward endowing system development with proactive wisdom. Computer 51(2), 14–26 (2018)

    Article  Google Scholar 

  19. Harmain, H.M., Gaizauskas, R.J.: Cm-builder: a natural language-based case tool for object-oriented analysis. Autom. Softw. Eng. 10, 157–181 (2003)

    Article  Google Scholar 

  20. Ibrahim, M., Ahmad, R.: Class diagram extraction from textual requirements using natural language processing (NLP) techniques. In: ICCRD 2010, pp. 200–204 (2010)

    Google Scholar 

  21. Kuhn, A.: On recommending meaningful names in source and UML. In: RSSE 2010, pp. 50–51 (2010)

    Google Scholar 

  22. Kumar, D.D., Sanyal, R.: Static UML model generator from analysis of requirements (SUGAR). In: ASEA 2008, pp. 77–84 (2008)

    Google Scholar 

  23. Kuschke, T., Mäder, P.: Pattern-based auto-completion of UML modeling activities. In: ASE 2014, pp. 551–556 (2014)

    Google Scholar 

  24. Lee, C.S., Kao, Y.F., Kuo, Y.H., Wang, M.H.: Automated ontology construction for unstructured text documents. Data Knowl. Eng. 60(3), 547–566 (2007)

    Article  Google Scholar 

  25. Marasoiu, M., Church, L., Blackwell, A.F.: An empirical investigation of code completion usage by professional software developers. In: PPIG 2015, p. 14 (2015)

    Google Scholar 

  26. Mendix: Mendix assist (2020). https://www.mendix.com/platform/#assist

  27. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS 2013, vol. 2 (2013)

    Google Scholar 

  28. Mussbacher, G., Combemale, B., Kienzle, J., et al.: Opportunities in intelligent modeling assistance. Softw. Syst. Model. 19(5), 1045–1053 (2020)

    Google Scholar 

  29. Olivé, A.: Conceptual Modeling of Information Systems. Springer, Berlin (2007). https://doi.org/10.1007/978-3-540-39390-0

    Book  MATH  Google Scholar 

  30. OutSystems: (2020). https://www.outsystems.com/p/low-code-platform/

  31. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: EMNLP 2014, pp. 1532–1543 (2014)

    Google Scholar 

  32. Reinhartz-Berger, I., Kemelman, M.: Extracting core requirements for software product lines. Requirements Eng. 25(1), 47–65 (2020)

    Article  Google Scholar 

  33. Robillard, M., Walker, R., Zimmermann, T.: Recommendation systems for software engineering. IEEE Softw. 27(4), 80–86 (2009)

    Article  Google Scholar 

  34. Sagar, V.B.R.V., Abirami, S.: Conceptual modeling of natural language functional requirements. J. Syst. Softw. 88, 25–41 (2014)

    Article  Google Scholar 

  35. Saini, R., Mussbacher, G., Guo, J.L., Kienzle, J.: DoMoBOT: a bot for automated and interactive domain modelling. In: MDE Intelligence 2020, pp. 1–10 (2020)

    Google Scholar 

  36. Saini, R., Mussbacher, G., Guo, J.L., Kienzle, J.: Towards queryable and traceable domain models. In: RE 2020, pp. 334–339. IEEE (2020)

    Google Scholar 

  37. Sen, S., Baudry, B., Vangheluwe, H.: Towards domain-specific model editors with automatic model completion. Simulation 86(2), 109–126 (2010)

    Article  Google Scholar 

  38. Shao, T., Chen, H., Chen, W.: Query auto-completion based on word2vec semantic similarity. J. Phys. Conf. Ser. 1004(1), 12–18 (2018)

    Google Scholar 

  39. Steinberg, D., Budinsky, F., Paternostro, M., Merks, E.: EMF: Eclipse Modeling Framework 2.0., 2nd edn. Addison-Wesley Professional, Boston (2009)

    Google Scholar 

  40. Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text: a look back and into the future. ACM Comput. Surv. (CSUR) 44(4), 1–36 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Loli Burgueño .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Burgueño, L., Clarisó, R., Gérard, S., Li, S., Cabot, J. (2021). An NLP-Based Architecture for the Autocompletion of Partial Domain Models. In: La Rosa, M., Sadiq, S., Teniente, E. (eds) Advanced Information Systems Engineering. CAiSE 2021. Lecture Notes in Computer Science(), vol 12751. Springer, Cham. https://doi.org/10.1007/978-3-030-79382-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-79382-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-79381-4

  • Online ISBN: 978-3-030-79382-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics