Skip to main content

KBSET – Knowledge-Based Support for Scholarly Editing and Text Processing with Declarative Markup and a Core Written in SWI-Prolog

  • Conference paper
  • First Online:
  • 306 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12057))

Abstract

KBSET is an environment that provides support for scholarly editing in two flavors: First, as a practical tool KBSET/Letters that accompanies the development of editions of correspondences (in particular from the 18th and 19th century), completely from source documents to PDF and HTML presentations. Second, as a prototypical tool KBSET/NER for experimentally investigating novel forms of working on editions that are centered around automated named entity recognition. KBSET can process declarative application-specific markup that is expressed in notation and incorporate large external fact bases that are typically provided in RDF. KBSET includes specially developed styles and a core system that is written in SWI-Prolog, which is used there in many roles, utilizing that it realizes the potential of Prolog as a unifying language.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The Oxygen XML Editor. See also https://en.wikipedia.org/wiki/Comparison_of_XML_editors, accessed Nov 19 2019.

  2. 2.

    In fact, [14, Sect. iv] notes that “the TEI encoding scheme itself does not depend on this language [XML]; it was originally formulated in terms of SGML (the ISO Standard Generalized Markup Language), a predecessor of XML, and may in future years be re-expressed in other ways as the field of markup develops and matures”.

  3. 3.

    http://www.dnb.de/gnd. The GND is maintained by the German-speaking library community and contains information about various entities, in particular about more than 11 million persons in more that 160 million fact triples. It is in the public domain (CC0) and can be downloaded as an RDF/XML document whose decompressed size is more than 18 GB.

  4. 4.

    http://kalliope-verbund.info.

  5. 5.

    A specification draft is available from the KBSET home page.

  6. 6.

    We do not demand in full the principles of the GND for choosing preferred names, as “Colombo, Cristoforo” or “Homerus” is unusual in German texts.

  7. 7.

    Before printing in high quality, documents in general need manual adjustments in places that can not be handled satisfactorily by the automated layout processor.

  8. 8.

    In Microsoft Windows, these scripts can be called from the Cygwin shell.

  9. 9.

    This requires a syntactic conversion as “:” has a special meaning in URIs.

  10. 10.

    Also the original fact bases used for the example application are archived on the KBSET home page, as none of them has a persistent URL.

  11. 11.

    In the sense of the discipline Artificial Intelligence, not as synecdoche for its subfield Machine Learning.

  12. 12.

    See for example the DFG (German Research Foundation) document Förderkriterien für wissenschaftliche Editionen in der Literaturwissenschaft, Ausgabe 11/2015, https://www.dfg.de/download/pdf/foerderung/grundlagen_dfg_foerderung/inform-ationen_fachwissenschaften/geisteswissenschaften/foerderkriterien_editionen_literat-urwissenschaft.pdf.

  13. 13.

    Scholarly editions of correspondences that offer an openly available TEI/XML presentation include Alfred-Escher Briefedition (https://www.briefedition.alfred-escher.ch), Briefe und Texte aus dem intellektuellen Berlin um 1800 (https://www.berliner-intellektuelle.eu), Digitale Edition der Korrespondenz August Wilhelm Schlegels (https://august-wilhelm-schlegel.de), hallerNet (http://hallernet.org), and edition humboldt digital (https://edition-humboldt.de).

  14. 14.

    Actually, the authors were (in November 2019) not able to find any correspondence edition where a formal specification of the used customized schema is referenced from the TEI/XML documents or specified on the Web site. Informal edition guidelines can be found, for example, on the Web sites of Alfred-Escher Briefedition, Briefe und Texte aus dem intellektuellen Berlin um 1800 and hallerNet.

  15. 15.

    The well-intentioned postulation “Um die Austauschbarkeit und Nachnutzung zu ermöglichen, werden die projektspezifisch verwendeten XML-Elemente und Attribut-Wert-Paare im TEI-Header dokumentiert” in the DFG document mentioned in footnote 12 can technically not refer to the teiHeader element.

  16. 16.

    Some of the functions of KBSET can be invoked in addition from Bash shell scripts. A Bash shell can be presupposed on Unix-like platforms and can be added, for example with Cygwin, to Microsoft Windows platforms.

  17. 17.

    Considering that there is an ISO standard for Prolog, such fact bases are actually in a standardized format. However, the ISO standard for Prolog is only with respect to ASCII encoding. Modern implementations like SWI-Prolog support UTF-8.

  18. 18.

    https://www.swi-prolog.org/features.html, accessed Nov 21 2019.

References

  1. Benedikt, M., Leblay, J., ten Cate, B., Tsamoura, E.: Generating Plans from Proofs: The Interpolation-based Approach to Query Reformulation. Morgan & Claypool, San Rafael (2016)

    Google Scholar 

  2. Craig, W.: Three uses of the Herbrand-Gentzen theorem in relating model theory and proof theory. J. Symbolic Logic 22(3), 269–285 (1957)

    Article  MathSciNet  Google Scholar 

  3. Eide, O.: Ontologies, data modeling, and TEI. J. Text Encoding Initiative 8 (2015)

    Google Scholar 

  4. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: ACL 2005, pp. 363–370. ACL (2005)

    Google Scholar 

  5. Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)

    Article  MathSciNet  Google Scholar 

  6. Kittelmann, J., Wernhard, C.: Semantik, Web, Metadaten und digitale Edition: Grundlagen und Ziele der Erschließung neuer Quellen des Branitzer Pückler-Archivs. In: Krebs, I., et al. (eds.) Resonanzen. Pücklerforschung im Spannungsfeld zwischen Wissenschaft und Kunst, pp. 179–202. trafo Verlag (2013)

    Google Scholar 

  7. Kittelmann, J., Wernhard, C.: Knowledge-based support for scholarly editing and text processing. In: DHd 2016, pp. 178–181. Nisaba verlag (2016)

    Google Scholar 

  8. Kittelmann, J., Wernhard, C.: Von der Transkription zur Wissensbasis. Zum Zusammenspiel von digitalen Editionstechniken und Formen der Wissensrepräsentation am Beispiel von Korrespondenzen Johann Georg Sulzers. In: Kittelmann, J., Purschwitz, A. (eds.) Aufklärungsforschung digital. Konzepte, Methoden, Perspektiven, IZEA - Kleine Schriften, vol. 10/2019, pp. 84–114. Mitteldeutscher Verlag (2019)

    Google Scholar 

  9. Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)

    Article  Google Scholar 

  10. O’Keefe, R.A.: The Craft of Prolog. The MIT Press, Cambridge (1990)

    Google Scholar 

  11. Plachta, B.: Editionswissenschaft: Eine Einführung in Methode und Praxis der Edition neuerer Texte. Reclam (1997)

    Google Scholar 

  12. Sahle, P.: Digitale Editionsformen, Zum Umgang mit der Überlieferung unter den Bedingungen des Medienwandels, 3 volumes, Schriften des Instituts für Dokumentologie und Editorik, vol. 7–9. Books on Demand (2013)

    Google Scholar 

  13. Sulzer, J.G.: Gesammelte Schriften. Kommentierte Ausgabe. In: Adler, H., Décultot, E. (eds.) Schwabe (2014–2021)

    Google Scholar 

  14. The TEI Consortium: TEI P5: Guidelines for Electronic Text Encoding and Interchange, Version 3.6.0. Text Encoding Initiative Consortium (2019). http://www.tei-c.org/Guidelines/P5/

  15. Toman, D., Weddell, G.: Fundamentals of Physical Design and Query Compilation. Morgan and Claypool, San Rafael (2011)

    Google Scholar 

  16. Wernhard, C.: Facets of the PIE environment for proving, interpolating and eliminating on the basis of first-order logic. In: Hofstedt, P., et al. (eds.) DECLARE 2019. LNCS(LNAI), vol. 12057, pp. 160–177. Springer, Heidelberg (2020)

    Google Scholar 

  17. Wielemaker, J., Schrijvers, T., Triska, M., Lager, T.: SWI-prolog. Theory Practice Logic Program. 12(1–2), 67–96 (2012)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kittelmann, J., Wernhard, C. (2020). KBSET – Knowledge-Based Support for Scholarly Editing and Text Processing with Declarative Markup and a Core Written in SWI-Prolog. In: Hofstedt, P., Abreu, S., John, U., Kuchen, H., Seipel, D. (eds) Declarative Programming and Knowledge Management. INAP WLP WFLP 2019 2019 2019. Lecture Notes in Computer Science(), vol 12057. Springer, Cham. https://doi.org/10.1007/978-3-030-46714-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-46714-2_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-46713-5

  • Online ISBN: 978-3-030-46714-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics