CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System

Xu, Bo; Xu, Yong; Liang, Jiaqing; Xie, Chenhao; Liang, Bin; Cui, Wanyun; Xiao, Yanghua

doi:10.1007/978-3-319-60045-1_44

Bo Xu¹⁶,
Yong Xu¹⁶,
Jiaqing Liang^16,17,
Chenhao Xie^16,17,
Bin Liang¹⁶,
Wanyun Cui¹⁶ &
…
Yanghua Xiao^16,18

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10351))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

4137 Accesses
122 Citations
1 Altmetric

Abstract

Great efforts have been dedicated to harvesting knowledge bases from online encyclopedias. These knowledge bases play important roles in enabling machines to understand texts. However, most current knowledge bases are in English and non-English knowledge bases, especially Chinese ones, are still very rare. Many previous systems that extract knowledge from online encyclopedias, although are applicable for building a Chinese knowledge base, still suffer from two challenges. The first is that it requires great human efforts to construct an ontology and build a supervised knowledge extraction model. The second is that the update frequency of knowledge bases is very slow. To solve these challenges, we propose a never-ending Chinese Knowledge extraction system, CN-DBpedia, which can automatically generate a knowledge base that is of ever-increasing in size and constantly updated. Specially, we reduce the human costs by reusing the ontology of existing knowledge bases and building an end-to-end facts extraction model. We further propose a smart active update strategy to keep the freshness of our knowledge base with little human costs. The 164 million API calls of the published services justify the success of our system.

This paper was supported by National Key Basic Research Program of China under No. 2015CB358800, by the National NSFC (No. 61472085, U1509213), by Shanghai Municipal Science and Technology Commission foundation key project under No. 15JC1400900, by Shanghai Municipal Science and Technology project under No. 16511102102.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76298-0_52
Chapter Google Scholar
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)
Google Scholar
Cui, W., Xiao, Y., Wang, W.: KBQA: an online template based question answering system over freebase. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July, pp. 4240–4241 (2016)
Google Scholar
Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., et al.: Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 5, 1–29 (2014)
Google Scholar
Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving Chinese linking open data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25093-4_14
Chapter Google Scholar
Sabou, M., Bontcheva, K., Scharl, A.: Crowdsourcing research opportunities: lessons from natural language processing. In: Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies, p. 17. ACM (2012)
Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. ACM (2007)
Google Scholar
Xie, C., Liang, J., Chen, L., Xiao, Y.: Towards End-to-End Knowledge Graph Construction via a Hybrid LSTM-RNN Framework
Google Scholar
Xu, B., Zhang, Y., Liang, J., Xiao, Y., Hwang, S., Wang, W.: Cross-lingual type inference. In: Navathe, S.B., Wu, W., Shekhar, S., Du, X., Wang, X.S., Xiong, H. (eds.) DASFAA 2016. LNCS, vol. 9642, pp. 447–462. Springer, Cham (2016). doi:10.1007/978-3-319-32025-0_28
Chapter Google Scholar
Yang, D., He, J., Qin, H., Xiao, Y., Wang, W.: A graph-based recommendation across heterogeneous domains. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 463–472. ACM (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, Shanghai, China
Bo Xu, Yong Xu, Jiaqing Liang, Chenhao Xie, Bin Liang, Wanyun Cui & Yanghua Xiao
Data Eyes Research, Shanghai, China
Jiaqing Liang & Chenhao Xie
Shanghai Internet Big Data Engineering and Technology Center, Shanghai, China
Yanghua Xiao

Authors

Bo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqing Liang
View author publications
You can also search for this author in PubMed Google Scholar
Chenhao Xie
View author publications
You can also search for this author in PubMed Google Scholar
Bin Liang
View author publications
You can also search for this author in PubMed Google Scholar
Wanyun Cui
View author publications
You can also search for this author in PubMed Google Scholar
Yanghua Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanghua Xiao .

Editor information

Editors and Affiliations

Artois University, Lens, France
Salem Benferhat
Artois University, Lens, France
Karim Tabia
Texas State University, San Marcos, Texas, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, B. et al. (2017). CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System. In: Benferhat, S., Tabia, K., Ali, M. (eds) Advances in Artificial Intelligence: From Theory to Practice. IEA/AIE 2017. Lecture Notes in Computer Science(), vol 10351. Springer, Cham. https://doi.org/10.1007/978-3-319-60045-1_44

Download citation

DOI: https://doi.org/10.1007/978-3-319-60045-1_44
Published: 03 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60044-4
Online ISBN: 978-3-319-60045-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics