Named Entity Recognition in Indonesian History Textbook Using BERT Model
DOI:
https://doi.org/10.31154/cogito.v11i1.880.140-151Keywords:
NER, IOB, BERT, HistoryAbstract
History is not recognized as an explicit subject in some primary or secondary education institutions anymore. Certainly, this can cause concern for the younger generation about their nation's history. Whereas history textbooks are available in digital form and contain much information, the presentation is still unstructured and difficult to understand. This research aims to develop a model of extracting historical entities from textbooks using the Named Entity Recognition (NER) approach based on the BERT (Bidirectional Encoder Representations from Transformers). The text data is derived from the history chapter of the 8th Social Science published by the Ministry of Education. The research stages include data extraction, preprocessing, IOB labeling, identifying entities by the BERT algorithm, and performance evaluation. The preprocessing results successfully reduced irrelevant words and improved analysis efficiency. The BERT model showed high performance with a precision value of 88.68%, a recall of 74.60%, and an F1-score of 81.03%. In addition, there were fluctuations in training time between epochs that were influenced by entity variation and sentence complexity. Overall, this research shows, the model application can extract historical entities automatically and accurately, thus potentially enriching historical understanding for students and society through the utilization of Natural Language Processing technologyReferences
E. Suparjan, “Perubahan Kurikulum Pendidikan Sejarah Di SMA (1994-2013),” JISIP (Jurnal Ilmu Sosial dan Pendidikan), vol. 4, no. 3, 2020, doi: 10.36312/jisip.v4i3.1283.
R. A. Pratama, Maskun, and N. I. Lestari, “Dinamika Pelajaran Sejarah Indonesia dalam Kurikulum 2013 pada Jenjang SMK/MAK,” Jurnal Pendidikan Sejarah, vol. 8, no. 2, 2019, doi: 10.21009/jps.082.02.
Istianah and S. Wahyuningsih, “The hadith digitization in millennial era: A study at center for hadith studies, Indonesia,” Qudus International Journal of Islamic Studies, vol. 7, no. 1, 2019, doi: 10.21043/qijis.v7i1.4900.
N. S. Lagutina, A. M. Vasilyev, and D. D. Zafievsky, “Name Entity Recognition Tasks: Technologies and Tools,” Modeling and Analysis of Information Systems, vol. 30, no. 1, 2023, doi: 10.18255/1818-1015-2023-1-64-85.
A. Ushio, L. Espinosa-Anke, S. Schockaert, and J. Camacho-Collados, “BERT is to NLP what AlexNet is to CV: Can pre-trained language models identify analogies?,” in ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, 2021. doi: 10.18653/v1/2021.acl-long.280.
S. Zou, Y. Xie, J. Yan, Y. Wei, and X. Luan, “Research on Image Captioning Based on Vision-language Pre-trained Models,” in 2023 9th International Conference on Big Data and Information Analytics, BigDIA 2023 - Proceedings, 2023. doi: 10.1109/BigDIA60676.2023.10429361.
S. Pichai, “Google AI updates: Bard and new AI features in Search,” Google The Keyword Blog.
A. J. Keya, M. A. H. Wadud, M. F. Mridha, M. Alatiyyah, and M. A. Hamid, “AugFake-BERT: Handling Imbalance through Augmentation of Fake News Using BERT to Enhance the Performance of Fake News Classification,” Applied Sciences (Switzerland), vol. 12, no. 17, 2022, doi: 10.3390/app12178398.
E. I. Setiawan, L. Kristianto, A. T. Hermawan, J. Santoso, K. Fujisawa, and M. H. Purnomo, “Social Media Emotion Analysis in Indonesian Using Fine-Tuning BERT Model,” in 3rd 2021 East Indonesia Conference on Computer and Information Technology, EIConCIT 2021, 2021. doi: 10.1109/EIConCIT50028.2021.9431885.
R. C. G. Ramos, H. D. Calderón-Vilca, and F. C. Cárdenas-Mariño, “A BERT-based Question Answering Architecture for Spanish Language,” International Journal of Computer Information Systems and Industrial Management Applications, vol. 14, 2022.
D. H. Fudholi, A. Zahra, S. Rani, S. N. Huda, I. V. Paputungan, and Z. Zukhri, “BERT-based tourism named entity recognition: making use of social media for travel recommendations,” PeerJ Comput Sci, vol. 9, 2023, doi: 10.7717/PEERJ-CS.1731.
Y. Iwasaki, A. Yamashita, Y. Konno, and K. Matsubayashi, “Japanese abstractive text summarization using BERT,” Advances in Science, Technology and Engineering Systems, vol. 5, no. 6, 2020, doi: 10.25046/AJ0506199.
E. T. Luthfi, Z. I. M. Yusoh, and B. M. Aboobaider, “BERT based Named Entity Recognition for Automated Hadith Narrator Identification,” International Journal of Advanced Computer Science and Applications, vol. 13, no. 1, 2022, doi: 10.14569/IJACSA.2022.0130173.
S. Sofiana, Konsep Bert Pada Natural Language Processing, vol. 225. Jawa Tengah: Eureka Media Aksara, 2021.
I. Kementerian Pendidikan dan Kebudayaan, Buku Paket Ilmu Pengetahuan Sosial Kelas 8, vol. 3, no. 4. 2017.
A. P. Anar, A. Widodo, and D. Indraswati, “Menilik Jejak Historis Pendidikan IPS Di Indonesia: Konsep Dan Kedudukan Pendidikan IPS Dalam Perubahan Kurikulum Di Sekolah Dasar,” Phinisi Integration Review, vol. 5, no. 2, 2022, doi: 10.26858/pir.v5i2.33677.
I. M. Karo Karo, M. F. M. Fudzee, S. Kasim, and A. A. Ramli, “Sentiment Analysis in Karonese Tweet using Machine Learning,” Indonesian Journal of Electrical Engineering and Informatics, vol. 10, no. 1, pp. 219–231, Mar. 2022, doi: 10.52549/ijeei.v10i1.3565.
A. Tehseen, T. Ehsan, H. Bin Liaqat, X. Kong, A. Ali, and A. Al-Fuqaha, “Shahmukhi named entity recognition by using contextualized word embeddings,” Expert Syst Appl, vol. 229, 2023, doi: 10.1016/j.eswa.2023.120489.
W. Hwang, J. Yim, S. Park, S. Yang, and M. Seo, “Spatial Dependency Parsing for Semi-Structured Document Information Extraction,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021. doi: 10.18653/v1/2021.findings-acl.28.
M. Modrzejewski, T. Le Ha, A. Waibel, M. Exel, and B. Buschbeck, “Incorporating External Annotation to improve Named Entity Translation in NMT,” in Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, EAMT 2020, 2020.
M. Mujahid, K. Kanwal, F. Rustam, W. Aljadani, and I. Ashraf, “Arabic ChatGPT Tweets Classification Using RoBERTa and BERT Ensemble Model,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 22, no. 8, 2023, doi: 10.1145/3605889.
L. Zhang, P. Xia, X. Ma, C. Yang, and X. Ding, “Enhanced Chinese named entity recognition with multi-granularity BERT adapter and efficient global pointer,” Complex and Intelligent Systems, vol. 10, no. 3, 2024, doi: 10.1007/s40747-024-01383-6.
A. M. A. Barhoom, B. S. Abunasser, and S. S. Abu-Naser, “Sarcasm Detection in Headline News using Machine and Deep Learning Algorithms,” 2022.
I. M. K. Karo, R. Ramdhani, A. W. Ramadhelza, and B. Z. Aufa, “A Hybrid Classification Based on Machine Learning Classifiers to Predict Smart Indonesia Program,” in Proceeding - 2020 3rd International Conference on Vocational Education and Electrical Engineering: Strengthening the framework of Society 5.0 through Innovations in Education, Electrical, Engineering and Informatics Engineering, ICVEE 2020, 2020. doi: 10.1109/ICVEE50212.2020.9243195.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 CogITo Smart Journal

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).