Cosine Similarity-Based Evidences Selection for Fact Verification Using SBERT on the FEVER Dataset

Harya Gusdevi; Arief  Setyanto; Kusrini Kusrini; Ema Utami

doi:10.31154/cogito.v11i1.917.52-66

Authors

Harya Gusdevi Universitas Amikom Yogyakarta
Arief Setyanto Universitas Amikom Yogyakarta
Kusrini Kusrini Universitas Amikom Yogyakarta
Ema Utami Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.31154/cogito.v11i1.917.52-66

Keywords:

Semantic Similarity, Cosine Similarity, Fact Verification, Evidence Selection

Abstract

The spread of misinformation on digital platforms has emphasized the urgent need for automated fact verification systems. However, selecting the most semantically relevant evidence to support or refute a claim remains a challenge, especially within the widely used FEVER dataset. Traditional approaches like TF-IDF often fall short in capturing the contextual meaning between claims and evidence. This study addresses the problem by comparing TF-IDF with Sentence-BERT (SBERT) in measuring semantic similarity. The novelty of this research lies in embedding both claims and evidence using SBERT, then calculating cosine similarity to quantify their semantic relevance. Before embedding, standard preprocessing steps were applied, including tokenization, stemming, lowercasing, and stopword removal. A quantitative approach is used to compute cosine similarity between claim-evidence pairs using both TF-IDF and SBERT embeddings. Similarity analysis, distribution statistics, and t-tests are conducted to evaluate the methods. The results show that SBERT achieves higher similarity with the “SUPPORTS” category (0.65) and stronger negative similarity with “NOT ENOUGH INFO” (-0.90), compared to TF-IDF (0.49 and -0.62, respectively). SBERT also demonstrates more stable score distributions and significantly higher t-test values across all label comparisons, indicating stronger semantic discrimination. These findings confirm that SBERT outperforms TF-IDF in identifying the most relevant evidence. The new dataset generated can serve as a foundation for future fact verification model development.

Author Biography

Kusrini Kusrini, Universitas Amikom Yogyakarta

Prof. Dr. Kusrini, M.Kom, is a professor and the Director of the Postgraduate Program at Universitas AMIKOM Yogyakarta. She joined Universitas AMIKOM Yogyakarta as Faculty member since 2003.

References

M. K. H. Al Asy ari and M. Rahman, “Technology: Technological Advances and Changes in Human Lifestyles in a Socio-Cultural Perspective,” Proceeding International Conference on Science and Engineering, vol. 3, pp. 721–730, Apr. 2020, doi: 10.14421/icse.v3.592.

P. C. Verhoef et al., “Digital transformation: A multidisciplinary reflection and research agenda,” J Bus Res, vol. 122, pp. 889–901, Jan. 2021, doi: 10.1016/j.jbusres.2019.09.022.

C. López-Marcos and P. Vicente-Fernández, “Fact checkers facing fake news and disinformation in the digital age: A comparative analysis between Spain and United Kingdom,” Publications, vol. 9, no. 3, 2021, doi: 10.3390/publications9030036.

A. Krishna, S. Riedel, and A. Vlachos, “ProoFVer: Natural Logic Theorem Proving for Fact Verification,” Aug. 2021, [Online]. Available: http://arxiv.org/abs/2108.11357

B. Portelli, J. Zhao, T. Schuster, G. Serra, and E. Santus, “Distilling the Evidence to Augment Fact Verification Models,” 2020.

S. Subramanian and K. Lee, “Hierarchical Evidence Set Modeling for Automated Fact Extraction and Verification.” [Online]. Available: https://github.com/

R. Aly et al., “FEVEROUS: Fact Extraction and VERification Over Unstructured and Structured information,” Jun. 2021, [Online]. Available: http://arxiv.org/abs/2106.05707

F. Petroni et al., “KILT: a Benchmark for Knowledge Intensive Language Tasks,” Sep. 2020, [Online]. Available: http://arxiv.org/abs/2009.02252

D. Wadden et al., “Fact or Fiction: Verifying Scientific Claims,” Apr. 2020, [Online]. Available: http://arxiv.org/abs/2004.14974

J. Thorne, A. Vlachos, C. Christodoulopoulos, and A. Mittal, “FEVER: a Large-scale Dataset for Fact Extraction and VERification,” pp. 809–819, 2018, doi: 10.18653/v1/n18-1074.

T. Schuster, A. Fisch, and R. Barzilay, “Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence,” Mar. 2021, [Online]. Available: http://arxiv.org/abs/2103.08541

A. Krishna, S. Riedel, and A. Vlachos, “ProoFVer: Natural Logic Theorem Proving for Fact Verification,” Aug. 2021, [Online]. Available: http://arxiv.org/abs/2108.11357

Y. Liu, C. Zhu, and M. Zeng, “Modeling Entity Knowledge for Fact Verification,” in Proceedings of the Fourth Workshop on Fact Extraction and VERification (FEVER), Stroudsburg, PA, USA: Association for Computational Linguistics, 2021, pp. 50–59. doi: 10.18653/v1/2021.fever-1.6.

B. Zhu, X. Zhang, M. Gu, and Y. Deng, “Knowledge Enhanced Fact Checking and Verification,” IEEE/ACM Trans Audio Speech Lang Process, vol. 29, pp. 3132–3143, 2021, doi: 10.1109/TASLP.2021.3120636.

Y. Du, A. Bosselut, and C. D. Manning, “Synthetic Disinformation Attacks on Automated Fact Verification Systems,” Feb. 2022, [Online]. Available: http://arxiv.org/abs/2202.09381

Z. Chen et al., “A syntactic evidence network model for fact verification,” Neural Networks, vol. 178, Oct. 2024, doi: 10.1016/j.neunet.2024.106424.

M. T. Uliniansyah et al., “Twitter dataset on public sentiments towards biodiversity policy in Indonesia,” Data Brief, vol. 52, p. 109890, Feb. 2024, doi: 10.1016/j.dib.2023.109890.

A. A. Firdaus, A. Yudhana, I. Riadi, and Mahsun, “Indonesian presidential election sentiment: Dataset of response public before 2024,” Data Brief, vol. 52, Feb. 2024, doi: 10.1016/j.dib.2023.109993.

A. Athar, S. Ali, M. M. Sheeraz, S. Bhattachariee, and H.-C. Kim, “Sentimental Analysis of Movie Reviews using Soft Voting Ensemble-based Machine Learning,” no. March, pp. 01–05, 2022, doi: 10.1109/snams53716.2021.9732159.

Z. Chen et al., “How does the perception of informal green spaces in urban villages influence residents’ complaint Sentiments? a Machine learning analysis of Fuzhou City, China,” Ecol Indic, vol. 166, Sep. 2024, doi: 10.1016/j.ecolind.2024.112376.

N. A. Rakhmawati, M. I. Aditama, R. I. Pratama, and K. H. U. Wiwaha, “Analisis Klasifikasi Sentimen Pengguna Media Sosial Twitter Terhadap Pengadaan Vaksin COVID-19,” Journal of Information Engineering and Educational Technology, vol. 4, no. 2, pp. 90–92, 2020, doi: 10.26740/jieet.v4n2.p90-92.

C. J. Varshney, A. Sharma, and D. P. Yadav, “Sentiment analysis using ensemble classification technique,” 2020 IEEE Students’ Conference on Engineering and Systems, SCES 2020, no. July, 2020, doi: 10.1109/SCES50439.2020.9236754.

A. Condor, M. Litster, and Z. Pardos, “Automatic short answer grading with SBERT on out-of-sample questions.”

J. Opitz and A. Frank, “SBERT studies Meaning Representations: Decomposing Sentence Embeddings into Explainable Semantic Features,” Jun. 2022, [Online]. Available: http://arxiv.org/abs/2206.07023

T. P. Rinjeni, A. Indriawan, and N. A. Rakhmawati, “Matching Scientific Article Titles using Cosine Similarity and Jaccard Similarity Algorithm,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 553–560. doi: 10.1016/j.procs.2024.03.039.

O. A. Resta, A. Aditya, and F. E. Purwiantono, “Plagiarism Detection in Students’ Theses Using The Cosine Similarity Method,” SinkrOn, vol. 5, no. 2, pp. 305–313, May 2021, doi: 10.33395/sinkron.v5i2.10909.

Y. Jiang, S. Bordia, Z. Zhong, C. Dognin, M. Singh, and M. Bansal, “HoVer: A Dataset for Many-Hop Fact Extraction And Claim Verification,” Nov. 2020, [Online]. Available: http://arxiv.org/abs/2011.03088

S. Lestari, M. Z. Dj, and U. Hasanah, “THE CORRELATION BETWEEN READING INTERNATIONAL JOURNAL ARTICLES ON ENRICHING THE UNIVERSITY EFL STUDENTS’ ACADEMIC VOCABULARY,” 2023.

F. Lan, “Research on Text Similarity Measurement Hybrid Algorithm with Term Semantic Information and TF-IDF Method,” Advances in Multimedia, vol. 2022, 2022, doi: 10.1155/2022/7923262.

A. D. Susanto, S. Andrian Pradita, C. Stryadhi, K. E. Setiawan, and M. Fikri Hasani, “Text Vectorization Techniques for Trending Topic Clustering on Twitter: A Comparative Evaluation of TF-IDF, Doc2Vec, and Sentence-BERT,” in 2023 5th International Conference on Cybernetics and Intelligent Systems, ICORIS 2023, Institute of Electrical and Electronics Engineers Inc., 2023. doi: 10.1109/ICORIS60118.2023.10352228.

P. J. Verschuuren, J. Gao, A. van Eeden, S. Oikonomou, and A. Bandhakavi, “Logically at Factify 2: A Multi-Modal Fact Checking System Based on Evidence Retrieval techniques and Transformer Encoder Architecture,” Jan. 2023, doi: https://doi.org/10.48550/arXiv.2112.09253.

A. Martín, J. Huertas-Tato, Á. Huertas-García, G. Villar-Rodríguez, and D. Camacho, “FacTeR-Check: Semi-automated fact-checking through semantic similarity and natural language inference,” Knowl Based Syst, vol. 251, Sep. 2022, doi: 10.1016/j.knosys.2022.109265.

Y. Chu, H. Cao, Y. Diao, and H. Lin, “Refined SBERT: Representing sentence BERT in manifold space,” Neurocomputing, vol. 555, Oct. 2023, doi: 10.1016/j.neucom.2023.126453.