Library Document Clustering Using Machine Learning Based on the K-Means Method

Authors

  • Odaro Osayande Edo State University, Iyamho, Edo State, Nigeria Author

Keywords:

Unsupervised learning, Library books clustering, K-means, keyword- frequency vectors, dimensional term- frequency (TF)

Abstract

Automatic clustering of library materials remains a central task in digital libraries and information- retrieval systems. In this study we investigate the viability of unsupervised clustering for grouping books based solely on keyword- frequency vectors extracted from their metadata and full- text abstracts. A corpus of 100 -400 books from three distinct disciplines (History, Computer Science, and Biology) was represented by a 250- dimensional term- frequency (TF) vector built from a curated controlled vocabulary. The k- means clustering algorithm was applied. The clustering performance was measured by clustering efficiency (runtime and memory consumption). Results show that k- means attains the highest different computational efficiency, which is dependent of the number of books involved in the classification. The findings demonstrate that keyword- frequency vectors, even in a modest‐size collection, provide sufficient discriminative power for reliable unsupervised learning, and that lightweight clustering (k- means) is adequate for most library- automation scenarios. 

Downloads

Download data is not yet available.

References

Downloads

Published

2026-03-03

How to Cite

Library Document Clustering Using Machine Learning Based on the K-Means Method. (2026). Journal of Science Computing and Applied Engineering Research, 2(2), 33-41. https://jcaes.net/index.php/jce/article/view/35

Share

Similar Articles

1-10 of 20

You may also start an advanced similarity search for this article.