Clustering is one of the most important concepts for unsupervised learning in machine learning. While there are numerous clustering algorithms already, many, including the popular one—k-means algorithm, require the number of clusters to be specified in advance, a huge drawback. Some studies use the silhouette coefficient to determine the optimal number of clusters. In this study, we introduce a novel algorithm called Powered Outer Probabilistic Clustering, show how it works through back-propagation (starting with many clusters and ending with an optimal number of clusters) , and show that the algorithm converges to the expected (optimal) number of clusters on theoretical examples.
References:
- P. Taraba: Powered Outer Probabilistic Clustering, Proceedings of the World Congress on Engineering and Computer Science, IAENG, October 2017 [best paper award]
- P. Taraba: Clustering for binary featured datasets, Transactions on Engineering Technologies: WCECS 2017, Springer, 2019
- D.J. Brunner, P. Taraba, A. Ankolekar: Personal data fusion, US11017343B2, 2018, 2021