DSpace Repository :: Browsing by Author "Mambo, Shadrack"

Browsing by Author "Mambo, Shadrack"

Now showing 1 - 3 of 3

K-Hyperparameter Tuning in High-Dimensional Space Clustering: Solving Smooth Elbow Challenges Using an Ensemble Based Technique of a Self-Adapting Autoencoder and Internal Validation Indexes
(Tech Science Press, 2023-10-26) Gikera, Rufus; Mwaura, Jonathan; Muuro, Elizaphan; Mambo, Shadrack
k-means is a popular clustering algorithmbecause of its simplicity and scalability to handle large datasets.However, one of its setbacks is the challenge of identifying the correct k-hyperparameter value. Tuning this value correctly is critical for building effective k-means models. The use of the traditional elbow method to help identify this value has a long-standing literature. However, when using this method with certain datasets, smooth curves may appear, making it challenging to identify the k-value due to its unclear nature.Onthe other hand, various internal validation indexes, which are proposed as a solution to this issue, may be inconsistent. Although various techniques for solving smooth elbow challenges exist, k-hyperparameter tuning in high-dimensional spaces still remains intractable and an open research issue. In this paper, we have first reviewed the existing techniques for solving smooth elbow challenges. The identified research gaps are then utilized in the development of the new technique. The new technique, referred to as the ensemble-based technique of a self-adapting autoencoder and internal validation indexes, is then validated in high-dimensional space clustering. The optimal k-value, tuned by this technique using a voting scheme, is a trade-off between the number of clusters visualized in the autoencoder’s latent space, k-value from the ensemble internal validation index score and one that generates a value of 0 or close to 0 on the derivative f ___ (k)(1+f _ (k)2)−3 f __ (k)2f __ ((k)2f _ (k), at the elbow. Experimental results based on theCochran’sQtest,ANOVA, andMcNemar’s score indicate a relativelygoodperformanceof thenewlydevelopedtechnique ink-hyperparameter tuning.
Optimized K-Means clustering algorithm using an intelligent stable-plastic variational autoencoder with self-intrinsic cluster validation mechanism
(ICONIC, 2020-09-24) Gikera, Rufus; Mambo, Shadrack; Mwaura, Jonathan
Clustering is one of the most important tasks in exploratory data analysis [1, 55, 59]. K-means are the most popular clustering algorithms [51, 61]. This is because of their ability to adapt to new examples and to scale up to large datasets. They are also easily understandable and computationally faster [57, 60, 3, 62]. However, the number of clusters, K, has to be specified by the user [50]. Random process is the norm of searching for appropriate number of clusters, until convergence [53, 5]. Several variants of the k-means algorithm have been proposed, geared towards optimal selection of the K [8, 48]. The objective of this paper is to analyze the scaling up problems associated with these variants for optimizing K in the k-means clustering algorithms. Finally, a more enhanced hybrid autoencoder-based k-means will be developed and evaluated against the existing variants.
Trends and Advances on The K-Hyperparameter Tuning Techniques In High-Dimensional Space Clustering
(IJAIDM, 2023-09) Gikera, Rufus Kinyua; Mwaura, Jonathan; Maina, Elizaphan; Mambo, Shadrack
Clustering is one of the tasks performed during exploratory data analysis with an extensive and wealthy history in a variety of disciplines. Application of clustering in computational medicine is one such application of clustering that has proliferated in the recent past. K-means algorithms are the most popular because of their ability to adapt to new examples besides scaling up to large datasets. They are also easy to understand and implement. However, with k-means algorithms, k-hyperparameter tuning is a long standing challenge. The sparse and redundant nature of the high-dimensional datasets makes the k-hyperparameter tuning in high-dimensional space clustering a more challenging task. A proper k-hyperparameter tuning has a significant effect on the clustering results. A number of state-of-the art k-hyperparameter tuning techniques in high-dimensional space have been proposed. However, these techniques perform differently in a variety of high-dimensional datasets and data-dimensionality reduction methods. This article uses a five-step methodology to investigate the trends and advances on the state of the art k-hyperparameter tuning techniques in high-dimensional space clustering, data dimensionality reduction methods used with these techniques, their tuning strategies, nature of the datasets applied with them as well as the challenges associated with the cluster analysis in high-dimensional spaces. The metrics used in evaluating these techniques are also reviewed. The results of this review, elaborated in the discussion section, makes it efficient for data science researchers to undertake an empirical study among these techniques; a study that subsequently forms the basis for creating improved solutions to this k-hyperparameter tuning problem.

Browsing by Author "Mambo, Shadrack"

Results Per Page

Sort Options