Browsing by Author "Mwaura, Jonathan"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item K Hyperparameter Tuning in High Dimensional Genomics Using Joint Optimization of Deep Diferential Evolutionary Algorithm and Unsupervised Transfer Learning from Intelligent Genoumap Embeddings(Int. j. inf. tecnol, 2024-07) Gikera, Rufus; Maina, Elizaphan; Mambo, Shadrack Maina; Mwaura, JonathanK-hyperparameter optimization in high-dimensional genomics remains a critical challenge, impacting the quality of clustering. Improved quality of clustering can enhance models for predicting patient outcomes and identifying personalized treatment plans. Subsequently, these enhanced models can facilitate the discovery of biomarkers, which can be essential for early diagnosis, prognosis, and treatment response in cancer research. Our paper addresses this challenge through a four-fold approach. Firstly, we empirically evaluate the k-hyperparameter optimization algorithms in genomics analysis using a correlation based feature selection method and a stratifed k-fold cross-validation strategy. Secondly, we evaluate the performance of the best optimization algorithm in the frst step using a variety of the dimensionality reduction methods applied for reducing the hyperparameter search spaces in genomics. Building on the two, we propose a novel algorithm for this optimization problem in the third step, employing a joint optimization of Deep-Diferential-Evolutionary Algorithm and Unsupervised Transfer Learning from Intelligent GenoUMAP (Uniform Manifold Approximation and Projection). Finally, we compare it with the existing algorithms and validate its efectiveness. Our approach leverages UMAP pre-trained special autoencoder and integrates a deep-diferential-evolutionary algorithm in tuning k. These choices are based on empirical analysis results. The novel algorithm balances population size for exploration and exploitation, helping to fnd diverse solutions and the global optimum. The learning rate balances iterations and convergence speed, leading to stable convergence towards the global optimum. UMAP’s superior performance, demonstrated by short whiskers and higher median values in the comparative analysis, informs its choice for training the special autoencoder in the new algorithm. The algorithm enhances clustering by balancing reconstruction accuracy, local structure preservation, and cluster compactness. The comprehensive loss function optimizes clustering quality, promotes hyperparameter diversity, and facilitates efective knowledge transfer. This algorithm’s multi-objective joint optimization makes it efective in genomics data analysis. The validation on this algorithm on three genomic datasets demonstrates superior clustering scores. Additionally, the convergence plots indicate relatively smoother curves and an excellent ftness landscape. These fndings hold signifcant promise for advancing cancer research and computational genomics at largeItem K-Hyperparameter Tuning in High-Dimensional Space Clustering: Solving Smooth Elbow Challenges Using an Ensemble Based Technique of a Self-Adapting Autoencoder and Internal Validation Indexes(Tech Science Press, 2023-10-26) Gikera, Rufus; Mwaura, Jonathan; Muuro, Elizaphan; Mambo, Shadrackk-means is a popular clustering algorithmbecause of its simplicity and scalability to handle large datasets.However, one of its setbacks is the challenge of identifying the correct k-hyperparameter value. Tuning this value correctly is critical for building effective k-means models. The use of the traditional elbow method to help identify this value has a long-standing literature. However, when using this method with certain datasets, smooth curves may appear, making it challenging to identify the k-value due to its unclear nature.Onthe other hand, various internal validation indexes, which are proposed as a solution to this issue, may be inconsistent. Although various techniques for solving smooth elbow challenges exist, k-hyperparameter tuning in high-dimensional spaces still remains intractable and an open research issue. In this paper, we have first reviewed the existing techniques for solving smooth elbow challenges. The identified research gaps are then utilized in the development of the new technique. The new technique, referred to as the ensemble-based technique of a self-adapting autoencoder and internal validation indexes, is then validated in high-dimensional space clustering. The optimal k-value, tuned by this technique using a voting scheme, is a trade-off between the number of clusters visualized in the autoencoder’s latent space, k-value from the ensemble internal validation index score and one that generates a value of 0 or close to 0 on the derivative f ___ (k)(1+f _ (k)2)−3 f __ (k)2f __ ((k)2f _ (k), at the elbow. Experimental results based on theCochran’sQtest,ANOVA, andMcNemar’s score indicate a relativelygoodperformanceof thenewlydevelopedtechnique ink-hyperparameter tuning.Item Optimized K-Means clustering algorithm using an intelligent stable-plastic variational autoencoder with self-intrinsic cluster validation mechanism(ICONIC, 2020-09-24) Gikera, Rufus; Mambo, Shadrack; Mwaura, JonathanClustering is one of the most important tasks in exploratory data analysis [1, 55, 59]. K-means are the most popular clustering algorithms [51, 61]. This is because of their ability to adapt to new examples and to scale up to large datasets. They are also easily understandable and computationally faster [57, 60, 3, 62]. However, the number of clusters, K, has to be specified by the user [50]. Random process is the norm of searching for appropriate number of clusters, until convergence [53, 5]. Several variants of the k-means algorithm have been proposed, geared towards optimal selection of the K [8, 48]. The objective of this paper is to analyze the scaling up problems associated with these variants for optimizing K in the k-means clustering algorithms. Finally, a more enhanced hybrid autoencoder-based k-means will be developed and evaluated against the existing variants.