1
Genetic algorithms in feature selection | |
Author | Nidapan Sureerattanan |
Call Number | AIT Diss. no.CS-02-02 |
Subject(s) | Genetic algorithms Computer algorithms |
Note | A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Technical Science, School of Advanced Technologies |
Publisher | Asian Institute of Technology |
Series Statement | Dissertation ; no. CS-02-02 |
Abstract | In recent years, Genetic Algorithms (GAs) have grown rapidly, and are extensively used in various fields. GAS are a search method based on the paradigm of natural selection and natural genetics, With their Special characteristics: a coding of parameter set, searching from a population, appropriate measure of fitness, and probabilistic transition rules, GAS can perform evolving a solution for several types of problems. Although the basic operators, crossover and mutation, work so well in the vast majority of genetic algorithm implementations, an important problem still remains in balancing exploration and exploitation in genetic search. This problem concerns to a selective pressure and pOpulation diversity. Selection pressure can have a decisive effect on the outcome of an evolutionary search. The higher the pressure, the faster the convergence but perhaps on a local optimum. Conversely, the lower the pressure, the slower the convergence but more variation of population, which provides raw materials for adaptation. Three new computational operators for GAS are proposed. Concentrating a chance to avoid complete loss of the characters on the worst chromosomes, self-adaptive-inversion and upgrading operators are presented. In addition to altering values on population chromosomes, a translocation operator is also introduced as another way for this purpose, as well as mutation Operator. Performance of any combination of the basic GAS and each proposed operator are compared with the pure basic GAS. Results on the optimization functions and real applications in feature selection problem provide valuable evidence that the proposed methods perform more robustly than the basic GAs, within a comparable execution time. A classification system requires selection of a subset of relevant attributes or features from a large size of data set to represent the pattern to be classified. In applying GAS to solve a specific problem, it is important to design the appropriate fitness function to guide well in genetic search. With the proposed multi-objective fitness function using multiple correlation, the empirical results on a number of real data sets demonstrate the effectiveness of the proposed GAS. Their effectiveness involve (1) handling the data sets with large numbers of features, mixed types of attributes, and multi class data; (2) eliminating irrelevant and redundant attributes while keeping the discriminating power of the original data; and (3) improving the discriminating power after elimination. |
Year | 2002 |
Corresponding Series Added Entry | Asian Institute of Technology. Dissertation ; no. CS-02-02 |
Type | Dissertation |
School | School of Advanced Technologies (SAT) |
Department | Department of Information and Communications Technologies (DICT) |
Academic Program/FoS | Computer Science (CS) |
Chairperson(s) | H.N. Phien |
Examination Committee(s) | Sadananda, R.;Tang, John S. C.;Mastorakis, Nikos El;Aekavute Sujarae |
Degree | Thesis (Ph.D.) - Asian Institute of Technology, 2002 |