1 AIT Asian Institute of Technology

Genetic algorithms in feature selection

AuthorNidapan Sureerattanan
Call NumberAIT Diss. no.CS-02-02
Subject(s)Genetic algorithms
Computer algorithms

NoteA dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Technical Science, School of Advanced Technologies
PublisherAsian Institute of Technology
Series StatementDissertation ; no. CS-02-02
AbstractIn recent years, Genetic Algorithms (GAs) have grown rapidly, and are extensively used in various fields. GAS are a search method based on the paradigm of natural selection and natural genetics, With their Special characteristics: a coding of parameter set, searching from a population, appropriate measure of fitness, and probabilistic transition rules, GAS can perform evolving a solution for several types of problems. Although the basic operators, crossover and mutation, work so well in the vast majority of genetic algorithm implementations, an important problem still remains in balancing exploration and exploitation in genetic search. This problem concerns to a selective pressure and pOpulation diversity. Selection pressure can have a decisive effect on the outcome of an evolutionary search. The higher the pressure, the faster the convergence but perhaps on a local optimum. Conversely, the lower the pressure, the slower the convergence but more variation of population, which provides raw materials for adaptation. Three new computational operators for GAS are proposed. Concentrating a chance to avoid complete loss of the characters on the worst chromosomes, self-adaptive-inversion and upgrading operators are presented. In addition to altering values on population chromosomes, a translocation operator is also introduced as another way for this purpose, as well as mutation Operator. Performance of any combination of the basic GAS and each proposed operator are compared with the pure basic GAS. Results on the optimization functions and real applications in feature selection problem provide valuable evidence that the proposed methods perform more robustly than the basic GAs, within a comparable execution time. A classification system requires selection of a subset of relevant attributes or features from a large size of data set to represent the pattern to be classified. In applying GAS to solve a specific problem, it is important to design the appropriate fitness function to guide well in genetic search. With the proposed multi-objective fitness function using multiple correlation, the empirical results on a number of real data sets demonstrate the effectiveness of the proposed GAS. Their effectiveness involve (1) handling the data sets with large numbers of features, mixed types of attributes, and multi class data; (2) eliminating irrelevant and redundant attributes while keeping the discriminating power of the original data; and (3) improving the discriminating power after elimination.
Year2002
Corresponding Series Added EntryAsian Institute of Technology. Dissertation ; no. CS-02-02
TypeDissertation
SchoolSchool of Advanced Technologies (SAT)
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSComputer Science (CS)
Chairperson(s)H.N. Phien
Examination Committee(s)Sadananda, R.;Tang, John S. C.;Mastorakis, Nikos El;Aekavute Sujarae
DegreeThesis (Ph.D.) - Asian Institute of Technology, 2002


Usage Metrics
View Detail0
Read PDF0
Download PDF0