Selection of a Minimal Number of Significant Porcine SNPs by an Information Gain and Genetic Algorithm Hybrid Model

05/22/2019
by   Wanthanee Rathasamuth, et al.
0

A panel of large number of common Single Nucleotide Polymorphisms (SNPs) distributed across an entire porcine genome has been widely used to represent genetic variability of pig. With the advent of SNP-array technology, a genome-wide genetic profile of a specimen can be easily observed. Among the large number of such variations, there exist a much smaller subset of the SNP panel that could equally be used to correctly identify the corresponding breed. This work presents a SNP selection heuristic that can still be used effectively in the breed classification process. The proposed feature selection was done by the approach of combining a filter method and a wrapper method--information gain method and genetic algorithm--plus a feature frequency selection step, while classification was done by support vector machine. The approach was able to reduce the number of significant SNPs to 0.86 in a swine dataset and provided a high classification accuracy of 94.80

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset