Developing a genetic algorithm to construct efficient binary decision trees
An intrinsic disease like deep venous thrombosis where blood clots form in a deep vein in the body has been a major concern for physicians. Because deep venous thrombosis has a high mortality rate, predicting it early is important. Simple prediction models are desirable for potential patients and physicians. Binary decision trees are simple and practical prediction models but conventional decision tree building algorithms such as ID3 or C4.5 often suffer from excessive complexity and can even be incomprehensible. Here a genetic algorithm is used to construct simple and efficient decision trees of increased accuracy and efficiency compared to those constructed by the ID3 or C4.5 algorithms. This dissertation makes four main contributions. First, two sets of attributes that are accessible by patients and physicians are carefully selected for the DVT prediction from two large databases. Second, heterogeneous type attributes are carefully converted into binary type attributes. Third, a genetic algorithm is developed to construct efficient binary decision trees, and shorter and more accurate decision trees were found in both datasets. Fourth, an improved definition of an efficient binary decision tree is proposed and evaluated—instead of simply using the number of nodes in a tree, the average number of questions asked in the tree for all the database entries is proposed.^
"Developing a genetic algorithm to construct efficient binary decision trees"
(January 1, 2010).
ETD Collection for Pace University.