Genetic Program Based Data Mining of Fuzzy Decision Trees and Methods of Improving Convergence and Reducing Bloat
A data mining procedure for automatic determination of fuzzy decision tree structure using a genetic program (GP) is discussed. A GP is an algorithm that evolves other algorithms or mathematical expressions. Innovative methods for accelerating convergence of the data mining procedure and reducing bl...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Report |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Smith, III, James F Nguyen, ThanhVu H |
description | A data mining procedure for automatic determination of fuzzy decision tree structure using a genetic program (GP) is discussed. A GP is an algorithm that evolves other algorithms or mathematical expressions. Innovative methods for accelerating convergence of the data mining procedure and reducing bloat are given. In genetic programming, bloat refers to excessive tree growth. It has been observed that the trees in the evolving GP population will grow by a factor of three every 50 generations. When evolving mathematical expressions much of the bloat is due to the expressions not being in algebraically simplest form. So a bloat reduction method based on automated computer algebra has been introduced. The effectiveness of this procedure is discussed. Also, rules based on fuzzy logic have been introduced into the GP to accelerate convergence, reduce bloat and produce a solution more readily understood by the human user. These rules are discussed as well as other techniques for convergence improvement and bloat control. Comparisons between trees created using a genetic program and those constructed solely by interviewing experts are made. A new co-evolutionary method that improves the control logic evolved by the GP by having a genetic algorithm evolve pathological scenarios is discussed. The effect on the control logic is considered. Finally, additional methods that have been used to validate the data mining algorithm are referenced.
Presented at the SPIE Defense + Security Conference 2007, Orlando, FL on 9-13 Apr 2007 and published in Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2007, SPIE Proceedings v6570 article 65700A. |
format | Report |
fullrecord | <record><control><sourceid>dtic_1RU</sourceid><recordid>TN_cdi_dtic_stinet_ADA524204</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>ADA524204</sourcerecordid><originalsourceid>FETCH-dtic_stinet_ADA5242043</originalsourceid><addsrcrecordid>eNqFjj0KwkAQRtNYiHoDi7mAIDEeID9GLQIi6cOwO4kLyYzubgLm9CZib_UV733wlsHrTEzeKLhZaSx2kKAjDRl6hMKw4QakhrwfxzdkpIwzwlBaIgfIGgryD9Fudq7d08owH1LhgWxDrOgr3Un3agZJK-jXwaLG1tHmt6tgm5_K9LLTU0blvJl6qjiLj2EU7qPDH_wBJPU_Ug</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>report</recordtype></control><display><type>report</type><title>Genetic Program Based Data Mining of Fuzzy Decision Trees and Methods of Improving Convergence and Reducing Bloat</title><source>DTIC Technical Reports</source><creator>Smith, III, James F ; Nguyen, ThanhVu H</creator><creatorcontrib>Smith, III, James F ; Nguyen, ThanhVu H ; NAVAL RESEARCH LAB WASHINGTON DC SURFACE ELECTRONIC WARFARE SYSTEMS BRANCH</creatorcontrib><description>A data mining procedure for automatic determination of fuzzy decision tree structure using a genetic program (GP) is discussed. A GP is an algorithm that evolves other algorithms or mathematical expressions. Innovative methods for accelerating convergence of the data mining procedure and reducing bloat are given. In genetic programming, bloat refers to excessive tree growth. It has been observed that the trees in the evolving GP population will grow by a factor of three every 50 generations. When evolving mathematical expressions much of the bloat is due to the expressions not being in algebraically simplest form. So a bloat reduction method based on automated computer algebra has been introduced. The effectiveness of this procedure is discussed. Also, rules based on fuzzy logic have been introduced into the GP to accelerate convergence, reduce bloat and produce a solution more readily understood by the human user. These rules are discussed as well as other techniques for convergence improvement and bloat control. Comparisons between trees created using a genetic program and those constructed solely by interviewing experts are made. A new co-evolutionary method that improves the control logic evolved by the GP by having a genetic algorithm evolve pathological scenarios is discussed. The effect on the control logic is considered. Finally, additional methods that have been used to validate the data mining algorithm are referenced.
Presented at the SPIE Defense + Security Conference 2007, Orlando, FL on 9-13 Apr 2007 and published in Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2007, SPIE Proceedings v6570 article 65700A.</description><language>eng</language><subject>Computer Programming and Software ; CONVERGENCE ; Cybernetics ; DATA MINING ; FUZZY DECISION TREES ; FUZZY LOGIC ; GENETIC ALGORITHMS ; GP(GENETIC PROGRAM) ; INFORMATION RETRIEVAL ; KNOWLEDGE DISCOVERY ; RESOURCE ALLOCATION PROGRAMS ; RESOURCE MANAGEMENT ; RM(RESOURCE MANAGERS) ; Statistics and Probability ; SYMPOSIA ; TOPOLOGY</subject><creationdate>2007</creationdate><rights>Approved for public release; distribution is unlimited.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,776,881,27544,27545</link.rule.ids><linktorsrc>$$Uhttps://apps.dtic.mil/sti/citations/ADA524204$$EView_record_in_DTIC$$FView_record_in_$$GDTIC$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Smith, III, James F</creatorcontrib><creatorcontrib>Nguyen, ThanhVu H</creatorcontrib><creatorcontrib>NAVAL RESEARCH LAB WASHINGTON DC SURFACE ELECTRONIC WARFARE SYSTEMS BRANCH</creatorcontrib><title>Genetic Program Based Data Mining of Fuzzy Decision Trees and Methods of Improving Convergence and Reducing Bloat</title><description>A data mining procedure for automatic determination of fuzzy decision tree structure using a genetic program (GP) is discussed. A GP is an algorithm that evolves other algorithms or mathematical expressions. Innovative methods for accelerating convergence of the data mining procedure and reducing bloat are given. In genetic programming, bloat refers to excessive tree growth. It has been observed that the trees in the evolving GP population will grow by a factor of three every 50 generations. When evolving mathematical expressions much of the bloat is due to the expressions not being in algebraically simplest form. So a bloat reduction method based on automated computer algebra has been introduced. The effectiveness of this procedure is discussed. Also, rules based on fuzzy logic have been introduced into the GP to accelerate convergence, reduce bloat and produce a solution more readily understood by the human user. These rules are discussed as well as other techniques for convergence improvement and bloat control. Comparisons between trees created using a genetic program and those constructed solely by interviewing experts are made. A new co-evolutionary method that improves the control logic evolved by the GP by having a genetic algorithm evolve pathological scenarios is discussed. The effect on the control logic is considered. Finally, additional methods that have been used to validate the data mining algorithm are referenced.
Presented at the SPIE Defense + Security Conference 2007, Orlando, FL on 9-13 Apr 2007 and published in Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2007, SPIE Proceedings v6570 article 65700A.</description><subject>Computer Programming and Software</subject><subject>CONVERGENCE</subject><subject>Cybernetics</subject><subject>DATA MINING</subject><subject>FUZZY DECISION TREES</subject><subject>FUZZY LOGIC</subject><subject>GENETIC ALGORITHMS</subject><subject>GP(GENETIC PROGRAM)</subject><subject>INFORMATION RETRIEVAL</subject><subject>KNOWLEDGE DISCOVERY</subject><subject>RESOURCE ALLOCATION PROGRAMS</subject><subject>RESOURCE MANAGEMENT</subject><subject>RM(RESOURCE MANAGERS)</subject><subject>Statistics and Probability</subject><subject>SYMPOSIA</subject><subject>TOPOLOGY</subject><fulltext>true</fulltext><rsrctype>report</rsrctype><creationdate>2007</creationdate><recordtype>report</recordtype><sourceid>1RU</sourceid><recordid>eNqFjj0KwkAQRtNYiHoDi7mAIDEeID9GLQIi6cOwO4kLyYzubgLm9CZib_UV733wlsHrTEzeKLhZaSx2kKAjDRl6hMKw4QakhrwfxzdkpIwzwlBaIgfIGgryD9Fudq7d08owH1LhgWxDrOgr3Un3agZJK-jXwaLG1tHmt6tgm5_K9LLTU0blvJl6qjiLj2EU7qPDH_wBJPU_Ug</recordid><startdate>200704</startdate><enddate>200704</enddate><creator>Smith, III, James F</creator><creator>Nguyen, ThanhVu H</creator><scope>1RU</scope><scope>BHM</scope></search><sort><creationdate>200704</creationdate><title>Genetic Program Based Data Mining of Fuzzy Decision Trees and Methods of Improving Convergence and Reducing Bloat</title><author>Smith, III, James F ; Nguyen, ThanhVu H</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-dtic_stinet_ADA5242043</frbrgroupid><rsrctype>reports</rsrctype><prefilter>reports</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Computer Programming and Software</topic><topic>CONVERGENCE</topic><topic>Cybernetics</topic><topic>DATA MINING</topic><topic>FUZZY DECISION TREES</topic><topic>FUZZY LOGIC</topic><topic>GENETIC ALGORITHMS</topic><topic>GP(GENETIC PROGRAM)</topic><topic>INFORMATION RETRIEVAL</topic><topic>KNOWLEDGE DISCOVERY</topic><topic>RESOURCE ALLOCATION PROGRAMS</topic><topic>RESOURCE MANAGEMENT</topic><topic>RM(RESOURCE MANAGERS)</topic><topic>Statistics and Probability</topic><topic>SYMPOSIA</topic><topic>TOPOLOGY</topic><toplevel>online_resources</toplevel><creatorcontrib>Smith, III, James F</creatorcontrib><creatorcontrib>Nguyen, ThanhVu H</creatorcontrib><creatorcontrib>NAVAL RESEARCH LAB WASHINGTON DC SURFACE ELECTRONIC WARFARE SYSTEMS BRANCH</creatorcontrib><collection>DTIC Technical Reports</collection><collection>DTIC STINET</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Smith, III, James F</au><au>Nguyen, ThanhVu H</au><aucorp>NAVAL RESEARCH LAB WASHINGTON DC SURFACE ELECTRONIC WARFARE SYSTEMS BRANCH</aucorp><format>book</format><genre>unknown</genre><ristype>RPRT</ristype><btitle>Genetic Program Based Data Mining of Fuzzy Decision Trees and Methods of Improving Convergence and Reducing Bloat</btitle><date>2007-04</date><risdate>2007</risdate><abstract>A data mining procedure for automatic determination of fuzzy decision tree structure using a genetic program (GP) is discussed. A GP is an algorithm that evolves other algorithms or mathematical expressions. Innovative methods for accelerating convergence of the data mining procedure and reducing bloat are given. In genetic programming, bloat refers to excessive tree growth. It has been observed that the trees in the evolving GP population will grow by a factor of three every 50 generations. When evolving mathematical expressions much of the bloat is due to the expressions not being in algebraically simplest form. So a bloat reduction method based on automated computer algebra has been introduced. The effectiveness of this procedure is discussed. Also, rules based on fuzzy logic have been introduced into the GP to accelerate convergence, reduce bloat and produce a solution more readily understood by the human user. These rules are discussed as well as other techniques for convergence improvement and bloat control. Comparisons between trees created using a genetic program and those constructed solely by interviewing experts are made. A new co-evolutionary method that improves the control logic evolved by the GP by having a genetic algorithm evolve pathological scenarios is discussed. The effect on the control logic is considered. Finally, additional methods that have been used to validate the data mining algorithm are referenced.
Presented at the SPIE Defense + Security Conference 2007, Orlando, FL on 9-13 Apr 2007 and published in Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2007, SPIE Proceedings v6570 article 65700A.</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | eng |
recordid | cdi_dtic_stinet_ADA524204 |
source | DTIC Technical Reports |
subjects | Computer Programming and Software CONVERGENCE Cybernetics DATA MINING FUZZY DECISION TREES FUZZY LOGIC GENETIC ALGORITHMS GP(GENETIC PROGRAM) INFORMATION RETRIEVAL KNOWLEDGE DISCOVERY RESOURCE ALLOCATION PROGRAMS RESOURCE MANAGEMENT RM(RESOURCE MANAGERS) Statistics and Probability SYMPOSIA TOPOLOGY |
title | Genetic Program Based Data Mining of Fuzzy Decision Trees and Methods of Improving Convergence and Reducing Bloat |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T16%3A51%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-dtic_1RU&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.btitle=Genetic%20Program%20Based%20Data%20Mining%20of%20Fuzzy%20Decision%20Trees%20and%20Methods%20of%20Improving%20Convergence%20and%20Reducing%20Bloat&rft.au=Smith,%20III,%20James%20F&rft.aucorp=NAVAL%20RESEARCH%20LAB%20WASHINGTON%20DC%20SURFACE%20ELECTRONIC%20WARFARE%20SYSTEMS%20BRANCH&rft.date=2007-04&rft_id=info:doi/&rft_dat=%3Cdtic_1RU%3EADA524204%3C/dtic_1RU%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |