Security Relevant Methods of Android's API Classification: A Machine Learning Empirical Evaluation

The Android operating system provides functions and methods to handle sensitive data to secure users' data. The Android security literature extracts binary features from a method and classifies the method into one of the Security Relevant Method's classes, adding information about how the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on computers 2023-11, Vol.72 (11), p.1-13
Hauptverfasser:	Rodrigues, Walber M., Walmsley, Felipe N., Cavalcanti, George D. C., Cruz, Rafael M. O.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Ambiguity Android security API Classification Binary features Classification Classification algorithms Classifiers Codes Collision rates Decision analysis Decision trees Embedding Empirical analysis Feature extraction Handles Machine learning Machine learning algorithms Multiple Classifier Systems Pipelines Security
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	13
container_issue	11
container_start_page	1
container_title	IEEE transactions on computers
container_volume	72
creator	Rodrigues, Walber M. Walmsley, Felipe N. Cavalcanti, George D. C. Cruz, Rafael M. O.
description	The Android operating system provides functions and methods to handle sensitive data to secure users' data. The Android security literature extracts binary features from a method and classifies the method into one of the Security Relevant Method's classes, adding information about how the method handles sensitive data. However, the usage of binary features hinders the performance of some classifiers due to the high collision rate between instances. Although previous works have explored Security Relevant Method classification, an extensive study of machine learning algorithms over this problem has not been conceived. This work fills this gap, analyzing Monolithic classifiers, Multiple Classifier Systems, and Embedding algorithms to transform binary features into real-valued features, aiming to facilitate the classifier's work by minimizing the ambiguity promoted by the collision. Our analyzes show that META-DES, using a pool of Decision Trees trained with the Random Forest algorithm, statistically has the best results. We also find that, in general, distance-based classifiers have a disadvantage in binary features. Moreover, embedding techniques such as deep metric learning with triplet loss can reduce geometrical instance ambiguity, improving the performance of the weakest learning algorithms. However, its usage was detrimental to the performance of more robust techniques, such as dynamic ensemble models better suited for handling difficult cases. The dataset and code used for the experiments are available in the following repository: https://github.com/walbermr/android-srm-ml-evaluation .
doi_str_mv	10.1109/TC.2023.3291998
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2875578352</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10183829</ieee_id><sourcerecordid>2875578352</sourcerecordid><originalsourceid>FETCH-LOGICAL-c244t-89b1ab546bcba02e8ef4ca2f5346cb5c64410b895ed2e5892679ba65d71779413</originalsourceid><addsrcrecordid>eNpNkD1PwzAARC0EEqUwszBYYmBK68_EZouiApVagaDMlu041FWaFDup1H9PShmYbnl3Jz0AbjGaYIzkdFVMCCJ0QonEUoozMMKcZ4mUPD0HI4SwSCRl6BJcxbhBCKUEyREwH872wXcH-O5qt9dNB5euW7dlhG0F86YMrS8fIszf5rCodYy-8lZ3vm0eYQ6X2q594-DC6dD45gvOtjsfBqCGs72u-1_wGlxUuo7u5i_H4PNptipeksXr87zIF4kljHWJkAZrw1lqrNGIOOEqZjWpOGWpNdymjGFkhOSuJI4LSdJMGp3yMsNZJhmmY3B_2t2F9rt3sVObtg_NcKmIyAYXgnIyUNMTZUMbY3CV2gW_1eGgMFJHkWpVqKNI9SdyaNydGt4594_Gggoi6Q_2dW4q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2875578352</pqid></control><display><type>article</type><title>Security Relevant Methods of Android's API Classification: A Machine Learning Empirical Evaluation</title><source>IEEE Electronic Library (IEL)</source><creator>Rodrigues, Walber M. ; Walmsley, Felipe N. ; Cavalcanti, George D. C. ; Cruz, Rafael M. O.</creator><creatorcontrib>Rodrigues, Walber M. ; Walmsley, Felipe N. ; Cavalcanti, George D. C. ; Cruz, Rafael M. O.</creatorcontrib><description>The Android operating system provides functions and methods to handle sensitive data to secure users' data. The Android security literature extracts binary features from a method and classifies the method into one of the Security Relevant Method's classes, adding information about how the method handles sensitive data. However, the usage of binary features hinders the performance of some classifiers due to the high collision rate between instances. Although previous works have explored Security Relevant Method classification, an extensive study of machine learning algorithms over this problem has not been conceived. This work fills this gap, analyzing Monolithic classifiers, Multiple Classifier Systems, and Embedding algorithms to transform binary features into real-valued features, aiming to facilitate the classifier's work by minimizing the ambiguity promoted by the collision. Our analyzes show that META-DES, using a pool of Decision Trees trained with the Random Forest algorithm, statistically has the best results. We also find that, in general, distance-based classifiers have a disadvantage in binary features. Moreover, embedding techniques such as deep metric learning with triplet loss can reduce geometrical instance ambiguity, improving the performance of the weakest learning algorithms. However, its usage was detrimental to the performance of more robust techniques, such as dynamic ensemble models better suited for handling difficult cases. The dataset and code used for the experiments are available in the following repository: https://github.com/walbermr/android-srm-ml-evaluation .</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2023.3291998</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Ambiguity ; Android security ; API Classification ; Binary features ; Classification ; Classification algorithms ; Classifiers ; Codes ; Collision rates ; Decision analysis ; Decision trees ; Embedding ; Empirical analysis ; Feature extraction ; Handles ; Machine learning ; Machine learning algorithms ; Multiple Classifier Systems ; Pipelines ; Security</subject><ispartof>IEEE transactions on computers, 2023-11, Vol.72 (11), p.1-13</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c244t-89b1ab546bcba02e8ef4ca2f5346cb5c64410b895ed2e5892679ba65d71779413</cites><orcidid>0000-0001-9446-1040 ; 0000-0003-1410-1059 ; 0000-0001-7714-2283 ; 0000-0002-8809-6304</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10183829$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10183829$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Rodrigues, Walber M.</creatorcontrib><creatorcontrib>Walmsley, Felipe N.</creatorcontrib><creatorcontrib>Cavalcanti, George D. C.</creatorcontrib><creatorcontrib>Cruz, Rafael M. O.</creatorcontrib><title>Security Relevant Methods of Android's API Classification: A Machine Learning Empirical Evaluation</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>The Android operating system provides functions and methods to handle sensitive data to secure users' data. The Android security literature extracts binary features from a method and classifies the method into one of the Security Relevant Method's classes, adding information about how the method handles sensitive data. However, the usage of binary features hinders the performance of some classifiers due to the high collision rate between instances. Although previous works have explored Security Relevant Method classification, an extensive study of machine learning algorithms over this problem has not been conceived. This work fills this gap, analyzing Monolithic classifiers, Multiple Classifier Systems, and Embedding algorithms to transform binary features into real-valued features, aiming to facilitate the classifier's work by minimizing the ambiguity promoted by the collision. Our analyzes show that META-DES, using a pool of Decision Trees trained with the Random Forest algorithm, statistically has the best results. We also find that, in general, distance-based classifiers have a disadvantage in binary features. Moreover, embedding techniques such as deep metric learning with triplet loss can reduce geometrical instance ambiguity, improving the performance of the weakest learning algorithms. However, its usage was detrimental to the performance of more robust techniques, such as dynamic ensemble models better suited for handling difficult cases. The dataset and code used for the experiments are available in the following repository: https://github.com/walbermr/android-srm-ml-evaluation .</description><subject>Algorithms</subject><subject>Ambiguity</subject><subject>Android security</subject><subject>API Classification</subject><subject>Binary features</subject><subject>Classification</subject><subject>Classification algorithms</subject><subject>Classifiers</subject><subject>Codes</subject><subject>Collision rates</subject><subject>Decision analysis</subject><subject>Decision trees</subject><subject>Embedding</subject><subject>Empirical analysis</subject><subject>Feature extraction</subject><subject>Handles</subject><subject>Machine learning</subject><subject>Machine learning algorithms</subject><subject>Multiple Classifier Systems</subject><subject>Pipelines</subject><subject>Security</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkD1PwzAARC0EEqUwszBYYmBK68_EZouiApVagaDMlu041FWaFDup1H9PShmYbnl3Jz0AbjGaYIzkdFVMCCJ0QonEUoozMMKcZ4mUPD0HI4SwSCRl6BJcxbhBCKUEyREwH872wXcH-O5qt9dNB5euW7dlhG0F86YMrS8fIszf5rCodYy-8lZ3vm0eYQ6X2q594-DC6dD45gvOtjsfBqCGs72u-1_wGlxUuo7u5i_H4PNptipeksXr87zIF4kljHWJkAZrw1lqrNGIOOEqZjWpOGWpNdymjGFkhOSuJI4LSdJMGp3yMsNZJhmmY3B_2t2F9rt3sVObtg_NcKmIyAYXgnIyUNMTZUMbY3CV2gW_1eGgMFJHkWpVqKNI9SdyaNydGt4594_Gggoi6Q_2dW4q</recordid><startdate>20231101</startdate><enddate>20231101</enddate><creator>Rodrigues, Walber M.</creator><creator>Walmsley, Felipe N.</creator><creator>Cavalcanti, George D. C.</creator><creator>Cruz, Rafael M. O.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-9446-1040</orcidid><orcidid>https://orcid.org/0000-0003-1410-1059</orcidid><orcidid>https://orcid.org/0000-0001-7714-2283</orcidid><orcidid>https://orcid.org/0000-0002-8809-6304</orcidid></search><sort><creationdate>20231101</creationdate><title>Security Relevant Methods of Android's API Classification: A Machine Learning Empirical Evaluation</title><author>Rodrigues, Walber M. ; Walmsley, Felipe N. ; Cavalcanti, George D. C. ; Cruz, Rafael M. O.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c244t-89b1ab546bcba02e8ef4ca2f5346cb5c64410b895ed2e5892679ba65d71779413</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Ambiguity</topic><topic>Android security</topic><topic>API Classification</topic><topic>Binary features</topic><topic>Classification</topic><topic>Classification algorithms</topic><topic>Classifiers</topic><topic>Codes</topic><topic>Collision rates</topic><topic>Decision analysis</topic><topic>Decision trees</topic><topic>Embedding</topic><topic>Empirical analysis</topic><topic>Feature extraction</topic><topic>Handles</topic><topic>Machine learning</topic><topic>Machine learning algorithms</topic><topic>Multiple Classifier Systems</topic><topic>Pipelines</topic><topic>Security</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rodrigues, Walber M.</creatorcontrib><creatorcontrib>Walmsley, Felipe N.</creatorcontrib><creatorcontrib>Cavalcanti, George D. C.</creatorcontrib><creatorcontrib>Cruz, Rafael M. O.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Rodrigues, Walber M.</au><au>Walmsley, Felipe N.</au><au>Cavalcanti, George D. C.</au><au>Cruz, Rafael M. O.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Security Relevant Methods of Android's API Classification: A Machine Learning Empirical Evaluation</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2023-11-01</date><risdate>2023</risdate><volume>72</volume><issue>11</issue><spage>1</spage><epage>13</epage><pages>1-13</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>The Android operating system provides functions and methods to handle sensitive data to secure users' data. The Android security literature extracts binary features from a method and classifies the method into one of the Security Relevant Method's classes, adding information about how the method handles sensitive data. However, the usage of binary features hinders the performance of some classifiers due to the high collision rate between instances. Although previous works have explored Security Relevant Method classification, an extensive study of machine learning algorithms over this problem has not been conceived. This work fills this gap, analyzing Monolithic classifiers, Multiple Classifier Systems, and Embedding algorithms to transform binary features into real-valued features, aiming to facilitate the classifier's work by minimizing the ambiguity promoted by the collision. Our analyzes show that META-DES, using a pool of Decision Trees trained with the Random Forest algorithm, statistically has the best results. We also find that, in general, distance-based classifiers have a disadvantage in binary features. Moreover, embedding techniques such as deep metric learning with triplet loss can reduce geometrical instance ambiguity, improving the performance of the weakest learning algorithms. However, its usage was detrimental to the performance of more robust techniques, such as dynamic ensemble models better suited for handling difficult cases. The dataset and code used for the experiments are available in the following repository: https://github.com/walbermr/android-srm-ml-evaluation .</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TC.2023.3291998</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-9446-1040</orcidid><orcidid>https://orcid.org/0000-0003-1410-1059</orcidid><orcidid>https://orcid.org/0000-0001-7714-2283</orcidid><orcidid>https://orcid.org/0000-0002-8809-6304</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9340
ispartof	IEEE transactions on computers, 2023-11, Vol.72 (11), p.1-13
issn	0018-9340 1557-9956
language	eng
recordid	cdi_proquest_journals_2875578352
source	IEEE Electronic Library (IEL)
subjects	Algorithms Ambiguity Android security API Classification Binary features Classification Classification algorithms Classifiers Codes Collision rates Decision analysis Decision trees Embedding Empirical analysis Feature extraction Handles Machine learning Machine learning algorithms Multiple Classifier Systems Pipelines Security
title	Security Relevant Methods of Android's API Classification: A Machine Learning Empirical Evaluation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T18%3A34%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Security%20Relevant%20Methods%20of%20Android's%20API%20Classification:%20A%20Machine%20Learning%20Empirical%20Evaluation&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Rodrigues,%20Walber%20M.&rft.date=2023-11-01&rft.volume=72&rft.issue=11&rft.spage=1&rft.epage=13&rft.pages=1-13&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2023.3291998&rft_dat=%3Cproquest_RIE%3E2875578352%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2875578352&rft_id=info:pmid/&rft_ieee_id=10183829&rfr_iscdi=true