BLR-D: applying bilinear logistic regression to factored diagnosis problems

In this paper, we address a pattern of diagnosis problems in which each of J entities produces the same K features, yet we are only informed of overall faults from the ensemble. Furthermore, we suspect that only certain entities and certain features are leading to the problem. The task, then, is to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Operating systems review 2012-01, Vol.45 (3), p.31-38
Hauptverfasser: Basu, Sumit, Dunagan, John, Duh, Kevin, Muniswamy-Reddy, Kiran-Kumar
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 38
container_issue 3
container_start_page 31
container_title Operating systems review
container_volume 45
creator Basu, Sumit
Dunagan, John
Duh, Kevin
Muniswamy-Reddy, Kiran-Kumar
description In this paper, we address a pattern of diagnosis problems in which each of J entities produces the same K features, yet we are only informed of overall faults from the ensemble. Furthermore, we suspect that only certain entities and certain features are leading to the problem. The task, then, is to reliably identify which entities and which features are at fault. Such problems are particularly prevalent in the world of computer systems, in which a datacenter with hundreds of machines, each with the same performance counters, occasionally produces overall faults. In this paper, we present a means of using a constrained form of bilinear logistic regression for diagnosis in such problems. The bilinear treatment allows us to represent the scenarios with J+K instead of JK parameters, resulting in more easily interpretable results and far fewer false positives compared to treating the parameters independently. We develop statistical tests to determine which features and entities, if any, may be responsible for the labeled faults, and use false discovery rate (FDR) analysis to ensure that our values are meaningful. We show results in comparison to ordinary logistic regression (with L1 regularization) on two scenarios: a synthetic dataset based on a model of faults in a datacenter, and a real problem of finding problematic processes/features based on user-reported hangs.
doi_str_mv 10.1145/2094091.2094100
format Article
fullrecord <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1145_2094091_2094100</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1145_2094091_2094100</sourcerecordid><originalsourceid>FETCH-crossref_primary_10_1145_2094091_20941003</originalsourceid><addsrcrecordid>eNpjYBA3NNAzNDQx1TcysDQxsDTUA9GGBgYsDJwGhmbGuqaWFgYcDFzFxVkGBoYWhmaGnAysTj5Bui48DKxpiTnFqbxQmptB3801xNlDN7kov7i4KDUtvqAoMzexqDLe0CAeZEM81IZ4qA3GpOsAAIlGK0s</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>BLR-D: applying bilinear logistic regression to factored diagnosis problems</title><source>ACM Digital Library Complete</source><creator>Basu, Sumit ; Dunagan, John ; Duh, Kevin ; Muniswamy-Reddy, Kiran-Kumar</creator><creatorcontrib>Basu, Sumit ; Dunagan, John ; Duh, Kevin ; Muniswamy-Reddy, Kiran-Kumar</creatorcontrib><description>In this paper, we address a pattern of diagnosis problems in which each of J entities produces the same K features, yet we are only informed of overall faults from the ensemble. Furthermore, we suspect that only certain entities and certain features are leading to the problem. The task, then, is to reliably identify which entities and which features are at fault. Such problems are particularly prevalent in the world of computer systems, in which a datacenter with hundreds of machines, each with the same performance counters, occasionally produces overall faults. In this paper, we present a means of using a constrained form of bilinear logistic regression for diagnosis in such problems. The bilinear treatment allows us to represent the scenarios with J+K instead of JK parameters, resulting in more easily interpretable results and far fewer false positives compared to treating the parameters independently. We develop statistical tests to determine which features and entities, if any, may be responsible for the labeled faults, and use false discovery rate (FDR) analysis to ensure that our values are meaningful. We show results in comparison to ordinary logistic regression (with L1 regularization) on two scenarios: a synthetic dataset based on a model of faults in a datacenter, and a real problem of finding problematic processes/features based on user-reported hangs.</description><identifier>ISSN: 0163-5980</identifier><identifier>DOI: 10.1145/2094091.2094100</identifier><language>eng</language><ispartof>Operating systems review, 2012-01, Vol.45 (3), p.31-38</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-crossref_primary_10_1145_2094091_20941003</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Basu, Sumit</creatorcontrib><creatorcontrib>Dunagan, John</creatorcontrib><creatorcontrib>Duh, Kevin</creatorcontrib><creatorcontrib>Muniswamy-Reddy, Kiran-Kumar</creatorcontrib><title>BLR-D: applying bilinear logistic regression to factored diagnosis problems</title><title>Operating systems review</title><description>In this paper, we address a pattern of diagnosis problems in which each of J entities produces the same K features, yet we are only informed of overall faults from the ensemble. Furthermore, we suspect that only certain entities and certain features are leading to the problem. The task, then, is to reliably identify which entities and which features are at fault. Such problems are particularly prevalent in the world of computer systems, in which a datacenter with hundreds of machines, each with the same performance counters, occasionally produces overall faults. In this paper, we present a means of using a constrained form of bilinear logistic regression for diagnosis in such problems. The bilinear treatment allows us to represent the scenarios with J+K instead of JK parameters, resulting in more easily interpretable results and far fewer false positives compared to treating the parameters independently. We develop statistical tests to determine which features and entities, if any, may be responsible for the labeled faults, and use false discovery rate (FDR) analysis to ensure that our values are meaningful. We show results in comparison to ordinary logistic regression (with L1 regularization) on two scenarios: a synthetic dataset based on a model of faults in a datacenter, and a real problem of finding problematic processes/features based on user-reported hangs.</description><issn>0163-5980</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNpjYBA3NNAzNDQx1TcysDQxsDTUA9GGBgYsDJwGhmbGuqaWFgYcDFzFxVkGBoYWhmaGnAysTj5Bui48DKxpiTnFqbxQmptB3801xNlDN7kov7i4KDUtvqAoMzexqDLe0CAeZEM81IZ4qA3GpOsAAIlGK0s</recordid><startdate>20120111</startdate><enddate>20120111</enddate><creator>Basu, Sumit</creator><creator>Dunagan, John</creator><creator>Duh, Kevin</creator><creator>Muniswamy-Reddy, Kiran-Kumar</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20120111</creationdate><title>BLR-D</title><author>Basu, Sumit ; Dunagan, John ; Duh, Kevin ; Muniswamy-Reddy, Kiran-Kumar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-crossref_primary_10_1145_2094091_20941003</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Basu, Sumit</creatorcontrib><creatorcontrib>Dunagan, John</creatorcontrib><creatorcontrib>Duh, Kevin</creatorcontrib><creatorcontrib>Muniswamy-Reddy, Kiran-Kumar</creatorcontrib><collection>CrossRef</collection><jtitle>Operating systems review</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Basu, Sumit</au><au>Dunagan, John</au><au>Duh, Kevin</au><au>Muniswamy-Reddy, Kiran-Kumar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>BLR-D: applying bilinear logistic regression to factored diagnosis problems</atitle><jtitle>Operating systems review</jtitle><date>2012-01-11</date><risdate>2012</risdate><volume>45</volume><issue>3</issue><spage>31</spage><epage>38</epage><pages>31-38</pages><issn>0163-5980</issn><abstract>In this paper, we address a pattern of diagnosis problems in which each of J entities produces the same K features, yet we are only informed of overall faults from the ensemble. Furthermore, we suspect that only certain entities and certain features are leading to the problem. The task, then, is to reliably identify which entities and which features are at fault. Such problems are particularly prevalent in the world of computer systems, in which a datacenter with hundreds of machines, each with the same performance counters, occasionally produces overall faults. In this paper, we present a means of using a constrained form of bilinear logistic regression for diagnosis in such problems. The bilinear treatment allows us to represent the scenarios with J+K instead of JK parameters, resulting in more easily interpretable results and far fewer false positives compared to treating the parameters independently. We develop statistical tests to determine which features and entities, if any, may be responsible for the labeled faults, and use false discovery rate (FDR) analysis to ensure that our values are meaningful. We show results in comparison to ordinary logistic regression (with L1 regularization) on two scenarios: a synthetic dataset based on a model of faults in a datacenter, and a real problem of finding problematic processes/features based on user-reported hangs.</abstract><doi>10.1145/2094091.2094100</doi></addata></record>
fulltext fulltext
identifier ISSN: 0163-5980
ispartof Operating systems review, 2012-01, Vol.45 (3), p.31-38
issn 0163-5980
language eng
recordid cdi_crossref_primary_10_1145_2094091_2094100
source ACM Digital Library Complete
title BLR-D: applying bilinear logistic regression to factored diagnosis problems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T16%3A22%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=BLR-D:%20applying%20bilinear%20logistic%20regression%20to%20factored%20diagnosis%20problems&rft.jtitle=Operating%20systems%20review&rft.au=Basu,%20Sumit&rft.date=2012-01-11&rft.volume=45&rft.issue=3&rft.spage=31&rft.epage=38&rft.pages=31-38&rft.issn=0163-5980&rft_id=info:doi/10.1145/2094091.2094100&rft_dat=%3Ccrossref%3E10_1145_2094091_2094100%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true