DEVICE AND METHOD FOR CONSTRUCTING DIFFERENTIALLY PRIVATE DECISION TREES

Disclosed are a method and an apparatus for generating a decision tree based on differential privacy, which can train a differentially private explainable boosting machine (DP-EBM) model for providing high accuracy while protecting personal privacy at a high level. The method for generating a decisi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: BAEK INCHUL, CHUNG YON DOHN
Format: Patent
Sprache:eng ; kor
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Disclosed are a method and an apparatus for generating a decision tree based on differential privacy, which can train a differentially private explainable boosting machine (DP-EBM) model for providing high accuracy while protecting personal privacy at a high level. The method for generating a decision tree based on differential privacy is performed by a computing device including at least a processor, and comprises the steps of: generating a histogram for each of features of data; and training the DP-EBM model by using the histogram, wherein the training of the DP-EBM model includes the steps of: calculating a feature score by which each of the features contributes to correct answer prediction and a noise score by which noise contributes to correct answer prediction; performing feature pruning based on the feature score and the noise score; and reallocating privacy budgets allocated to the pruned features. 차분 프라이버시 기반 의사결정 트리 생성 방법 및 장치가 개시된다. 상기 차분 프라이버시 기반 의사결정 트리 생성 방법은 적어도 프로세서를 포함하는 컴퓨팅 장치에 의해 수행되고, 데이터의 특징들 각각에 대한 히스토그램(histogram)을 생성하는 단계, 및 상기 히스토그램을 이용하여 DP-EBM(Differentially Private Explainable Boosting Machine) 모델을 학습하는 단계를 포함하고, 상기 DP-EBM 모델을 학습하는 단계는, 특징들 각각이 정답예측에 기여하는 정도인 특징 스코어와 노이즈가 정답예측에 기여하는 정도인 노이즈 스코어를 산출하는 단계, 상기 특징 스코어와 상기 노이즈 스코어에 기초하여 특징 가지치기(feature pruning)를 수행하는 단계, 및 가지치기된 특징에 할당된 프라이버시 예산(privacy budgets)을 재할당하는 단계를 포함한다.