Data complexity: An FCA-based approach

In this paper we propose different indices for measuring the complexity of a dataset in terms of Formal Concept Analysis (FCA). We extend the lines of the research about the ``closure structure'' and the ``closure index'' based on minimum generators of intents (aka closed itemset...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of approximate reasoning 2024-02, Vol.165, p.109084, Article 109084
Hauptverfasser: Buzmakov, Alexey, Dudyrev, Egor, Kuznetsov, Sergei O., Makhalova, Tatiana, Napoli, Amedeo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper we propose different indices for measuring the complexity of a dataset in terms of Formal Concept Analysis (FCA). We extend the lines of the research about the ``closure structure'' and the ``closure index'' based on minimum generators of intents (aka closed itemsets). We would try to capture statistical properties of a dataset, not just extremal characteristics, such as the size of a passkey. For doing so we introduce an alternative approach where we measure the complexity of a dataset w.r.t. five significant elements that can be computed in a concept lattice, namely intents (closed sets of attributes), pseudo-intents, proper premises, keys (minimal generators), and passkeys (minimum generators). Then we define several original indices allowing us to estimate the complexity of a dataset. Moreover we study the distribution of all these different elements and indices in various real-world and synthetic datasets. Finally, we investigate the relations existing between these significant elements and indices, and as well the relations with implications and association rules.
ISSN:0888-613X
DOI:10.1016/j.ijar.2023.109084