Data complexity: An FCA-based approach
In this paper we propose different indices for measuring the complexity of a dataset in terms of Formal Concept Analysis (FCA). We extend the lines of the research about the ``closure structure'' and the ``closure index'' based on minimum generators of intents (aka closed itemset...
Gespeichert in:
Veröffentlicht in: | International journal of approximate reasoning 2024-02, Vol.165, p.109084, Article 109084 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper we propose different indices for measuring the complexity of a dataset in terms of Formal Concept Analysis (FCA). We extend the lines of the research about the ``closure structure'' and the ``closure index'' based on minimum generators of intents (aka closed itemsets). We would try to capture statistical properties of a dataset, not just extremal characteristics, such as the size of a passkey. For doing so we introduce an alternative approach where we measure the complexity of a dataset w.r.t. five significant elements that can be computed in a concept lattice, namely intents (closed sets of attributes), pseudo-intents, proper premises, keys (minimal generators), and passkeys (minimum generators). Then we define several original indices allowing us to estimate the complexity of a dataset. Moreover we study the distribution of all these different elements and indices in various real-world and synthetic datasets. Finally, we investigate the relations existing between these significant elements and indices, and as well the relations with implications and association rules. |
---|---|
ISSN: | 0888-613X |
DOI: | 10.1016/j.ijar.2023.109084 |