Advancing UN Comtrade for Physical Trade Flow Analysis: Addressing the Issue of Outliers

•Outliers exist in UN Comtrade for almost all reporters (207 in 209), all commodities, and all years.•Most outliers (92% of the total) are with wrong net weight values.•Outliers may be few in numbers but cause significant biases in physical trade flow analysis.•Based on potential causes, our outlier...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Resources, conservation and recycling conservation and recycling, 2022-11, Vol.186, p.106524, Article 106524
Hauptverfasser: Jiang, Zhihan, Chen, Chuke, Li, Nan, Wang, Heming, Wang, Peng, Zhang, Chao, Ma, Fengmei, Zhang, Zhihe, Huang, Yuanyi, Qi, Jianchuan, Chen, Wei-Qiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Outliers exist in UN Comtrade for almost all reporters (207 in 209), all commodities, and all years.•Most outliers (92% of the total) are with wrong net weight values.•Outliers may be few in numbers but cause significant biases in physical trade flow analysis.•Based on potential causes, our outlier handling could retain most critical information.•Addressing the outlier issue would improve UN Comtrade quality and benefit related analysis. The UN Comtrade is one of the most widely used data sources for physical trade analysis. However, the issue of outliers would result in misleading interpretations and biased results, limiting its applications. Assuming that no deals would be made at unreasonable prices, we define an outlier as the data record whose unit price (trade value divided by net weight) is unusually high or low. To address the outlier issue, we develop a framework of first applying the kernel density estimation method to detect outliers and then using different statistical models to handle them based on their potential causes, then develop a deviation index to assess the impacts of outliers, and present the data quality improvement and the significance of our framework; and finally evaluate its performance by comparing with previous methods to show its outperformance on adaptability to different commodities’ data. Our results reveal that outliers exist for almost all reporters (207 in 209, 99%), all commodities, and all years, and most outliers (92%) are with wrong net weight values. With a higher deviation index, reporters are Canada, China, France, etc., while commodities are high-price electronic products, clocks, etc. The data quality would be greatly improved by addressing the outlier issue, thus benefiting UN-based physical trade analysis.
ISSN:0921-3449
1879-0658
DOI:10.1016/j.resconrec.2022.106524