DATA SCIENCE IN DEVELOPMENT ECONOMICS: USING CLUSTER ANALYSIS TO GENERATE A MULTIVARIATE DEVELOPMENT TAXONOMY
Date
2018-05
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
This paper attempts to apply clustering techniques from data science to the
economic problem of generating a country-level development taxonomy. Development
taxonomies currently in use su er from two key issues. First, the taxonomies are based
on very few variables and therefore cannot properly represent something as complex
and multifaceted as development. Second, the values used to discriminate groups are
chosen arbitrarily. In this work, a univariate analysis is performed using the method of
kernel density estimation to empirically generate a single-valued taxonomy which can
be directly compared with the income group taxonomy published by the World Bank.
Next, a de nition of development is derived and a multivariate analysis is performed to
create a comprehensive development taxonomy using two forms of k-means clustering.
The univariate analysis demonstrates the superiority of a data-driven approach to
single-valued taxonomy creation. Conversely, it remains inconclusive as to whether
cluster analysis can create a well-de ned multivariate development taxonomy.
Description
Keywords
Mathematics and Economics, cluster analysis, multivariate development taxonomy