TY - JOUR TI - Automated Coding Using Machine Learning and Remapping the U.S. Nonprofit Sector: A Guide and Benchmark AU - Ma, Ji T2 - Nonprofit and Voluntary Sector Quarterly AB - This research developed a machine learning classifier that reliably automates the coding process using the National Taxonomy of Exempt Entities as a schema and remapped the U.S. nonprofit sector. I achieved 90% overall accuracy for classifying the nonprofits into nine broad categories and 88% for classifying them into 25 major groups. The intercoder reliabilities between algorithms and human coders measured by kappa statistics are in the “almost perfect” range of .80 to 1.00. The results suggest that a state-of-the-art machine learning algorithm can approximate human coders and substantially improve researchers’ productivity. I also reassigned multiple category codes to more than 439,000 nonprofits and discovered a considerable amount of organizational activities that were previously ignored. The classifier is an essential methodological prerequisite for large-N and Big Data analyses, and the remapped U.S. nonprofit sector can serve as an important instrument for asking or reexamining fundamental questions of nonprofit studies. The working directory with all data sets, source codes, and historical versions are available on GitHub (https://github.com/ma-ji/npo_classifier). DA - 2021/06/01/ PY - 2021 DO - 10.1177/0899764020968153 DP - SAGE Journals VL - 50 IS - 3 SP - 662 EP - 687 J2 - Nonprofit and Voluntary Sector Quarterly LA - en SN - 0899-7640 ST - Automated Coding Using Machine Learning and Remapping the U.S. Nonprofit Sector UR - https://doi.org/10.1177/0899764020968153 Y2 - 2021/05/22/05:41:29 ER -