Data and Software

Knowledge Infrastructure of Nonprofit and Philanthropic Studies (LINK)

How can computational social science (CSS) methods be applied in nonprofit and philanthropic studies? This paper summarizes and explains a range of relevant CSS methods from a research design perspective, and highlights key applications in our field. We define CSS as a set of computationally intensive empirical methods for data management, concept representation, data analysis, and visualization. What makes the computational methods “social” is that the purpose of using these methods is to serve quantitative, qualitative, and mixed-methods social science research, such that theorization can have a solid ground. We illustrate the promise of CSS in our field by using it to construct the largest and most comprehensive database of scholarly references in our field, the Knowledge Infrastructure of Nonprofit and Philanthropic Studies (KINPS). Furthermore, we show that through the application of CSS in constructing and analyzing KINPS, we can better understand and facilitate the intellectual growth of our field. We conclude the article with cautions for using CSS and suggestions for future studies implementing CSS and KINPS.

Citation: Ma, Ji, Islam Akef Ebeid, Arjen de Wit, Meiying Xu, Yongzheng Yang, René Bekkers, and Pamala Wiepking. 2021. “Computational Social Science for Nonprofit Studies: Developing a Toolbox and Knowledge Base for the Field.” VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations, October.

Other relevant articles or resources:

npoclass – Classify nonprofits using NTEE codes (LINK)

This research developed a machine-learning classifier that reliably automates the coding process using the National Taxonomy of Exempt Entities as a schema and remapped the U.S. nonprofit sector. I achieved 90% overall accuracy for classifying the nonprofits into nine broad categories and 88% for classifying them into 25 major groups. The intercoder reliabilities between algorithms and human coders measured by kappa statistics are in the “almost perfect” range of 0.80–1.00. The results suggest that a state-of-the-art machine-learning algorithm can approximate human coders and substantially improve researchers’ productivity. I also reassigned multiple category codes to over 439 thousand nonprofits and discovered a considerable amount of organizational activities that were previously ignored. The classifier is an essential methodological prerequisite for large-N and Big Data analyses, and the remapped U.S. nonprofit sector can serve as an important instrument for asking or reexamining fundamental questions of nonprofit studies.

Citation: Ma, Ji. 2020. “Automated Coding Using Machine-Learning and Remapping the U.S. Nonprofit Sector: A Guide and Benchmark.” Nonprofit and Voluntary Sector Quarterly forthcoming.

The Research Infrastructure of Chinese Foundations (RICF)

“A database of Chinese foundations, civil society, and social development in general. The structure of the RICF is deliberately designed and normalized according to the Three Normal Forms. The database schema consists of three major themes: foundations’ basic organizational profile (i.e., basic profile, board member, supervisor, staff, and related party tables), program information (i.e., program information, major program, program relationship, and major recipient tables), and financial information (i.e., financial position, financial activities, cash flow, activity overview, and large donation tables).”

Citation: Ma, J., Wang, Q., Dong, C., and Li, H. (2017). The research infrastructure of Chinese foundations, a database for Chinese civil society studies. Scientific Data, 4:170094.

Citing Publications      Download Data

Datasets in “state power and elite autonomy in a networked civil society” (link)

“In response to failures of central planning, the Chinese government has experimented not only with free-market trade zones, but with allowing non-profit foundations to operate in a decentralized fashion. A network study shows how these foundations have connected together by sharing board members, in a structural parallel to what is seen in corporations in the United States and Europe. This board interlocking leads to the emergence of an elite group with privileged network positions. While the presence of government officials on non-profit boards is widespread, government officials are much less common in a subgroup of foundations that control just over half of all revenue in the network. This subgroup, associated with business elites, not only enjoys higher levels of within-elite links, but even preferentially excludes government officials from the NGOs with higher degree. The emergence of this structurally autonomous sphere is associated with major political and social events in the state–society relationship. Cluster analysis reveals multiple internal components within this sphere that share similar levels of network influence. Rather than a core-periphery structure centered around government officials, the Chinese non-profit world appears to be a multipolar one of distinct elite groups, many of which achieve high levels of independence from direct government control.”

Citation: Ma, J., & DeDeo, S. (2018). State power and elite autonomy in a networked civil society: The board interlocking of Chinese non-profits. Social Networks, 54, 291–302.

Full-text corpus of People’s Daily (1946-2017)

The corpus includes over 1.6 million full-text records of People’s Daily, the official newspaper of the Chinese Communist Party. Good for researches in political science, public administration, sociology, and nonprofit studies, etc. Because of copyright restrictions, the raw dataset cannot be posted publicly.