Subdivisions and Crossroads: Identifying Hidden Community Structures in a Data Archive's Citation Network
Data archives are an important source of high quality data in many fields, making them ideal sites to study data reuse. By studying data reuse through citation networks, we are able to learn how hidden research communities - those that use the same scientific datasets - are organized. This paper analyzes the community structure of an authoritative network of datasets cited in academic publications, which have been collected by a large, social science data archive: the Interuniversity Consortium for Political and Social Research (ICPSR). Through network analysis, we identified communities of social science datasets and fields of research connected through shared data use. We argue that communities of exclusive data reuse form subdivisions that contain valuable disciplinary resources, while datasets at a "crossroads" broadly connect research communities. Our research reveals the hidden structure of data reuse and demonstrates how interdisciplinary research communities organize around datasets as shared scientific inputs. These findings contribute new ways of describing scientific communities in order to understand the impacts of research data reuse.
READ FULL TEXT