Episode 80: Counting Potatoes vs. Computational Mysticism - Using CHAOSS for Research
Thank you to the folks at Sustain for providing the hosting account for CHAOSSCast! CHAOSScast – Episode 79 In this episode, host Georg Link is joined by Daniel, Anita, Sophia, and Sean, to discuss their research experiences with CHAOSS metrics and software for open source community health analysis. They dive into various topics, such as collecting and interpreting data from different perspectives, considerations regarding privacy and ethics, and the importance of collaboration between academics and industry professionals. They also highlight some significant projects and studies where CHAOSS metrics and software were employed, and their hopes and concerns for the future direction of research in the field. Furthermore, they discuss the necessity of bridging the gap between academia and industry and touch on the importance of linguistics and cultural context when examining data. Download this episode now! [00:02:48] Anita discusses the history of open source software research and how CHAOSS provides a common framework for various metrics used by researchers, and Sean emphasizes the standardization of metrics by CHAOSS, which aids in consistency across research. [00:04:52] Sophia highlights the discrepancies in metric calculations and definitions, seeking standard methodologies, especially for non-academic publications, and Daniel reflects on the differences in research approaches between academia and industry, emphasizing the importance of methodological rigor. [00:08:25] Sean critiques academic papers for often lacking complete method descriptions, calling for a more rigorous methodological transparency, and Daniel shares about transitioning from academia to industry and the different expectations for communication and results. [00:10:44] Georg inquires about the impact of CHAOSS research capabilities, and Daniel explains that CHAOSS is shaping research by reflecting the interests and observations of its contributors. [00:12:16] Sean talks about the increased capacity for research offered by CHAOSS, particularly through tools like Grimoire Lab and Augur, Anita shares her experience using Grimoire Lab for creating interventions and dashboards for open source communities to monitor their projects, and Daniel adds historical context and mentions the importance of tools that allow the replication of analysis in research. [00:17:10] Georg introduces a study using CHAOSS metrics and software that hasn’t been officially published yet, and Sophia shares some details and explains the study’s premise. [00:21:00] Anita raises a philosophical point about the potential limitations of metrics, suggesting that they may only reflect what is observable and could lead to gamification if people optimize their behavior based on the metrics. [00:22:14] Sean speaks about the importance of deep field engagement and the combination of social science with data mining to fully understand the data’s underlying human behavior. Sophia shares her perspective from market research, discussing the design of surveys, the selection bias inherent in data collection, and the importance of understanding the population that is excluded by the research filters used. [00:25:56] Anita discusses the challenges of academic surveys, and Daniel discusses the bias that may arise from the data available. [00:28:10] Sophia contemplates the behavioral nuances dictated by different platforms’ processes, and Sean suggests a focus on common software engineering processes across different tools and advocates for social scientific research in open source to better understand the human aspects. [00:30:32] Georg transitions to discussing survey methodologies and their relation to CHAOSS metrics, and Anita shares her experiences with survey design for the international Apache Software Foundation community and implementation. [00:33:10] Daniel reflects on the collaborative effort with the ASF community to ensure the survey’s terms and questions were appropriately adapted for an intern