Book description
The Art and Science of Analyzing Software Data provides valuable information on analysis techniques often used to derive insight from software data. This book shares best practices in the field generated by leading data scientists, collected from their experience training software engineering students and practitioners to master data science.
The book covers topics such as the analysis of security data, code reviews, app stores, log files, and user telemetry, among others. It covers a wide variety of techniques such as co-change analysis, text analysis, topic analysis, and concept analysis, as well as advanced topics such as release planning and generation of source code comments. It includes stories from the trenches from expert data scientists illustrating how to apply data analysis in industry and open source, present results to stakeholders, and drive decisions.
- Presents best practices, hints, and tips to analyze data and apply tools in data science projects
- Presents research methods and case studies that have emerged over the past few years to further understanding of software data
- Shares stories from the trenches of successful data science initiatives in industry
Table of contents
- Cover image
- Title page
- Table of Contents
- Copyright
- List of Contributors
- Chapter 1: Past, Present, and Future of Analyzing Software Data
-
Part 1: Tutorial-Techniques
- Chapter 2: Mining Patterns and Violations Using Concept Analysis
- Chapter 3: Analyzing Text in Software Projects
- Chapter 4: Synthesizing Knowledge from Software Development Artifacts
- Chapter 5: A Practical Guide to Analyzing IDE Usage Data
- Chapter 6: Latent Dirichlet Allocation: Extracting Topics from Software Engineering Data
- Chapter 7: Tools and Techniques for Analyzing Product and Process Data
- Part 2: Data/Problem Focussed
-
Part 3: Stories from the Trenches
- Chapter 12: Applying Software Data Analysis in Industry Contexts: When Research Meets Reality
-
Chapter 13: Using Data to Make Decisions in Software Engineering: Providing a Method to our Madness
- Abstract
- 13.1 Introduction
- 13.2 Short History of Software Engineering Metrics
- 13.3 Establishing Clear Goals
- 13.4 Review of Metrics
- 13.5 Challenges with Data Analysis on Software Projects
- 13.6 Example of Changing Product Development Through the Use of Data
- 13.7 Driving Software Engineering Processes with Data
- Chapter 14: Community Data for OSS Adoption Risk Management
-
Chapter 15: Assessing the State of Software in a Large Enterprise: A 12-Year Retrospective
- Abstract
- Acknowledgments
- 15.1 Introduction
- 15.2 Evolution of the Process and the Assessment
- 15.3 Impact Summary of the State of Avaya Software Report
- 15.4 Assessment Approach and Mechanisms
- 15.5 Data Sources
- 15.6 Examples of Analyses
- 15.7 Software Practices
- 15.8 Assessment Follow-up: Recommendations and Impact
- 15.9 Impact of the Assessments
- 15.10 Conclusions
- 15.11 Appendix
- Author Biographies
- Chapter 16: Lessons Learned from Software Analytics in Practice
-
Part 4: Advanced Topics
- Chapter 17: Code Comment Analysis for Improving Software Quality
-
Chapter 18: Mining Software Logs for Goal-Driven Root Cause Analysis
- Abstract
- 18.1 Introduction
- 18.2 Approaches to Root Cause Analysis
- 18.3 Root Cause Analysis Framework Overview
- 18.4 Modeling Diagnostics for Root Cause Analysis
- 18.5 Log Reduction
- 18.6 Reasoning Techniques
- 18.7 Root Cause Analysis for Failures Induced by Internal Faults
- 18.8 Root Cause Analysis for Failures due to External Threats
- 18.9 Experimental Evaluations
- 18.10 Conclusions
-
Chapter 19: Analytical Product Release Planning
- Abstract
- Acknowledgments
- 19.1 Introduction and Motivation
- 19.2 Taxonomy of Data-intensive Release Planning Problems
- 19.3 Information Needs for Software Release Planning
- 19.4 The Paradigm of Analytical Open Innovation
- Analysis phase
- Synthesize phase
- 19.5 Analytical Release Planning—A Case Study
- 19.6 Summary and Future Research
- 19.7 Appendix: Feature Dependency Constraints
-
Part 5: Data Analysis at Scale (Big Data)
-
Chapter 20: Boa: An Enabling Language and Infrastructure for Ultra-Large-Scale MSR Studies
- Abstract
- 20.1 Objectives
- 20.2 Getting Started with Boa
- 20.3 Boa’s Syntax and Semantics
- 20.4 Mining Project and Repository Metadata
- 20.5 Mining Source Code with Visitors
- 20.6 Guidelines for Replicable Research
- 20.7 Conclusions
- 20.8 Practice Problems
- Project and Repository Metadata Problems
- Source Code Problems
- Chapter 21: Scalable Parallelization of Specification Mining Using Distributed Computing
-
Chapter 20: Boa: An Enabling Language and Infrastructure for Ultra-Large-Scale MSR Studies
Product information
- Title: The Art and Science of Analyzing Software Data
- Author(s):
- Release date: September 2015
- Publisher(s): Morgan Kaufmann
- ISBN: 9780124115439
You might also like
article
The Human Factor in AI-Based Decision-Making
Individuals’ unique decision-making styles inform the choices they make when working with AI-based inputs. The authors …
audiobook
The Year in Tech, 2025
<B>A year of HBR's essential thinking on tech—all in one place.</B><br/><br/><br/><br/>Generative AI, biometrics, spatial computing, electric …
article
Become a Better Problem Solver by Telling Better Stories
One of the biggest obstacles to effective problem-solving is not defining the problem well. Invoking the …
article
Communicate with Teams More Effectively
This selection of shortcuts will enable you to improve your communication, critical thinking, documentation, and networking …