Book description
Data Science for Software Engineering: Sharing Data and Models presents guidance and procedures for reusing data and models between projects to produce results that are useful and relevant. Starting with a background section of practical lessons and warnings for beginner data scientists for software engineering, this edited volume proceeds to identify critical questions of contemporary software engineering related to data and models. Learn how to adapt data from other organizations to local problems, mine privatized data, prune spurious information, simplify complex results, how to update models for new platforms, and more. Chapters share largely applicable experimental results discussed with the blend of practitioner focused domain expertise, with commentary that highlights the methods that are most useful, and applicable to the widest range of projects. Each chapter is written by a prominent expert and offers a state-of-the-art solution to an identified problem facing data scientists in software engineering. Throughout, the editors share best practices collected from their experience training software engineering students and practitioners to master data science, and highlight the methods that are most useful, and applicable to the widest range of projects.
- Shares the specific experience of leading researchers and techniques developed to handle data problems in the realm of software engineering
- Explains how to start a project of data science for software engineering as well as how to identify and avoid likely pitfalls
- Provides a wide range of useful qualitative and quantitative principles ranging from very simple to cutting edge research
- Addresses current challenges with software engineering data such as lack of local data, access issues due to data privacy, increasing data quality via cleaning of spurious chunks in data
Table of contents
- Cover image
- Title page
- Table of Contents
- Copyright
- Why this book?
- Foreword
- List of Figures
- Chapter 1: Introduction
- Part I: Data Mining for Managers
- Part II: Data Mining: A Technical Tutorial
-
Part III: Sharing Data
- Chapter 11: Sharing Data: Challenges and Methods
- Chapter 12: Learning Contexts
-
Chapter 13: Cross-Company Learning: Handling The Data Drought
- Abstract
- 13.1 Motivation
- 13.2 Setting the ground for analyses
- 13.3 Analysis #1: can CC data be useful for an organization?
- 13.4 Analysis #2: how to cleanup CC data for local tuning?
- 13.5 Analysis #3: how much local data does an organization need for a local model?
- 13.6 How trustworthy are these results?
- 13.7 Are these useful in practice or just number crunching?
- 13.8 What's new on cross-learning?
- 13.9 What's the takeaway?
- Chapter 14: Building Smarter Transfer Learners
- Chapter 15: Sharing Less Data (Is a Good Thing)
- Chapter 16: How To Keep Your Data Private
- Chapter 17: Compensating for Missing Data
- Chapter 18: Active Learning: Learning More With Less
-
Part IV: Sharing Models
- Chapter 19: Sharing Models: Challenges and Methods
- Chapter 20: Ensembles of Learning Machines
- Chapter 21: How to Adapt Models in a Dynamic World
- Chapter 22: Complexity: Using Assemblies of Multiple Models
- Chapter 23: The Importance of Goals in Model-Based Reasoning
-
Chapter 24: Using Goals in Model-Based Reasoning
- Abstract
- 24.1 Multilayer Perceptrons
- 24.2 Multiobjective evolutionary algorithms
- 24.3 HaD-MOEA
- 24.4 Using MOEAs for creating see models
- 24.5 Experimental setup
- 24.6 The relationship among different performance measures
- 24.7 Ensembles based on concurrent optimization of performance measures
- 24.8 Emphasizing particular performance measures
- 24.9 Further analysis of the model choice
- 24.10 Comparison against other types of models
- 24.11 Summary
- Chapter 25: A Final Word
- Bibliography
Product information
- Title: Sharing Data and Models in Software Engineering
- Author(s):
- Release date: December 2014
- Publisher(s): Morgan Kaufmann
- ISBN: 9780124173071
You might also like
book
Managing Data Orchestration and Integration at Scale
Why is data integration still a challenge today? And what does data orchestration mean? In this …
book
Mastering Data Modeling: A User-Driven Approach
Data modeling is one of the most critical phases in the database application development process, but …
book
GPU Programming in MATLAB
GPU programming in MATLAB is intended for scientists, engineers, or students who develop or maintain applications …
article
Use Github Copilot for Prompt Engineering
Using GitHub Copilot can feel like magic. The tool automatically fills out entire blocks of code--but …