Book description
Preparing and cleaning data is notoriously expensive, prone to error, and time consuming: the process accounts for roughly 80% of the total time spent on analysis. As this O’Reilly report points out, enterprises have already invested billions of dollars in big data analytics, so there’s great incentive to modernize methods for cleaning, combining, and transforming data.
Author Federico Castanedo, Chief Data Scientist at WiseAthena.com, details best practices for reducing the time it takes to convert raw data into actionable insights. With these tools and techniques in mind, your organization will be well positioned to translate big data into big decisions.
- Explore the problems organizations face today with traditional prep and integration
- Define the business questions you want to address before selecting, prepping, and analyzing data
- Learn new methods for preparing raw data, including date-time and string data
- Understand how some cleaning actions (like replacing missing values) affect your analysis
- Examine data curation products: modern approaches that scale
- Consider your business audience when choosing ways to deliver your analysis
Publisher resources
Product information
- Title: Data Preparation in the Big Data Era
- Author(s):
- Release date: October 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491938942
You might also like
book
Planning for Big Data
In an age where everything is measurable, understanding big data is an essential. From creating new …
book
Learning to Love Data Science
Until recently, many people thought big data was a passing fad. "Data science" was an enigmatic …
book
Data for the Public Good
As we move into an era of unprecedented volumes of data and computing power, the benefits …
book
How Data Science Is Transforming Health Care
In the early days of the 20th century, department store magnate JohnWanamaker famously said, "I know …