Predictive Analytics Business Cases in RapidMiner
Published by O'Reilly Media, Inc.
A short course for busy professionals who want to get more out of their data
Regardless of your current role, chances are you work with some sort of data. You might be part of the team that generates data at your company, or you might be a team lead, supervisor, or manager who has data that describes the quality, quantity, and nature of the work done by the good people you work with. Maybe you’re just a person who likes to keep track of things: how much you spend, what you spend it on, where you go, how often, etc.
No matter your relationship with data, this course will help you make more sense of it. Using RapidMiner Studio Free, a powerful software platform for building analytic models and visualizing your results, Matthew North walks you through six common analytic techniques that can help you understand your data: linear regression, logistic regression, naïve Bayes, neural networks, decision trees, and text analytics. Join Matthew to learn when and how to employ each technique to (responsibly) predict future outcomes and see how industries from customer service, subscription services, and retail sales to finance, insurance, and manufacturing have implemented them for their own business use cases.
What you’ll learn and how you can apply it
By the end of this live, online course, you’ll understand:
- How and when to employ linear regression, logistic regression, naïve Bayes, neural networks, decision trees, and text analytics for predictive analytics
- How to use RapidMiner’s tools to perform predictive analytics
And you’ll be able to:
- Ask targeted questions of your data and build a predictive model that will be used on that data
- Select an appropriate predictive modeling technique for your business use case
- Answer business questions, improve decision support, and enhance business intelligence using six common techniques in predictive analytics
This live event is for you because...
- You are a business professional who wants to better understand and mitigate risk.
- You are a supervisor or manager who wants to use data to improve decision making, operational efficiency, or business execution.
- You are a planner who wants to anticipate and prepare for future organizational events.
- You are an analyst who wants to use data to enhance business understanding and action.
Prerequisites
- A solid understanding of data organization and basic statistics
- Familiarity with techniques often used in predictive analytics, including linear regression, logistic regression, naïve Bayes, neural networks, decision trees, and text analytics (recommended)
Materials and downloads needed:
- A machine with RapidMiner 7: (free edition) installed
- Course datasets downloaded prior to course
Recommended Preparation:
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Setup and understanding training/scoring datasets (10 minutes)
- Setting up a functional working environment
- The RapidMiner interface
Linear regression: How much lettuce should I order for next week? (25 minutes)
- Predicting numeric outcomes based on historical data
- Example: Predicting how much lettuce we should order for a given week at a grocery store produce department, based on past weeks’ sales and spoilage data
Logistic regression: Should I loan you money? (25 minutes)
- Predicting a binary outcome (yes or no, up or down, pass or fail)
- Example: Attempting to determine whether or not a loan applicant represents a good risk or a bad one for a credit union
- Bonus: Determining how reliable each of our predictions were
Break (10 minutes)
Naïve Bayes: Are you gold, silver, or bronze? (25 minutes)
- Using existing data about customer value (repeat customers, high-yield customers, high-maintenance customers, etc.) to determine the category of new consumers
- Example: Anticipating a customer’s category, allowing you to more appropriately allocate (or de-allocate) resources to better care for them
Neural networks: To ship or not to ship? (25 minutes)
- Using neural networks and a dataset of product testing outcomes to predict the primary points of failure for a company’s product
- Example: Predicting points of failure to decide whether or not a product is ready to ship, protecting a company from lost revenue, potential liability, and a decrease in brand confidence in the event of a product failure
Break (10 minutes)
Decision trees: How risky is it to sell you insurance? (25 minutes)
- Determining how and when prospective clients shift from one risk group to another
- Example: Deciding what premium to charge a prospective insurance buyer so that the company brings in more money in premium payments than it pays out in claims
Text analysis: Are you mad enough to leave me? (25 minutes)
- Extracting meaning from large amounts of written communication
- Example: Analyzing customer complaints to determine larger trends, in the process learning what the company can do to proactively anticipate customer needs and address them before any complaints are even lodged
Q&A(30 minutes)
Your Instructor
Dr. Matthew North
Matthew Northis a professor of information systems at Utah Valley University with expertise in analytics and database systems. Previously, Matthew was a software engineer and risk analyst at eBay, where he created automated fraud detection and prevention programs using predictive modeling for risk categorization, and a consultant to private enterprises, nonprofits, and government organizations. He has lectured at universities in Europe, South America, Asia, and the United States and completed a Fulbright appointment at Universidad Tecnológica Nacional in Santa Fe, Argentina, where he taught data mining and analytics for fraud prevention. Matthew is the author of two books, including Data Mining for the Masses (now in the second edition), and numerous professional, academic, and popular publications. In addition to teaching with RapidMiner, Matthew has worked with R, SAS, and Weka.