Machine Learning Security Principles

Book description

Thwart hackers by preventing, detecting, and misdirecting access before they can plant malware, obtain credentials, engage in fraud, modify data, poison models, corrupt users, eavesdrop, and otherwise ruin your day

Key Features

  • Discover how hackers rely on misdirection and deep fakes to fool even the best security systems
  • Retain the usefulness of your data by detecting unwanted and invalid modifications
  • Develop application code to meet the security requirements related to machine learning

Book Description

Businesses are leveraging the power of AI to make undertakings that used to be complicated and pricy much easier, faster, and cheaper. The first part of this book will explore these processes in more depth, which will help you in understanding the role security plays in machine learning.

As you progress to the second part, you’ll learn more about the environments where ML is commonly used and dive into the security threats that plague them using code, graphics, and real-world references.

The next part of the book will guide you through the process of detecting hacker behaviors in the modern computing environment, where fraud takes many forms in ML, from gaining sales through fake reviews to destroying an adversary’s reputation. Once you’ve understood hacker goals and detection techniques, you’ll learn about the ramifications of deep fakes, followed by mitigation strategies.

This book also takes you through best practices for embracing ethical data sourcing, which reduces the security risk associated with data. You’ll see how the simple act of removing personally identifiable information (PII) from a dataset lowers the risk of social engineering attacks.

By the end of this machine learning book, you'll have an increased awareness of the various attacks and the techniques to secure your ML systems effectively.

What you will learn

  • Explore methods to detect and prevent illegal access to your system
  • Implement detection techniques when access does occur
  • Employ machine learning techniques to determine motivations
  • Mitigate hacker access once security is breached
  • Perform statistical measurement and behavior analysis
  • Repair damage to your data and applications
  • Use ethical data collection methods to reduce security risks

Who this book is for

Whether you’re a data scientist, researcher, or manager working with machine learning techniques in any aspect, this security book is a must-have. While most resources available on this topic are written in a language more suitable for experts, this guide presents security in an easy-to-understand way, employing a host of diagrams to explain concepts to visual learners. While familiarity with machine learning concepts is assumed, knowledge of Python and programming in general will be useful.

Table of contents

  1. Machine Learning Security Principles
  2. Foreword
  3. Contributors
  4. About the author
  5. Acknowledgements
  6. About the reviewers
  7. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Conventions used
    6. Get in touch
    7. Share Your Thoughts
    8. Download a free PDF copy of this book
  8. Part 1 – Securing a Machine Learning System
  9. Chapter 1: Defining Machine Learning Security
    1. Building a picture of ML
      1. Why is ML important?
      2. Identifying the ML security domain
      3. Distinguishing between supervised and unsupervised
      4. Using ML from development to production
    2. Adding security to ML
      1. Defining the human element
      2. Compromising the integrity and availability of ML models
      3. Describing the types of attacks against ML
      4. Considering what ML security can achieve
    3. Setting up for the book
      1. What do you need to know?
      2. Considering the programming setup
    4. Summary
  10. Chapter 2: Mitigating Risk at Training by Validating and Maintaining Datasets
    1. Technical requirements
    2. Defining dataset threats
      1. Learning about the kinds of database threats
      2. Considering dataset threat sources
      3. Delving into data change
      4. Delving into data corruption
      5. Uncovering feature manipulation
      6. Examining source modification
      7. Thwarting privacy attacks
    3. Detecting dataset modification
      1. An example of relying on traditional methods
      2. Working with hashes and larger files
      3. Using a data version control system example
    4. Mitigating dataset corruption
      1. The human factor in missingness
      2. An example of recreating the dataset
      3. Using an imputer
      4. Handling missing or corrupted data
    5. Summary
  11. Chapter 3: Mitigating Inference Risk by Avoiding Adversarial Machine Learning Attacks
    1. Defining adversarial ML
      1. Categorizing the attack vectors
      2. Examining the hacker mindset
    2. Considering security issues in ML algorithms
      1. Defining attacker motivations
      2. Employing CAPTCHA bypass techniques
      3. Considering common hacker goals
      4. Relying on trial and error
      5. Avoiding helping the hacker
      6. Integrating new research quickly
      7. Understanding the Black Swan Theory
    3. Describing the most common attack techniques
      1. Evasion attacks
      2. Model poisoning
      3. Understanding membership inference attacks
      4. Understanding Trojan attacks
      5. Understanding backdoor (neural) attacks
      6. Seeing adversarial attacks in action
    4. Mitigating threats to the algorithm
      1. Developing principles that help protect against every threat
      2. Detecting and mitigating an evasion attack
      3. Detecting and mitigating a model poisoning attack
      4. Detecting and mitigating a membership inference attack
      5. Detecting and mitigating a Trojan attack
      6. Detecting and mitigating backdoor (neural) attacks
    5. Summary
    6. Further reading
  12. Part 2 – Creating a Secure System Using ML
  13. Chapter 4: Considering the Threat Environment
    1. Technical requirements
    2. Defining an environment
    3. Understanding business threats
      1. Protecting consumer sites
      2. Understanding malware
      3. Understanding network attacks
      4. Eyeing the small stuff
      5. Dealing with web APIs
      6. Dealing with the hype cycle
    4. Considering social threats
      1. Spam
      2. Identity theft
      3. Unwanted tracking
      4. Remote storage data loss or corruption
      5. Account takeover
    5. Employing ML in security in the real world
      1. Understanding the kinds of application security
      2. Considering the realities of the machine
      3. Adding human intervention
      4. Developing a simple authentication example
      5. Developing a simple spam filter example
    6. Summary
    7. Further reading
  14. Chapter 5: Keeping Your Network Clean
    1. Technical requirements
    2. Defining current network threats
      1. Developing a sense of control over chaos
      2. Implementing access control
      3. Ensuring authentication
      4. Detecting intrusions
      5. Defining localized attacks
      6. Understanding botnets
    3. Considering traditional protections
      1. Working with honeypots
      2. Using data-centric security
      3. Locating subtle intrusion indicators
      4. Using alternative identity strategies
      5. Obtaining data for network traffic testing
    4. Adding ML to the mix
      1. Developing an updated security plan
      2. Determining which features to track
    5. Creating real-time defenses
      1. Using supervised learning example
      2. Using a subprocess in Python example
      3. Working with Flask example
      4. Asking for human intervention
    6. Developing predictive defenses
      1. Defining what is available today
      2. Downsides of predicting the future
      3. Creating a realistic network model
    7. Summary
  15. Chapter 6: Detecting and Analyzing Anomalies
    1. Technical requirements
    2. Defining anomalies
      1. Specifying the causes and effects of anomaly detection
      2. Considering anomaly sources
      3. Understanding when anomalies occur
      4. Using and combining anomaly detection and signature detection
    3. Detecting data anomalies
      1. Checking data validity
      2. Forecasting potential anomalies example
    4. Using anomaly detection effectively in ML
    5. Considering other mitigation techniques
    6. Summary
    7. Further reading
  16. Chapter 7: Dealing with Malware
    1. Technical requirements
    2. Defining malware
      1. Specifying the malware types
      2. Understanding the subtleties of malware
      3. Determining malware goals
    3. Generating malware detection features
      1. Getting the required disassembler
      2. Collecting data about any application
      3. Extracting strings from an executable
      4. Extracting images from an executable
      5. Generating a list of application features
      6. Selecting the most important features
      7. Considering speed of detection
      8. Building a malware detection toolbox
    4. Classifying malware
      1. Obtaining malware samples and labels
      2. Development of a simple malware detection scenario
    5. Summary
    6. Further reading
  17. Chapter 8: Locating Potential Fraud
    1. Technical requirements
    2. Understanding the types of fraud
    3. Defining fraud sources
      1. Considering fraudsters
      2. Considering hackers
      3. Considering other organizations
      4. Considering company insiders
      5. Considering customers (or fraudsters posing as customers)
      6. Obtaining fraud datasets
    4. Considering fraud that occurs in the background
      1. Detecting fraud that occurs when you’re not looking
      2. Building a background fraud detection application
    5. Considering fraud that occurs in real time
      1. Considering the types of real-time fraud
      2. Detecting real-time fraud
    6. Building a fraud detection example
      1. Getting the data
      2. Setting the example up
      3. Splitting the data into train and test sets
      4. Building the model
      5. Performing the analysis
      6. Checking another model
      7. Creating a ROC curve and calculating AUC
    7. Summary
    8. Further reading
  18. Chapter 9: Defending against Hackers
    1. Technical requirements
    2. Considering hacker targets
      1. Hosted systems
      2. Networks
      3. Mobile devices
      4. Customers
      5. Public venues and social media
    3. Defining hacker goals
      1. Data stealing
      2. Data modification
      3. Algorithm modification
      4. System damage
    4. Monitoring and alerting
      1. Considering the importance of lag
      2. An example of detecting behavior
      3. Building and testing an XGBoost regressor
      4. Putting the data in perspective
      5. Predicting new behavior based on the past
      6. Locating other behavioral datasets
    5. Improving security and reliability
    6. Summary
    7. Further reading
  19. Part 3 – Protecting against ML-Driven Attacks
  20. Chapter 10: Considering the Ramifications of Deepfakes
    1. Technical requirements
    2. Defining a deepfake
      1. Modifying media
      2. Common deepfake types
      3. The history of deepfakes
    3. Creating a deepfake computer setup
      1. Installing TensorFlow on a desktop system
      2. Checking for a GPU
    4. Understanding autoencoders
      1. Defining the autoencoder
      2. Working with an autoencoder example
    5. Understanding CNNs and implementing GANs
      1. An overview of a Pix2Pix GAN
      2. Obtaining and viewing the images
      3. Manipulating the images
      4. Developing datasets from the modified images
      5. Creating the generator
      6. Creating the discriminator
      7. Performing optimization of both generator and discriminator
      8. Monitoring the training process
      9. Training the model
    6. Summary
    7. Further reading
  21. Chapter 11: Leveraging Machine Learning for Hacking
    1. Making attacks automatic and personalized
      1. Gaining unauthorized access bypassing CAPTCHA
      2. Automatically harvesting information
    2. Enhancing existing capabilities
      1. Rendering malware less effective using GANs
      2. Putting artificial intelligence in spear-phishing
      3. Generating smart bots for fake news and reviews
    3. Summary
    4. Further reading
  22. Part 4 – Performing ML Tasks in an Ethical Manner
  23. Chapter 12: Embracing and Incorporating Ethical Behavior
    1. Technical requirements
    2. Sanitizing data correctly
      1. Obtaining benefits from data sanitization
      2. Considering the current dataset
      3. Removing PII
      4. Adding traits together to make them less identifiable
      5. Eliminating unnecessary features
    3. Defining data source awareness
      1. Validating user permissions
      2. Using recognizable datasets
      3. Verifying third-party datasets
      4. Obtaining required third-party permissions
    4. Understanding ML fairness
      1. Determining what fairness means
      2. Understanding Simpson’s paradox
      3. Removing personal bias
      4. Defining algorithmic bias
    5. Addressing fairness concerns
      1. Computing fairness indicators with TensorFlow
      2. Solving fairness problems with TensorFlow-constrained optimization
    6. Mitigating privacy risks using federated learning and differential privacy
      1. Distributing data and privacy risks using federated learning
      2. Relying on differential privacy
    7. Summary
    8. Further reading
  24. Index
    1. Why subscribe?
  25. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts
    3. Download a free PDF copy of this book

Product information

  • Title: Machine Learning Security Principles
  • Author(s): John Paul Mueller
  • Release date: December 2022
  • Publisher(s): Packt Publishing
  • ISBN: 9781804618851