Learn LLVM 17 - Second Edition

Book description

Learn how to build and use the complete spectrum of real-world compilers, including the frontend, optimization pipeline, and a new backend by leveraging the power of LLVM core libraries

Key Features

  • Get to grips with using LLVM libraries step by step
  • Understand the high-level design of LLVM compilers and apply these principles to your own compiler
  • Add a new backend to target an unsupported CPU architecture
  • Purchase of the print or Kindle book includes a free PDF eBook

Book Description

LLVM was built to bridge the gap between the theoretical knowledge found in compiler textbooks and the practical demands of compiler development. With a modular codebase and advanced tools, LLVM empowers developers to build compilers with ease. This book serves as a practical introduction to LLVM, guiding you progressively through complex scenarios and ensuring that you navigate the challenges of building and working with compilers like a pro.

The book starts by showing you how to configure, build, and install LLVM libraries, tools, and external projects. You’ll then be introduced to LLVM's design, unraveling its applications in each compiler stage: frontend, optimizer, and backend. Using a real programming language subset, you'll build a frontend, generate LLVM IR, optimize it through the pipeline, and generate machine code. Advanced chapters extend your expertise, covering topics such as extending LLVM with a new pass, using LLVM tools for debugging, and enhancing the quality of your code. You'll also focus on just-in-time compilation issues and the current state of JIT-compilation support with LLVM. Finally, you’ll develop a new backend for LLVM, gaining insights into target description and how instruction selection works.

By the end of this book, you'll have hands-on experience with the LLVM compiler development framework through real-world examples and source code snippets.

What you will learn

  • Configure, compile, and install the LLVM framework
  • Understand how the LLVM source is organized
  • Discover what you need to do to use LLVM in your own projects
  • Explore how a compiler is structured, and implement a tiny compiler
  • Generate LLVM IR for common source language constructs
  • Set up an optimization pipeline and tailor it for your own needs
  • Extend LLVM with transformation passes and clang tooling
  • Add new machine instructions and a complete backend

Who this book is for

This book is for compiler developers, enthusiasts, and engineers new to LLVM. C++ software engineers looking to use compiler-based tools for code analysis and improvement, as well as casual users of LLVM libraries who want to gain more knowledge of LLVM essentials will also find this book useful. Intermediate-level experience with C++ programming is necessary to understand the concepts covered in this book.

Table of contents

  1. Learn LLVM 17
  2. Contributors
  3. About the authors
  4. About the reviewers
  5. Preface
    1. What’s new in this edition
    2. Who this book is for
    3. What this book covers
    4. To get the most out of this book
    5. Download the example code files
    6. Conventions used
    7. Get in touch
    8. Share Your Thoughts
    9. Download a free PDF copy of this book
  6. Part 1: The Basics of Compiler Construction with LLVM
  7. Chapter 1: Installing LLVM
    1. Compiling LLVM versus installing binaries
    2. Getting the prerequisites ready
      1. Ubuntu
      2. Fedora and RedHat
      3. FreeBSD
      4. OS X
      5. Windows
    3. Cloning the repository and building from source
      1. Configuring Git
      2. Cloning the repository
      3. Creating a build directory
      4. Generating the build system files
      5. Compiling and installing LLVM
    4. Customizing the build process
      1. Variables defined by CMake
      2. Using LLVM-defined build configuration variables
    5. Summary
  8. Chapter 2: The Structure of a Compiler
    1. Building blocks of a compiler
    2. An arithmetic expression language
      1. Formalism for specifying the syntax of a programming language
      2. How does grammar help the compiler writer?
    3. Lexical analysis
      1. A hand-written lexer
    4. Syntactical analysis
      1. A hand-written parser
      2. The abstract syntax tree
    5. Semantic analysis
    6. Generating code with the LLVM backend
      1. Textual representation of LLVM IR
      2. Generating the IR from the AST
      3. The missing pieces – the driver and the runtime library
    7. Summary
  9. Part 2: From Source to Machine Code Generation
  10. Chapter 3: Turning the Source File into an Abstract Syntax Tree
    1. Defining a real programming language
    2. Creating the project layout
    3. Managing the input files for the compiler
    4. Handling messages for the user
    5. Structuring the lexer
    6. Constructing a recursive descent parser
    7. Performing semantic analysis
      1. Handling the scope of names
      2. Using an LLVM-style RTTI for the AST
      3. Creating the semantic analyzer
    8. Summary
  11. Chapter 4: Basics of IR Code Generation
    1. Generating IR from the AST
      1. Understanding the IR code
      2. Learning about the load-and-store approach
      3. Mapping the control flow to basic blocks
    2. Using AST numbering to generate IR code in SSA form
      1. Defining the data structure to hold values
      2. Reading and writing values local to a basic block
      3. Searching the predecessor blocks for a value
      4. Optimizing the generated phi instructions
      5. Sealing a block
      6. Creating the IR code for expressions
      7. Emitting the IR code for a function
      8. Controlling visibility with linkage and name mangling
      9. Converting a type from an AST description into LLVM types
      10. Creating the LLVM IR function
      11. Emitting the function body
    3. Setting up the module and the driver
      1. Wrapping all in the code generator
      2. Initializing the target machine class
      3. Emitting assembler text and object code
    4. Summary
  12. Chapter 5: IR Generation for High-Level Language Constructs
    1. Technical requirements
    2. Working with arrays, structs, and pointers
    3. Getting the application binary interface right
    4. Creating IR code for classes and virtual functions
      1. Implementing single inheritance
      2. Extending single inheritance with interfaces
      3. Adding support for multiple inheritance
    5. Summary
  13. Chapter 6: Advanced IR Generation
    1. Throwing and catching exceptions
      1. Raising an exception
      2. Catching an exception
      3. Integrating the exception handling code into the application
    2. Generating metadata for type-based alias analysis
      1. Understanding the need for additional metadata
      2. Creating TBAA metadata in LLVM
      3. Adding TBAA metadata to tinylang
    3. Adding debug metadata
      1. Understanding the general structure of debug metadata
      2. Tracking variables and their values
      3. Adding line numbers
      4. Adding debug support to tinylang
    4. Summary
  14. Chapter 7: Optimizing IR
    1. Technical requirements
    2. The LLVM pass manager
    3. Implementing a new pass
      1. Developing the ppprofiler pass as a plugin
      2. Adding the pass to the LLVM source tree
    4. Using the ppprofiler pass with LLVM tools
    5. Adding an optimization pipeline to your compiler
      1. Creating an optimization pipeline
      2. Extending the pass pipeline
    6. Summary
  15. Part 3: Taking LLVM to the Next Level
  16. Chapter 8: The TableGen Language
    1. Technical requirements
    2. Understanding the TableGen language
    3. Experimenting with the TableGen language
      1. Defining records and classes
      2. Creating multiple records at once with multiclasses
      3. Simulating function calls
    4. Generating C++ code from a TableGen file
      1. Defining data in the TableGen language
      2. Implementing a TableGen backend
    5. Drawbacks of TableGen
    6. Summary
  17. Chapter 9: JIT Compilation
    1. Technical requirements
    2. LLVM’s overall JIT implementation and use cases
    3. Using JIT compilation for direct execution
      1. Exploring the lli tool
    4. Implementing our own JIT compiler with LLJIT
      1. Integrating the LLJIT engine into the calculator
      2. Code generation changes to support JIT compilation via LLJIT
      3. Building an LLJIT-based calculator
    5. Building a JIT compiler class from scratch
      1. Creating a JIT compiler class
      2. Using our new JIT compiler class
    6. Summary
  18. Chapter 10: Debugging Using LLVM Tools
    1. Technical requirements
    2. Instrumenting an application with sanitizers
      1. Detecting memory access problems with the address sanitizer
      2. Finding uninitialized memory accesses with the memory sanitizer
      3. Pointing out data races with the thread sanitizer
    3. Finding bugs with libFuzzer
      1. Limitations and alternatives
    4. Performance profiling with XRay
    5. Checking the source with the clang static analyzer
      1. Adding a new checker to the clang static analyzer
    6. Creating your own clang-based tool
    7. Summary
  19. Part 4: Roll Your Own Backend
  20. Chapter 11: The Target Description
    1. Setting the stage for a new backend
    2. Adding the new architecture to the Triple class
    3. Extending the ELF file format definition in LLVM
    4. Creating the target description
      1. Adding the register definition
      2. Defining the instruction formats and the instruction information
      3. Creating the top-level file for the target description
    5. Adding the M88k backend to LLVM
    6. Implementing the assembler parser
    7. Creating the disassembler
    8. Summary
  21. Chapter 12: Instruction Selection
    1. Defining the rules of the calling convention
      1. Implementing the rules of the calling convention
    2. Instruction selection via the selection DAG
      1. Implementing DAG lowering – handling legal types and setting operations
      2. Implementing DAG lowering – lowering formal arguments
      3. Implementing DAG lowering – lowering return values
      4. Implementing DAG-to-DAG transformations within instruction selection
    3. Adding register and instruction information
    4. Putting an empty frame lowering in place
    5. Emitting machine instructions
    6. Creating the target machine and the sub-target
      1. Implementing M88kSubtarget
      2. Implementing M88kTargetMachine – defining the definitions
      3. Implementing M88kTargetMachine – adding the implementation
    7. Global instruction selection
      1. Lowering arguments and return values
      2. Legalizing the generic machine instructions
      3. Selecting a register bank for operands
      4. Translating generic machine instructions
      5. Running an example
    8. How to further evolve the backend
    9. Summary
  22. Chapter 13: Beyond Instruction Selection
    1. Adding a new machine function pass to LLVM
      1. Implementing the top-level interface for the M88k target
      2. Adding the TargetMachine implementation for machine function passes
      3. Developing the specifics of the machine function pass
      4. Building newly implemented machine function passes
      5. A glimpse of running a machine function pass with llc
    2. Integrating a new target into the clang frontend
      1. Implementing the driver integration within clang
      2. Implementing ABI support for M88k within clang
      3. Implementing the toolchain support for M88k within clang
      4. Building the M88k target with clang integration
    3. Targeting a different CPU architecture
    4. Summary
  23. Index
    1. Why subscribe?
  24. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts
    3. Download a free PDF copy of this book

Product information

  • Title: Learn LLVM 17 - Second Edition
  • Author(s): Kai Nacke, Amy Kwan
  • Release date: January 2024
  • Publisher(s): Packt Publishing
  • ISBN: 9781837631346