Book description
Learn to design your own programming language in a hands-on way by building compilers, using preprocessors, transpilers, and more, in this fully-refreshed second edition, written by the creator of the Unicon programming language. Purchase of the print or Kindle book includes a free PDF eBook
Key Features
- Takes a hands-on approach; learn by building the Jzero language, a subset of Java, with example code shown in both the Java and Unicon languages
- Learn how to create parsers, code generators, scanners, and interpreters
- Target bytecode, native code, and preprocess or transpile code into a high-level language
Book Description
There are many reasons to build a programming language: out of necessity, as a learning exercise, or just for fun. Whatever your reasons, this book gives you the tools to succeed.
You’ll build the frontend of a compiler for your language and generate a lexical analyzer and parser using Lex and YACC tools. Then you’ll explore a series of syntax tree traversals before looking at code generation for a bytecode virtual machine or native code. In this edition, a new chapter has been added to assist you in comprehending the nuances and distinctions between preprocessors and transpilers. Code examples have been modernized, expanded, and rigorously tested, and all content has undergone thorough refreshing. You’ll learn to implement code generation techniques using practical examples, including the Unicon Preprocessor and transpiling Jzero code to Unicon. You'll move to domain-specific language features and learn to create them as built-in operators and functions. You’ll also cover garbage collection.
Dr. Jeffery’s experiences building the Unicon language are used to add context to the concepts, and relevant examples are provided in both Unicon and Java so that you can follow along in your language of choice.
By the end of this book, you'll be able to build and deploy your own domain-specific language.
What you will learn
- Analyze requirements for your language and design syntax and semantics.
- Write grammar rules for common expressions and control structures.
- Build a scanner to read source code and generate a parser to check syntax.
- Implement syntax-coloring for your code in IDEs like VS Code.
- Write tree traversals and insert information into the syntax tree.
- Implement a bytecode interpreter and run bytecode from your compiler.
- Write native code and run it after assembling and linking using system tools.
- Preprocess and transpile code into another high-level language
Who this book is for
This book is for software developers interested in the idea of inventing their own language or developing a domain-specific language. Computer science students taking compiler design or construction courses will also find this book highly useful as a practical guide to language implementation to supplement more theoretical textbooks. Intermediate or better proficiency in Java or C++ programming languages (or another high-level programming language) is assumed.
Table of contents
- Preface
- Section I: Programming Language Frontends
-
Why Build Another Programming Language?
- Motivations for writing your own programming language
- Types of programming language implementations
- Organizing a bytecode language implementation
- Languages used in the examples
- The difference between programming languages and libraries
- Applicability to other software engineering tasks
- Establishing the requirements for your language
- Case study – requirements that inspired the Unicon language
- Summary
- Questions
- Programming Language Design
- Scanning Source Code
- Parsing
- Syntax Trees
- Section II: Syntax Tree Traversals
- Symbol Tables
- Checking Base Types
- Checking Types on Arrays, Method Calls, and Structure Accesses
- Intermediate Code Generation
-
Syntax Coloring in an IDE
- Writing your own IDE versus supporting an existing one
- Downloading the software used in this chapter
- Adding support for your language to Visual Studio Code
- Integrating a compiler into a programmer’s editor
- Avoiding reparsing the entire file on every change
- Using lexical information to colorize tokens
- Highlighting errors using parse results
- Summary
- Questions
- Section III: Code Generation and Runtime Systems
-
Preprocessors and Transpilers
- Understanding preprocessors
- Code generation in the Unicon preprocessor
- The difference between preprocessors and transpilers
-
Transpiling Jzero code to Unicon
- Semantic attributes for transpiling to Unicon
- A code generation model for Jzero
- The Jzero to Unicon transpiler code generation method
- Transpiling the base cases: names and literals
- Handling the dot operator
- Mapping Java expressions to Unicon
- Transpiler code for method calls
- Assignments
- Transpiler code for control structures
- Transpiling Jzero declarations
- Transpiling Jzero block statements
- Transpiling a Jzero class into a Unicon package that contains a class
- Summary
- Questions
-
Bytecode Interpreters
- Technical requirements
- Understanding what bytecode is
- Comparing bytecode with intermediate code
- Building a bytecode instruction set for Jzero
- Implementing a bytecode interpreter
- Writing a runtime system for Jzero
- Running a Jzero program
- Examining iconx, the Unicon bytecode interpreter
- Summary
- Questions
-
Generating Bytecode
- Technical requirements
-
Converting intermediate code to Jzero bytecode
- Adding a class for bytecode instructions
- Mapping intermediate code addresses to bytecode addresses
- Implementing the bytecode generator method
- Generating bytecode for simple expressions
- Generating code for pointer manipulation
- Generating bytecode for branches and conditional branches
- Generating code for method calls and returns
- Handling labels and other pseudo-instructions in intermediate code
- Comparing bytecode assembler with binary formats
- Linking, loading, and including the runtime system
- Unicon example – bytecode generation in icont
- Summary
- Questions
-
Native Code Generation
- Technical requirements
- Deciding whether to generate native code
- Introducing the x64 instruction set
- Using registers
-
Converting intermediate code to x64 code
- Mapping intermediate code addresses to x64 locations
- Implementing the x64 code generator method
- Generating x64 code for simple expressions
- Generating code for pointer manipulation
- Generating native code for branches and conditional branches
- Generating code for method calls and returns
- Handling labels and pseudo-instructions
- Generating x64 output
- Summary
- Questions
- Leave a review!
- Implementing Operators and Built-In Functions
-
Domain Control Structures
- Knowing when a new control structure is needed
- Scanning strings in Icon and Unicon
- Rendering regions in Unicon
- Summary
- Questions
- Garbage Collection
- Final Thoughts
- Section IV: Appendix
- Appendix: Unicon Essentials
- Answers
- Other Books You May Enjoy
- Index
Product information
- Title: Build Your Own Programming Language - Second Edition
- Author(s):
- Release date: January 2024
- Publisher(s): Packt Publishing
- ISBN: 9781804618028
You might also like
book
Build Your Own Programming Language
Written by the creator of the Unicon programming language, this book will show you how to …
book
The Go Programming Language
is the authoritative resource for any programmer who wants to learn Go. It shows how to …
book
The Rust Programming Language, 2nd Edition
The Rust Programming Language, 2nd Edition is the official guide to Rust 2021: an open source …
book
Build a Large Language Model (From Scratch)
Learn how to create, train, and tweak large language models (LLMs) by building one from the …