Description
Possessing everything you need to grasp Unicode, this comprehensive reference takes you on a detailed guide through the complex character world. Learn how to identify and classify characters, utilize their properties, and process data in a robust manner. Other topics include collation and sorting, line breaking rules and Unicode encodings. Perfect for both beginning and seasoned programmers.
Full Description
Table of Contents
-
Working with Characters
-
Chapter 1 Characters as Data
- Introduction to Characters and Unicode
- What’s in a Character?
- Variation of Writing Systems
- Glyphs and Fonts
- Definitions of Character Repertoires
- Numbering Characters
- Encoding Characters as Octet Sequences
- Working with Encodings
- Working with Fonts
- Summaries
-
Chapter 2 Writing Characters
- Method Varieties
- Keyboard Variation and Settings
- Virtual Keyboards
- Program Commands
- Character Maps
- Replacements on the Fly
- Special Techniques
- Escape Sequences
- Specialized Editors
- Exercise
-
Chapter 3 Character Sets and Encodings
- Good Old ASCII
- ISO 8859 Codes
- Windows Latin 1 and Other Windows Codes
- Other 8-bit Codes
- Unicode and UTF-8
- Encodings for East Asian Language
- Converters and Transcoding
- Using Character Codes
-
-
A Systematic Look at Unicode
-
Chapter 4 The Structure of Unicode
- Design Principles
- Versions of Unicode
- Coding Space
- Unicode Terms
- Guide to the Unicode Standard
- Unicode and Fonts
- Criticism of Unicode
- Questions and Answers
-
Chapter 5 Properties of Characters
- Character Classification
- An Overview of Properties
- Compositions and Decompositions
- Normalization
- Case Properties
- Collation and Sorting
- Text Boundaries
- Directionality
- Line-Breaking Properties
- Unicode Conformance Requirements
- Effects on Choosing Characters
-
Chapter 6 Unicode Encodings
- Unicode Encodings in General
- UTF-32 and UCS-4
- UTF-16 and UCS-2
- UTF-8
- Byte Order
- Conversions Between Unicode Encodings
- Other Encodings
- Auto-Detecting the Encoding
- Choosing an Encoding
-
-
Advanced Unicode Topics
-
Chapter 7 Characters and Languages
- Writing Systems and IT
- Character Requirements of Languages
- Transliteration and Transcription
- Language Metadata
- Languages and Fonts
-
Chapter 8 Character Usage
- Basics of Character Usage
- ASCII (Basic Latin)
- Latin-1 Supplement (ISO 8859-1)
- Other Latin Letters
- Other European Alphabetic Scripts
- Diacritic Marks
- Letterlike Symbols
- General Punctuation
- Line Structure Control
- Mathematical and Technical Symbols
- Other Blocks
-
Chapter 9 The Character Level and Above
- Levels of Text Representation and Processing
- Characters and Markup
- Media Types for Text
-
Chapter 10 Characters in Internet Protocols
- Information About Encoding
- Characters in MIME
- Content Negotiation and Multilingual Sites
- Characters in Protocol Headers
- Characters in Domain Names and URLs
-
Chapter 11 Characters in Programming
- Characters in Computer Languages
- Character and String Data
- The Preparedness Principle
- Character Input and Output
- Processing Form Data
- Identifiers, Patterns, and Regular Expressions
- International Components for Unicode (ICU)
- Using Locales
-
-
Appendix Tables for Writing Characters
-
Additional Notes
-
-
Colophon
Product Details
- Title:
- Unicode Explained
- By:
- Jukka K. Korpela
- Publisher:
- O'Reilly Media
- Formats:
-
- Ebook
- Safari Books Online
- Print Release:
- June 2006
- Ebook Release:
- June 2009
- Pages:
- 688
- Print ISBN:
- 978-0-596-10121-3
- | ISBN 10:
- 0-596-10121-X
- Ebook ISBN:
- 978-0-596-10586-0
- | ISBN 10:
- 0-596-10586-X
Customer Reviews
