Book description
First published a decade ago, CJKV Information Processing quickly became the unsurpassed source of information on processing text in Chinese, Japanese, Korean, and Vietnamese. It has now been thoroughly updated to provide web and application developers with the latest techniques and tools for disseminating information directly to audiences in East Asia. This second edition reflects the considerable impact that Unicode, XML, OpenType, and newer operating systems such as Windows XP, Vista, Mac OS X, and Linux have had on East Asian text processing in recent years.
Written by its original author, Ken Lunde, a Senior Computer Scientist in CJKV Type Development at Adobe Systems, this book will help you:
- Learn about CJKV writing systems and scripts, and their transliteration methods
- Explore trends and developments in character sets and encodings, particularly Unicode
- Examine the world of typography, specifically how CJKV text is laid out on a page
- Learn information-processing techniques, such as code conversion algorithms and how to apply them using different programming languages
- Process CJKV text using different platforms, text editors, and word processors
- Become more informed about CJKV dictionaries, dictionary software, and machine translation software and services
- Manage CJKV content and presentation when publishing in print or for the Web
Internationalizing and localizing applications is paramount in today's global market -- especially for audiences in East Asia, the fastest-growing segment of the computing world. CJKV Information Processing will help you understand how to develop web and other applications effectively in a field that many find difficult to master.
Publisher resources
Table of contents
- Foreword
- Preface (1/2)
- Preface (2/2)
-
Chapter 1: CJKV Information Processing Overview
- Writing Systems and Scripts
- Character Set Standards
- Encoding Methods
- Input Methods
- Typography
-
Basic Concepts and Terminology FAQ
- What Are All These Abbreviations and Acronyms?
- What Are Internationalization, Globalization, and Localization?
- What Are the Multilingual and Locale Models?
- What Is a Locale?
- What Is Unicode?
- How Are Unicode and ISO 10646 Related?
- What Are Row-Cell and Plane-Row-Cell?
- What Is a Unicode Scalar Value?
- Characters Versus Glyphs: What Is the Difference?
- What Is the Difference Between Typeface and Font?
- What Are Half- and Full-Width Characters?
- Latin Versus Roman Characters
- What Is a Diacritic Mark?
- What Is Notation?
- What Is an Octet?
- What Are Little- and Big-Endian?
- What Are Multiple-Byte and Wide Characters?
- Advice to Readers
- Chapter 2: Writing Systems and Scripts
-
Chapter 3: Character Set Standards
- NCS Standards
-
CCS Standards
- National Coded Character Set Standards Overview
- ASCII
- ASCII Variations
- CJKV-Roman
- Chinese Character Set Standards—China (1/4)
- Chinese Character Set Standards—China (2/4)
- Chinese Character Set Standards—China (3/4)
- Chinese Character Set Standards—China (4/4)
- Chinese Character Set Standards—Taiwan (1/3)
- Chinese Character Set Standards—Taiwan (2/3)
- Chinese Character Set Standards—Taiwan (3/3)
- Chinese Character Set Standards—Hong Kong (1/2)
- Chinese Character Set Standards—Hong Kong (2/2)
- Chinese Character Set Standards—Singapore
- Japanese Character Set Standards
- Korean Character Set Standards (1/2)
- Korean Character Set Standards (2/2)
- Vietnamese Character Set Standards
- International Character Set Standards
- Character Set Standard Oddities
- Noncoded Versus Coded Character Sets
- Information Interchange and Professional Publishing
- Future Trends and Predictions
- Advice to Developers
- Chapter 4: Encoding Methods
- Chapter 5: Input Methods
-
Chapter 6: Font Formats, Glyph Sets, and Font Tools
- Typeface Design
- How Many Glyphs Can a Font Include?
- Bitmapped Font Formats
- Outline Font Formats
- Glyph Sets
- Ruby Glyphs
- Host-Installed, Printer-Resident, and Embedded Fonts
- Font Development Tools
-
Gaiji Handling
- The Gaiji Problem
- SING—Smart INdependent Glyphlets
- Ideographic Variation Sequences
- XKP, A Gaiji Handling Initiative—Obsolete
- Adobe Type Composer (ATC)—Obsolete
- Composite Font Functionality Within Applications
- Gaiji Handling Techniques and Tricks
- Creating Your Own Rearranged Fonts
- Acquiring Gaiji Glyphs and Gaiji Fonts
- Advice to Developers
- Chapter 7: Typography
- Chapter 8: Output Methods
-
Chapter 9: Information Processing Techniques
- Language, Country, and Script Codes
- CLDR—Common Locale Data Repository
- Programming Languages
- Code Conversion Algorithms
- Java Programming Examples
- Miscellaneous Algorithms
- Byte Versus Character Handling
- Character Sorting
- Natural Language Processing
- Regular Expressions
- Search Engines
- Code-Processing Tools
- Chapter 10: OSes, Text Editors, and Word Processors
- Chapter 11: Dictionaries and Dictionary Software
- Chapter 12: Web and Print Publishing
- Appendix A: Code Conversion Tables
- Appendix B: Notation Conversion Table
- Appendix C: Perl Code Examples (1/4)
- Appendix C: Perl Code Examples (2/4)
- Appendix C: Perl Code Examples (3/4)
- Appendix C: Perl Code Examples (4/4)
- Appendix D: Glossary (1/8)
- Appendix D: Glossary (2/8)
- Appendix D: Glossary (3/8)
- Appendix D: Glossary (4/8)
- Appendix D: Glossary (5/8)
- Appendix D: Glossary (6/8)
- Appendix D: Glossary (7/8)
- Appendix D: Glossary (8/8)
- Appendix E: Vendor Character Set Standards
- Appendix F: Vendor Encoding Methods
- Appendix G: Chinese Character Sets—China
- Appendix H: Chinese Character Sets—Taiwan
- Appendix I: Chinese Character Sets—Hong Kong
- Appendix J: Japanese Character Sets
- Appendix K: Korean Character Sets
- Appendix L: Vietnamese Character Sets
- Appendix M: Miscellaneous Character Sets
- Bibliography (1/6)
- Bibliography (2/6)
- Bibliography (3/6)
- Bibliography (4/6)
- Bibliography (5/6)
- Bibliography (6/6)
- Index (1/6)
- Index (2/6)
- Index (3/6)
- Index (4/6)
- Index (5/6)
- Index (6/6)
Product information
- Title: CJKV Information Processing, 2nd Edition
- Author(s):
- Release date: December 2008
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9780596800925
You might also like
book
Just Java™ 2
The #1 introduction to J2SE 1.5 and enterprise/server-side development! An international bestseller for eight years, is …
book
SAS Encoding
Understanding the basic concepts of character encoding is necessary for creating, manipulating, and rendering any type …
article
Reinventing the Organization for GenAI and LLMs
Previous technology breakthroughs did not upend organizational structure, but generative AI and LLMs will. We now …
book
Information Flow and Knowledge Sharing
Except from the Foreword The stated aim of the book series "Capturing Intelligence" is to publish …