Commit 18430066 authored by Chong Yidong's avatar Chong Yidong
Browse files

* mule.texi (Charsets): Numerous copyedits. Don't discuss the

`charset' property, which is irrelevant to the user manual (Bug#3526).
parent 26581f0e
2009-10-31 Chong Yidong <cyd@stupidchicken.com>
* mule.texi (Charsets): Numerous copyedits. Don't discuss the
`charset' property, which is irrelevant to the user manual (Bug#3526).
2009-10-14 Juanma Barranquero <lekktu@gmail.com>
* trouble.texi (DEL Does Not Delete): Fix typo.
......
......@@ -1608,51 +1608,50 @@ Use @kbd{C-x 8 C-h} to list all the available @kbd{C-x 8} translations.
@section Charsets
@cindex charsets
Emacs defines most of popular character sets (e.g. ascii,
iso-8859-1, cp1250, big5, unicode) as @dfn{charsets} and a few of its
own charsets (e.g. emacs, unicode-bmp, eight-bit). All supported
characters belong to one or more charsets. Usually you don't have to
take care of ``charset'', but knowing about it may help understanding
the behavior of Emacs in some cases.
One example is a font selection. In each language environment,
charsets have different priorities. Emacs, at first, tries to use a
font that matches with charsets of higher priority. For instance, in
Japanese language environment, the charset @code{japanese-jisx0208}
has the highest priority (@pxref{Describe Language Environment}). So,
Emacs tries to use a font whose @code{registry} property is
``JISX0208.1983-0'' for characters belonging to that charset.
Another example is a use of @code{charset} text property. When
Emacs reads a file encoded in a coding systems that uses escape
sequences to switch charsets (e.g. iso-2022-int-1), the buffer text
keep the information of the original charset by @code{charset} text
property. By using this information, Emacs can write the file with
the same byte sequence as the original.
In Emacs, @dfn{charset} is short for ``character set''. Emacs
supports most popular charsets (such as @code{ascii},
@code{iso-8859-1}, @code{cp1250}, @code{big5}, and @code{unicode}), in
addition to some charsets of its own (such as @code{emacs},
@code{unicode-bmp}, and @code{eight-bit}). All supported characters
belong to one or more charsets.
Emacs normally ``does the right thing'' with respect to charsets, so
that you don't have to worry about them. However, it is sometimes
helpful to know some of the underlying details about charsets.
One example is font selection (@pxref{Font X}). Each language
environment (@pxref{Language Environments}) defines a ``priority
list'' for the various charsets. When searching for a font, Emacs
initially attempts to find one that can display the highest-priority
charsets. For instance, in the Japanese language environment, the
charset @code{japanese-jisx0208} has the highest priority, so Emacs
tries to use a font whose @code{registry} property is
@samp{JISX0208.1983-0}.
@findex list-charset-chars
@cindex characters in a certain charset
@findex describe-character-set
There are two commands for obtaining information about Emacs
There are two commands that can be used to obtain information about
charsets. The command @kbd{M-x list-charset-chars} prompts for a
charset name, and displays all the characters in that character set.
The command @kbd{M-x describe-character-set} prompts for a charset
name and displays information about that charset, including its
name, and displays information about that charset, including its
internal representation within Emacs.
@findex list-character-sets
To display a list of all the supported charsets, type @kbd{M-x
To display a list of all supported charsets, type @kbd{M-x
list-character-sets}. The list gives the names of charsets and
additional information to identity each charset (see ISO/IEC's this
page <http://www.itscj.ipsj.or.jp/ISO-IR/> for the detail). In the
list, charsets are categorized into two; the normal charsets are
listed first, and the supplementary charsets are listed last. A
charset in the latter category is used for defining another charset
(as a parent or a subset), or was used only in Emacs of the older
versions.
To find out which charset a character in the buffer belongs to,
put point before it and type @kbd{C-u C-x =}.
additional information to identity each charset (see
@url{http://www.itscj.ipsj.or.jp/ISO-IR/} for details). In this list,
charsets are divided into two categories: @dfn{normal charsets} are
listed first, followed by @dfn{supplementary charsets}. A
supplementary charset is one that is used to define another charset
(as a parent or a subset), or to provide backward-compatibility for
older Emacs versions.
To find out which charset a character in the buffer belongs to, put
point before it and type @kbd{C-u C-x =} (@pxref{International
Chars}).
@ignore
arch-tag: 310ba60d-31ef-4ce7-91f1-f282dd57b6b3
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment