Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese. The structure of EUC is based on the ISO-2022 standard, which specifies a way to represent character sets containing a maximum of 94 characters, or 8836 (942) characters, or 830584 (943) characters, as sequences of 7-bit codes. Only ISO-2022 compliant character sets can have EUC forms. Up to four coded character sets (referred to as G0, G1, G2, and G3 or as code sets 0, 1, 2, and 3) can be represented with the EUC scheme. G0 is almost always an ISO-646 compliant coded character set (e.g. US-ASCII/KS X 1003/ISO 646:KR in EUC-KR and US-ASCII/the lower half of JIS X 0201 in EUC-JP) that is invoked on GL (i.e. with the most significant bit cleared).

Property Value
dbo:abstract
  • Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese. The structure of EUC is based on the ISO-2022 standard, which specifies a way to represent character sets containing a maximum of 94 characters, or 8836 (942) characters, or 830584 (943) characters, as sequences of 7-bit codes. Only ISO-2022 compliant character sets can have EUC forms. Up to four coded character sets (referred to as G0, G1, G2, and G3 or as code sets 0, 1, 2, and 3) can be represented with the EUC scheme. G0 is almost always an ISO-646 compliant coded character set (e.g. US-ASCII/KS X 1003/ISO 646:KR in EUC-KR and US-ASCII/the lower half of JIS X 0201 in EUC-JP) that is invoked on GL (i.e. with the most significant bit cleared). To get the EUC form of an ISO-2022 character, the most significant bit of each 7-bit byte of the original ISO 2022 codes is set (by adding 128 to each of these original 7-bit codes); this allows software to easily distinguish whether a particular byte in a character string belongs to the ISO-646 code or the ISO-2022 (EUC) code. The most commonly used EUC codes are variable-width encodings with a character belonging to G0 (ISO-646 compliant coded character set) taking one byte and a character belonging to G1 (taken by a 94x94 coded character set) represented in two bytes. The EUC-CN form of GB2312 and EUC-KR are examples of such two-byte EUC codes. EUC-JP includes characters represented by up to three bytes whereas a single character in EUC-TW can take up to four bytes. Modern applications are more likely to use UTF-8, which supports all of the glyphs of the EUC codes, and more, and is generally more portable with fewer vendor deviations and errors. (en)
  • Extended UNIX Coding (Abkürzung EUC) ist eine 8-Bit-Zeichencodierung, die vor allem für Chinesisch, Japanisch und Koreanisch gebraucht wird. EUC ist eine Sammelbezeichnung für verschiedene Kodierungen, die je nach Land bis zu vier unterschiedliche Zeichensätze kodieren können. Ursprünglich entwickelt von der Open Software Foundation (OSF), Unix International (UI) und den Unix System Laboratories Pacific (USLP) als Standardkodierung für UNIX-Systeme, findet diese Kodierung heute immer weniger Verwendung, da sie oft von weiter verbreiteten lokalen Kodierungen (Shift-JIS, Big5 etc.) und/oder Unicode (UTF-8) abgelöst wurde. (de)
  • Extended Unix Coding (EUC) est un codage de caractères sur 8 bits utilisé premièrement par le japonais et le coréen. Au Japon, ce codage est intensivement utilisé par les systèmes d'exploitation de type Unix, mais est rarement utilisé ailleurs. EUC est cependant le moins utilisé des 3 principaux codage du japonais, derrière l'ISO-2022-JP (JIS) et le codage Shift-JIS. (fr)
  • L'Extended Unix Code è un sistema multibyte di codifica di caratteri usato soprattutto per il giapponese, il cinese ed il coreano. La struttura dell'Extended Unix Code è basata sullo standard ISO-2022. Questo tipo di codifica si suddivide in: * EUC-CN: una codifica basata sullo standard GB2312 per i caratteri cinesi semplificati; * EUC-JP: una variabile della codifica JIS basati su tre elementi, nominati JIS X 0208, JIS X 0212, e JIS X 0201 per la lingua giapponese; * EUC-KR: una variabile delle codifiche KS X 1001 (detto anche KS C 5601) e KS X 1003 (detto anche KS C 5636)/ISO 646:KR/US-ASCII e KS X 2901 (detto anche KS C 5861) utilizzata per la lingua coreana; * EUC-TW: una variabile della codifica US-ASCII e CNS 11643, raramente usata per i caratteri cinesi tradizionali poiché più diffusa la codifica Big5. (it)
  • Extended Unix Code(EUC)は、UNIX上でよく使われる文字コードの符号化方式である。 * 日本語EUC * JIS X 0208ベース (EUC-JP) * JIS X 0213ベース (EUC-JIS-2004) * 韓国語EUC (EUC-KR) * 簡体字中国語EUC (EUC-CN) * 繁体字中国語EUC (EUC-TW) などがある。 (ja)
  • EUC全名为Extended Unix Code,是一个使用8位编码来表示字符的方法。 EUC最初是针对Unix系统,由一些Unix公司所开发,于1991年标准化。EUC基于ISO/IEC 2022的7位编码标准,因此单字节的编码空间为94,双字节的编码空间(区位码)为94x94。把每个区位加上0xA0来表示,以便符合ISO 2022。它主要用于表示及储存汉语文字、日语文字及朝鲜文字。 EUC定义了4个单独的码集(code set)。码集0总是对应于7位的ASCII(或其它的各国定义的ISO 646),包括了ISO 2022定义的C0与G0空间的值。码集1, 2, 3表示G1空间的值。其中,码集1表示一些未经修饰(unadorned)的字符。码集2的字符编码以0x8E(属于C1控制字符,或称SS2)为第一字节。码集3的字符编码以0x8F(另一个属于C1的控制字符,或称SS3)为第一字节。码集0总是编码为单字节;码集2、3总是编码为至少2个字节;码集1编码为1-3个字节。 (zh)
dbo:wikiPageExternalLink
dbo:wikiPageID
  • 546341 (xsd:integer)
dbo:wikiPageRevisionID
  • 727864851 (xsd:integer)
dct:subject
http://purl.org/linguistics/gold/hypernym
rdf:type
rdfs:comment
  • Extended UNIX Coding (Abkürzung EUC) ist eine 8-Bit-Zeichencodierung, die vor allem für Chinesisch, Japanisch und Koreanisch gebraucht wird. EUC ist eine Sammelbezeichnung für verschiedene Kodierungen, die je nach Land bis zu vier unterschiedliche Zeichensätze kodieren können. Ursprünglich entwickelt von der Open Software Foundation (OSF), Unix International (UI) und den Unix System Laboratories Pacific (USLP) als Standardkodierung für UNIX-Systeme, findet diese Kodierung heute immer weniger Verwendung, da sie oft von weiter verbreiteten lokalen Kodierungen (Shift-JIS, Big5 etc.) und/oder Unicode (UTF-8) abgelöst wurde. (de)
  • Extended Unix Coding (EUC) est un codage de caractères sur 8 bits utilisé premièrement par le japonais et le coréen. Au Japon, ce codage est intensivement utilisé par les systèmes d'exploitation de type Unix, mais est rarement utilisé ailleurs. EUC est cependant le moins utilisé des 3 principaux codage du japonais, derrière l'ISO-2022-JP (JIS) et le codage Shift-JIS. (fr)
  • Extended Unix Code(EUC)は、UNIX上でよく使われる文字コードの符号化方式である。 * 日本語EUC * JIS X 0208ベース (EUC-JP) * JIS X 0213ベース (EUC-JIS-2004) * 韓国語EUC (EUC-KR) * 簡体字中国語EUC (EUC-CN) * 繁体字中国語EUC (EUC-TW) などがある。 (ja)
  • EUC全名为Extended Unix Code,是一个使用8位编码来表示字符的方法。 EUC最初是针对Unix系统,由一些Unix公司所开发,于1991年标准化。EUC基于ISO/IEC 2022的7位编码标准,因此单字节的编码空间为94,双字节的编码空间(区位码)为94x94。把每个区位加上0xA0来表示,以便符合ISO 2022。它主要用于表示及储存汉语文字、日语文字及朝鲜文字。 EUC定义了4个单独的码集(code set)。码集0总是对应于7位的ASCII(或其它的各国定义的ISO 646),包括了ISO 2022定义的C0与G0空间的值。码集1, 2, 3表示G1空间的值。其中,码集1表示一些未经修饰(unadorned)的字符。码集2的字符编码以0x8E(属于C1控制字符,或称SS2)为第一字节。码集3的字符编码以0x8F(另一个属于C1的控制字符,或称SS3)为第一字节。码集0总是编码为单字节;码集2、3总是编码为至少2个字节;码集1编码为1-3个字节。 (zh)
  • Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese. The structure of EUC is based on the ISO-2022 standard, which specifies a way to represent character sets containing a maximum of 94 characters, or 8836 (942) characters, or 830584 (943) characters, as sequences of 7-bit codes. Only ISO-2022 compliant character sets can have EUC forms. Up to four coded character sets (referred to as G0, G1, G2, and G3 or as code sets 0, 1, 2, and 3) can be represented with the EUC scheme. G0 is almost always an ISO-646 compliant coded character set (e.g. US-ASCII/KS X 1003/ISO 646:KR in EUC-KR and US-ASCII/the lower half of JIS X 0201 in EUC-JP) that is invoked on GL (i.e. with the most significant bit cleared). (en)
  • L'Extended Unix Code è un sistema multibyte di codifica di caratteri usato soprattutto per il giapponese, il cinese ed il coreano. La struttura dell'Extended Unix Code è basata sullo standard ISO-2022. Questo tipo di codifica si suddivide in: (it)
rdfs:label
  • Extended Unix Code (en)
  • Extended UNIX Coding (de)
  • Extended Unix Coding (fr)
  • Extended Unix Code (it)
  • Extended Unix Code (ja)
  • EUC (zh)
owl:sameAs
prov:wasDerivedFrom
foaf:isPrimaryTopicOf
is dbo:wikiPageDisambiguates of
is dbo:wikiPageRedirects of
is foaf:primaryTopic of