差分

このページの2つのバージョン間の差分を表示します。

--- snow:nagaoka_tigrinya_corpus [2022/05/12 21:17] – admin
+++ snow:nagaoka_tigrinya_corpus [2022/05/12 21:17] (現在) – admin
@@ 行 24: / 行 24: @@
   * Tigrinya Grammar by John Mason (1996)
-==== 4. Format of NTC ====
+=== 4. Format of NTC ===
 Tigrinya uses the Ge'ez Script as its writing system.  The corpus is available in both Ge'ez script and Transliterated Latin script. [[http://yacob.org/papers%2Fetinet.pdf|SERA]] transliteration scheme has been used with a few adjustments. The upper case 'I' was used to exclusively mark the epenthetic vowel (know as 'sads' in Ge'ez script).  For machine readability and flexible manipulation, the corpus was pre-processed (cleaned) and encoded in [[https://tei-c.org/release/doc/tei-p5-doc/en/html/CC.html|TEI corpus format]].  The retained punctuation marks are, ፡ (two dots), ። (four dots), ፧ (three dots) or ?, !, "" and (). The first three are specific to languages that use the Ge'ez script.   In order to normalize the corpus, cliticized words (words joined by an apostrophe) are separated into their constituent parts.   For example, ክጽሕፍ’ዩ /kISIHIfI’yu/ ‘he will write’ is a cliticized form of the two words ክጽሕፍ /kISIHIfI/ and እዩ /Iyu/.   This tendency occurs because it is customary to mask laryngeals such as እ ‘I’, ኣ ‘a’ or ኢ ‘i’ with an apostrophe while writing.
@@ 行 31: / 行 31: @@
 NTC 1.0 can be used freely for research purposes.
-  - [[https://www.jnlp.org/cgi-priv/download.cgi?id=SNOW/NagaokaTigrinyaCorpus_1.0_tig_T2.rar|Download NTC 1.0]] - TEI format in Ge'ez script
+  * [[https://www.jnlp.org/cgi-priv/download.cgi?id=SNOW/NagaokaTigrinyaCorpus_1.0_tig_T2.rar|Download NTC 1.0]] - TEI format in Ge'ez script
-  - [[https://www.jnlp.org/cgi-priv/download.cgi?id=SNOW/NagaokaTigrinyaCorpus_1.0_rom_T2.rar|Download NTC 1.0]] - TEI format in Latin script (Transliterated)
+  * [[https://www.jnlp.org/cgi-priv/download.cgi?id=SNOW/NagaokaTigrinyaCorpus_1.0_rom_T2.rar|Download NTC 1.0]] - TEI format in Latin script (Transliterated)
 === 6. Contact us ===

言語商会

ページ用ツール

サイト用ツール

ユーザ用ツール

差分