Odd for TEI based dictionariesEduard Drenth (ed.)

Editor: Eduard Drenth2020-12-17

Schema tei_dictionaries: Elements

<article>

<article> An article (determiner) for a noun
Member of
Contained by
core: note
dictionaries: def
May contain Character data only
Content model
<content>
 <textNode/>
</content>
Schema Declaration
element article { text }

<body>

<body> A body contains a superEntry with entries (for example for homonyms) or one entry
Contained by
May contain
Content model
<content>
 <alternate>
  <elementRef key="superEntry"/>
  <elementRef key="entry"/>
 </alternate>
</content>
Schema Declaration
element body { superEntry | entry }

<cit>

<cit> A cit is used as a container holding either a quote, forms (@type='translation') or senses (@type='sensegroup'). When @type='translation', multiple quotes are allowed, when @type='collocation' nested cits of @type=example are allowed. All types can be described using a note, for @type='proverb' and @type='collocation' a definition may be added
Attributes
type
Status Required
Legal values are:
translation
A cit of this type contains a translation of either a form or a text
example
A cit of this type contains an example
collocation
A cit of this type contains collocations of a lemma, possibly with translations and examples
proverb
A cit of this type contains proverbs using a lemma, possibly with translations and examples
sensegroup
A cit of this type contains senses that can be seen as a group
Contained by
May contain
core: note
dictionaries: def
Schematron

<s:report test="(@type='sensegroup' and not(tei:sense)) or (tei:form and @type!='translation') or ((@type='example' or @type='collocation' or @type='proverb') and not(tei:quote))">
<s:value-of select="ancestor::tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno"/>: @type=sensegroup requires cit/sense, @type!=translation|sensegroup requires cit/quote,
form requires @type=translation</s:report>
Schematron

<s:report test="count(tei:quote) > 1 and @type!='translation'">
<s:value-of select="ancestor::tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno"/>: multiple quotes requires @type=translation</s:report>
Schematron

<s:report test="tei:def and @type!='collocation' and @type!='proverb'">
<s:value-of select="ancestor::tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno"/>: def is only for collocation and proverb</s:report>
Schematron

<s:report test="@type='translation' and not(@xml:lang)">
<s:value-of select="ancestor::tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno"/>: translation requires @xml:lang</s:report>
Schematron

<s:report test="tei:cit/tei:cit/tei:cit">
<s:value-of select="ancestor::tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno"/>: maximum nesting depth for cit is 2</s:report>
Schematron

<s:report test="ancestor::tei:cit and @type!='translation' and @type!='example' and @type!='sensegroup'">
<s:value-of select="ancestor::tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno"/>: nested cit must be translation, example or sensegroup</s:report>
Content model
<content>
 <alternate>
  <sequence>
   <elementRef key="sense"
    maxOccurs="unbounded"/>

   <elementRef key="noteminOccurs="0"/>
  </sequence>
  <sequence>
   <elementRef key="quote"
    maxOccurs="unbounded"/>

   <elementRef key="defminOccurs="0"
    maxOccurs="unbounded"/>

   <elementRef key="noteminOccurs="0"/>
   <elementRef key="citminOccurs="0"
    maxOccurs="unbounded"/>

  </sequence>
  <sequence>
   <elementRef key="form"
    maxOccurs="unbounded"/>

   <elementRef key="noteminOccurs="0"/>
  </sequence>
 </alternate>
</content>
Schema Declaration
element cit
{
   attribute type
   {
      "translation" | "example" | "collocation" | "proverb" | "sensegroup"
   },
   ( ( sense+, note? ) | ( quote+, def*, note?, cit* ) | ( form+, note? ) )
}

<def>

<def> A def contains a definition of a sense and may contain either text (with hi, q and ref) or at least one of label, gloss or note.
Module dictionaries
Contained by
May contain
core: note
character data
Schematron

<s:report test="not(node())">
<s:value-of select="ancestor::tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno"/>: def should contain something</s:report>
Content model
<content>
 <alternate>
  <macroRef key="macro.textLike"/>
  <sequence>
   <elementRef key="labelminOccurs="0"/>
   <elementRef key="glossminOccurs="0"/>
   <elementRef key="noteminOccurs="0"/>
  </sequence>
 </alternate>
</content>
Schema Declaration
element def { macro.textLike | ( label?, gloss?, note? ) }

<entry>

<entry> An entry contains at least a form, it optionally contains sense(s) of that form and translations, examples, proverbs or collocations in cit elements, also note and ref can be used. Cits directly under an entry are assumed to hold information valid for all senses.
Contained by
May contain
core: note
Schematron

<s:assert test="tei:form/@type='lemma'">
<s:value-of select="tei:form/tei:orth"/>: entry/form/@type must be of type "lemma"</s:assert>
Schematron

<s:report test="tei:cit[@type='translation']/tei:quote or tei:sense/tei:cit[@type='translation']/tei:quote">
<s:value-of select="ancestor::tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno"/>: entry/cit/@type='translation requires form, not quote'<s:value-of select="ancestor::TEI/teiHeader/fileDesc/titleStmt/title"/>
</s:report>
Schematron

<s:report test="tei:cit[@type!='translation']/tei:form">
<s:value-of select="tei:cit[@type!='translation']/tei:form/tei:orth"/>: cit/form requires cit/@type='translation'</s:report>
Content model
<content>
 <sequence>
  <elementRef key="form"/>
  <alternate maxOccurs="unbounded"
   minOccurs="0">

   <elementRef key="note"/>
   <elementRef key="sense"/>
   <elementRef key="cit"/>
   <elementRef key="ref"/>
  </alternate>
 </sequence>
</content>
Schema Declaration
element entry { form, ( note | sense | cit | ref )* }

<form>

<form> A form contains at least an orth for the word, grammar information in gram elements may be added, for these universal dependencies terminology is used. Several other elements may be used to describe the form, as well as nested form elements for example for paradigm forms. A form may also contain text nodes, q and hi, use of which is discouraged preferably use note instead.
Attributes att.namekinds (@namekind)
type
Status Required
Legal values are:
lemma
This type indicates a headword, possibly with nested non-lemma forms
paradigm
This type indicates a paradigm form for the containing lemma, should be within a lemma form, normally doesn't contain nested forms
synonym
This type indicates a synonym form for the containing lemma, should be within a lemma form
variant
This type indicates a variant for the containing lemma, should be within a lemma form
compound
This type indicates a compound for the containing lemma, should be within a lemma form, shouldn't contain nested forms
Contained by
May contain
core: note
character data
Schematron

<s:report test="@namekind and not(@type='lemma')">
<s:value-of select="tei:orth"/>: define @namekind on a lemma form</s:report>
Schematron

<s:report test="@type='lemma' and parent::tei:form">
<s:value-of select="tei:orth"/>: A lemma form must be at the highest level of a form tree</s:report>
Schematron

<s:report test="tei:form/tei:form/tei:form">
<s:value-of select="tei:orth"/>: maximum 2 levels of form nesting allowed</s:report>
Content model
<content>
 <sequence>
  <elementRef key="orth"/>
  <elementRef key="gramminOccurs="0"
   maxOccurs="unbounded"/>

  <elementRef key="articleminOccurs="0"/>
  <elementRef key="hyphminOccurs="0"/>
  <alternate minOccurs="0"
   maxOccurs="unbounded">

   <elementRef key="pron"/>
   <elementRef key="stress"/>
   <elementRef key="form"/>
   <elementRef key="etym"/>
   <elementRef key="note"/>
   <elementRef key="q"/>
   <elementRef key="hi"/>
   <textNode/>
  </alternate>
 </sequence>
</content>
Schema Declaration
element form
{
   att.namekinds.attributes,
   attribute type { "lemma" | "paradigm" | "synonym" | "variant" | "compound" },
   (
      orth,
      gram*,
      article?,
      hyph?,
      ( pron | stress | form | etym | note | q | hi | text )*
   )
}

<gram>

<gram> A gram provides grammar information for a form using universal dependencies terminology. No @type is used for the category, this simplifies querying and makes an index possible
Member of
Contained by
core: note
dictionaries: def
May contain Empty element
Content model
<content>
 <valList type="closed">
  <valItem ident="islemma.yes">
   <desc/>
  </valItem>
  <valItem ident="abbr.yes">
   <desc/>
  </valItem>
  <valItem ident="poss.yes">
   <desc/>
  </valItem>
  <valItem ident="reflex.yes">
   <desc/>
  </valItem>
  <valItem ident="prefix.yes">
   <desc/>
  </valItem>
  <valItem ident="suffix.yes">
   <desc/>
  </valItem>
  <valItem ident="prontype.prs">
   <desc>personal pronoun or determiner</desc>
  </valItem>
  <valItem ident="prontype.rcp">
   <desc>reciprocal pronoun</desc>
  </valItem>
  <valItem ident="prontype.art">
   <desc>Article is a special case of determiner that bears the feature of definiteness</desc>
  </valItem>
  <valItem ident="prontype.int">
   <desc>interrogative pronoun, determiner, numeral or adverb</desc>
  </valItem>
  <valItem ident="prontype.rel">
   <desc>relative pronoun, determiner, numeral or adverb</desc>
  </valItem>
  <valItem ident="prontype.ind">
   <desc>indefinite pronoun, determiner, numeral or adverb</desc>
  </valItem>
  <valItem ident="prontype.emp">
   <desc>Emphatic pro-adjectives (determiners) emphasize the nominal they depend on.</desc>
  </valItem>
  <valItem ident="prontype.exc">
   <desc>exclamative determiner</desc>
  </valItem>
  <valItem ident="prontype.dem">
   <desc>Demonstrative pronouns are often parallel to interrogatives.</desc>
  </valItem>
  <valItem ident="case.nom">
   <desc>nominative</desc>
  </valItem>
  <valItem ident="case.acc">
   <desc>accusative</desc>
  </valItem>
  <valItem ident="case.dat">
   <desc>dative</desc>
  </valItem>
  <valItem ident="case.gen">
   <desc>genitive</desc>
  </valItem>
  <valItem ident="case.ins">
   <desc>instrumental / instructive</desc>
  </valItem>
  <valItem ident="case.par">
   <desc>partitive</desc>
  </valItem>
  <valItem ident="tense.past">
   <desc>past tense</desc>
  </valItem>
  <valItem ident="tense.pres">
   <desc>present tense</desc>
  </valItem>
  <valItem ident="tense.fut">
   <desc>future tense</desc>
  </valItem>
  <valItem ident="voice.act">
   <desc>The subject of the verb is the doer of the action (agent).</desc>
  </valItem>
  <valItem ident="voice.pass">
   <desc>The subject of the verb is affected by the action (patient).</desc>
  </valItem>
  <valItem ident="number.sing">
   <desc>A singular noun denotes one person, animal or thing.</desc>
  </valItem>
  <valItem ident="number.plur">
   <desc>A plural noun denotes several persons, animals or things.</desc>
  </valItem>
  <valItem ident="number.ptan">
   <desc>Plurale tantum, some nouns appear only in the plural form even though they denote one thing.</desc>
  </valItem>
  <valItem ident="number.coll">
   <desc>Collective or mass or singulare tantum applies to words that use grammatical singular to describe sets of objects.</desc>
  </valItem>
  <valItem ident="person.first">
   <desc>The first person refers just to the speaker / author and in plural one or more additional persons.</desc>
  </valItem>
  <valItem ident="person.second">
   <desc>The second person refers to the addressee(s).</desc>
  </valItem>
  <valItem ident="person.third">
   <desc>The third person refers to one or more persons that are neither speakers nor addressees.</desc>
  </valItem>
  <valItem ident="verbtype.mod">
   <desc>Verbs that take infinitive of another verb as argument and add various modes of possibility, necessity etc.</desc>
  </valItem>
  <valItem ident="verbtype.tense">
   <desc>Verb used to create periphrastic verb forms (tenses, passives etc.).</desc>
  </valItem>
  <valItem ident="verbform.inf">
   <desc>Infinitive is the citation form of verbs in many languages.</desc>
  </valItem>
  <valItem ident="verbform.part">
   <desc>Participle is a non-finite verb form that shares properties of verbs and adjectives.</desc>
  </valItem>
  <valItem ident="verbform.ger">
   <desc>Gerund is a non-finite verb form that shares properties of verbs and nouns.</desc>
  </valItem>
  <valItem ident="verbform.conv">
   <desc>The converb, also called adverbial participle or transgressive, is a non-finite verb form that shares properties of verbs and adverbs.</desc>
  </valItem>
  <valItem ident="polite.infm">
   <desc>usually meant for communication with family members and close friends.</desc>
  </valItem>
  <valItem ident="polite.form">
   <desc>usually meant for communication with strangers and people of higher social status.</desc>
  </valItem>
  <valItem ident="numtype.ord">
   <desc>ordinal number (first, second,..)</desc>
  </valItem>
  <valItem ident="numtype.card">
   <desc>cardinal number (one, two, many,....)</desc>
  </valItem>
  <valItem ident="degree.cmp">
   <desc>comparative, second degree</desc>
  </valItem>
  <valItem ident="degree.sup">
   <desc>superlative, third degree</desc>
  </valItem>
  <valItem ident="mood.imp">
   <desc>The speaker uses imperative to order or ask the addressee to do the action of the verb.</desc>
  </valItem>
  <valItem ident="mood.sub">
   <desc>The subjunctive mood is used under certain circumstances in subordinate clauses, typically for actions that are subjective or otherwise uncertain.</desc>
  </valItem>
  <valItem ident="mood.ind">
   <desc>A verb in indicative merely states that something happens, has happened or will happen.</desc>
  </valItem>
  <valItem ident="gender.masc">
   <desc>masculine gender</desc>
  </valItem>
  <valItem ident="gender.fem">
   <desc>feminine gender</desc>
  </valItem>
  <valItem ident="gender.neut">
   <desc>neuter gender</desc>
  </valItem>
  <valItem ident="gender.com">
   <desc>Some languages do not distinguish masculine/feminine but they do distinguish neuter vs. non-neuter. The non-neuter is called common gender.</desc>
  </valItem>
  <valItem ident="pronoun.drop">
   <desc>Not in universaldependencies. pronoun drop, omission of pronouns because they can be infered</desc>
  </valItem>
  <valItem ident="pronoun.clitic">
   <desc>Not in universaldependencies. pronoun clitic, most personal pronouns have a clitic form, which is the result of either vowel deletion, vowel reduction, monophthongization or schwa deletion, while there are also cases of suppletion.</desc>
  </valItem>
  <valItem ident="diminutive.dim">
   <desc>Not in universaldependencies. diminutive</desc>
  </valItem>
  <valItem ident="inflection.infl">
   <desc>Not in universaldependencies. inflected</desc>
  </valItem>
  <valItem ident="inflection.uninf">
   <desc>Not in universaldependencies. uninflected</desc>
  </valItem>
  <valItem ident="valency.mtran">
   <desc>Not in universaldependencies. a monotransitive verb takes two arguments (of which one object)</desc>
  </valItem>
  <valItem ident="valency.tran">
   <desc>Not in universaldependencies. a transitive verb requires one or more objects</desc>
  </valItem>
  <valItem ident="valency.intran">
   <desc>Not in universaldependencies. an intransitive verb takes one argument (no object)</desc>
  </valItem>
  <valItem ident="valency.ditran">
   <desc>Not in universaldependencies. a ditransitive verb takes three arguments (of which a direct and an indirect object)</desc>
  </valItem>
  <valItem ident="construction.attr">
   <desc>Not in universaldependencies. attributive</desc>
  </valItem>
  <valItem ident="convertedfrom.adj">
   <desc>Not in universaldependencies. adjective used as another category</desc>
  </valItem>
  <valItem ident="convertedfrom.adv">
   <desc>Not in universaldependencies. adverb used as another category</desc>
  </valItem>
  <valItem ident="convertedfrom.ver">
   <desc>Not in universaldependencies. verb used as another category</desc>
  </valItem>
  <valItem ident="convertedfrom.num">
   <desc>Not in universaldependencies. numeral used as another category</desc>
  </valItem>
  <valItem ident="convertedfrom.pro">
   <desc>Not in universaldependencies. pronomen used as another category</desc>
  </valItem>
  <valItem ident="convertedfrom.part">
   <desc>Not in universaldependencies. verbform part used as another category</desc>
  </valItem>
  <valItem ident="predicate.pred">
   <desc>Not in universaldependencies. statement about the subject</desc>
  </valItem>
  <valItem ident="pos.adj">
   <desc>Adjectives are words that typically modify nouns and specify their properties or attributes.</desc>
  </valItem>
  <valItem ident="pos.adp">
   <desc>Adposition is a cover term for prepositions and postpositions.</desc>
  </valItem>
  <valItem ident="pos.adv">
   <desc>Adverbs are words that typically modify verbs for such categories as time, place, direction or manner.</desc>
  </valItem>
  <valItem ident="pos.aux">
   <desc>An auxiliary is a function word that accompanies the lexical verb of a verb phrase and expresses grammatical distinctions not carried by the lexical verb, such as person, number, tense, mood, aspect, voice or evidentiality.</desc>
  </valItem>
  <valItem ident="pos.cconj">
   <desc>A coordinating conjunction is a word that links words or larger constituents without syntactically subordinating one to the other and expresses a semantic relationship between them.</desc>
  </valItem>
  <valItem ident="pos.det">
   <desc>Determiners are words that modify nouns or noun phrases and express the reference of the noun phrase in context.</desc>
  </valItem>
  <valItem ident="pos.intj">
   <desc>An interjection is a word that is used most often as an exclamation or part of an exclamation.</desc>
  </valItem>
  <valItem ident="pos.noun">
   <desc>Nouns are a part of speech typically denoting a person, place, thing, animal or idea.</desc>
  </valItem>
  <valItem ident="pos.num">
   <desc>A numeral is a word, functioning most typically as a determiner, adjective or pronoun, that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.</desc>
  </valItem>
  <valItem ident="pos.part">
   <desc>Particles are function words that must be associated with another word or phrase to impart meaning and that do not satisfy definitions of other universal parts of speech.</desc>
  </valItem>
  <valItem ident="pos.pron">
   <desc>Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context.</desc>
  </valItem>
  <valItem ident="pos.propn">
   <desc>A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object.</desc>
  </valItem>
  <valItem ident="pos.punct">
   <desc>Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text.</desc>
  </valItem>
  <valItem ident="pos.sconj">
   <desc>A subordinating conjunction is a conjunction that links constructions by making one of them a constituent of the other.</desc>
  </valItem>
  <valItem ident="pos.sym">
   <desc>A symbol is a word-like entity that differs from ordinary words by form, function, or both.</desc>
  </valItem>
  <valItem ident="pos.verb">
   <desc>A verb is a member of the syntactic class of words that typically signal events and actions.</desc>
  </valItem>
  <valItem ident="pos.x">
   <desc>The tag X is used for words that for some reason cannot be assigned a real part-of-speech category.</desc>
  </valItem>
 </valList>
</content>
Legal values are:
islemma.yes
abbr.yes
poss.yes
reflex.yes
prefix.yes
suffix.yes
prontype.prs
personal pronoun or determiner
prontype.rcp
reciprocal pronoun
prontype.art
Article is a special case of determiner that bears the feature of definiteness
prontype.int
interrogative pronoun, determiner, numeral or adverb
prontype.rel
relative pronoun, determiner, numeral or adverb
prontype.ind
indefinite pronoun, determiner, numeral or adverb
prontype.emp
Emphatic pro-adjectives (determiners) emphasize the nominal they depend on.
prontype.exc
exclamative determiner
prontype.dem
Demonstrative pronouns are often parallel to interrogatives.
case.nom
nominative
case.acc
accusative
case.dat
dative
case.gen
genitive
case.ins
instrumental / instructive
case.par
partitive
tense.past
past tense
tense.pres
present tense
tense.fut
future tense
voice.act
The subject of the verb is the doer of the action (agent).
voice.pass
The subject of the verb is affected by the action (patient).
number.sing
A singular noun denotes one person, animal or thing.
number.plur
A plural noun denotes several persons, animals or things.
number.ptan
Plurale tantum, some nouns appear only in the plural form even though they denote one thing.
number.coll
Collective or mass or singulare tantum applies to words that use grammatical singular to describe sets of objects.
person.first
The first person refers just to the speaker / author and in plural one or more additional persons.
person.second
The second person refers to the addressee(s).
person.third
The third person refers to one or more persons that are neither speakers nor addressees.
verbtype.mod
Verbs that take infinitive of another verb as argument and add various modes of possibility, necessity etc.
verbtype.tense
Verb used to create periphrastic verb forms (tenses, passives etc.).
verbform.inf
Infinitive is the citation form of verbs in many languages.
verbform.part
Participle is a non-finite verb form that shares properties of verbs and adjectives.
verbform.ger
Gerund is a non-finite verb form that shares properties of verbs and nouns.
verbform.conv
The converb, also called adverbial participle or transgressive, is a non-finite verb form that shares properties of verbs and adverbs.
polite.infm
usually meant for communication with family members and close friends.
polite.form
usually meant for communication with strangers and people of higher social status.
numtype.ord
ordinal number (first, second,..)
numtype.card
cardinal number (one, two, many,....)
degree.cmp
comparative, second degree
degree.sup
superlative, third degree
mood.imp
The speaker uses imperative to order or ask the addressee to do the action of the verb.
mood.sub
The subjunctive mood is used under certain circumstances in subordinate clauses, typically for actions that are subjective or otherwise uncertain.
mood.ind
A verb in indicative merely states that something happens, has happened or will happen.
gender.masc
masculine gender
gender.fem
feminine gender
gender.neut
neuter gender
gender.com
Some languages do not distinguish masculine/feminine but they do distinguish neuter vs. non-neuter. The non-neuter is called common gender.
pronoun.drop
Not in universaldependencies. pronoun drop, omission of pronouns because they can be infered
pronoun.clitic
Not in universaldependencies. pronoun clitic, most personal pronouns have a clitic form, which is the result of either vowel deletion, vowel reduction, monophthongization or schwa deletion, while there are also cases of suppletion.
diminutive.dim
Not in universaldependencies. diminutive
inflection.infl
Not in universaldependencies. inflected
inflection.uninf
Not in universaldependencies. uninflected
valency.mtran
Not in universaldependencies. a monotransitive verb takes two arguments (of which one object)
valency.tran
Not in universaldependencies. a transitive verb requires one or more objects
valency.intran
Not in universaldependencies. an intransitive verb takes one argument (no object)
valency.ditran
Not in universaldependencies. a ditransitive verb takes three arguments (of which a direct and an indirect object)
construction.attr
Not in universaldependencies. attributive
convertedfrom.adj
Not in universaldependencies. adjective used as another category
convertedfrom.adv
Not in universaldependencies. adverb used as another category
convertedfrom.ver
Not in universaldependencies. verb used as another category
convertedfrom.num
Not in universaldependencies. numeral used as another category
convertedfrom.pro
Not in universaldependencies. pronomen used as another category
convertedfrom.part
Not in universaldependencies. verbform part used as another category
predicate.pred
Not in universaldependencies. statement about the subject
pos.adj
Adjectives are words that typically modify nouns and specify their properties or attributes.
pos.adp
Adposition is a cover term for prepositions and postpositions.
pos.adv
Adverbs are words that typically modify verbs for such categories as time, place, direction or manner.
pos.aux
An auxiliary is a function word that accompanies the lexical verb of a verb phrase and expresses grammatical distinctions not carried by the lexical verb, such as person, number, tense, mood, aspect, voice or evidentiality.
pos.cconj
A coordinating conjunction is a word that links words or larger constituents without syntactically subordinating one to the other and expresses a semantic relationship between them.
pos.det
Determiners are words that modify nouns or noun phrases and express the reference of the noun phrase in context.
pos.intj
An interjection is a word that is used most often as an exclamation or part of an exclamation.
pos.noun
Nouns are a part of speech typically denoting a person, place, thing, animal or idea.
pos.num
A numeral is a word, functioning most typically as a determiner, adjective or pronoun, that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.
pos.part
Particles are function words that must be associated with another word or phrase to impart meaning and that do not satisfy definitions of other universal parts of speech.
pos.pron
Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context.
pos.propn
A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object.
pos.punct
Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text.
pos.sconj
A subordinating conjunction is a conjunction that links constructions by making one of them a constituent of the other.
pos.sym
A symbol is a word-like entity that differs from ordinary words by form, function, or both.
pos.verb
A verb is a member of the syntactic class of words that typically signal events and actions.
pos.x
The tag X is used for words that for some reason cannot be assigned a real part-of-speech category.
Schema Declaration
element gram
{
   "islemma.yes"
 | "abbr.yes"
 | "poss.yes"
 | "reflex.yes"
 | "prefix.yes"
 | "suffix.yes"
 | "prontype.prs"
 | "prontype.rcp"
 | "prontype.art"
 | "prontype.int"
 | "prontype.rel"
 | "prontype.ind"
 | "prontype.emp"
 | "prontype.exc"
 | "prontype.dem"
 | "case.nom"
 | "case.acc"
 | "case.dat"
 | "case.gen"
 | "case.ins"
 | "case.par"
 | "tense.past"
 | "tense.pres"
 | "tense.fut"
 | "voice.act"
 | "voice.pass"
 | "number.sing"
 | "number.plur"
 | "number.ptan"
 | "number.coll"
 | "person.first"
 | "person.second"
 | "person.third"
 | "verbtype.mod"
 | "verbtype.tense"
 | "verbform.inf"
 | "verbform.part"
 | "verbform.ger"
 | "verbform.conv"
 | "polite.infm"
 | "polite.form"
 | "numtype.ord"
 | "numtype.card"
 | "degree.cmp"
 | "degree.sup"
 | "mood.imp"
 | "mood.sub"
 | "mood.ind"
 | "gender.masc"
 | "gender.fem"
 | "gender.neut"
 | "gender.com"
 | "pronoun.drop"
 | "pronoun.clitic"
 | "diminutive.dim"
 | "inflection.infl"
 | "inflection.uninf"
 | "valency.mtran"
 | "valency.tran"
 | "valency.intran"
 | "valency.ditran"
 | "construction.attr"
 | "convertedfrom.adj"
 | "convertedfrom.adv"
 | "convertedfrom.ver"
 | "convertedfrom.num"
 | "convertedfrom.pro"
 | "convertedfrom.part"
 | "predicate.pred"
 | "pos.adj"
 | "pos.adp"
 | "pos.adv"
 | "pos.aux"
 | "pos.cconj"
 | "pos.det"
 | "pos.intj"
 | "pos.noun"
 | "pos.num"
 | "pos.part"
 | "pos.pron"
 | "pos.propn"
 | "pos.punct"
 | "pos.sconj"
 | "pos.sym"
 | "pos.verb"
 | "pos.x"
}
Legal values are:
islemma.yes
abbr.yes
poss.yes
reflex.yes
prefix.yes
suffix.yes
prontype.prs
personal pronoun or determiner
prontype.rcp
reciprocal pronoun
prontype.art
Article is a special case of determiner that bears the feature of definiteness
prontype.int
interrogative pronoun, determiner, numeral or adverb
prontype.rel
relative pronoun, determiner, numeral or adverb
prontype.ind
indefinite pronoun, determiner, numeral or adverb
prontype.emp
Emphatic pro-adjectives (determiners) emphasize the nominal they depend on.
prontype.exc
exclamative determiner
prontype.dem
Demonstrative pronouns are often parallel to interrogatives.
case.nom
nominative
case.acc
accusative
case.dat
dative
case.gen
genitive
case.ins
instrumental / instructive
case.par
partitive
tense.past
past tense
tense.pres
present tense
tense.fut
future tense
voice.act
The subject of the verb is the doer of the action (agent).
voice.pass
The subject of the verb is affected by the action (patient).
number.sing
A singular noun denotes one person, animal or thing.
number.plur
A plural noun denotes several persons, animals or things.
number.ptan
Plurale tantum, some nouns appear only in the plural form even though they denote one thing.
number.coll
Collective or mass or singulare tantum applies to words that use grammatical singular to describe sets of objects.
person.first
The first person refers just to the speaker / author and in plural one or more additional persons.
person.second
The second person refers to the addressee(s).
person.third
The third person refers to one or more persons that are neither speakers nor addressees.
verbtype.mod
Verbs that take infinitive of another verb as argument and add various modes of possibility, necessity etc.
verbtype.tense
Verb used to create periphrastic verb forms (tenses, passives etc.).
verbform.inf
Infinitive is the citation form of verbs in many languages.
verbform.part
Participle is a non-finite verb form that shares properties of verbs and adjectives.
verbform.ger
Gerund is a non-finite verb form that shares properties of verbs and nouns.
verbform.conv
The converb, also called adverbial participle or transgressive, is a non-finite verb form that shares properties of verbs and adverbs.
polite.infm
usually meant for communication with family members and close friends.
polite.form
usually meant for communication with strangers and people of higher social status.
numtype.ord
ordinal number (first, second,..)
numtype.card
cardinal number (one, two, many,....)
degree.cmp
comparative, second degree
degree.sup
superlative, third degree
mood.imp
The speaker uses imperative to order or ask the addressee to do the action of the verb.
mood.sub
The subjunctive mood is used under certain circumstances in subordinate clauses, typically for actions that are subjective or otherwise uncertain.
mood.ind
A verb in indicative merely states that something happens, has happened or will happen.
gender.masc
masculine gender
gender.fem
feminine gender
gender.neut
neuter gender
gender.com
Some languages do not distinguish masculine/feminine but they do distinguish neuter vs. non-neuter. The non-neuter is called common gender.
pronoun.drop
Not in universaldependencies. pronoun drop, omission of pronouns because they can be infered
pronoun.clitic
Not in universaldependencies. pronoun clitic, most personal pronouns have a clitic form, which is the result of either vowel deletion, vowel reduction, monophthongization or schwa deletion, while there are also cases of suppletion.
diminutive.dim
Not in universaldependencies. diminutive
inflection.infl
Not in universaldependencies. inflected
inflection.uninf
Not in universaldependencies. uninflected
valency.mtran
Not in universaldependencies. a monotransitive verb takes two arguments (of which one object)
valency.tran
Not in universaldependencies. a transitive verb requires one or more objects
valency.intran
Not in universaldependencies. an intransitive verb takes one argument (no object)
valency.ditran
Not in universaldependencies. a ditransitive verb takes three arguments (of which a direct and an indirect object)
construction.attr
Not in universaldependencies. attributive
convertedfrom.adj
Not in universaldependencies. adjective used as another category
convertedfrom.adv
Not in universaldependencies. adverb used as another category
convertedfrom.ver
Not in universaldependencies. verb used as another category
convertedfrom.num
Not in universaldependencies. numeral used as another category
convertedfrom.pro
Not in universaldependencies. pronomen used as another category
convertedfrom.part
Not in universaldependencies. verbform part used as another category
predicate.pred
Not in universaldependencies. statement about the subject
pos.adj
Adjectives are words that typically modify nouns and specify their properties or attributes.
pos.adp
Adposition is a cover term for prepositions and postpositions.
pos.adv
Adverbs are words that typically modify verbs for such categories as time, place, direction or manner.
pos.aux
An auxiliary is a function word that accompanies the lexical verb of a verb phrase and expresses grammatical distinctions not carried by the lexical verb, such as person, number, tense, mood, aspect, voice or evidentiality.
pos.cconj
A coordinating conjunction is a word that links words or larger constituents without syntactically subordinating one to the other and expresses a semantic relationship between them.
pos.det
Determiners are words that modify nouns or noun phrases and express the reference of the noun phrase in context.
pos.intj
An interjection is a word that is used most often as an exclamation or part of an exclamation.
pos.noun
Nouns are a part of speech typically denoting a person, place, thing, animal or idea.
pos.num
A numeral is a word, functioning most typically as a determiner, adjective or pronoun, that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.
pos.part
Particles are function words that must be associated with another word or phrase to impart meaning and that do not satisfy definitions of other universal parts of speech.
pos.pron
Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context.
pos.propn
A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object.
pos.punct
Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text.
pos.sconj
A subordinating conjunction is a conjunction that links constructions by making one of them a constituent of the other.
pos.sym
A symbol is a word-like entity that differs from ordinary words by form, function, or both.
pos.verb
A verb is a member of the syntactic class of words that typically signal events and actions.
pos.x
The tag X is used for words that for some reason cannot be assigned a real part-of-speech category.

<msIdentifier>

<msIdentifier>
Module msdescription
Contained by
May contain Empty element
Schematron

<note>

<note>
Module core
Contained by
dictionaries: def
May contain
character data
Content model
<content>
 <macroRef key="macro.textLike"/>
</content>
Schema Declaration
element note { macro.textLike }

<purpose>

<purpose> Purpose is used to indicate the capabilities of a dictionary, when a certain capability is present this implies it is viable to query for data belonging to the capability. Clients are free to either or not act upon the registered capabilities.
Attributes
type The type attirbute denotes a capability
Status Required
Datatype teidata.enumerated
Legal values are:
formtranslation
The dictionary can be used for form translations
texttranslation
The dictionary can be used for text translations
synonyms
The dictionary can be used for synonyms
variants
The dictionary can be used for variants
compounds
The dictionary can be used for compounds
pronunciation
The dictionary can be used for pronunciation
hyphenation
The dictionary can be used for hyphenation
pos
The dictionary can be used for part of speech
paradigm
The dictionary can be used for paradigm
examples
The dictionary can be used for examples
collocations
The dictionary can be used for collocations
proverbs
The dictionary can be used for proverbs
Contained by
May contain Empty element

<quote>

<quote> A quote contains text (with hi, q and ref) and optionally etymological or bibliographical information or a note.
Contained by
May contain
core: note
character data
Content model
<content>
 <macroRef key="macro.textLike"/>
 <elementRef key="etymminOccurs="0"/>
 <elementRef key="biblminOccurs="0"/>
 <elementRef key="noteminOccurs="0"/>
</content>
Schema Declaration
element quote { macro.textLike, etym?, bibl?, note? }

<sense>

<sense> A sense contains zero or more ref, def, note and translations, examples, proverbs or collocations in cit elements, followed by an optional form
Contained by
May contain
core: note
dictionaries: def
Content model
<content>
 <sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">

   <elementRef key="ref"/>
   <elementRef key="def"/>
   <elementRef key="cit"/>
   <elementRef key="note"/>
  </alternate>
  <elementRef key="formminOccurs="0"/>
 </sequence>
</content>
Schema Declaration
element sense { ( ref | def | cit | note )*, form? }

<superEntry>

<superEntry> A superEntry contains entries that are grouped together, for example for homonyms
Contained by
May contain
Content model
<content>
 <elementRef key="entry"
  maxOccurs="unbounded"/>

</content>
Schema Declaration
element superEntry { entry+ }

<TEI>

<TEI> Each article is a TEI document with a teiHeader holding at least an msIdentifier/idno for the article. The contents of the article can be found under body/text. msIdentifier/collection/@ref points to metainfo of the dictionary.
Contained by
May contain
Schematron

<s:assert test="tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno">
<s:value-of select="tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:title"/>: missing teiHeader/fileDesc/sourceDesc/msDesc/msIdentifier/idno.</s:assert>
Content model
<content>
 <sequence>
  <elementRef key="teiHeader"/>
  <elementRef key="text"/>
 </sequence>
</content>
Schema Declaration
element TEI { teiHeader, text }

<teiCorpus>

<teiCorpus> A teiCorpus document contains a teiHeader with meta information about the dictionary. It may contain the articles in nested TEI elements, but the default application expects articles in separate TEI documents where collection/@ref points to the file containing dictionary metainfo in teiCorpus/teiHeader.
Contained by
May contain
Content model
<content>
 <elementRef key="teiHeader"/>
 <elementRef key="TEI"
  maxOccurs="unboundedminOccurs="0"/>

</content>
Schema Declaration
element teiCorpus { teiHeader, TEI* }

<text>

<text> A text container contains a body
Contained by
May contain
Content model
<content>
 <elementRef key="body"/>
</content>
Schema Declaration
element text { body }

<textDesc>

<textDesc> Use textDesc/purpose to indicate the capabilities of a dictionary, so clients can know what it can be used for
Contained by
May contain Empty element
Schematron

<s:assert test="ancestor::tei:teiHeader[parent::tei:teiCorpus]">
<s:value-of select="ancestor::tei:TEI/tei:teiHeader/tei:fileDesc/tei:sourceDesc/tei:msDesc/tei:msIdentifier/tei:idno"/>: textDesc/purpose should be in teiCorpus/teiHeader
not in TEI/teiHeader</s:assert>

Schema tei_dictionaries: Attribute classes

att.namekinds

att.namekinds 
Module namesdates
Members form
Attributes
namekind A namekind attribute can be used on form elements to indicate what kind of name it is, for example a place or a person.
Status Optional
Datatype teidata.enumerated
Legal values are:
countryName
placeName
see description for placeName element
geoName
see description for geoName element
orgName
see description for orgName element
personName
see description for personName element
animalName
plantName
birdName

Schema tei_dictionaries: Macros

macro.textLike

macro.textLike Several elements (i.e. def, quote and note) can contain text with simple styling and references. In note one may want to explain forms, for this several elements for characteristics of forms can be used.
Used by
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <elementRef key="q"/>
  <elementRef key="hi"/>
  <elementRef key="ref"/>
  <elementRef key="pron"/>
  <elementRef key="stress"/>
  <elementRef key="gram"/>
  <elementRef key="article"/>
 </alternate>
</content>
Declaration
macro.textLike = ( text | q | hi | ref | pron | stress | gram | article )*
Date: 2020-12-17