Dependent Tiers
Status: Reference Last updated: 2026-06-22 23:33 EDT
Dependent tiers appear on lines beginning with % immediately after an utterance. They provide annotations linked to the main tier content.
CHAT defines four structural categories of dependent tiers:
- Structured linguistic tiers: parsed into typed AST nodes with word-level alignment
- Phon phonological tiers: syllabification and segmental alignment from the Phon project
- Bullet-content tiers: free-form text with optional inline timing markers
- Text tiers: plain text with no structural alignment
Structured Linguistic Tiers
These tiers have rich, parsed representations in the data model. Each token aligns 1-to-1 with an alignable word on the main tier (excluding retraces, pauses, and events). Terminators (., ?, !) must match the main tier terminator.
%mor, Morphological Analysis
The %mor tier carries part-of-speech tags, lemmas, and morphological features for each word on the main tier. See The %mor Tier for full documentation covering the UD-style format, data model, divergences from Universal Dependencies, and migration from traditional CHAT MOR.
Format: POS|lemma[-Feature]*, with ~ separating post-clitics.
*CHI: she's eating cookies .
%mor: PRON|she~AUX|be-Pres-S3 VERB|eat-Prog NOUN|cookie-Plur .
%gra, Grammatical Relations
The %gra tier encodes dependency syntax using Universal Dependencies relation labels. Each entry has the format index|head|relation, where indices are 1-based and head 0 indicates ROOT.
*CHI: I want cookies .
%mor: PRON|I VERB|want NOUN|cookie-Plur .
%gra: 1|2|SUBJ 2|0|ROOT 3|2|OBJ 4|2|PUNCT
The %gra tier aligns with %mor chunks (clitics expand into multiple chunks). Validation checks sequential indices (E721), ROOT structure (E722 missing root, E723 multiple roots), and circular dependencies (E724).
%pho / %mod, Phonological Transcription
The %pho tier records actual pronunciation; %mod records target/model pronunciation. Both use the same format: space-separated phonetic tokens aligned 1-to-1 with main tier words.
*CHI: I want three cookies .
%pho: aɪ wɑnt fwi kʊkiz .
%mod: aɪ wɑnt θri kʊkiz .
Phonological tiers support IPA, UNIBET, X-SAMPA, or custom notation systems. They are used for child language, speech disorders, L2 learning, and dialectal variation studies.
Parsing strategy: We deliberately parse only the minimal word/group-level structure in
%phoand%modneeded for coarse alignment with the main tier. The full IPA phoneme content is stored as opaque strings, deep phonological analysis is handled by Phon, and we avoid duplicating that work. The Phon extension tiers (%modsyl,%phosyl,%phoaln) follow the same strategy.
%sin, Gesture and Sign Annotation
The %sin tier codes gestures and signs aligned with speech. Each token is either 0 (no gesture) or g:referent:type (e.g., g:ball:dpoint for a deictic point at a ball).
*CHI: that ball .
%sin: g:ball:dpoint 0 .
Multiple simultaneous gestures use bracket grouping: 〔g:toy:hold g:toy:shake〕.
%wor, Word Timing
The %wor tier carries word-level timing annotations for media synchronization.
Words may include inline bullets with millisecond timestamps. Word text is
display-only (“eye candy”); timing data comes from the bullet fields.
⚠ IMPORTANT:
%worword text is the cleaned form, by design. When chatter serializes a%worword it writes the word’s cleaned text, the spoken form with surface markers removed, NOT the raw main-tier surface form. This is a deliberate convention (seeWorTier::write_chatincrates/talkbank-model/src/model/dependent_tier/wor.rs), chosen for human readability and because%worexists to anchor timing, not to re-state the main tier’s orthography. The generated%wortext and the TextGrid export both use this cleaned form.Consequence you must know: surface markers carried on a word, prosodic lengthening (
wabe:), and similar in-word notation, are not preserved in%woroutput. A main-tier wordwabe:becomeswabeon%wor. This means a%worline containing such words does not byte-roundtrip (parse, serialize, reparse changes the surface text), and that is expected, not a bug.%woris a cleaned, timing-only view; the main tier remains the faithful record of surface forms. Do not “fix” the%worserializer to emit raw text without an explicit decision to change this convention.
%wor is not a flat “all tokens except punctuation” tier. It follows a
word-level alignment rule:
- Regular words count.
- Fillers (
&-um,&-uh,&-you_know) count; they are real spoken words with known phoneme sequences. - Fragments (
&+...) do NOT count: incomplete phoneme sequences; the FA engine cannot reliably anchor partial phonological material. - Nonwords (
&~...) do NOT count: interactional/gestural sounds without stable lexical phoneme content for alignment. - Untranscribed placeholders (
xxx,yyy,www) do NOT count: they have no known phoneme sequence; CTC forced alignment cannot produce timings for unknown material. - Replacements keep the original spoken word slot for
%wor; the replacement text matters for%mor, not%wor. If the original slot is untranscribed or a fragment/nonword, it is still excluded. - Retrace scope does not change
%wormembership. - Overlap markers do not change
%wormembership.
%wor is a timing-annotation tier. Its word count equals the number of Wor-domain
words and may differ from a naive main-tier word count. There is no downstream
positional indexing into %wor; the %wor count is not validated against the
main-tier word count.
*CHI: I want cookies .
%wor: I want cookies .
Exact corpus-shaped contrast:
*CHI: <one &+ss> [/] one play ground .
%wor: one •321809_321969• play •322049_322310• ground •322390_322890• .
# &+ss is a fragment, excluded from %wor regardless of retrace context.
*EXP: &+ih <the what> [/] what's letter &+th is this ?
%wor: the •49103_49163• what •49183_50205• what's •50205_50405• letter •50405_50685• is •50946_51046• this •51086_51586• ?
# Fragments &+ih and &+th excluded; regular words remain.
*EXP: what's is dis [: this] ?
%wor: what's •37050_37471• is •37491_37631• dis •37631_38131• ?
*CHI: xxx snack .
%wor: snack •884668_885168• .
# xxx has no phoneme sequence, excluded from %wor; only snack appears.
*CHI: &~um a boat .
%wor: a •1073779_1073799• boat •1076861_1077361• .
# &~um is a nonword, excluded from %wor.
*CHI: &-mm [<] bananas are good .
%wor: mm •1949506_1949566• bananas •1949566_1949766• are •1949846_1949987• good •1950067_1950567• .
# &-mm is a filler, included in %wor (real spoken word with alignable phoneme sequence).
flowchart TD
A["Main-tier word candidate"] --> B{"Timestamp token /\nomission / empty?"}
B -->|Yes| OUT["Excluded from %wor"]
B -->|No| C{"Untranscribed?\n(xxx/yyy/www)"}
C -->|Yes| OUT
C -->|No| D{"Fragment or nonword?\n(&+ or &~)"}
D -->|Yes| OUT
D -->|No| IN["Counts for %wor\n(word or filler &-)"]
style IN fill:#afa,stroke:#333
style OUT fill:#faa,stroke:#333
Phon Phonological Tiers
These tiers originate from the Phon
project and provide syllable-annotated phonological transcription and segmental
alignment. They were originally serialized as %x-prefixed user-defined tiers
(%xmodsyl, %xphosyl, %xphoaln) and are being promoted to official CHAT
tiers. Phon stores phonological data in its own XML format; the CHAT
representation is generated by PhonTalk.
%modsyl / %phosyl, Syllabified Phonology
%modsyl is a syllabified version of %mod (target pronunciation); %phosyl
is a syllabified version of %pho (actual pronunciation). Each phoneme is
annotated with a syllable position code (N=nucleus, O=onset, C=coda,
etc.). Words are space-separated and align 1-to-1 with the corresponding
%mod or %pho tier.
*CHI: the best .
%mod: ðə bɛst .
%modsyl: ð:Oə:N b:Oɛ:Ns:Ct:C .
%pho: ðə bɛs .
%phosyl: ð:Oə:N b:Oɛ:Ns:C .
Alignment: Content-based, stripping position codes (:N, :O, :C, etc.)
and stress markers (ˈ, ˌ) from %modsyl should yield the same phonemes
as %mod. Same for %phosyl → %pho.
%phoaln, Phone Alignment
%phoaln provides segmental alignment between target and actual IPA,
showing phoneme-by-phoneme correspondence. Each pair uses source↔target
notation; ∅ marks insertions or deletions.
*CHI: the best .
%phoaln: ð↔ð,ə↔ə b↔b,ɛ↔ɛ,s↔s,t↔∅
Alignment: Positional, word-by-word, word N in %phoaln aligns with
word N in both %mod and %pho.
Parsing strategy: Same as %pho/%mod, we parse just enough structure
for alignment (word boundaries for %modsyl/%phosyl, alignment pairs for
%phoaln). IPA phoneme content is treated as opaque strings.
Validation (E725-E728)
Because these are derived views, word counts must match between each syllabification tier and its parent IPA tier:
| Check | Error code |
|---|---|
%modsyl word count ≠ %mod word count | E725 |
%phosyl word count ≠ %pho word count | E726 |
%phoaln word count ≠ %mod word count | E727 |
%phoaln word count ≠ %pho word count | E728 |
These checks are gated on ParseHealth, if either tier in a pair has parse
errors, the alignment check is suppressed to avoid false positives.
Known PhonTalk Export Issue
The PhonTalk XML→CHAT converter writes %mod/%pho through a OneToOne
alignment path that maps IPA words to orthography words and silently drops
extras. The syllabification tiers (%modsyl, %phosyl, %phoaln) bypass
this path and include all IPA words. In child phonology data where children
produce more IPA words than orthographic targets (~4% of Phon corpus files),
this creates tier-to-tier word count mismatches. The mismatches originate in
the Phon XML source data (orthography↔IPA word count discrepancies) and are
inconsistently handled during CHAT export. This is being investigated in
collaboration with the Phon team.
Bullet-Content Tiers
These tiers contain free-form text with optional embedded timing markers (•START_END•) and picture references (•%pic:"file.jpg"•). They do not align word-by-word with the main tier.
| Tier | Purpose |
|---|---|
%act | Physical actions, gestures, non-verbal behaviors |
%cod | Research-specific coding (semantic roles, thematic coding, error classification) |
%com | Comments, annotations, and contextual notes |
%exp | Explanations or expansions of ambiguous/incomplete speech |
%add | Addressee identification in multi-party conversations |
%spa | Speech act coding (request, assertion, question, directive) |
%sit | Situational context or setting description |
%gpx | Extended gesture position coding |
%int | Intonational contours and prosodic patterns |
%cod is bullet-content in the shared TalkBank AST. In the %cod coding
convention, a word selector such as <w4> scopes the code that follows it
(it names which main-tier word the code applies to) rather than being a code
in its own right.
Example with timing:
*CHI: gimme that .
%act: reaches toward shelf
%com: child is pointing to picture
Text Tiers
These tiers contain plain text with no bullets, timing, or structural alignment:
| Tier | Purpose |
|---|---|
%alt | Alternative transcriptions |
%coh | Cohesion annotation |
%def | Definitions |
%eng | English translations (for non-English transcripts) |
%err | Error annotations |
%fac | Facial expressions |
%flo | Flow annotation |
%gls | Glosses |
%ort | Orthographic representations |
%par | Paralinguistic information |
%tim | Timing information |
User-Defined Tiers
Tiers prefixed with %x (e.g., %xcod, %xact) are user-defined dependent tiers. They are preserved during parsing and roundtrip but receive no structural validation beyond basic format checks. Any %x-prefixed tier is always accepted, this is the open extension point for project-specific annotation.
The Supported Set Is Closed
A dependent tier is valid in chatter only if it is one of the standard tiers documented above (the structured, Phon, bullet-content, and text tiers) or a %x-prefixed user-defined tier. Any other %-tier is invalid CHAT, and chatter rejects the file with error E605 (UnsupportedDependentTier). This is a closed set by design: chatter validate is the binding judgment on CHAT validity, so an unrecognized dependent tier is an error, not a warning.
Deliberate Divergence from CLAN: Retired Legacy Tiers
When TalkBank standardized morphology on a single Universal Dependencies %mor tier (plus %gra for relations), several legacy dependent tiers were retired. CLAN’s check still accepts three of them, so on these chatter is intentionally stricter, a deliberate, documented divergence:
| Retired tier | CLAN check | chatter |
|---|---|---|
%trn | accepts | rejects (E605) |
%tra | accepts | rejects (E605) |
%grt | accepts | rejects (E605) |
%umor | rejects | rejects (E605) |
The modern UD-%mor workflow has one morphology tier (%mor) plus %gra; the older training/translation/variant tiers are no longer part of the format chatter validates. %umor is rejected by both validators and is listed only for completeness. Note that %xtra (with the %x prefix) is a perfectly valid user-defined tier; only the bare %tra is retired.
This is one instance of a general principle: where chatter intentionally departs from CLAN/CHECK behavior, the divergence is documented rather than left implicit. See CHECK Parity Audit.