
Issue #19908 has been updated by janosch-x (Janosch Müller). Is not [this](https://www.unicode.org/reports/tr29/tr29-43.html#Regex_Definitions) the updated regular expression? ```diff ccs-base := [\p{L}\p{N}\p{P}\p{S}\p{Zs}] ccs-extend := [\p{M}\p{Join_Control}] extended_base := ccs-base | hangul-syllable -crlf := CR LF +crlf := CR LF | CR | LF legacy-core := hangul-syllable | ri-sequence | xpicto-sequence legacy-postcore := [Extend ZWJ] core := hangul-syllable | ri-sequence | xpicto-sequence +| conjunctCluster | [^Control CR LF] postcore := [Extend ZWJ SpacingMark] precore := Prepend hangul-syllable := L* (V+ | LV V* | LVT) T* | L+ | T+ xpicto-sequence := \p{Extended_Pictographic} (Extend* ZWJ \p{Extended_Pictographic})* +conjunctCluster := \p{InCB=Consonant} ([\p{InCB=Extend} \p{InCB=Linker}]* \p{InCB=Linker} [\p{InCB=Extend} \p{InCB=Linker}]* \p{InCB=Consonant})+ ``` ---------------------------------------- Feature #19908: Update to Unicode 15.1 https://bugs.ruby-lang.org/issues/19908#change-106054 * Author: nobu (Nobuyoshi Nakada) * Status: Assigned * Priority: Normal * Assignee: duerst (Martin Dürst) ---------------------------------------- The Unicode 15.1 is released. The current enc-unicode.rb seems to fail because of `Indic_Conjunct_break` properties with values. I'm not sure how these properties should be handled well. `/\p{InCB_Liner}/` or `/\p{InCB=Liner}/` as the comments in that file? https://github.com/nobu/ruby/tree/unicode-15.1 is the former. -- https://bugs.ruby-lang.org/