Files
ServerSync/lib/wcwidth/__pycache__/grapheme.cpython-314.pyc

153 lines
15 KiB
Plaintext
Raw Normal View History

2026-02-12 02:28:23 +02:00
+
9<><39>iF4<00><01><><00>Rt^RIHt^RIHt^RIHt^RIHtH t ^RI
H
t ^RI H t HtHtHtHtHtHtHtHtHtHtHtHtHt]'d^RIHt^ t!RR ]4t]!R
R 7R R l4t]!R
R 7RRl4t ]!R
R 7RRl4t!]!R
R 7RRl4t"]!R
R 7RRl4t#!RR] 4t$]!R
R 7RRl4t%RRlt&R%RRllt'RR lt(R!R"lt)R%R#R$llt*R#)&z<>
Grapheme cluster segmentation following Unicode Standard Annex #29.
This module provides pure-Python implementation of the grapheme cluster boundary algorithm as
defined in UAX #29: Unicode Text Segmentation.
https://www.unicode.org/reports/tr29/
)<01> annotations)<01>IntEnum)<01> lru_cache)<02> TYPE_CHECKING<4E>
NamedTuple)<01>bisearch)<0E>
GRAPHEME_L<EFBFBD>
GRAPHEME_T<EFBFBD>
GRAPHEME_V<EFBFBD> GRAPHEME_LV<4C> INCB_EXTEND<4E> INCB_LINKER<45> GRAPHEME_LVT<56>INCB_CONSONANT<4E>GRAPHEME_EXTEND<4E>GRAPHEME_CONTROL<4F>GRAPHEME_PREPEND<4E>GRAPHEME_SPACINGMARK<52>EXTENDED_PICTOGRAPHIC<49>GRAPHEME_REGIONAL_INDICATOR)<01>Iteratorc<01>R<00>]tRt^,tRt^t^t^t^t^t ^t
^t ^t ^t ^ t^
t^ t^ t^ tRtR#)<04>GCBz'Grapheme Cluster Break property values.<2E>N)<14>__name__<5F>
__module__<EFBFBD> __qualname__<5F>__firstlineno__<5F>__doc__<5F>OTHER<45>CR<43>LF<4C>CONTROL<4F>EXTEND<4E>ZWJ<57>REGIONAL_INDICATOR<4F>PREPEND<4E> SPACING_MARK<52>L<>V<>T<>LV<4C>LVT<56>__static_attributes__r<00><00>7/tmp/pip-target-wqrk2shd/lib/python/wcwidth/grapheme.pyrr,sL<00><00>1<> <0A>E<EFBFBD>
<EFBFBD>B<EFBFBD>
<EFBFBD>B<EFBFBD><0F>G<EFBFBD> <0E>F<EFBFBD>
<0B>C<EFBFBD><1A><16><0F>G<EFBFBD><14>L<EFBFBD> <09>A<EFBFBD>
<EFBFBD>A<EFBFBD>
<EFBFBD>A<EFBFBD> <0B>B<EFBFBD>
<0C>Cr.ri)<01>maxsizec<01> <00>V^8<>dQhRRRR/#)<05><00>ucs<63>int<6E>returnrr)<01>formats"r/<00> __annotate__r7Cs<00><00><15><15><13><15><13>r.c<05><><00>V^ 8Xd\P#V^
8Xd\P#VR8Xd\P#\ V\
4'd\P #\ V\4'd\P#\ V\4'd\P#\ V\4'd\P#\ V\4'd\P#\ V\4'd\P #\ V\"4'd\P$#\ V\&4'd\P(#\ V\*4'd\P,#\ V\.4'd\P0#\P2#)z;Return the Grapheme_Cluster_Break property for a codepoint.i )rr r!r$<00> _bisearchrr"rr#rr%rr&rr'rr(r
r)r r*r r+rr,r<00>r3s&r/<00>_grapheme_cluster_breakr;Bs'<00><00>
 <0B>f<EFBFBD>}<7D><12>v<EFBFBD>v<EFBFBD> <0A>
<EFBFBD>f<EFBFBD>}<7D><12>v<EFBFBD>v<EFBFBD> <0A>
<EFBFBD>f<EFBFBD>}<7D><12>w<EFBFBD>w<EFBFBD><0E><10><13>&<26>'<27>'<27><12>{<7B>{<7B><1A><10><13>o<EFBFBD>&<26>&<26><12>z<EFBFBD>z<EFBFBD><19><10><13>1<>2<>2<><12>%<25>%<25>%<25><10><13>&<26>'<27>'<27><12>{<7B>{<7B><1A><10><13>*<2A>+<2B>+<2B><12><1F><1F><1F><10><13>j<EFBFBD>!<21>!<21><12>u<EFBFBD>u<EFBFBD> <0C><10><13>j<EFBFBD>!<21>!<21><12>u<EFBFBD>u<EFBFBD> <0C><10><13>j<EFBFBD>!<21>!<21><12>u<EFBFBD>u<EFBFBD> <0C><10><13>k<EFBFBD>"<22>"<22><12>v<EFBFBD>v<EFBFBD> <0A><10><13>l<EFBFBD>#<23>#<23><12>w<EFBFBD>w<EFBFBD><0E> <0E>9<EFBFBD>9<EFBFBD>r.c<01> <00>V^8<>dQhRRRR/#<00>r2r3r4r5<00>boolr)r6s"r/r7r7fs<00><00>7<>7<>3<EFBFBD>7<>4<EFBFBD>7r.c<05>4<00>\\V\44#)z6Check if codepoint has Extended_Pictographic property.)r>r9rr:s&r/<00>_is_extended_pictographicr@es<00><00> <10> <09>#<23>4<>5<> 6<>6r.c<01> <00>V^8<>dQhRRRR/#r=r)r6s"r/r7r7l<00><00><00>-<2D>-<2D><13>-<2D><14>-r.c<05>4<00>\\V\44#)z,Check if codepoint has InCB=Linker property.)r>r9r r:s&r/<00>_is_incb_linkerrDk<00><00><00> <10> <09>#<23>{<7B>+<2B> ,<2C>,r.c<01> <00>V^8<>dQhRRRR/#r=r)r6s"r/r7r7rs<00><00>0<>0<>C<EFBFBD>0<>D<EFBFBD>0r.c<05>4<00>\\V\44#)z/Check if codepoint has InCB=Consonant property.)r>r9rr:s&r/<00>_is_incb_consonantrHqs<00><00> <10> <09>#<23>~<7E>.<2E> /<2F>/r.c<01> <00>V^8<>dQhRRRR/#r=r)r6s"r/r7r7xrBr.c<05>4<00>\\V\44#)z,Check if codepoint has InCB=Extend property.)r>r9r r:s&r/<00>_is_incb_extendrKwrEr.c<01>0<00>]tRt^}t$RtR]R&R]R&RtR#)<08> BreakResultz*Result of grapheme cluster break decision.r><00> should_breakr4<00>ri_countrN)rrrrr<00>__annotations__r-rr.r/rMrM}s<00><00>4<><16><16><11>Mr.rMc<01>$<00>V^8<>dQhRRRRRR/#)r2<00>prev_gcbr<00>curr_gcbr5zBreakResult | Noner)r6s"r/r7r7<00>s"<00><00>-<10>-<10>#<23>-<10><13>-<10>9K<39>-r.c<05>J<00>V\P8Xd#V\P8Xd\R^R7#V\P\P\P39d\R^R7#V\P\P\P39d\R^R7#V\P
8XdQV\P
\P \P\P39d\R^R7#V\P\P 39d3V\P \P39d\R^R7#V\P\P39d#V\P8Xd\R^R7#V\P8Xd\R^R7#V\P8Xd\R^R7#V\P8Xd\R^R7#R#)z<>
Check simple GCB-pair-based break rules (cacheable).
Returns BreakResult for rules that can be determined from GCB properties alone, or None if
complex lookback rules (GB9c, GB11) need to be checked.
F<EFBFBD>rNrOTN) rr r!rMr"r(r)r+r,r*r#r'r&)rRrSs&&r/<00>_simple_break_checkrV<00>st<00><00><10>3<EFBFBD>6<EFBFBD>6<EFBFBD><19>h<EFBFBD>#<23>&<26>&<26>0<><1A><05><01>:<3A>:<3A><10>C<EFBFBD>K<EFBFBD>K<EFBFBD><13><16><16><13><16><16>0<>0<><1A><04>q<EFBFBD>9<>9<><10>C<EFBFBD>K<EFBFBD>K<EFBFBD><13><16><16><13><16><16>0<>0<><1A><04>q<EFBFBD>9<>9<><10>3<EFBFBD>5<EFBFBD>5<EFBFBD><18>X<EFBFBD>#<23>%<25>%<25><13><15><15><03><06><06><03><07><07>)H<>H<><1A><05><01>:<3A>:<3A><10>C<EFBFBD>F<EFBFBD>F<EFBFBD>C<EFBFBD>E<EFBFBD>E<EFBFBD>?<3F>"<22>x<EFBFBD>C<EFBFBD>E<EFBFBD>E<EFBFBD>3<EFBFBD>5<EFBFBD>5<EFBFBD>><3E>'A<><1A><05><01>:<3A>:<3A><10>C<EFBFBD>G<EFBFBD>G<EFBFBD>S<EFBFBD>U<EFBFBD>U<EFBFBD>#<23>#<23><08>C<EFBFBD>E<EFBFBD>E<EFBFBD>(9<><1A><05><01>:<3A>:<3A><10>3<EFBFBD>:<3A>:<3A><1D><1A><05><01>:<3A>:<3A><10>3<EFBFBD>#<23>#<23>#<23><1A><05><01>:<3A>:<3A><10>3<EFBFBD>;<3B>;<3B><1E><1A><05><01>:<3A>:<3A> r.c <01>0<00>V^8<>dQhRRRRRRRRRRR R
/#) r2rRrrS<00>text<78>str<74>curr_idxr4rOr5rMr)r6s"r/r7r7<00>sL<00><00>@=<3D>@=<3D><11>@=<3D><11>@=<3D> <0E>@=<3D><12> @=<3D>
<12> @=<3D> <11> @=r.c<05><><00>\W4pVeV#V\P8Xd\R^R7#\ W#,4p\ V4'd<>RpV^,
pV^8<>dt\ W(,4p \ V 4'dRpV^,pK6\V 4'd V^,pKR\ V 4'dV'd\R^R7#MV\P8Xd}\V4'dlV^,
pV^8<>d\\ W(,4p \V 4p
V
\P8Xd V^,pKC\V 4'd\R^R7#V\P8XdEV\P8Xd0V^,^8Xd\RV^,R7#\R^R7#V\P8Xd^M^p\RVR7#)z<>
Determine if there should be a grapheme cluster break between prev and curr.
Implements UAX #29 grapheme cluster boundary rules.
FrUT) rVrr$rM<00>ordrHrDrKr@r;r#r%) rRrSrXrZrO<00>result<6C>curr_ucs<63>
has_linker<EFBFBD>i<>prev_ucs<63> prev_props &&&&& r/<00> _should_breakrc<00>s<><00><00>!<21><18> 4<>F<EFBFBD> <0A><19><15> <0A><10>3<EFBFBD>7<EFBFBD>7<EFBFBD><1A><1A><05><01>:<3A>:<3A>
<13>4<EFBFBD>><3E>"<22>H<EFBFBD><19>(<28>#<23>#<23><1A>
<EFBFBD> <14>q<EFBFBD>L<EFBFBD><01><0F>1<EFBFBD>f<EFBFBD><1A>4<EFBFBD>7<EFBFBD>|<7C>H<EFBFBD><1E>x<EFBFBD>(<28>(<28>!<21>
<EFBFBD><11>Q<EFBFBD><06><01> <20><18>*<2A>*<2A><11>Q<EFBFBD><06><01>#<23>H<EFBFBD>-<2D>-<2D><1D>&<26>E<EFBFBD>A<EFBFBD>F<>F<><15><15><10>3<EFBFBD>7<EFBFBD>7<EFBFBD><1A>8<><18>B<>B<> <14>q<EFBFBD>L<EFBFBD><01><0F>1<EFBFBD>f<EFBFBD><1A>4<EFBFBD>7<EFBFBD>|<7C>H<EFBFBD>/<2F><08>9<>I<EFBFBD><18>C<EFBFBD>J<EFBFBD>J<EFBFBD>&<26><11>Q<EFBFBD><06><01>*<2A>8<EFBFBD>4<>4<>"<22><05><01>B<>B<><15><10>3<EFBFBD>)<29>)<29>)<29>h<EFBFBD>#<23>:P<>:P<>.P<> <13>a<EFBFBD><<3C>1<EFBFBD> <1C><1E>E<EFBFBD>H<EFBFBD>q<EFBFBD>L<EFBFBD>I<> I<><1A><04>q<EFBFBD>9<>9<><1D><03> 6<> 6<>6<>q<EFBFBD>A<EFBFBD>H<EFBFBD> <16>D<EFBFBD>8<EFBFBD> <<3C><r.Nc<01>(<00>V^8<>dQhRRRRRRRR/#<00> r2<00>unistrrY<00>startr4<00>endz
int | Noner5z Iterator[str]r)r6s"r/r7r7<00>s6<00><00>A$<24>A$<24> <0F>A$<24> <0E>A$<24>
<14>A$<24><13> A$r.c#<05><>"<00>V'gR#\V4pVfTpW8<>gW8<>dR#\W#4pTp^p\\W,44pV\P
8Xd^p\ V^,V4FRp\\W,44p\WhWV4p V PpV P'd WVx<00>TpTpKT WVx<00>R#5i)a 
Iterate over grapheme clusters in a Unicode string.
Grapheme clusters are "user-perceived characters" - what a user would
consider a single character, which may consist of multiple Unicode
codepoints (e.g., a base character with combining marks, emoji sequences).
:param unistr: The Unicode string to segment.
:param start: Starting index (default 0).
:param end: Ending index (default len(unistr)).
:yields: Grapheme cluster substrings.
Example::
>>> list(iter_graphemes('cafe\u0301'))
['c', 'a', 'f', 'e\u0301']
>>> list(iter_graphemes('\U0001F468\u200D\U0001F469\u200D\U0001F467'))
['o', 'k', '\U0001F468\u200D\U0001F469\u200D\U0001F467']
>>> list(iter_graphemes('\U0001F1FA\U0001F1F8'))
['o', 'k', '\U0001F1FA\U0001F1F8']
.. versionadded:: 0.3.0
N)
<EFBFBD>len<65>minr;r\rr%<00>rangercrOrN)
rfrgrh<00>length<74> cluster_startrOrR<00>idxrSr]s
&&& r/<00>iter_graphemesrp<00>s<><00><00><00>8 <12><0E> <10><16>[<5B>F<EFBFBD>
<EFBFBD>{<7B><14><03> <0C>|<7C>u<EFBFBD><EFBFBD><0E>
<0A>c<EFBFBD>
<1A>C<EFBFBD><1A>M<EFBFBD><10>H<EFBFBD>'<27>s<EFBFBD>6<EFBFBD>=<3D>'9<>:<3A>H<EFBFBD><10>3<EFBFBD>)<29>)<29>)<29><14><08><14>U<EFBFBD>Q<EFBFBD>Y<EFBFBD><03>$<24><03>*<2A>3<EFBFBD>v<EFBFBD>{<7B>+;<3B><<3C><08><1E>x<EFBFBD>6<EFBFBD><08>I<><06><19>?<3F>?<3F><08> <11> <1E> <1E> <1E><18>s<EFBFBD>+<2B> +<2B><1F>M<EFBFBD><1B><08>%<25> <11>s<EFBFBD>
#<23>#<23>s<00>CCc<01>$<00>V^8<>dQhRRRRRR/#)r2rXrY<00>posr4r5r)r6s"r/r7r7<s!<00><00>1<19>1<19>c<EFBFBD>1<19><03>1<19><03>1r.c<05>t<00>\W^,
,4pV^
8Xd%V^8<>dW^,
,R8Xd
V^,
#V^<5E>8dgV^8<>dWV^ 8<>dP\W^,
,4pV^<5E>8<EFBFBD>d1\V4\P8Xd\ W^,
4#V^,
#V^,
pV^8<>d`W,
\
8dO\W,4p^ Tu;8:d^<5E>8dMMM*\V4\P 8XdM V^,pKfTp\\W,44pV\P8Xd^M^p\V^,V4FLp \\W ,44p
\WzW V4p V PpV P'dT pT
pKN V#)ac
Find the start of the grapheme cluster containing the character before pos.
Scans backwards from pos to find a safe starting point, then iterates forward using standard
break rules to find the actual cluster boundary.
:param text: The Unicode string.
:param pos: Position to search before (exclusive).
:returns: Start position of the grapheme cluster.
<EFBFBD> ) r\r;rr&<00>_find_cluster_start<72>MAX_GRAPHEME_SCANr"r%rlrcrOrN) rXrr<00> target_cp<63>prev_cp<63>
safe_start<EFBFBD>cprn<00>left_gcbrOr`<00> right_gcbr]s && r/ruru<sp<00><00><14>D<EFBFBD>q<EFBFBD><17>M<EFBFBD>"<22>I<EFBFBD><11>D<EFBFBD><18>S<EFBFBD>A<EFBFBD>X<EFBFBD>$<24>Q<EFBFBD>w<EFBFBD>-<2D>4<EFBFBD>*?<3F><12>Q<EFBFBD>w<EFBFBD><0E><11>4<EFBFBD><17> <0E>!<21>8<EFBFBD> <09>T<EFBFBD>)<29><19>$<24>Q<EFBFBD>w<EFBFBD>-<2D>(<28>G<EFBFBD><16>$<24><EFBFBD>#:<3A>7<EFBFBD>#C<>s<EFBFBD>{<7B>{<7B>#R<>*<2A>4<EFBFBD>q<EFBFBD><17>9<>9<><12>Q<EFBFBD>w<EFBFBD><0E><15>q<EFBFBD><17>J<EFBFBD>
<14>q<EFBFBD>.<2E>c<EFBFBD>.<2E>2C<32>C<> <10><14>!<21> "<22><02> <0F>2<EFBFBD> <1C><04> <1C> <11> "<22>2<EFBFBD> &<26>#<23>+<2B>+<2B> 5<> <11><12>a<EFBFBD><0F>
<EFBFBD><1F>M<EFBFBD>&<26>s<EFBFBD>4<EFBFBD>+;<3B>'<<3C>=<3D>H<EFBFBD><1C><03> 6<> 6<>6<>q<EFBFBD>A<EFBFBD>H<EFBFBD> <12>:<3A><01>><3E>3<EFBFBD> '<27><01>+<2B>C<EFBFBD><04><07>L<EFBFBD>9<> <09><1E>x<EFBFBD>D<EFBFBD>X<EFBFBD>F<><06><19>?<3F>?<3F><08> <11> <1E> <1E> <1E><1D>M<EFBFBD><1C><08> (<28> <19>r.c<01>$<00>V^8<>dQhRRRRRR/#)r2rfrYrrr4r5r)r6s"r/r7r7ps!<00><00>><3E>><3E>S<EFBFBD>><3E>s<EFBFBD>><3E>s<EFBFBD>>r.c <05>R<00>V^8:d^#\V\V\V444#)a<>
Find the grapheme cluster boundary immediately before a position.
:param unistr: The Unicode string to search.
:param pos: Position in the string (0 < pos <= len(unistr)).
:returns: Start index of the grapheme cluster containing the character at pos-1.
Example::
>>> grapheme_boundary_before('Hello \U0001F44B\U0001F3FB', 8)
6
>>> grapheme_boundary_before('a\r\nb', 3)
1
.. versionadded:: 0.3.6
)rurkrj)rfrrs&&r/<00>grapheme_boundary_beforerps&<00><00>" <0B>a<EFBFBD>x<EFBFBD><10> <1E>v<EFBFBD>s<EFBFBD>3<EFBFBD><03>F<EFBFBD> <0B>'<<3C> =<3D>=r.c<01>(<00>V^8<>dQhRRRRRRRR/#rer)r6s"r/r7r7<00>s0<00><00>&<1C>&<1C> <0F>&<1C> <0E>&<1C>
<14>&<1C><13> &r.c#<05><>"<00>V'gR#\V4pVfTM
\W#4p\V^4pW8<>gW8<>dR#TpWA8<41>d\W4pWQ8dR#WVx<00>TpK#R#5i)ay
Iterate over grapheme clusters in reverse order (last to first).
:param unistr: The Unicode string to segment.
:param start: Starting index (default 0).
:param end: Ending index (default len(unistr)).
:yields: Grapheme cluster substrings in reverse order.
Example::
>>> list(iter_graphemes_reverse('cafe\u0301'))
['e\u0301', 'f', 'a', 'c']
.. versionadded:: 0.3.6
N)rjrk<00>maxru)rfrgrhrmrrrns&&& r/<00>iter_graphemes_reverser<65><00>so<00><00><00>( <12><0E> <10><16>[<5B>F<EFBFBD><17>K<EFBFBD>&<26>S<EFBFBD><13>%5<>C<EFBFBD> <0F><05>q<EFBFBD>M<EFBFBD>E<EFBFBD> <0C>|<7C>u<EFBFBD><EFBFBD><0E>
<0A>C<EFBFBD>
<0A>+<2B>+<2B>F<EFBFBD>8<> <0A> <18> <20> <11><14>3<EFBFBD>'<27>'<27><1B><03> <16>s<00>A(A*)<02>N)+r<00>
__future__r<00>enumr<00> functoolsr<00>typingrrrr9<00>table_graphemerr r
r r r rrrrrrrr<00>collections.abcrrvrr;r@rDrHrKrMrVrcrprurr<>rr.r/<00><module>r<>s<00><01><04>#<23><19><1F>,<2C>,<2C> :<3A> :<3A> :<3A> :<3A><11>(<28><17><11> <0A>'<27> <0A>, <0B>4<EFBFBD><18><15><19><15>D <0B>4<EFBFBD><18>7<><19>7<>
 <0B>4<EFBFBD><18>-<2D><19>-<2D>
 <0B>4<EFBFBD><18>0<><19>0<>
 <0B>4<EFBFBD><18>-<2D><19>-<2D>
<12>*<2A><12> <0B>4<EFBFBD><18>-<10><19>-<10>`@=<3D>FA$<24>H1<19>h><3E>,&<1C>&r.