a
Size: a a a
a
MA
MA
a
MA
'
вмсто ::
, починить utfa
MA
The Unicode Bug
a
'
вмсто ::
, починить utfAU
AU
RL
VG
MA
AK
a
The Unicode Bug
In split.Вот как раз из-за наслоения подобных умолчаний шатается экосистема перла. Впрочем, возможно тут описана здравая мысль, просто документирована плохо. Но вообще, если ты хочешь юникодных пробелов, цифр и т.д. - разумно было бы явно писать это в паттернах...
Starting in Perl 5.28.0, the split function with a pattern specified as a string containing a single space handles whitespace characters consistently within the scope of of unicode_strings . Prior to that, or outside its scope, characters that are whitespace according to Unicode rules but not according to ASCII rules were treated as field contents rather than field separators when they appear in byte-encoded strings.
AK
MA
$ perl -MDevel::Peek -E '$a = "\N{U+fd}"; $b = "\xfd"; Dump $a; Dump $b; say $a eq $b'
SV = PV(0x7fd2f3806070) at 0x7fd2f382dd48
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK,UTF8)
PV = 0x7fd2f3401b40 "\303\275"\0 [UTF8 "\x{fd}"]
CUR = 2
LEN = 10
COW_REFCNT = 1
SV = PV(0x7fd2f3806120) at 0x7fd2f382dcd0
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x7fd2f3401770 "\375"\0
CUR = 1
LEN = 10
COW_REFCNT = 1
1
FLAGS = (POK,IsCOW,pPOK,UTF8) PV = 0x7fd2f3401b40 "\303\275"\0 [UTF8 "\x{fd}"
, вы должны записать один байт 0xFD)"\x{fd}"
-> "\303\275"
), либо бросается exceptionMA
MA
perl -MDigest::MD5=md5_hex -MDevel::Peek -E '$a = "\N{U+fd}"; $b = "\xfd"; Dump $a; Dump $b; say md5_hex($a); say md5_hex($b)'
SV = PV(0x7f8404001e70) at 0x7f8402811610
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK,UTF8)
PV = 0x7f8402509720 "\303\275"\0 [UTF8 "\x{fd}"]
CUR = 2
LEN = 10
COW_REFCNT = 1
SV = PV(0x7f8404001f30) at 0x7f840280d700
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x7f84025016b0 "\375"\0
CUR = 1
LEN = 10
COW_REFCNT = 1
da564f38413a243e30e8c8c07fccc5d8
da564f38413a243e30e8c8c07fccc5d8
MA