[ruby-talk:444454] codepoints

Hi, How to reverse str.codepoints? how can I convert codepoints between UTF8 and UTF16 ? e.g: cp1 = [0xf0, 0x9f, 0x8f, 0xb3]; cp2 = [0xe2, 0x82, 0xac] #=> 0x20ac Could someone add UTF-8 to pack/unpack ? # seems output is UTF16 instead as UTF8 as doku says! How to show the US-Flag (star-flag as one graphical symbol), composed of [0x1f1fa, 0x1f1f8] ? # How to compose UTF-symbols consisting of more codepoints?

Hi, I'm not sure if you've checked, but there is plenty of online documentation for Ruby. https://ruby-doc.org/core-3.0.1/String.html#method-i-encode https://ruby-doc.org/core-3.0.1/Array.html#method-i-pack On Sun, 21 Apr 2024 at 19:02, Information via ruby-talk <ruby-talk@ml.ruby-lang.org> wrote:
Hi,
How to reverse str.codepoints?
str.codepoints.reverse
how can I convert codepoints between UTF8 and UTF16 ? e.g: cp1 = [0xf0, 0x9f, 0x8f, 0xb3]; cp2 = [0xe2, 0x82, 0xac] #=> 0x20ac Could someone add UTF-8 to pack/unpack ? # seems output is UTF16 instead as UTF8 as doku says!
Those are arrays of bytes, not codepoints. A codepoint is usually transported as a single integer. The most logical way to deal with characters and codepoints is with strings, so I'd start by getting it back from an array of bytes to a string with the correct encoding metadata: str1 = cp1.pack('C*').force_encoding('UTF-8') #=> "🏳" str2 = cp2.pack('C*').force_encoding('UTF-8') #=> "€" Then you can convert to UTF-16 str1.encode('UTF-16').codepoints #=> [0xFEFF, 0x1F3F3] str2.encode('UTF-16').codepoints #=> [0xFEFF, 0x20AC] My question is: why? What are you trying to do? You seem to be partway down a rabbit hole and have maybe lost track of the actual goal you're trying to achieve?
How to show the US-Flag (star-flag as one graphical symbol), composed of [0x1f1fa, 0x1f1f8] ? # How to compose UTF-symbols consisting of more codepoints?
Emoji sequences (and other character sequences) are sequences of codepoints, so you have to transmit them as a sequence. I don't know what you're asking. [0x1f1fa, 0x1f1f8].pack 'U*' #=> "🇺🇸" Cheers -- Matthew Kerwin [he/him] https://matthew.kerwin.net.au/

To reverse the code points of a string in Ruby, you can use the codepoints method to get an array of Unicode code points, then reverse that array, and finally use the pack method to convert it back to a string. Here's how you can do it: str = "hello" reversed_str = str.codepoints.reverse.pack("U*") puts reversed_str To convert code points between UTF-8 and UTF-16, you can use the pack and unpack methods in Ruby. Here's how you can do it: # UTF-8 to UTF-16 utf8_codepoints = [0xf0, 0x9f, 0x8f, 0xb3] utf8_string = utf8_codepoints.pack("U*") utf16_codepoints = utf8_string.unpack("U*") puts utf16_codepoints.inspect # UTF-16 to UTF-8 utf16_codepoints = [0xe2, 0x82, 0xac] utf16_string = utf16_codepoints.pack("U*") utf8_codepoints = utf16_string.unpack("U*") puts utf8_codepoints.inspect # UTF-8 to UTF-16 utf8_codepoints = [0xf0, 0x9f, 0x8f, 0xb3] utf8_string = utf8_codepoints.pack("U*") utf16_codepoints = utf8_string.unpack("U*") puts utf16_codepoints.inspect # UTF-16 to UTF-8 utf16_codepoints = [0xe2, 0x82, 0xac] utf16_string = utf16_codepoints.pack("U*") utf8_codepoints = utf16_string.unpack("U*") puts utf8_codepoints.inspect # UTF-8 to UTF-16 utf8_codepoints = [0xf0, 0x9f, 0x8f, 0xb3] utf8_string = utf8_codepoints.pack("U*") utf16_codepoints = utf8_string.unpack("U*") puts utf16_codepoints.inspect # UTF-16 to UTF-8 utf16_codepoints = [0xe2, 0x82, 0xac] utf16_string = utf16_codepoints.pack("U*") utf8_codepoints = utf16_string.unpack("U*") puts utf8_codepoints.inspect To show the US flag emoji (🇺🇸) composed of [0x1f1fa, 0x1f1f8], you can use the pack method to convert these code points to a string: us_flag_codepoints = [0x1f1fa, 0x1f1f8] us_flag_emoji = us_flag_codepoints.pack("U*") puts us_flag_emoji # Compose a symbol consisting of multiple code points symbol_codepoints = [0x1f468, 0x200d, 0x1f393, 0x200d, 0x1f3a4] # Man Technologist symbol = symbol_codepoints.pack("U*") puts symbol https://mehndidesignworld.com/simple-mehndi-designs/
participants (3)
-
Information
-
Matthew Kerwin
-
shrishti.hennadsg@gmail.com