[SOLVED] The \U Escape Sequence in C#

Issue

I am experimenting with the Escape sequences and can not really use the \U sequence (UTF-32)
It does not compile as it can not recognize the sequence for some reason.
It recognizes it as UTF-16.

Could you please help me?

Console.WriteLine("\U00HHHHHH");

enter image description here

enter image description here

Solution

Your problem is that you copied \U00HHHHHH from the documentation page Strings (C# Programming Guide): String Escape Sequences:

enter image description here

But \U00HHHHHH is not itself a valid UTF-32 escape sequence — it’s a mask where each H indicates where a Hex character must be typed. The reason it’s not valid is that hexadecimal numbers consist of the digits 0-9 and the letters A–F or a–f — and H is not one of these characters. And the literal mentioned in comments, "\U001effff", does not work because it falls outside the range the range of valid UTF-32 characters values specified immediately thereafter in the docs:

(range: 000000 – 10FFFF; example: \U0001F47D = "👽")*

The c# compiler actually checks to see if the specified UTF-32 character is valid according to these rules:

// These compile because they're valid Hex numbers in the range 000000 - 10FFFF padded to 8 digits with leading zeros:
Console.WriteLine("\U0001F47D");
Console.WriteLine("\U00000000");
Console.WriteLine("\U0010FFFF");
// But these don't.
// H is not a valid Hex character:
// Compilation error (line 16, col 22): Unrecognized escape sequence
Console.WriteLine("\U00HHHHHH");
// This is outside the range of 000000 - 10FFFF:
// Compilation error (line 19, col 22): Unrecognized escape sequence
Console.WriteLine("\U001effff");

See https://dotnetfiddle.net/KezdTG.

As an aside, to properly display Unicode characters in the Windows console, see How to write Unicode characters to the console?.

Answered By – dbc

Answer Checked By – Gilberto Lyons (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *