Windows 10: Windows 10 UTF-8 exclusive characters display incorrectly

Discus and support Windows 10 UTF-8 exclusive characters display incorrectly in Windows 10 Customization to solve the problem; I am using Windows 10 1909 x64, display language English United States, and of course, default font Segoe UI. There has been a problem that has... Discussion in 'Windows 10 Customization' started by Aireyanna Havaska, May 14, 2020.

  1. Windows 10 UTF-8 exclusive characters display incorrectly


    I am using Windows 10 1909 x64, display language English United States, and of course, default font Segoe UI. There has been a problem that has troubled me for a very long time, and this one problem's cause is rooted in the design of Windows NT.


    The problem is very very simple, that is, some texts with different character set encoding from the system default's, they show up as garbled codes, and completely unintelligible! I don't know if I conveyed my meanings right, so here are some screenshots that help explain what I mean.

    Windows 10 UTF-8 exclusive characters display incorrectly d6ba2f5b-04ad-4ac3-90ca-dfd364acd10d?upload=true.png

    Windows 10 UTF-8 exclusive characters display incorrectly 0cbee4fa-99c1-45eb-9c2f-af2fa7fa19e3?upload=true.png

    Windows 10 UTF-8 exclusive characters display incorrectly a568f3eb-ca48-4020-91ad-2c0ecd8c5af4?upload=true.png

    As you see the first is a screenshot of file explorer, what should be Chinese characters are displayed as god-knows-what, the second is a screenshot of the program-which is in Chinese, again almost all Chinese characters are not displayed correctly, and the third and last is a screenshot of the console of programming language Julia, in the help section, what should be 17 mathematical symbols are displayed as square tofu-like substitute characters. The aforementioned problems do not limit to Chinese.


    Way off the topic I would like to say I really dislike this Segoe UI font, which is sans-serif, so it has so many issues in itself, for one thing the upper-case I and lower-case l look exactly the same: Il, they are homoglyphs, and there is not any distinction between I and l in sans-serif fonts, which causes too much confusion. Of course in normal conditions, in a context, one can never confuse these two characters, such as in Intelligence, however there is a famous song Ievan Polkka, which is Finnish, and many people mistakenly believe the first character is an el instead of an i, resulting it often misnamed as Levan Polkka, and in programming there isn't much context to guess what the characters really are. And, nowadays almost everything is in Unicode, UTF-8, the character set which essentially contains everything, however Segoe UI is an ASCII font?, it has only so many characters.


    Obviously the first two problems are caused by character mapping correspondence failure, the characters exclusive to UTF-8, beyond the code point limit of ASCII or ANSI, I am not sure what Windows NT uses as default character set, nothing I can find online can elaborate me on that, only UTF-8 has the right characters at these code point entries, and ASCII has way fewer entries, the UTF-8 exclusive characters got mistakenly decoded in ASCII, which doesn't have them, so the computer divides the entry point by the limit of ASCII entries and displays the ASCII characters mapped to the remainders, which aren't the right ones. It is so easy to figure out why for me, but the difficult one is, how? How to solve this problem?


    I have already installed Chinese Simplified language pack and fonts, but Windows is not using the right font and charset while displaying these Characters, instead uses the horrible Segoe UI as default font, uses it to display what it doesn't support. Obviously, the simplest way to solve the first two problems would be to switch the display language to Chinese, but I really don't like my system display Chinese by default, almost all my installed programsmore than 500 with exceptions less than ten are in English. Besides, even if I make my system display Chinese by default, many English programs would display scrambled text thus making them unusable, for example Raw Therapee, this is one of the reasons why I switched my system to display English in the first place!


    The last problem of cource is because there is the entry point, but the code point doesn't have any drawing mapped to it, the font doesn't include the character, so it displayes tofu character instead.


    I have installed texlive in the hopes of solving this problem, it took good long hours, longer than installing Windows 10 itself, only to find it's unrelated to the problem.


    How do I change default font of console programs? How do I change Windows 10 default character encoding/character set to utf-8?

    Is there any reistry editing can do this? Please, I am rather good at regedit.

    :)
     
    Aireyanna Havaska, May 14, 2020
    #1
  2. moskitos Win User

    N95 - problem with coding chartes - UTF-7 and UTF-8 and Polish national characters

    I'm using Nokia N95 SMS Text with Polish T9 dictionary. There is problem with coding charters.

    In Sending options is: character encoding with options "full support" (UTF-8) and "Reduced support" (UTF-7). With Polish characters (ą, ć, ź, ż, ł, ó, ę, ś, ń ...) DOESN'T WORK REDUCED SUPPORT! With Reduced support ON phone is still coding in UTF-8 and i have
    only 70 characters. Of course without using national characters I have 160.

    Device software V 11.0.026

    I was using N73 and there this option was working good.

    Where i should write with this problem? Who can help me? Nokia Poland is ignoring this bug!
    Message Edited by on 17-Jun-200711:40 PM
     
    moskitos, May 14, 2020
    #2
  3. chr109 Win User
    WIndows 10 Mail App doesn't display UTF8 characters

    I'm running the default Mail app that is part of Windows 10 and messages I receive that contain UTF-8 characters are garbled. How do I change the app settings to display UTF-8 characters correctly?

    ***Post moved by the moderator to the appropriate forum category.***
     
    chr109, May 14, 2020
    #3
  4. Trammael Win User

    Windows 10 UTF-8 exclusive characters display incorrectly

    UTF-8 encoded subjects in Email

    Hello,

    How might I submit a bug report?

    What:

    Subject-lines encoded in UTF-8 received by the IMAP client display as a series of encoded characters instead of the actual content.

    How to reproduce:

    Receive an email with a subject line encoded in UTF-8.

    Version:

    WP8.1 developer on Nokia Lumia 920.

    Thank you
     
    Trammael, May 14, 2020
    #4
Thema:

Windows 10 UTF-8 exclusive characters display incorrectly

Loading...
  1. Windows 10 UTF-8 exclusive characters display incorrectly - Similar Threads - UTF exclusive characters

  2. Windows Command prompt displays UTF-8 tamil characters in a weird way.

    in Windows 10 Gaming
    Windows Command prompt displays UTF-8 tamil characters in a weird way.: I have attached the reference image.You can see that some characters are too tiny while others are standard size. Any way to fix this? https://answers.microsoft.com/en-us/windows/forum/all/windows-command-prompt-displays-utf-8-tamil/b8744bdf-f522-40db-8301-7fe5f89472e2
  3. Windows Command prompt displays UTF-8 tamil characters in a weird way.

    in Windows 10 Software and Apps
    Windows Command prompt displays UTF-8 tamil characters in a weird way.: I have attached the reference image.You can see that some characters are too tiny while others are standard size. Any way to fix this? https://answers.microsoft.com/en-us/windows/forum/all/windows-command-prompt-displays-utf-8-tamil/b8744bdf-f522-40db-8301-7fe5f89472e2
  4. UTF-8 encoding

    in Windows 10 Gaming
    UTF-8 encoding: Hello, does windows support different encoding schemes, like utf8 and iso8559-1 or so?In linux, I never experience any issues with emacs, python, files, etc. Is there anyway to use different encoding schemes in windows?...
  5. UTF-8 encoding

    in Windows 10 Software and Apps
    UTF-8 encoding: Hello, does windows support different encoding schemes, like utf8 and iso8559-1 or so?In linux, I never experience any issues with emacs, python, files, etc. Is there anyway to use different encoding schemes in windows?...
  6. Notepad ANSI/UTF-8

    in Windows 10 Network and Sharing
    Notepad ANSI/UTF-8: I saved my file as .txt ansi and it changed the format to .txt UTF-8. I tried on differnet computers as well, I got the same result. Any idea how to resolve this ? https://answers.microsoft.com/en-us/windows/forum/all/notepad-ansiutf-8/929f9241-b392-487b-afbb-b0f52f493346
  7. ENCOD UTF-8

    in Windows 10 Customization
    ENCOD UTF-8: alguns PROGRAMAS QUE TENHO INSTALADOS QUANDO ABERTOS APARECEM COM ERROS DE DIGITAÇÃO PARA CARACTERES ESPECIAIS, COMO EXEMPLO NO TEXTO A SEGUIR: Programa desarrollado por el Servicio de Informacin sobre Sade Pblica de la Direccin Xeral de Sade Pblica de la Consellera de...
  8. Using the UTF-8 Symbols in Windows 10 (Which Came out Incorrectly)

    in Windows 10 Support
    Using the UTF-8 Symbols in Windows 10 (Which Came out Incorrectly): I did this. [ATTACH] When I use a few random symbols (example, alt+5+6+2), it came out incorrectly like this. � Why? 158358
  9. Windows PowerShell utf-8 encoding

    in Windows 10 Software and Apps
    Windows PowerShell utf-8 encoding: Hello! I would like to know if it is possible to configure Windows PowerShell to print utf-8 characters? I searched the web and found multiple solutions, but nothing seems to be working. e.g.: * chcp 65001 * $OutputEncoding = [console]::InputEncoding =...
  10. Keyboard incorrectly displays the wrong characters.

    in Windows 10 Drivers and Hardware
    Keyboard incorrectly displays the wrong characters.: Microsoft Sculpt wireless keyboard 1531. Did a Windows 10 update, now the keyboard will not work. It displays the wrong characters, for instance when I press the "b"-key it displays "By" and when I press the "o"-key it displays ";o". I tried it on two different laptops;...