Windows 10: Windows Command prompt displays UTF-8 tamil characters in a weird way.

Discus and support Windows Command prompt displays UTF-8 tamil characters in a weird way. in Windows 10 Gaming to solve the problem; I have attached the reference image.You can see that some characters are too tiny while others are standard size. Any way to fix this?... Discussion in 'Windows 10 Gaming' started by Surya Narayanan1, Apr 7, 2023.

  1. Windows Command prompt displays UTF-8 tamil characters in a weird way.


    I have attached the reference image.You can see that some characters are too tiny while others are standard size. Any way to fix this?

    :)
     
    Surya Narayanan1, Apr 7, 2023
    #1

  2. Windows 10 UTF-8 exclusive characters display incorrectly

    I am using Windows 10 1909 x64, display language English (United States), and of course, default font Segoe UI. There has been a problem that has troubled me for a very long time, and this one problem's cause is rooted in the design of Windows NT.

    The problem is very very simple, that is, some texts with different character set encoding from the system default's, they show up as garbled codes, and completely unintelligible! I don't know if I conveyed my meanings right, so here are some screenshots
    that help explain what I mean.


    Windows Command prompt displays UTF-8 tamil characters in a weird way. d6ba2f5b-04ad-4ac3-90ca-dfd364acd10d?upload=true.png



    Windows Command prompt displays UTF-8 tamil characters in a weird way. 0cbee4fa-99c1-45eb-9c2f-af2fa7fa19e3?upload=true.png



    Windows Command prompt displays UTF-8 tamil characters in a weird way. a568f3eb-ca48-4020-91ad-2c0ecd8c5af4?upload=true.png


    As you see the first is a screenshot of file explorer, what should be Chinese characters are displayed as god-knows-what, the second is a screenshot of the program-which is in Chinese, again almost all Chinese characters are not displayed correctly, and
    the third and last is a screenshot of the console of programming language Julia, in the help section, what should be 17 mathematical symbols are displayed as square tofu-like substitute characters. The aforementioned problems do not limit to Chinese.

    Way off the topic I would like to say I really dislike this Segoe UI font, which is sans-serif, so it has so many issues in itself, for one thing the upper-case I and lower-case l look exactly the same: Il, they are homoglyphs, and there is not any distinction
    between I and l in sans-serif fonts, which causes too much confusion. Of course in normal conditions, in a context, one can never confuse these two characters, such as in Intelligence, however there is a famous song Ievan Polkka, which is Finnish, and many
    people mistakenly believe the first character is an el instead of an i, resulting it often misnamed as Levan Polkka, and in programming there isn't much context to guess what the characters really are. And, nowadays almost everything is in Unicode, UTF-8,
    the character set which essentially contains everything, however Segoe UI is an ASCII font?, it has only so many characters.

    Obviously the first two problems are caused by character mapping correspondence failure, the characters exclusive to UTF-8, beyond the code point limit of ASCII (or ANSI, I am not sure what Windows NT uses as default character set, nothing I can find online
    can elaborate me on that), only UTF-8 has the right characters at these code point entries, and ASCII has way fewer entries, the UTF-8 exclusive characters got mistakenly decoded in ASCII, which doesn't have them, so the computer divides the entry point by
    the limit of ASCII entries and displays the ASCII characters mapped to the remainders, which aren't the right ones. It is so easy to figure out why for me, but the difficult one is, how? How to solve this problem?

    I have already installed Chinese (Simplified) language pack and fonts, but Windows is not using the right font and charset while displaying these Characters, instead uses the horrible Segoe UI as default font, uses it to display what it doesn't support.
    Obviously, the simplest way to solve the first two problems would be to switch the display language to Chinese, but I really don't like my system display Chinese by default, almost all my installed programs(more than 500 with exceptions less than ten) are
    in English. Besides, even if I make my system display Chinese by default, many English programs would display scrambled text thus making them unusable, for example Raw Therapee, this is one of the reasons why I switched my system to display English in the
    first place!

    The last problem of cource is because there is the entry point, but the code point doesn't have any drawing mapped to it, the font doesn't include the character, so it displayes tofu character instead.

    I have installed texlive in the hopes of solving this problem, it took good long hours, longer than installing Windows 10 itself, only to find it's unrelated to the problem.

    How do I change default font of console programs? How do I change Windows 10 default character encoding/character set to utf-8?

    Is there any reistry editing can do this? Please, I am rather good at regedit.
     
    Aireyanna Havaska, Apr 7, 2023
    #2
  3. Windows 10 UTF-8 exclusive characters display incorrectly

    Now I have understanded such problems are caused by Windows defaults to ANSI character set, ANSI is limited to one byte (octet), it has only eight bits, therefore it only supports 2^8=256 characters, with minimum codepoint being 0 and maximum codepoint being
    255, Unicode supports 1,114,112 codepoints, and first 256 characters of Unicode are the same as ANSI.

    Different Programs in Windows treat unicode characters with codepoints greater than 255 differently. Some programs treat characters with codepoints greater than 255 directly as Unicode codepoints, but some others, like the examples I mentioned earlier, including
    Windows itself, treats the characters as ASCII characters, and display the characters corresponded to modulus 256 of original codepoints, thus output scrambled texts, that is it, that is all I know, however I don't know how to make programs treat codepoints
    greater than 255 as Unicode codepoints as default, could there be a registry editting that can set Windows to use UTF-8 as default?
     
    Aireyanna Havaska, Apr 7, 2023
    #3
  4. moskitos Win User

    Windows Command prompt displays UTF-8 tamil characters in a weird way.

    N95 - problem with coding chartes - UTF-7 and UTF-8 and Polish national characters

    I'm using Nokia N95 SMS Text with Polish T9 dictionary. There is problem with coding charters.

    In Sending options is: character encoding with options "full support" (UTF-8) and "Reduced support" (UTF-7). With Polish characters (ą, ć, ź, ż, ł, ó, ę, ś, ń ...) DOESN'T WORK REDUCED SUPPORT! With Reduced support ON phone is still coding in UTF-8 and i have
    only 70 characters. Of course without using national characters I have 160.

    Device software V 11.0.026

    I was using N73 and there this option was working good.

    Where i should write with this problem? Who can help me? Nokia Poland is ignoring this bug!
    Message Edited by on 17-Jun-200711:40 PM
     
    moskitos, Apr 7, 2023
    #4
Thema:

Windows Command prompt displays UTF-8 tamil characters in a weird way.

Loading...
  1. Windows Command prompt displays UTF-8 tamil characters in a weird way. - Similar Threads - Command prompt displays

  2. Windows Command prompt displays UTF-8 tamil characters in a weird way.

    in Windows 10 Software and Apps
    Windows Command prompt displays UTF-8 tamil characters in a weird way.: I have attached the reference image.You can see that some characters are too tiny while others are standard size. Any way to fix this? https://answers.microsoft.com/en-us/windows/forum/all/windows-command-prompt-displays-utf-8-tamil/b8744bdf-f522-40db-8301-7fe5f89472e2
  3. Diskpart assign command using script file not detecting UTF-8 paths

    in Windows 10 Software and Apps
    Diskpart assign command using script file not detecting UTF-8 paths: Hi Community,I am using diskpart command with script file /s flag and while assigning a VHD file to a temporary directory which contains non-ACII characters, problem here is diskpart reports this directory as invalid however when I assign folder with DiskManagement GUI this...
  4. UTF-8 encoding

    in Windows 10 Gaming
    UTF-8 encoding: Hello, does windows support different encoding schemes, like utf8 and iso8559-1 or so?In linux, I never experience any issues with emacs, python, files, etc. Is there anyway to use different encoding schemes in windows?...
  5. UTF-8 encoding

    in Windows 10 Software and Apps
    UTF-8 encoding: Hello, does windows support different encoding schemes, like utf8 and iso8559-1 or so?In linux, I never experience any issues with emacs, python, files, etc. Is there anyway to use different encoding schemes in windows?...
  6. Notepad ANSI/UTF-8

    in Windows 10 Network and Sharing
    Notepad ANSI/UTF-8: I saved my file as .txt ansi and it changed the format to .txt UTF-8. I tried on differnet computers as well, I got the same result. Any idea how to resolve this ? https://answers.microsoft.com/en-us/windows/forum/all/notepad-ansiutf-8/929f9241-b392-487b-afbb-b0f52f493346
  7. ENCOD UTF-8

    in Windows 10 Customization
    ENCOD UTF-8: alguns PROGRAMAS QUE TENHO INSTALADOS QUANDO ABERTOS APARECEM COM ERROS DE DIGITAÇÃO PARA CARACTERES ESPECIAIS, COMO EXEMPLO NO TEXTO A SEGUIR: Programa desarrollado por el Servicio de Informacin sobre Sade Pblica de la Direccin Xeral de Sade Pblica de la Consellera de...
  8. Windows 10 UTF-8 exclusive characters display incorrectly

    in Windows 10 Customization
    Windows 10 UTF-8 exclusive characters display incorrectly: I am using Windows 10 1909 x64, display language English United States, and of course, default font Segoe UI. There has been a problem that has troubled me for a very long time, and this one problem's cause is rooted in the design of Windows NT. The problem is very very...
  9. Windows PowerShell utf-8 encoding

    in Windows 10 Software and Apps
    Windows PowerShell utf-8 encoding: Hello! I would like to know if it is possible to configure Windows PowerShell to print utf-8 characters? I searched the web and found multiple solutions, but nothing seems to be working. e.g.: * chcp 65001 * $OutputEncoding = [console]::InputEncoding =...
  10. Command Prompt setx truncating to 1024 characters

    in Windows 10 Network and Sharing
    Command Prompt setx truncating to 1024 characters: setx path "%path%;C:\julia\bin" WARNING: The data being saved is truncated to 1024 characters. SUCCESS: Specified value was saved. Using this how can I get around the data being truncated?...