MantisBT - Doomseeker
View Issue Details
0001739Doomseeker[All Projects] Bugpublic2014-03-09 16:022018-09-29 14:46
Absolute Zero 
Zalewa 
normalminoralways
closedfixed 
MicrosoftWindowsXP/Vista/7
0.11.1 Beta 
0.12 Beta0.12 Beta 
0001739: Characters like é, è, á, à and others are broken in IRC
If a user receives any of those characters (and maybe most of cyrillic ones), they come weirdly broken to the Doomseeker IRC client, I noticed it happens only in Zandronum network, everything works fine in QuakeNet.

I can't tell if this a Doomseeker bug or serverside issue.
No tags attached.
Issue History
2014-03-09 16:02Absolute ZeroNew Issue
2014-03-17 17:29ZalewaNote Added: 0008425
2014-03-17 17:29ZalewaAssigned To => Zalewa
2014-03-17 17:29ZalewaStatusnew => acknowledged
2014-03-17 19:49Blzut3Note Added: 0008426
2014-03-17 20:49ZalewaNote Added: 0008427
2014-03-17 22:27ZalewaNote Added: 0008428
2014-03-17 22:28ZalewaNote Edited: 0008428bug_revision_view_page.php?bugnote_id=8428#r4582
2014-03-18 19:14ZalewaNote Added: 0008431
2014-03-18 19:14ZalewaStatusacknowledged => needs review
2014-03-22 15:45ZalewaNote Added: 0008441
2014-03-22 15:45ZalewaStatusneeds review => needs testing
2014-03-22 15:46ZalewaTarget Version => 0.12 Beta
2014-03-23 03:22Absolute ZeroNote Added: 0008448
2014-04-01 19:33Blzut3Statusneeds testing => resolved
2014-04-01 19:33Blzut3Fixed in Version => 0.12 Beta
2014-04-01 19:33Blzut3Resolutionopen => fixed
2018-09-29 14:46WubTheCaptainStatusresolved => closed

Notes
(0008425)
Zalewa   
2014-03-17 17:29   
This isn't a simple problem. Although Wikipedia says that UTF-8 is now prevailing when it comes to character encoding, there's no real standard and networks do not enforce anything on the clients. From what I can remember the general consensus was to always limit yourself to the ASCII characters, and Doomseeker is even designed to convert all messages it sends to ASCII. I can change that so UTF-8 is sent instead, but decoding other users' messages is not a trivial matter. I wouldn't expect it to work properly if we were to brew something on our own.
(0008426)
Blzut3   
2014-03-17 19:49   
Would there be a particular problem in assuming UTF-8 from other users? I would think that would handle most cases. Certainly if someone is using some other encoding it would error in UTF-8 just like it does now?
(0008427)
Zalewa   
2014-03-17 20:49   
I think that is the simplest solution and I agree that it would work in most cases. Doomseeker still does some splitting before sending messages larger than 510 bytes so it's first going to need to be made aware that 1 character doesn't always equal 1 byte.
(0008428)
Zalewa   
2014-03-17 22:27   
(edited on: 2014-03-17 22:28)
I've added something to help with splitting UTF-8 character arrays without ruining them on the edges. I hope I didn't reinvent something that's already implemented in Qt, although I looked and couldn't find anything. Converting IRC client to UTF-8 should be quite straight-forward now.

(0008431)
Zalewa   
2014-03-18 19:14   
IRC Client will now assume UTF-8 encoding when sending and receiving messages. This doesn't completely solve the IRC encoding clusterfuck, and the system might still be faulty, but, as Blzut3 stated, this should handle most cases.

I don't know if we want to spend extra time to fix problems with other clients sending messages using their own regional encoding so if such problems emerge, they might remain unfixed.
(0008441)
Zalewa   
2014-03-22 15:45   
Update available on beta update channel. Please test.
(0008448)
Absolute Zero   
2014-03-23 03:22   
Even russian characters are showing correctly now (well, those were on the topic, but I think it counts).

Working as intended (tested in Zandronum network).