My favorites | Sign in
Project Home Downloads Wiki Issues Source
READ-ONLY: This project has been archived. For more information see this post.
Search
for
  Advanced search   Search tips   Subscriptions
Issue 148: Broken enconding in offline messages
5 people starred this issue and may be notified of changes. Back to list
Status:  Fixed
Owner:  r000ns...@gmail.com
Closed:  Dec 2008


 
Reported by zed.0xff, Dec 21, 2008
I use
http://groups.google.com/group/py-transports/web/pyicqt-0.8.1-git-patched.tar.gz

vCards are fine.
nicknames are fine.
but when someone sends me message when I'm offline and then I go online - I
receive a lot of japanese hieroglyphs instead of ANY chars (both russan &
english & digits).

screenshot attached.
hieroglyphs.png
89.1 KB   View   Download
Dec 21, 2008
Project Member #1 r000ns...@gmail.com
Your contact use QIP?
Dec 21, 2008
#2 zed.0xff
no, he use licq.
I just tried jimm & kopete - got the same results.

update: 
when I message contains eng and/or rus letters - I get hieroglyphs.
but when msg containts only digits - I get these digits ok :)
Dec 21, 2008
#3 Mur...@gmail.com
Confirm on PSI client using Linux. Russian messages sent to me when I offline
receives in lot of japanese hieroglyphs. Senders are using Kopete and QIP.
Dec 22, 2008
#4 zed.0xff
I use PSI too :)
Dec 22, 2008
#5 mail.spy...@gmail.com
Confirm on Pidgin (Linux) and Pandion (Windows)
Dec 22, 2008
Project Member #6 r000ns...@gmail.com
(No comment was entered for this change.)
Status: Accepted
Owner: r000nster
Labels: -Priority-Medium Priority-High Milestone-0.8.1.1 Version-0.8.1
Dec 22, 2008
Project Member #7 r000ns...@gmail.com
Add <detectunicode/> option in your pyicqt.conf.xml and test it, please:
http://groups.google.com/group/py-transports/web/pyicqt-0.8.1-git.tar.gz
Dec 23, 2008
#8 egcros...@gmail.com
I have the version mentioned in comment #7, I have <detectunicode/> set, and I see
hieroglyphs instead of Cyrillic offline messages from licq.

This hieroglyph string turns into valid UTF-8 Cyrillic if you run it through "iconv
-t utf16". Note that it's *not* "-f" but "-t"!

The case described in some more detail in the group:
http://groups.google.com/group/py-transports/browse_thread/thread/9f933ac692ce9959/14d7d2b3ee1d42ce#14d7d2b3ee1d42ce
Dec 23, 2008
Project Member #9 r000ns...@gmail.com
egcrosser: I'm tried reproduce it - cp1251 in Licq as default, utf-8 for user, but
offline messages were successfully recognized for me. You can put text of message here?
Dec 24, 2008
#10 egcros...@gmail.com
In the text attached, the two messages at 08:31:22 where sent while I was offline.
Original:
=====
(08:26:30 PM) crosser@average.org/Work: щас уйду в offline, напиши что-нибудь
(08:31:22 PM) ttanushka: 笠ﯺ﷼਍
(08:31:22 PM) ttanushka: 냐뇐닐돐듐뗐뛐럐룐말뫐믐볐뷐뻐뿐胑臑苑菑蓑藑蛑蟑裑觑諑译賑跑
軑近਍
(08:32:29 PM) ttanushka: ну что?...что-нибудь читаемо?
=====
After conversion "iconv -t ucs-2le":
=====
(08:26:30 PM) crosser@average.org/Work:
I0A C94C 2 offline, =0?8H8 GB>-=81C4L
(08:31:22 PM) ttanushka: ��������������������������

(08:31:22 PM) ttanushka: абвгдежзийклмнопрстуфхцчшщъыьэюя

(08:32:29 PM) ttanushka: =C GB>?...GB>-=81C4L G8B05<>?
=====
I will be happy to do more testing, just tell me what exactly to do.

Eugene
licq-original.utf8
376 bytes   View   Download
licq-conv-to-ucs-2le.utf8
468 bytes   Download
Dec 24, 2008
Project Member #11 r000ns...@gmail.com
Even with message 'абвгдежзийклмнопрстуфхцчшщъыьэюя' it's works for me. More
information necessary.

Please, run transport in debug mode (add -D parameter to command line). Then
reproduce this error (receive this message again). And show lines with 'Received
Offline' and 'Converted message' text from log

Dec 24, 2008
#12 egcros...@gmail.com
Indeed, I could not reproduce it now. I could read both offline messages that where
sent from licq one with "windows-1251" and one with "utf-8" settings. The difference
was that in this case, I stopped the PyICQ process rather than disconnecting the
client from the transport like I did last time.

Could it have anything to do with the licq changing encoding on the fly, and PyICQ
using the information about the peer's capabilities from a previous request that was
done before the change?

At the same time, I understood that the ecoding in online communication is not right!
In particular, when the user of licq set encoding "windows-1251", I see her messages,
but she does not see mine. When she sets encoding "utf8", she sees my messages but I
don't see hers. What I get in the latter case is a string that becomes readable when
passed through "iconv -t windows-1251".

I am attaching two logs, the one produced by PyICQ with -D, the other - regular
Pidgin log. I tried to remove all data that is not related to the particular
conversation for privacy reasons. The last two messages dated 15:58:28 where sent
while the transport was not running with different encodings, all the rest are
"online". The second from the top message, dated 15:43:22, is an example of wrong
encoding in online conversation.

I gather that you may understand Russian, if so, then the log should be self-explanatory.

I will post if/when I find more information about the problem, and/or you are welcome
to tell me what other experiments I can do.

Thanks,

Eugene
licq_debug.log
349 KB   View   Download
licq_pidgin.log
3.2 KB   View   Download
Dec 24, 2008
#13 egcros...@gmail.com
Another related note: if I replace "<encoding>windows-1251</encoding>" with
"<encoding>utf-8</encoding>", I can normally communicate with licq users (that set
"utf-8" encoding for me) and with most other peers, but not with some users (probably
older versions of official ICQ).
Dec 24, 2008
Project Member #14 r000ns...@gmail.com
With online messages a bit simpler. Test updated version, it should works for utf-8
encoding in Licq:
http://groups.google.com/group/py-transports/web/pyicqt-0.8.1-git.tar.gz

Option for unicode detection more sensitive now (not only enable/disable), update
your config:
	<!-- Try detect Unicode:
	    0 - never
	    1 - in offline messages
	    2 - and in nicknames
	    Attention: this solution can be slowly on high-load servers
	-->
	<detectunicode>1</detectunicode>
Dec 24, 2008
#15 egcros...@gmail.com
With the version from comment 14, and "<encoding>windows-1251</encoding" in PyICQ
config, and licq configured for utf8, online messages in both directions pass
correctly. Offline messages from licq are all right. But offline messages from PyICQ
are displayed wrong in licq: 'абв' is displayed as '012'.

(I installed licq 1.3.5 on the local machine for this experiment)
Dec 24, 2008
Project Member #16 r000ns...@gmail.com
It's because transport sends offline messages in utf-16 (impossible send utf-8 to
offline as far I know). This is more right way than sending in any national encoding.
But Licq as you might guess always sends and receives messages only in one encoding,
specified for every user.

When PyICQt user going to offline, Licq user should change his encoding from utf-8 to
utf-16, and when he/she returning back to online then change encoding again. This is
bad solution, but currently I have no better solving of problem.
Dec 24, 2008
#17 egcros...@gmail.com
Doesn't it make sense to send outgoing (online and offline) messages in the encoding
that is specified as "<encoding>" in the config file *if* there is no reasonable
indication of the peer's unicode capabilities? I am not sure that it would be the
right thing, but just a thought...

In particular, when both pyicq and licq are explicitly configured to use
"windows-1251", sending messages from pyicqt to licq in utf-8 does not seem logical?
And this is what happens.

	0x00a0:  0000 0000 0001 0000 0001 0022 0061 6263  ...........".abc
	0x00b0:  6465 6667 6869 6a20 d0b0 d0b1 d0b2 d0b3  defghij.........
	0x00c0:  d0b4 d0b5 d0b6 d0b7 d0b8 d0ba d0bb 0000  ................

(the message was "abcdefghij абвгдежзикл")
Dec 25, 2008
Project Member #18 r000ns...@gmail.com
Hm. May be current version can help (in windows-1251 configuration)?
http://groups.google.com/group/py-transports/web/pyicqt-0.8.1-git.tar.gz
Dec 25, 2008
#19 egcros...@gmail.com
(pulled source from git instead, commit 95b3c48f00df737cff5b8476657d9b6f4582d4fa)

This time, with licq configured to use cp1251, and <encoding> windows-1251 in
pyicqt.conf, both online and offline messages in both directions are readable. Yay!

I think that licq complained once when it received some sort of status message (not
real conversation), about being unable to convert it to or from ucs2-2le or something
like that but I did not record any details.

I will see how it works with other clients and report any problems I find.

Thanks!

Eugene
Dec 25, 2008
#20 egcros...@gmail.com
OK, so this is how things look with the latest version:

- online conversations are fine with all my peers running different clients

- offline messages that I receive where readable every time I checked

- offline messages that I send are almost never readable by the peers. They are
readable by licq users, and on *some* occasions by ICQ6 users. To other ICQ6 users,
and to Miranda, QIP and Pidgin users that I checked with they either look garbage or
are empty.

If I understand it right, it seems that *offline* messages are better to send in
utf16(?) as you did before the last change; that would leave licq in the cold but
ensure compatibility with the majority.

Eugene
Dec 25, 2008
Project Member #21 r000ns...@gmail.com
Encoding for messages chooses by other way now. Instead of rule 'Send always in
unicode' works 2 rules:
1. Send by default in custom encoding
2. Send in Unicode if contact supports it.
But for checking this support transport after run should see contact as least one time.
1. Transport starting
2. Contact become online
3. Transport saving info about unicode support
4. Contact become offline
5. You sending message
6. Transport sending message in unicode
Dec 26, 2008
#22 egcros...@gmail.com
re. comment #21: I'd be happy to test it against my peers, just please drop a note
when you commit the changes into the repository.

Upfront comments:
1. Isn't it an overkill? Maybe it's reasonable to sacrifice offline messages to licq
for the sake of simplicity? As long as all other combinations work...
2. Don't forget the the peer may change their software at any time: it's probably a
good idea to refresh our notion of their capabilities every time we see them online.

Thanks,

Eugene
Dec 26, 2008
Project Member #23 r000ns...@gmail.com
:) No, in reality it's good way. And ICQ clients do something like this
Dec 31, 2008
Project Member #24 r000ns...@gmail.com
(No comment was entered for this change.)
Status: Fixed
Dec 31, 2008
#25 egcros...@gmail.com
Fixed?
At the very least, it does not work for offline messages to Adium and to ICQ6 (they
cannot read my messages). Works for QIP. More clients to check...

(yes, I did exchange online messages with them, before testing offline)

Eugene
Jan 6, 2009
Project Member #26 r000ns...@gmail.com
:) Ok, separate option for choosing of encoding added
Jan 12, 2009
#27 Mur...@gmail.com
In what version this is fixed? I try this version today:
http://groups.google.com/group/py-transports/web/pyicqt-0.8.1-git.tar.gz
but the problem is still here.
I receive the wrong russian letters from offline:
[09:22:17] <lawrentiy> †ⴠ⃤⃢¥N⸮⃨⃱⃰ﬠ
[09:22:18] <lawrentiy> Ⱐ⃴‭⃯Ⱐ@⃱⋒⃰易⃷⃲⃱⃭㼠ﰠⰠ쇄
⃨⃱ﬠ⃳⃲@

r000nster, can you send me at murznn[at]gmail.com the version of pyict with this
issue fixed for tesing?
Jan 12, 2009
Project Member #28 r000ns...@gmail.com
It's already latest version. Just add this line to your config:
<detectunicode>1</detectunicode>

 
Jan 16, 2009
#29 egcros...@gmail.com
As of today (git commit 1c2b8a0a3846ed296d0f5ef193294d15f0de8e38), offline messages
from pyicqt to many others clients are unreadable (checked with ICQ6 and QIP). I have
"encoding for outgoing offline messages" set to "auto detect". I'd say, now things
are worse than they where before the Dec 25 change.

Should this ticket be reopened, or a new one opened?

Eugene
Jan 28, 2009
#30 Mur...@gmail.com
I have using version 0.8.1.1 of pyicqt with this patch:
http://pyicqt.googlecode.com/issues/attachment?aid=8022030648936962831&name=pyicq-t-0.8-seqnum.patch

And today I see the bad message from offline:
[11:40:51] <Nickname> ‡਍ﳫ⃲業慬楶獴湡⹮畲振湯牴汯ഠ爊潯⽴敲楶楳湯
But he sent:
[11:44:51] <Nickname> пароль от xxxx root/xxxxx

I have a <detectunicode>1</detectunicode> in config.
All config is:
<pyicqt>
        <jid>icq.xxx.ru</jid>
        <mainServer>127.0.0.1</mainServer>
        <mainServerJID>xxx.ru</mainServerJID>
        <website>http://xxx.ru/</website>
        <port>5347</port>
        <secret>xxx</secret>
        <lang>ru</lang>
        <encoding>cp1251</encoding>
        <icqServer>login.oscar.aol.com</icqServer>
        <icqPort>5190</icqPort>
        <admins>
        <jid>murz@xxx.ru</jid>
        </admins>
        <xdbDriver>xmlfiles</xdbDriver>
        <detectunicode>1</detectunicode>
        <usemd5auth/>
</pyicqt>
Maybe I need set anything else in config?

Powered by Google Project Hosting