Obsolete
Status Update
Comments
sa...@google.com <sa...@google.com> #2
Thank you for your feedback. We assure you that we are doing our best to address the issue reported, however our product team has shifted work priority that doesn't include this issue. For now, we will be closing the issue as won't fix obsolete. If this issue currently still exists, we request that you log a new issue along with latest bug report here https://goo.gl/TbMiIO .
Description
My test setup currently is as follows:
2* Google Nexus 4 running Android 5.1.1
2* NVidia Shield Tablets running Android 5.1.1
All interconnected through a stable WiFi connection, local address range 10.0.0.x
From a user's point of view, the following problems arise:
*) Devices discover each other in a non-deterministic fashion. If I launch my NsdManager-based app, it works more or less if only two devices are involved. If a third device joins, it will seldomly discover the first two, and the first two will not see the third. If I exit the apps (they gracefully unregister the NSD listeners) and restart them in a different order, the discovery pattern differs and is incomplete again.
*) In my home network, the IP resolution of the discovered devices basically works as expected. At work, sometimes even when only using two devices (A and B), device A will resolve device B's service with A's IP address and B's port and vice-versa. So somehow the IP address and service name somehow seem to get mixed up at a lower level (probably the NsdManager).
Now the developer's perspective:
I have spent multiple full workdays already trying to find out the issue. I checked my code 100 times and made sure that there are no race conditions, threading issues or problems of unsynchronized object writes that could cause this. I added extensive logging that suggests all problems emanate from the NsdManager/NsdService level.
I have now found out in the logs, that the underlying MDnsDS service correctly discovers and resolves all services. However, then, sometimes the resolving step within NsdService does not yield a result, or even the discovery itself is not propagated to my discovery listeners.
I add some exemplary log output here that illustrates the problem (I filtered some messages, the full log is attached to this report):
02-18 16:57:02.327: D/NsdService(628): startMDnsDaemon
02-18 16:57:02.327: D/MDnsDS(187): Starting MDNSD
02-18 16:57:02.529: D/NsdService(628): New client listening to asynchronous messages
02-18 16:57:02.529: D/NsdService(628): New client, channel: com.android.internal.util.AsyncChannel@1fa188ce messenger: android.os.Messenger@cca33ef
02-18 16:57:02.532: D/NsdService(628): Register service
02-18 16:57:02.532: D/NsdService(628): registerService: 106 name: TuSync-0.57392, type: _tusync._tcp., host: /::, port: 57392
02-18 16:57:02.533: D/MDnsDS(187): serviceRegister(106, (null), TuSync-0.57392, _tusync._tcp., (null), (null), 57392, 0, <binary>)
02-18 16:57:02.533: D/MDnsDS(187): serviceRegister successful
02-18 16:57:02.534: D/NsdService(628): Register 1 106
02-18 16:57:04.083: D/MDnsDS(187): register succeeded for 106 as TuSync-0.57392
02-18 16:57:04.087: D/NsdService(628): SERVICE_REGISTERED Raw: 606 106 "TuSync-0.57392"
02-18 16:57:04.109: D/NsdService(628): Discover services
02-18 16:57:04.109: D/NsdService(628): discoverServices: 107 _tusync._tcp.
02-18 16:57:04.110: D/MDnsDS(187): discover((null), _tusync._tcp., (null), 107, 0)
02-18 16:57:04.110: D/MDnsDS(187): discover successful
02-18 16:57:04.110: D/NsdService(628): Discover 2 107_tusync._tcp.
02-18 16:57:04.333: D/MDnsDS(187): Discover found new serviceName TuSync-0.57392, regType _tusync._tcp. and domain local. for 107
02-18 16:57:04.334: D/NsdService(628): SERVICE_FOUND Raw: 603 107 "TuSync-0.57392" _tusync._tcp. local.
02-18 16:57:04.338: D/NsdService(628): Resolve service
02-18 16:57:04.338: D/NsdService(628): resolveService: 108 name: TuSync-0.57392, type: _tusync._tcp., host: null, port: 0
02-18 16:57:04.339: D/MDnsDS(187): resolveService(108, (null), TuSync-0.57392, _tusync._tcp., local.)
02-18 16:57:04.345: D/MDnsDS(187): startMonitoring 108
02-18 16:57:04.345: D/MDnsDS(187): resolveService successful
02-18 16:57:04.346: D/MDnsDS(187): resolve succeeded for 108 finding TuSync-0\.57392._tusync._tcp.local. at Android-3.local.:57392 with txtLen 1
02-18 16:57:04.347: D/NsdService(628): SERVICE_RESOLVED Raw: 608 108 "TuSync-0\\.57392._tusync._tcp.local." "Android-3.local." 57392 1
02-18 16:57:04.347: D/NsdService(628): stopResolveService: 108
02-18 16:57:04.347: D/MDnsDS(187): Stopping resolve with ref 0xb5c4734c
02-18 16:57:04.349: D/NsdService(628): getAdddrInfo: 109
02-18 16:57:04.349: D/MDnsDS(187): getAddrInfo(109, (null) 0, Android-3.local.)
02-18 16:57:04.350: D/MDnsDS(187): getAddrInfo successful
02-18 16:57:04.352: D/MDnsDS(187): getAddrInfo succeeded for 109: 109 "Android-3.local." 120 10.0.0.4
02-18 16:57:04.352: D/MDnsDS(187): getAddrInfo succeeded for 109: 109 "Android-3.local." 120 fe80::204:4bff:fe2c:6c87
02-18 16:57:04.354: D/NsdService(628): SERVICE_GET_ADDR_SUCCESS Raw: 612 109 "Android-3.local." 120 10.0.0.4
02-18 16:57:04.354: D/NsdService(628): stopGetAdddrInfo: 109
02-18 16:57:04.355: D/MDnsDS(187): Stopping getaddrinfo with ref 0xb5c472d4
02-18 16:57:04.364: E/NsdService(628): Unique id with no client mapping: 109
02-18 16:57:04.364: E/NsdService(628): Unhandled { when=-10ms what=393242 obj=com.android.server.NsdService$NativeEvent@86af300 target=com.android.internal.util.StateMachine$SmHandler }
02-18 16:57:04.627: D/MDnsDS(187): Discover found new serviceName TuSync-0.36230, regType _tusync._tcp. and domain local. for 107
02-18 16:57:04.632: D/MDnsDS(187): Discover found new serviceName TuSync-0.60493, regType _tusync._tcp. and domain local. for 107
02-18 16:57:04.633: D/NsdService(628): SERVICE_FOUND Raw: 603 107 "TuSync-0.36230" _tusync._tcp. local.
02-18 16:57:04.634: D/NsdService(628): SERVICE_FOUND Raw: 603 107 "TuSync-0.60493" _tusync._tcp. local.
02-18 16:57:04.635: D/NsdService(628): Resolve service
02-18 16:57:04.635: D/NsdService(628): resolveService: 110 name: TuSync-0.36230, type: _tusync._tcp., host: null, port: 0
02-18 16:57:04.636: D/MDnsDS(187): resolveService(110, (null), TuSync-0.36230, _tusync._tcp., local.)
02-18 16:57:04.637: D/MDnsDS(187): resolve succeeded for 110 finding TuSync-0\.36230._tusync._tcp.local. at Android.local.:36230 with txtLen 1
02-18 16:57:04.638: D/NsdService(628): Resolve service
02-18 16:57:04.638: D/NsdService(628): SERVICE_RESOLVED Raw: 608 110 "TuSync-0\\.36230._tusync._tcp.local." "Android.local." 36230 1
02-18 16:57:04.639: D/NsdService(628): stopResolveService: 110
02-18 16:57:04.639: D/MDnsDS(187): Stopping resolve with ref 0xb5c473c4
02-18 16:57:04.643: D/MDnsDS(187): getAddrInfo succeeded for 111: 111 "Android.local." 120 10.0.0.5
02-18 16:57:04.643: D/MDnsDS(187): getAddrInfo succeeded for 111: 111 "Android.local." 120 fe80::204:4bff:fe26:8483
02-18 16:57:04.644: D/NsdService(628): SERVICE_GET_ADDR_SUCCESS Raw: 612 111 "Android.local." 120 10.0.0.5
02-18 16:57:04.644: D/NsdService(628): stopGetAdddrInfo: 111
02-18 16:57:04.645: D/MDnsDS(187): Stopping getaddrinfo with ref 0xb5c47364
02-18 16:57:04.645: D/MDnsDS(187): Going to poll with pollCount 3
02-18 16:57:04.658: E/NsdService(628): Unique id with no client mapping: 111
02-18 16:57:04.658: E/NsdService(628): Unhandled { when=-14ms what=393242 obj=com.android.server.NsdService$NativeEvent@1d93a739 target=com.android.internal.util.StateMachine$SmHandler }
Some info on the context: obviously, my service type is _tusync._tcp. and I create unique service names for all nodes in the format TuSync-0.[local port number] to prevent naming conflicts and ease debugging. In this scenario, there are three devices. The logging device's IP is 10.0.0.4.
The log shows, that the underlying MDnsDS correctly discovers and resolves all nodes. However, the NsdService above does not resolve all of them. There seems to be an ID conflict at 16:57:04.627, where both of the device's peers (TuSync-0.36230 and TuSync-0.60493) get assigned an internal ID of 107. The discoveryListener I registered with the NsdManager is notified upon the discovery of both nodes, however, resolving only works for one of them, the other triggers an error:
02-18 16:57:04.638: E/NsdHelper(6370): Resolve failed with error code: 3. Service: name: TuSync-0.60493, type: _tusync._tcp., host: null, port: 0
I have also experienced additional cases where after the NsdService emits a "SERVICE_FOUND Raw" message in the logs, the my discovery listener is not notified. An exemplary log (heavily filtered; same test setup as above):
02-18 17:54:06.692: D/MDnsDS(187): Starting MDNSD
02-18 17:54:06.896: D/NsdService(628): registerService: 112 name: TuSync-0.57392, type: _tusync._tcp., host: /::, port: 57392
02-18 17:54:06.896: D/MDnsDS(187): serviceRegister(112, (null), TuSync-0.57392, _tusync._tcp., (null), (null), 57392, 0, <binary>)
02-18 17:54:06.896: D/MDnsDS(187): serviceRegister successful
02-18 17:54:08.802: D/NsdService(628): SERVICE_REGISTERED Raw: 606 112 "TuSync-0.57392"
02-18 17:54:08.820: D/NsdService(628): Discover services
02-18 17:54:09.050: D/MDnsDS(187): Discover found new serviceName TuSync-0.57392, regType _tusync._tcp. and domain local. for 113
02-18 17:54:09.050: D/NsdService(628): SERVICE_FOUND Raw: 603 113 "TuSync-0.57392" _tusync._tcp. local.
02-18 17:54:09.211: D/MDnsDS(187): Discover found new serviceName TuSync-0.60493, regType _tusync._tcp. and domain local. for 113
02-18 17:54:09.212: D/NsdService(628): SERVICE_FOUND Raw: 603 113 "TuSync-0.60493" _tusync._tcp. local.
02-18 17:54:09.215: D/NsdService(628): resolveService: 116 name: TuSync-0.60493, type: _tusync._tcp., host: null, port: 0
02-18 17:54:09.216: D/MDnsDS(187): resolveService(116, (null), TuSync-0.60493, _tusync._tcp., local.)
02-18 17:54:09.217: D/MDnsDS(187): resolve succeeded for 116 finding TuSync-0\.60493._tusync._tcp.local. at Android-2.local.:60493 with txtLen 1
02-18 17:54:09.219: D/NsdService(628): SERVICE_RESOLVED Raw: 608 116 "TuSync-0\\.60493._tusync._tcp.local." "Android-2.local." 60493 1
02-18 17:54:09.228: D/MDnsDS(187): getAddrInfo succeeded for 117: 117 "Android-2.local." 120 10.0.0.6
02-18 17:54:09.228: D/MDnsDS(187): getAddrInfo succeeded for 117: 117 "Android-2.local." 120 fe80::c643:8fff:fec5:5648
02-18 17:54:09.229: D/NsdService(628): SERVICE_GET_ADDR_SUCCESS Raw: 612 117 "Android-2.local." 120 10.0.0.6
02-18 17:54:09.244: D/MDnsDS(187): Discover found new serviceName TuSync-0.36230, regType _tusync._tcp. and domain local. for 113
02-18 17:54:09.251: E/NsdService(628): Unique id with no client mapping: 117
02-18 17:54:09.251: E/NsdService(628): Unhandled { when=-22ms what=393242 obj=com.android.server.NsdService$NativeEvent@1e992653 target=com.android.internal.util.StateMachine$SmHandler }
02-18 17:54:09.255: D/NsdService(628): SERVICE_FOUND Raw: 603 113 "TuSync-0.36230" _tusync._tcp. local.
In this case, the discovered node 10.0.0.5 (port 36230) triggers no discoveryListener notification. After the last log message, nothing happens.
The source of my NsdHelper class is attached for reference. The low amount of similar bug reports makes me wonder if I'm the only one with these problems or if the NsdManager is completely unstable and no-one uses it anyway?
Note, that I even implemented a semaphore that can be set to 1 to prevent resolving multiple services in parallel, as somebody else reported problems with parallel resolution.