SIP Accounts remain unregistered on network restoration - DNS SRV Failure

bug

#21

Here you go, debug level syslog included. How long \ how much storage is available for these debug packages? It’s difficult to catch this after it happens, but I could possibly start the debug at home, and then retrieve it once the issue occurs at the office, so as to hopefully capture what happens when it reconnects to AP2 and begins sending to the wrong DNS server address.

Also, I am editing my previous post, as I reversed the networks when explaining things.

WP800_1.zip (168.6 KB)


#22

Hi,

I think this debug package is captured a bit early. Please wait until you get back to the office network again before extracting the log. We basically want to see the log of the transition from home network to office network. As for the packet capture, the device typically have around 500 mb for storage. If you do capture, we just need the packet capture when its back to the office environment. If not its ok, but we do need to know at what time the handset was in office or home so we can scan the logs quickly to identify the cause.

We also tried to reproduce the issue here while mimic your setup. It seem for us the SRV query using new network’s DNS every time. Do you know if a reboot can help resolve the issue? You do not have alternate DNS setup on the handset right?

Thanks


#23

I won’t be travelling between the office and home until Monday, but I will make sure to start the debug prior to moving networks on Monday. I will start one here in a moment before I head home, and maybe we’ll see it happen.

Alternate DNS is not configured, and the phone is relying only on DHCP for DNS server addresses.

A reboot does clear the problem, however disabling WiFi and re-enabling it does not. I have captured the disable\enable of wifi and the failure to register due to SRV DNS failure in the attached debug.WP800_2.zip (107.8 KB)


#24

Happy Monday,

I was able to reproduce the problem, with a debug running from when I left AP1 to when DNS fails on AP2.

WP800_3.zip (946.9 KB)

Looking around the internet, I also found a very similar sounding issue: https://stackoverflow.com/questions/42444575/android-dns-java-srv-lookup-fails-when-network-connectivity-changes

Not sure if it’s at all related, but sure looks like some underlying process caching the DNS server from the previous network.


#25

Hi,

Thanks for the new logs and finding. We will continue to investigate the issue.


#26

Ran another test this morning with the alternate DNS set in the phone, and did not have any issue. When arriving at the new network all DNS queries, A record and SRV record, are sent to the manually defined DNS servers, and the phone registers fine.

Temp.zip (1.1 MB)


#27

@bnelsonfs

Could you please check your PM for the test build we sent to check on this issue?

Thanks


#28

@bnelsonfs

Any feedback on the build we provided to you?

Anyone else have this issue still?

Thanks


#29

Has this been resolved? We saw it as well that the account stays unregistered altough network is up again. this happens mostly after night in the morning as it seems. reproduction is not so easy for us tough.


#30

Hi,

Yes a fix was performed based on the findings from the log of the problem device at the time. For your particular issue, did you move from one environment to another as reported by the OP? Or did your handset simply lost registration overnight in same location?

We would advise to first set syslog to “debug” mode on your handset (Settings->Advanced settings->Syslog). When the problem occurs please note the handset date and time. After the problem occurs you can go to the web UI of the device, under Maintenance->System Diagnosis->Debug->One-click debugging, press “start” wait a few minutes then press “stop.” Then click “list” and download the debug package that was generated and attach here. This would give us the device log and a packet capture of the device’s current behavior.

Thanks


#31

It simply lost registration (not network) over night. Much less frequent in beta 10.1.16 vs. 10.1.15


#32

Hi,

That’s strange because there wasn’t any improvement in this area from 1.0.1.15 to 1.0.1.16. Please continue to monitor for the issue and follow the steps of the previous post in regards to acquiring the log when the issue occurs.

Thanks


#33

OK, in that kace setting the keep alive option had an impact on the situation. i’ve since seen cases, but rather rarely.


#34

Hi,

By default after the SIP OPTION fail three times it will trigger re-registration. Did the register fail?

Thanks