You mean your dns names that you point your phones to? How long are they?
I would honestly switch to DNS-based failover if I were you. SRV is going to continue to be a pain in the rear.
Our record names aren’t long individually, but a single SRV record on our system contains 9 different servers of varying priorities. Which means the phone is having to make 10 DNS resolutions every time it makes a call or performs an action.
This does not affect Yealinks or Polycoms because they actually built DNS caches into their software like a proper manufacturer.
Alright we’re good!
We eliminated 6 A records in each of our SRV records, so that our SRV records only include 1 server node per datacenter (3 nodes per SRV), and all our GS phones are happy as clams.
So, for the record, firmware *.108 is golden!
Nope. There is security leak there.
Fixed access to a root shell (CVE-2018-17565). Check 117 release for list of fixes
Awesome! So far so good here too for our office phones. What type of setup do you have on the backend? Asterisk? FreePBX? For our office, I just have 1 FreePBX box with a SIP trunk, but would like to get more information regarding redundancy, failover, etc. Could you share anything?
We used to use FreePBX a few years ago, and I do love FreePBX, but it wasn’t sufficient for our customers anymore at that time.
So we ended up creating our own software on top of Asterisk, and it worked really well for a couple years, but it provided no redundancy or failover (besides just cloning the virtual machines and backing them up).
So, we invented a new, geographically elastic infrastructure that we are currently patenting. The patent should finish within the next year or two. Our system is basically an AI-powered real-time analysis system that re-routes calls around outages in real time, with full geographic failover within 10 seconds.
We can lose an entire datacenter, and over 2,000 SIP phones throughout the country will very quickly switch to a different datacenter while the customers’ PBX systems simultaneously failover and start back up in alternative datacenters.
It’s quite revolutionary and we’ve gotten a couple of multi million dollar buy-out offers because of it.
I think we have a YouTube video demonstrating it somewhere.
That’s great. I love seeing innovations in this space. Nice work. Me being a tech guy, how on earth do you have client PBX system start back up in an alternative data center within 10 seconds? You’re saying you can boot a duplicate PBX in 10 seconds?
Yessir. Not just a single PBX though, we’re talking about 50 to 100 at once. Since we have nearly 300 PBXes in production, the spread ends up being 100 per datacenter (we have 3 DCs for right now), so when a datacenter goes down (fiber cut, hurricane, aliens), 100 PBX systems are failing over to 2 datacenter and spinning up at once. This process incurs minimal-to-no interruption by our customers.
Ah, here we are:
haha we produced that over a year ago when the infrastructure finally became viable, after 6 months of engineering.
That’s great. What is your virtualization platform? KVM? VMWare? Something else? If this is proprietary stuff, just tell to stop asking questions. I’m a tech guy, so I’m curious.
The cluster nodes are all VMs, on VMWare ESXI 6.5 hosts in each of our datacenters.
We’re exploring upgrading to 6.7 because of all the awesome new stuff they added, and the improved HTML5 GUI
The code and applications we use for cluster operations are proprietary for now, but once the patent goes though we’ll be able to talk all about it. You can see it in action in the YouTube video I posted earlier.
Awesome stuff. Nice work. HTML 5 GUI is my preference as well.
@lukeescude Since you seem to be an expert on Grandstream phones, What settings do you set on the phones to make the reconnect as quickly as possible in the event of a local internet outage. Our PBX is cloud hosted and offsite, and whenever there is an internet blip, and the phones go offline, I want them to reach back out to the server as quickly as possible. Any tips for this?
Fundamentally, this is registration and keep-alive time. The phones assume that the server is there for as long as they are registered (60 minutes by default). If they try to make a call and the server isn’t there, then they get freaked out. So you can change the registration timeout in the SIP parameters for each account to be whatever you want. More time = more chance that the server isn’t really there (or the network is down). Less time means more chance the phone will re-register if the network did go down.
However they also have tradeoffs - less time means more registrations. Your SIP provider may set a limit on that (we do) to prevent DoS type attacks.
Also, the keep-alive option (same screen) says how long to wait (default is 30 seconds) before sending a low-cost OPTIONS request to the server. This is generally to keep firewall ports open, but if there’s no response within 3 (default) OPTIONS timeouts (total of 90 seconds) then the phone assumes the server is gone and freaks out. You can tweak these settings, too.
I’d be more concerned that your network “blips” frequently and just let the phones adjust accordingly.
SmartVox is correct, it boils down to 2 settings: Registration Expiration time (default is 60 minutes but this is WAY too long, we use 10 Minutes) and OPTIONS KeepAlive packets every 30 seconds.
You can likely go down to 5 minute register times and maybe even 15 second OPTIONS packets if you wanted to, but SmartVox brought up a good point about packet flow monitoring, you’d have to check with your SIP provider if this is going to be an issue.
In our case, we don’t do any form of flow control at all - We allow all packets, and simply drop the ones that are spammy.
That makes sense. So basically, settings like this:
What about SIIP Listening Mode? If I am using TCP…
Personally, I prefer TCP for SIP.
Even if you have the phone set to TCP, only the SIP is TCP - The RTP is still UDP, which is good for audio quality.
TCP gives you a few advantages:
- Packet delivery guarantee (Retransmissions are protocol-level)
- Sockets are handshook so the phone will know immediately if the server goes down
- Faster phone call connection due to the above item #1.
- A lot of ISPs nowadays have awful support for UDP, and end up having all sorts of weird loss,