Graph stalls: just stops at some point

Started by Mark

Mark

Graph stalls: just stops at some point   10 February 2016, 22:42

The problem with the graph stalling/stopping has popped up again. The graph just stops at some point -- haven't been able to correlate it with anything else -- but the rest of NetWorx still works. Have to exit and re-launch to get it going again. Seems to run OK for quite a while, maybe days, and then at some point I notice it's stopped.

I had reported this last year and I thought it had gotten fixed. But with the 5.5.1 update, it seems to be back. I've tried putting it in compatibility mode as Win 7, but it still stalls. Am running 64-bit NetWorx version 5.5.1 on Win 10 with all the latest updates.
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   10 February 2016, 22:43

What have you chosen under Monitored Interfaces?
Mark

Re: Graph stalls   10 February 2016, 22:43

Forgot to mention I am using NetWorx in router monitoring mode.
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   10 February 2016, 22:44

SNMP or UPnP?
Mark

Re: Graph stalls   11 February 2016, 02:23

UPnP.
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   11 February 2016, 09:35

It could also be a router's issue. How reproducible is it? I mean does it happen, say in 10 minutes, or 1 week?

I can add some debug code to see what's happening under the hood, but it's probably not worth doing if the graph stalls very infrequently.
Mark

Re: Graph stalls   12 February 2016, 04:57

It had been working OK with my router with the earlier 5.4.2 rev, so maybe not the router. The graph seems to stop after about a day. Depends on how many times I need to reboot Windows that day! wink
Mark

Re: Graph stalls   13 February 2016, 07:52

Here's another data point on this issue. Just found the graph stalled again and changed the NetWorx setting to monitor my gigabit network connector. When I clicked Apply, the graph started moving again. When I changed it back to the router's name and clicked Apply, the graph stopped again. Router problem or something with the UPnP software?
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   15 February 2016, 12:42

I am afraid I can't tell with certainty. I can make a test build that writes a log, then we can see what's going on under the hood. Would you be willing to test it?
Mark

Re: Graph stalls   15 February 2016, 21:47

Quote

Andrew

I can make a test build that writes a log, then we can see what's going on under the hood. Would you be willing to test it?


Sure, just give me the link and I'll give it a try.

Simply restarting NetWorx (and not doing anything to the router) fixes this -- for a while. After the graph stalls, NetWorx still works, and if I go into settings and select anything other than the router to monitor, the graph starts moving again. Re-selecting the router causes the graph to stop again. The only way I've found to get it moving again with the router is to re-launch NetWorx.
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   16 February 2016, 18:38

Sure, here's the debug version. It's the portable build, so just unpack it in a temporary folder, configure it to monitor the router via UPnP and you will see a log file created in the same folder.

When the problem occurs, quit NetWorx and send me the log. It can be quite large over a day, so please compress it and upload here.
Mark

Re: Graph stalls   18 February 2016, 02:20

I'm uploading the full log file, zipped, as requested, but here is the relevant portion:

2/17/2016 16:23:36 Polling UPNP
2/17/2016 16:23:36 Router present
2/17/2016 16:23:36 TotalBytesReceived
2/17/2016 16:23:36 Reply OK
2/17/2016 16:23:36 TotalBytesReceived=105851217
2/17/2016 16:23:36 TotalBytesSent
2/17/2016 16:23:36 Reply OK
2/17/2016 16:23:36 TotalBytesSent=245614844
2/17/2016 16:23:37 Polling UPNP
2/17/2016 16:23:37 Router present
2/17/2016 16:23:37 TotalBytesReceived
2/17/2016 16:24:10 Reply FAIL
2/17/2016 16:24:10 TotalBytesReceived=0
2/17/2016 16:24:10 TotalBytesSent
2/17/2016 16:24:11 Reply FAIL
2/17/2016 16:24:11 TotalBytesSent=0
2/17/2016 16:24:11 Polling UPNP
2/17/2016 16:24:11 Router present
2/17/2016 16:24:11 TotalBytesReceived
2/17/2016 16:24:12 Reply FAIL
2/17/2016 16:24:12 TotalBytesReceived=0
2/17/2016 16:24:12 TotalBytesSent
2/17/2016 16:24:13 Reply FAIL
2/17/2016 16:24:13 TotalBytesSent=0
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   18 February 2016, 13:31

Thank you, it appears at some point NetWorx is no longer able to retrieve the TotalBytesReceived and TotalBytesSent counters from the router.

I have now added more logging after 'Reply FAIL'. Kindly download the updated build and run it again until it fails.

Then just post the relevant part again. This time no need to upload the whole thing.
Mark

Re: Graph stalls   18 February 2016, 23:51

2/18/2016 14:41:36 Polling UPNP
2/18/2016 14:41:36 Router present
2/18/2016 14:41:36 TotalBytesReceived
2/18/2016 14:41:36 Reply OK
2/18/2016 14:41:36 TotalBytesReceived=3648062065
2/18/2016 14:41:36 TotalBytesSent
2/18/2016 14:41:36 Reply OK
2/18/2016 14:41:36 TotalBytesSent=1011925312
2/18/2016 14:41:37 Polling UPNP
2/18/2016 14:41:37 Router present
2/18/2016 14:41:37 TotalBytesReceived
2/18/2016 14:42:11 Reply FAIL, error code 0 
2/18/2016 14:42:11 Socket code 10054 Connection reset by peer
2/18/2016 14:42:11 Server reply 

2/18/2016 14:42:11 TotalBytesReceived=0
2/18/2016 14:42:11 TotalBytesSent
2/18/2016 14:42:12 Reply FAIL, error code 500 
2/18/2016 14:42:12 Socket code 10061 Connection refused
2/18/2016 14:42:12 Server reply <?xml version="1.0" encoding="utf-8"?>
<s:Envelope s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body>
<u:GetTotalBytesSent xmlns:u="urn:schemas-upnp-org:service:WANCommonInterfaceConfig:1" />
</s:Body>
</s:Envelope>

2/18/2016 14:42:12 TotalBytesSent=0
2/18/2016 14:42:12 Discovery
2/18/2016 14:42:12 Received USN uuid:ebf5a0a0-1dd1-11b2-a90f-20aa4b65aafe::urn:schemas-upnp-org:device:InternetGatewayDevice:1
2/18/2016 14:42:12 Updating USN
2/18/2016 14:42:13 Polling UPNP
2/18/2016 14:42:13 Router present
2/18/2016 14:42:13 TotalBytesReceived
2/18/2016 14:42:14 Reply FAIL, error code 500 
2/18/2016 14:42:14 Socket code 10061 Connection refused
2/18/2016 14:42:14 Server reply <?xml version="1.0" encoding="utf-8"?>
<s:Envelope s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body>
<u:GetTotalBytesReceived xmlns:u="urn:schemas-upnp-org:service:WANCommonInterfaceConfig:1" />
</s:Body>
</s:Envelope>

2/18/2016 14:42:14 TotalBytesReceived=0
2/18/2016 14:42:14 TotalBytesSent
2/18/2016 14:42:15 Reply FAIL, error code 500 
2/18/2016 14:42:15 Socket code 10061 Connection refused
2/18/2016 14:42:16 Server reply <?xml version="1.0" encoding="utf-8"?>
<s:Envelope s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body>
<u:GetTotalBytesSent xmlns:u="urn:schemas-upnp-org:service:WANCommonInterfaceConfig:1" />
</s:Body>
</s:Envelope>

After this it seems to just repeat.
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   19 February 2016, 15:28

This is a real mystery. What normally happens is NetWorx sends a request like this to query the router's number of sent and received bytes:
POST /control?WANCommonInterfaceConfig HTTP/1.1
Host: 192.168.1.1:1780
Content-Type: text/xml
Content-Length: 293
SOAPAction: "urn:schemas-upnp-org:service:WANCommonInterfaceConfig:1#GetTotalBytesSent"

<?xml version="1.0" encoding="utf-8"?>
<s:Envelope s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body>
<u:GetTotalBytesSent xmlns:u="urn:schemas-upnp-org:service:WANCommonInterfaceConfig:1" />
</s:Body>
</s:Envelope>

The router responds like this with the number of sent or received bytes:
HTTP/1.1 200 OK
Content-Length: 340
Content-Type: text/xml; charset="utf-8"
Date: Fri, 19 Feb 2016 05:02:13 GMT
EXT: 
Server: POSIX, UPnP/1.0 linux/5.60.127.2901 
Connection: close

<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<s:Body>
<u:GetTotalBytesSentResponse xmlns:u="urn:schemas-upnp-org:service:WANCommonInterfaceConfig:1">
<NewTotalBytesSent>20746603</NewTotalBytesSent>
</u:GetTotalBytesSentResponse>
</s:Body>
</s:Envelope>

At some point your router starts responding with error code 500 (Internal Server Error) instead of 200 OK. Why this is a mystery is that if something happens in the router and it starts producing an error, restarting NetWorx should have no effect.

I am not sure what we can do at this stage. What router model is it? Could you perhaps try updating its firmware and see if the problem persists?
Mark

Re: Graph stalls   19 February 2016, 20:33

The router is a Linksys EA4500 with the most current firmware. It's set to automatically update, and when I just did a manual check it came back as being at the latest version. I don't think there's anything else unusual about its setup. The computer using NetWorx is cable attached (not Wi-Fi).

Yes, I agree, the mystery is why simply restarting NetWorx gets the graph going again. It would seem to imply the issue is not with the router. Yet changing what NetWorx monitors to, say, the NIC, and then changing it back to the router has no effect. And so I wonder what is the difference, if any, between changing the NetWorx setting to monitor the router and restarting NetWorx. Any kind of initialization thing that might affect this?

Also, is there any possibility the router returns a FAIL because of something not right in the request message it is getting from NetWorx? Something that might get resolved from restarting NetWorx but not if you simply change the setting?

Just tossing out ideas.

Maybe I can try installing NetWorx on my wife's computer and see if it exhibits the same issue. She's running Win 10 x64, too, but not the Pro version, and is Wi-Fi connected to the same router. Can two computers ping the router for this info at the same time? Should I set mine to monitor my NIC while trying to monitor the router from my wife's computer?
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   19 February 2016, 21:08

That's what I also thought, there may be a part of the request that gets wrong over time and so the router reports an error. Restarting NetWorx obviously fixes it.

If that's not too much to ask, I would recommend to install Wireshark, put "http" (without quotes) as the filter and record a few request-response pairs. Save this (working) session to a *.pcapng file. Then wait until it stops working and again capture a few request-response pairs. Save this (not working) session to another *.pcapng file.

By comparing those capture files I may be able to tell if there's any difference and what's going on.

Trying NetWorx on another computer may also be a good idea, you can have two instances polling the router at the same time, no problem with it.
Mark

Re: Graph stalls   20 February 2016, 02:12

Am currently doing to dual polling. So far nothing to report.

Will attempt the Wireshark (or similar) sniffing over the weekend.
Mark

Re: Graph stalls   21 February 2016, 16:17

I've uploaded the capture file. I hope you will find it useful.
Mark

Re: Graph stalls   29 February 2016, 19:13

It's been a few days since I installed the latest 5.5.2 update, and everything seems to work OK now. I've been monitoring the router and the graph has not stalled once despite several "sleeps" and even a router outage. It seems quite solid now. I don't know what's changed -- I didn't see anything obvious in the change log that would affect this -- but something apparently has and the problem I reported seems to have disappeared, for now. Will update if it returns in the future.
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   29 February 2016, 19:17

It won't return wink The issue was that the router at some point decided to listen for UPnP requests on a different port, while NetWorx only requested router's configuration on startup. Now it does it so periodically, so even if the port changes, everything should keep working.

Re: Graph stalls   29 February 2016, 20:37

Can you suppress checking UPNP when settings are All + Ignore local?

Networx is set to use SNMP to check the router, not UPNP.

Since you added this there are warnings from the firewall every 10 seconds.
Attachments:
open | download – Networx UPNP attempt.png (18.6 KB)
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   29 February 2016, 21:06

Quote

Praxis

Can you suppress checking UPNP when settings are All + Ignore local?


Done in the latest build.
Mark

Re: Graph stalls   01 March 2016, 02:43

Quote

Andrew

The issue was that the router at some point decided to listen for UPnP requests on a different port...


Is this particularly unusual for a router, or any UPnP device? Just curious.
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   01 March 2016, 07:37

Quote

Mark

Is this particularly unusual for a router, or any UPnP device? Just curious.


It's generally unusual. It'd be understandable if the device chose a random port on startup, but I don't have any idea why the engineers made it change at runtime at a seemingly random moments.
Mark

Re: Graph stalls   08 March 2016, 01:09

I've posted this issue on the Linksys support form. I'll let you know if anything useful comes of that.

Routers are notorious for poor, insecure UPnP implementations. I am wondering if perhaps this has something to do with security and some misguided attempt to deal with it by Linksys.
Mark

Re: Graph stalls   09 March 2016, 02:06

Apparently Linksys did have an issue with the UPnP port changing, and it was supposedly fixed in a firmware update. According to Linksys, "The firmware update that was released in 2014 should have fixed this issue." Well, I already had a firmware update from 2014 and my router was reporting "no new updates available" when I checked manually from the admin console. But it turned out I did not have the latest build -- same version numbers, but different build number -- that came out six months after the one I had been using! I had auto-update on, but it failed to grab this newer firmware, even with a manual check.

Release Date: July 31, 2014
Firmware version: 2.1.41 (Build 162351)

Release Date: December 18, 2014
Firmware version: 2.1.41 (Build 164606)

So now I have downloaded and updated the firmware manually.

Can you point me to an earlier version of NetWorx -- I guess version 5.5.1 should be good -- so I can see if my original problem has gone away? I know you've added a work-around to the software and 5.5.2 works great now, but I am still curious to see if this new firmware really did fix the issue.
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   09 March 2016, 09:01

No worries, here's the build that did not have the workaround. If it works continuously, then the firmware was fixed.
Mark

Re: Graph stalls   14 March 2016, 18:33

Um, just looked again to make sure, and the About Networx menu entry for the version I downloaded using your link, Andrew, is reporting 5.5.2 (build 16069). Doesn't this version have the fix in it? Shouldn't I be using 5.5.1 or earlier if I want to see if Linksys fixed the problem?
SoftPerfect Support forum - Andrew avatar image

Re: Graph stalls   14 March 2016, 18:52

Don't worry about the version number. I have just pulled the problematic revision from SVN and built it as the latest version, so the version number suggests it's the latest one, though it's not. It's the revision that didn't work correctly with your router.

Reply to this topic

Sometimes you can find a solution faster if you try the forum search, have a look at the knowledge base, or check the software user manual to see if your question has already been answered.

Our forum rules are simple:

  • Be polite.
  • Do not spam.
  • Write in English. If possible, check your spelling and grammar.

Author:

Email:

Subject

A brief and informative title for your message, approximately 4–8 words:

     

Spam prevention: please enter the following code in the input field below.

 **     **  **    **  **     **  **     **  ********  
 **     **  ***   **  **     **  ***   ***  **     ** 
 **     **  ****  **  **     **  **** ****  **     ** 
 **     **  ** ** **  *********  ** *** **  **     ** 
 **     **  **  ****  **     **  **     **  **     ** 
 **     **  **   ***  **     **  **     **  **     ** 
  *******   **    **  **     **  **     **  ********  

Message: