Bug 267551 - External file change detection via fs_server stall both local and remote CPUs
External file change detection via fs_server stall both local and remote CPUs
Status: RESOLVED FIXED
Product: cnd
Classification: Unclassified
Component: Remote
8.1
PC Windows 7
: P2 (vote)
: 8.2
Assigned To: Vladimir Kvashin
issues@cnd
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-08-11 18:31 UTC by zabbas
Modified: 2016-10-14 15:03 UTC (History)
0 users

See Also:
Issue Type: DEFECT
:


Attachments
IDE log (79.44 KB, text/plain)
2016-08-11 18:31 UTC, zabbas
Details
Zip file containing stack traces of fs_server (30.13 KB, application/x-gzip)
2016-08-16 13:43 UTC, zabbas
Details
Profiler Snapshot of IDE when fs_server was consuming high CPU (117.78 KB, application/octet-stream)
2016-08-16 13:45 UTC, zabbas
Details
IDE Profiler snapshot right after high cpu consumption of fs_server (385.52 KB, application/octet-stream)
2016-08-16 13:47 UTC, zabbas
Details
snapshot of fs_server and IDE on another windows machine (2.52 MB, application/zip)
2016-08-16 17:40 UTC, zabbas
Details
fs_server built for 64-bits Linux with debugging info (127.05 KB, application/octet-stream)
2016-08-23 13:04 UTC, Vladimir Kvashin
Details
Here is a binary for 64-bits Linux that was built using glibc 2.4 (version I use in production) (63.68 KB, application/octet-stream)
2016-08-24 09:48 UTC, Vladimir Kvashin
Details
Here is a zip file with (136.20 KB, application/zip)
2016-08-24 09:52 UTC, Vladimir Kvashin
Details
Here is a zip file with new version sources (117.44 KB, application/octet-stream)
2016-09-08 13:59 UTC, Vladimir Kvashin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description zabbas 2016-08-11 18:31:44 UTC
Product Version = NetBeans IDE 8.1 (Build 201510222201)
Operating System = Windows 7 version 6.1 running on amd64
Java; VM; Vendor = 1.7.0_79
Runtime = Java HotSpot(TM) 64-Bit Server VM 24.79-b02

Reproducibility: Happens sometimes, but not always

STEPS:
I mainly use Netbeans for source code navigation and editing over SSH by connecting to Linux server. The remote project is created by using existing source files with Makefile. Following are the settings of project related to Authentication and access:

Access project files via SFTP
Check ACL based remote file permissions
Enable X11 forwarding

Preferred Authentication:
GSS-API
Keyboard interactive
Password

In IDE I have switched off/disconnected the remote git to avoid slow performance.


ACTUAL:
In recent couple of months I repeatedly faced following problem:

Randomly the IDE start checking for external changes in files ( some time when parsing is performed locally after a change made by me in file from within Netbeans editor). While doing this the CPU usage goes up between 80 and 90% on local Windows machine. At the same time in remote linux server I see a process fs_server eating up CPU upto 6000% (yes that is 6k). As a result both IDE get stuck locally and also on remote server other users are deprived of CPUs. Rarely the things get back to normal quickly, otherwise this scenario goes on for long time and I had to shutdown the IDE (mostly forcefully).

EXPECTED:
  Please guide me what can I change in IDE configuration to avoid this issue or is this a bug outside of remote development system of IDE? Feel free to ask for more information
Comment 1 zabbas 2016-08-11 18:31:50 UTC
Created attachment 161642 [details]
IDE log
Comment 2 Vladimir Kvashin 2016-08-11 19:55:19 UTC
fs_server is a tiny binary utility that makes remote file system faster.

I need additional information; could you please do the following: once this happens
1) get NetBeans profiling snapshot (there is a "Profile me" button on tool bar - it looks like a stopwatch)
2) on remote host, when fs_server starts consuming lots of CPU, get its stacks
On Linux, I usually do this via gdb:
gdb --q --n --ex "thread apply all bt" --batch --pid $PID

You can also try switching ACL off in you host setup, in this case fs_server will just use stat/lstat and interpret its flags instead of calling access() for each file it deals with. I know that on Solaris broken links or mounts access() sometimes hang for long (although not consume much CPU). But I would ask you to get fs_server thread dump first  it will show us what happens, then, if you see that lots of threads stay in access(), then try switching ACL off.
Comment 3 zabbas 2016-08-12 11:00:35 UTC
Thanks for quick response.

Even before your response I switched off ACL as experiment to see if this will calm down the fs_server and local process. I didn't observe long but the initial impression is good.

This morning I switched on ACL again to reproduce the issue, so far no success. As soon problem appears again I'll profile the IDE and grab ste stack trace of fs_server.

Finger crossed...
Comment 4 zabbas 2016-08-16 13:43:53 UTC
Created attachment 161675 [details]
Zip file containing stack traces of fs_server

Even with ACL option unchecked for the remote host, I noticed repeatedly fs_server consuming CPU up to 6000%. I took several snapshot using gdb so to see any difference in activities of fs_server. All snapshots/stack traces are tarred and ziped into fs_server_gdb.tar.zip

Separately I'll attach the IDE profiling snapshots. This time IDE was responsive but on another machine IDE became unresponsive while consuming up to 85% CPU however fs_server was calm. For that instance I couldn't activate the profiling because of unresponsiveness.
Comment 5 zabbas 2016-08-16 13:45:21 UTC
Created attachment 161676 [details]
Profiler Snapshot of IDE when fs_server was consuming high CPU

This snapshot is from the time when fs_server was consuming high CPU.
Please see previous comment for more info.
Comment 6 zabbas 2016-08-16 13:47:37 UTC
Created attachment 161677 [details]
IDE Profiler snapshot right after high cpu consumption of fs_server

This snapshot is from earlier instance of high cpu consumption of fs_server but 
unfortunately I was busy capturing stack traces therefore the IDE snapshot was taken after fs_server became calm.
Comment 7 zabbas 2016-08-16 17:40:55 UTC
Created attachment 161681 [details]
snapshot of fs_server and IDE on another windows machine

This attachment contains snapshot of fs_server and IDE when profiling was running for quite sometime until I faced the high CPU consumption both in windows machine by IDE and in remote linux machine by fs_server. Looks like this time IDE snapshot has much more information (considering the file size).
Comment 8 Vladimir Kvashin 2016-08-23 13:03:40 UTC
OMG. I forgot that fs_server is stripped. What if I ask you either to use fs_server binary I'm going to attach to the bug shortly 
or to build it from sources?

To build from source, you need the following on your remote server:
  clone hg.netbeans.org/cnd-main
  cd dlight.remote.impl/tools/fs_server
  make 64BITS=1 ASSERTIONS=1 clean all-debug
It will tell you the full path to the binary

The binary should be placed into C:\Program Files\NetBeans 8.1\dlight\bin\Linux-x86_64\fs_server

I'm now trying to understand what's going on without this... not sure I'll succeed...
Comment 9 Vladimir Kvashin 2016-08-23 13:04:53 UTC
Created attachment 161752 [details]
fs_server built for 64-bits Linux with debugging info
Comment 10 zabbas 2016-08-23 17:52:46 UTC
Ok I'll first try the attached binary and will get back to this issue as soon i face the problem again.
Comment 11 zabbas 2016-08-23 18:40:39 UTC
Ok I tried to use the attached binary but without success (see below) and I do not have mercurial on linux server so can#t clone the repo.

The fs_binary binary had to be (additionally) copied to 

C:\Users\<user>\AppData\Roaming\NetBeans\8.1\bin\..

in order for Netbeans to place it on remote server. Anyway this time IDE didn't use fs_server without any error or info. I executed fs_server manually to see if it was working but it seems that it is linked to a wrong verison of library which my remote server don't have.


$  /var/tmp/dlight_abbasz/21aa3918/bin/Linux-x86_64/fs_server -h
/var/tmp/dlight_abbasz/21aa3918/bin/Linux-x86_64/fs_server: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /var/tmp/dlight_abbasz/21aa3918/bin/Linux-x86_64/fs_server)

Any alternative way to get the sources? or can you provide the binary for libc version 2.12?

Thanks.
Comment 12 Vladimir Kvashin 2016-08-24 09:48:44 UTC
Created attachment 161757 [details]
Here is a binary for 64-bits Linux that was built using glibc 2.4 (version I use in production)
Comment 13 Vladimir Kvashin 2016-08-24 09:52:50 UTC
Created attachment 161758 [details]
Here is a zip file with

To build:
  cd fs_server_src/tools/fs_server
  make 64BITS=1 ASSERTIONS=1 clean all-debug

please note that you need to put it on the local host into C:\Program Files\NetBeans 8.1\dlight\bin\Linux-x86_64\fs_server directory, otherwise (if you just put it on remote in /var/tmp/dlight_$USER/...) IDE will overwrite it with the one from local installation directory.
Comment 14 zabbas 2016-08-24 10:39:07 UTC
Thanks.

The binary was stripped so I just build locally from sources you provided and it is running now. I'll report back as soon high cpu usage is observed.
Comment 15 Vladimir Kvashin 2016-09-01 15:19:53 UTC
What I see from logs, even without debugging info, is that most of threads stay in mutex-related things (mostly on pthread_mutex_lock). They probably consume a lot of CPU on such calls (can it be the case, for example, if they lock/unlock frequently but spend little time in locked state? - I don't know).

In any case, I faced the same wrong (exaggerated) thread concurrency while solving a different issue (#267715), and I'm going to fix it ASAP. Probably this will help with this one either.

However if you are able to provide fs_server stacks with debugging info, I'd be happy. Thanks in advance.

Vladimir.
Comment 16 zabbas 2016-09-01 21:32:36 UTC
Vladimir, the locking/unlocking was my first guess as well, however without having full stack trace and code I am not sure if that was the case. I have the code now as you sent but the problem didn't occur for couple of days since I used locally compiled fs_server with debug symbols. Since that I am busy with some training so not using netbeans for some days now. I have only next week before my vacations to try to reproduce the problem.

Having said that I am tempted to believe, after the observation with unstripped fs_server for couple of days, that problem might be arising due to uninitialized variables. Please do check for such case when fixing your findings.

Next week I will compile again without -g but the function names will not be stripped, this way the binary should be closer to the release version and still stack trace will have enough information.
Comment 17 Vladimir Kvashin 2016-09-08 13:57:07 UTC
I hope this is fixed in trunk (i. e. in daily NetBeans builds and in the upcoming 8.2 release). There was a strong contention between threads that has been fixed. 

I encourage you to try. You can either try daily build or just use a fs_server utility from it. The utility is compatible with 8.1. It can be downloaded from http://hg.netbeans.org/binaries/C9EE1ACA53AA71183590274D658ABD0A6AD6B2BE-fs_server-1.0.zip

Or you can build it yourself from sources - I'll attach sources shortly.
(fs_server sources are also available here: http://hg.netbeans.org/cnd-main/file/14cbd8be96fe/dlight.remote.impl/tools/fs_server)
Comment 18 Vladimir Kvashin 2016-09-08 13:59:13 UTC
Created attachment 161972 [details]
Here is a zip file with new version sources

To build:
  cd fs_server_src/tools/fs_server
  make 64BITS=1 ASSERTIONS=1 clean all-debug
Then copy to C:\Program Files\NetBeans 8.1\dlight\bin\Linux-x86_64\fs_server
Comment 19 zabbas 2016-09-08 15:06:49 UTC
Thanks. I'll try this after a month of vacation.
In last few days I didn't test much the fs_binary you sent earlier and for whatever reason the high cpu problem didn't occur so far. In any case if  I encounter the problem again I'll reopen this bug.
Comment 20 Vladimir Kvashin 2016-10-14 15:03:14 UTC
I think it is fixed now. If not, please feel free to reopen.


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo