This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 257458 - SQL statements returning resultdset are executed twice
Summary: SQL statements returning resultdset are executed twice
Status: RESOLVED FIXED
Alias: None
Product: db
Classification: Unclassified
Component: Show Data (show other bugs)
Version: 8.2
Hardware: PC Windows XP
: P2 normal (vote)
Assignee: matthias42
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-07 11:59 UTC by NukemBy
Modified: 2016-05-28 22:53 UTC (History)
0 users

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
SQL-log.txt (57.76 KB, text/plain)
2016-01-07 19:31 UTC, NukemBy
Details

Note You need to log in before you can comment on or make changes to this bug.
Description NukemBy 2016-01-07 11:59:55 UTC
When I execute this statement in NetBeans

    select put_message(concat('dummy ', now()));

it is actually executed twice, thus causing duplication of rows in target table.

Full SQL for this use-case is following (PostgreSQL):
>>----------------------------------------

    CREATE TABLE dummy_messages ( 
      dm_id                serial,
      dm_text              varchar(250),
      dm_created           timestamp
     );

    CREATE OR REPLACE FUNCTION put_message(IN m_text varchar)
        RETURNS int LANGUAGE plpgsql AS $$
    DECLARE
        msg_id int;
    BEGIN
        insert into dummy_messages(dm_text, dm_created)
            values (m_text, now())
            returning dm_id into msg_id;
        return msg_id;
    END $$;

    select put_message(concat('dummy ', now()));

    -- select * from dummy_messages;

<<----------------------------------------
(Note: $$ will work unless patch from these issues is applied. 
    https://netbeans.org/bugzilla/show_bug.cgi?id=257457
    https://netbeans.org/bugzilla/show_bug.cgi?id=257363
Alternately 'setup' SQL script can be executed in pgAdmin
)

Second execution occurs because of this code:

  package org.netbeans.modules.db.dataview.output;

  class SQLExecutionHelper {
        class Loader implements Runnable, Cancellable {
            ...
            @Override
            public void run() {
                    ...
                    // If total count was not retrieved using scrollable cursors,
                    // compute it now.
                    if (!useScrollableCursors && dataView.getPageContexts().size() > 0) {
                        getTotalCount(isSelect, sql, stmt, dataView.getPageContext(0));
                    }

    private void getTotalCount(boolean isSelect, ...) {
        ...
        // SELECT COUNT(*) FROM (sqlquery) alias
        ResultSet cntResultSet = null;
        try {
            cntResultSet = stmt.executeQuery(
                    SQLStatementGenerator.getCountAsSubQuery(sql));
            setTotalCount(cntResultSet, pageContext);


... which indeed executes statement twice just to get count of rows.

Call stack:
"SQLStatementExecution"
    at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:458)
    at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:374)
    at org.netbeans.modules.db.dataview.output.SQLExecutionHelper.getTotalCount(SQLExecutionHelper.java:909)
    at org.netbeans.modules.db.dataview.output.SQLExecutionHelper.access$900(SQLExecutionHelper.java:84)
    at org.netbeans.modules.db.dataview.output.SQLExecutionHelper$1Loader.run(SQLExecutionHelper.java:182)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:1443)
    at org.netbeans.modules.openide.util.GlobalLookup.execute(GlobalLookup.java:68)
    at org.openide.util.lookup.Lookups.executeWith(Lookups.java:303)
    at org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:2058)


I think that the only 'right' way for generic runner of SQL code (the one having no exact knowledge of what is happening server side) to get number of rows in resultset is - read it until the end (most probably with some safe limit) and display number of the fetched rows.

Setting priority to P2 because
- issue causes unwanted hidden destructive affect on data
- there is no workaround
Comment 1 NukemBy 2016-01-07 19:31:30 UTC
After digging deeper into what is happening, implementation seems to be too weird to me:

- resultset is actually not fetched completely - only those fitting into currently used display page size are fetch (for example 3 of 1000). If I click 'goto next page' in the response navigator - SQL statement is just executed once again and first N rows from the previous page are skipped. 

So ... let's use SQL statement which is executed 5 seconds, every click will cause 5 second delay ...

Another case - query over dynamically changed data, i.e. data modified while i click prev/next page. This approach will cause output changing each time I go to another page.

On top of this reading total count of rows as a separate statement is useless, because it also will change over time ...


- I've put a breakpoint in the SQL driver to trace all SQL statements (to learn what is happening behind the scenes) and i see that NetBeans sends 85 SQL statements to database to display 1 resultset 

  select * from dummy_messages

... Weird ... it seems that NB tries to somehow fetch entire model of the database to display single resultset ... 85 statements is for database having almost no datamodel in it ...


I'll attach trace of SQL message (breakpoint was set in 

package org.postgresql.jdbc;
public class PgStatement implements BaseStatement
    public boolean executeWithFlags(String p_sql, int flags) throws SQLException
        execute(simpleQuery, null, QueryExecutor.QUERY_ONESHOT | flags);
). 

Please review implementation of SQL queries in NetBeans. It indeed has number of unwanted side effects.
Comment 2 NukemBy 2016-01-07 19:31:57 UTC
Created attachment 158051 [details]
SQL-log.txt
Comment 3 matthias42 2016-01-07 20:03:13 UTC
First: Thank you for taking care of the db support. I'll look into your packages in the next days. Could you please check, that you signed an OCA (oracle contributer agreement) and point me to the entry?

I wanted to reduce prio to P2 as this is a misuse of SQL features IMHO. SQL 99 offers the correct syntax "CALL ..." to differ a select from a function call. The code in the executer detects this and only tries to fetch row count if a plain select is detected.

I agree with that as I consider the SELECT in the same as GET in http. It should be idempotent. It seems postgresql has a different understanding. I'll have to think about it.

With respect to the SQLs you see - I'd bet you'll see that the origin lies in the postgresql driver itself, as netbeans fetches the db model to be able to supply code completion (metadatamodel for resultset and also db). So seeing information_schema mentioned often this looks like an implementation decision in the postgresql driver.

I'll let it stay on P2 - what I see:

- duplicate executions of statements when doing the initial load of data

What I won't fix:

- SQL is reexecuted on page switch

Reason is: You can't buffer the whole resultset. OOMs are a issue as it is and transfering whole databases is pointless.
Comment 4 matthias42 2016-01-07 20:03:49 UTC
Reassigning to me.
Comment 5 NukemBy 2016-01-07 22:21:13 UTC
- No, I did not sign OCA and I'm posting patches just to speedup problem resolution, not for merging into codebase ASIS from my name. I expect them to be reviewed by someone from NB team and updated as needed to conform to internal requirements to source code.

Is there is something wrong with that expectation?

- 'CALL' does not work for PostgreSQL - they use another syntax (http://www.postgresql.org/docs/9.5/static/sql-syntax-calling-funcs.html). 

Anyway, I suppose any flavor of SQL languages allow returning resultsets/rowsets from stored procedure (function in tems of PostgreSQL), what brings us to the original problem - display the resultset generated in SP.


- Re: origin lies in the postgresql driver itself

Of course, before posting I've checked were these calls come from - all of them originate from various places in org.netbeans.modules.db.metadata.model.api, for example:

	"org.netbeans.modules.db.explorer.DatabaseConnection"
		at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:459)
		at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:374)
		at org.postgresql.jdbc.PgDatabaseMetaData.getIndexInfo(PgDatabaseMetaData.java:3133)
		at org.netbeans.modules.db.metadata.model.MetadataUtilities.getIndexInfo(MetadataUtilities.java:118)
		at org.netbeans.modules.db.metadata.model.jdbc.JDBCTable.createIndexes(JDBCTable.java:206)
		at org.netbeans.modules.db.metadata.model.jdbc.JDBCTable.initIndexes(JDBCTable.java:455)
		at org.netbeans.modules.db.metadata.model.jdbc.JDBCTable.getIndexes(JDBCTable.java:118)
		at org.netbeans.modules.db.metadata.model.api.Table.getIndexes(Table.java:103)
		at org.netbeans.modules.db.explorer.node.ColumnNode$1.run(ColumnNode.java:161)
		at org.netbeans.modules.db.explorer.node.ColumnNode$1.run(ColumnNode.java:134)
		at org.netbeans.modules.db.metadata.model.JDBCConnMetadataModel.runReadAction(JDBCConnMetadataModel.java:97)
		at org.netbeans.modules.db.metadata.model.api.MetadataModel.runReadAction(MetadataModel.java:92)
		at org.netbeans.modules.db.explorer.node.ColumnNode.setupNames(ColumnNode.java:133)
		at org.netbeans.modules.db.explorer.node.ColumnNode.refresh(ColumnNode.java:119)
		at org.netbeans.api.db.explorer.node.NodeProvider.refresh(NodeProvider.java:142)
		at org.netbeans.modules.db.explorer.node.NodeRegistry.refresh(NodeRegistry.java:151)
		at org.netbeans.api.db.explorer.node.BaseNode.refresh(BaseNode.java:189)
		at org.netbeans.api.db.explorer.node.NodeProvider.refresh(NodeProvider.java:142)
		at org.netbeans.modules.db.explorer.node.NodeRegistry.refresh(NodeRegistry.java:151)
		at org.netbeans.api.db.explorer.node.BaseNode.refresh(BaseNode.java:189)
		at org.netbeans.api.db.explorer.node.NodeProvider.refresh(NodeProvider.java:142)
		at org.netbeans.modules.db.explorer.node.NodeRegistry.refresh(NodeRegistry.java:151)
		at org.netbeans.api.db.explorer.node.BaseNode.refresh(BaseNode.java:189)
		at org.netbeans.api.db.explorer.node.NodeProvider.refresh(NodeProvider.java:142)
		at org.netbeans.modules.db.explorer.node.NodeRegistry.refresh(NodeRegistry.java:151)
		at org.netbeans.api.db.explorer.node.BaseNode.refresh(BaseNode.java:189)
		at org.netbeans.api.db.explorer.node.NodeProvider.refresh(NodeProvider.java:142)
		at org.netbeans.modules.db.explorer.node.NodeRegistry.refresh(NodeRegistry.java:151)
		at org.netbeans.api.db.explorer.node.BaseNode.refresh(BaseNode.java:189)
		at org.netbeans.modules.db.explorer.DatabaseConnection$4.run(DatabaseConnection.java:1228)
		at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:1443)
		at org.netbeans.modules.openide.util.GlobalLookup.execute(GlobalLookup.java:68)
		at org.openide.util.lookup.Lookups.executeWith(Lookups.java:303)
		at org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:2058)

- Re: What I won't fix: SQL is reexecuted on page switch. Reason is: You can't buffer the whole resultset. OOMs are a issue as it is and transfering whole databases is pointless.

Yes - buffering entire resultset does not make sense when row count is beyond 1000 and more rows. But, it typically has no sense to scroll through more than (let's say) 100 of rows. In rare cases when end user indeed needs to scroll through 100000 rows - tools typically allow user to increase default limit to required number thus letting user to deal with OOM issues on its own - probably by allocating more memory to JVM.

And anyway, if some SQL statement generates millions of rows it will kill either server (if it does not have some safety limits) or NetBeans. Letting the user to work without limits just from NetBeans's side will not work in this case.
Comment 6 Jiri Kovalsky 2016-01-08 09:03:56 UTC
Matthias is unfortunately right Sergey. We cannot accept your contributions unless you sign the Oracle Contributor Agreement [1]. It's very easy though. Just print the document, check one of the options for question #7, identify yourself and scanned version send to the oracle-ca_us@oracle.com e-mail address. I will do the rest. Thanks!

[1] http://wiki.netbeans.org/FaqHowDoIFileACA
Comment 7 NukemBy 2016-01-08 13:30:11 UTC
Sorry, still do not understand why signed OCA is required for resolution of these bugs. I'm not willing to become an Oracle contributor and get some committer/write-access rights anywhere in Oracle's infrastructure nor I care about  seeing myself in some list of Oracle Contributors.

The patches I attached to some bug reports are just code changes I've made to my local copy of sources to make Netbeans work for my particular use-cases. I consider these use-case to be applicable not just to me, but to other users of Netbeans - that's why I've posted bug reports. Attached patches are just "more detailed bug report". I do not expect them to be accepted and pushed into source repository AS IS. 

To my understanding, being 'a contributor' requires rather much time to prepare code which satisfies all internal requirements to code quality (which i'm not aware about) and so on. Sorry - i do not have time for that. 

That's why you may or may not use completely or partially my patches. I just care to save my time in future, i.e. getting my use-cases working in future releases of NetBeans with no need to build NetBeans localy and regularly synchronize my and external changes. You may fix them in some different and rather probably more correct way than I've done in my code.
Comment 8 Jiri Kovalsky 2016-01-08 20:06:45 UTC
In such case it's up to you Matthias to review the patch and if you find the suggestions working integrate the changes as yours.

Sergey, by signing the OCA you neither get push access to any codebase automatically nor you commit to delivering super quality contributions. Anyway, thanks for your honesty.
Comment 9 matthias42 2016-03-14 19:37:30 UTC
At this point I'm out of real solutions - I see only one real option:

Remove the paging option and add the option to limit data based on rowcount/fetchtime/size.

Paging depends on the knowledge of the total rowcount. Removing the reexecution option (modified in different ways), the only remaining option is to do an exhaustive read of the resultset. While this is impractical over a WAN connection, in a LAN this would be doable. I tried that and hit h2, which got me an out-of-memory error for a large table. This was tried with and without scrollable resultsets, both failed.

If there are more ideas, I'd like to hear them - if not I'll go ahead and clean up db.dataview and db.core and remove paging option.
Comment 10 NukemBy 2016-03-17 10:41:51 UTC
I would go with something like this:

- by default - read at most 1000 of rows and keep in memory, this will cover 95% of live cases with reasonable memory usage.

- if resultset has more than 1000 rows - instead of "pagination controls" display a button (or link-button) saying "Row count is more than 1000, click to enable pagination or read entire resultset". Upon clicking - display dialog which explicitly says that "SQL statement will be re-executed" (with checkbox "OK, never show again") - after that revert to current implementation - user will not be disappointed "now and then" by unwanted re-execution of SQL statements.

I guess limit of 1000 rows should be configurable somewhere.

After fetching 1000 of rows it may have sense to use ResultSet.last() just to get total number of rows without reading them all, but JDBC driver has to support it.

Check here for details:
http://stackoverflow.com/questions/19533991/how-to-get-number-of-rows-returned-by-resultset-while-using-hsqldb 
https://examples.javacodegeeks.com/core-java/sql/determine-if-a-database-supports-scrollable-resultsets/
Comment 11 matthias42 2016-05-28 22:53:12 UTC
I decided to go with my original plan and not fetch the total row count. So basicly paging is now removed. The changeset is pushed as:

http://hg.netbeans.org/core-main/rev/10acbb2ec515

The statement is now only executed once to fetch the displayed resultset, possible only fetching a subset.

With regard to limitless fetching: It is already possible by specifying 0 as fetch row count.

When the nightly build is produced, that contains this fix, please check if this works for you and change the status of this bug to RESOLVED-VERIFIED. If you see a wrong behaviour, feel free to reopen (optimizations should go into a new report).