This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 141355 - XML file recognized as PHP file
Summary: XML file recognized as PHP file
Status: VERIFIED FIXED
Alias: None
Product: platform
Classification: Unclassified
Component: Filesystems (show other bugs)
Version: 6.x
Hardware: All All
: P1 blocker (vote)
Assignee: Jiri Skrivanek
URL:
Keywords: RANDOM, TEST
: 141333 (view as bug list)
Depends on: 199927
Blocks:
  Show dependency tree
 
Reported: 2008-07-23 14:06 UTC by Ivan Sidorkin
Modified: 2011-07-06 20:39 UTC (History)
7 users (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
screen shot (35.83 KB, image/png)
2008-07-23 14:06 UTC, Ivan Sidorkin
Details
the same problem with ruby files (121.95 KB, image/png)
2008-07-24 10:20 UTC, rmatous
Details
suggested patch (784 bytes, text/plain)
2008-07-24 10:38 UTC, rmatous
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ivan Sidorkin 2008-07-23 14:06:01 UTC
Randomly reproduced on deadlock during c/v tests run.
no specific steps to reproduce.

Newly created XML Document recognized as PHP file (see attached screen shot)
Comment 1 Ivan Sidorkin 2008-07-23 14:06:54 UTC
Created attachment 65378 [details]
screen shot
Comment 2 Jesse Glick 2008-07-23 23:43:29 UTC
Probably should be component 'php', no?
Comment 3 rmatous 2008-07-24 10:20:21 UTC
Created attachment 65500 [details]
the same problem with ruby files
Comment 4 rmatous 2008-07-24 10:37:19 UTC
meta-inf registered mime resolvers often need to lookup in the content of file, definitely true for php mime resolver.
php resolver expects to be called just in case when declarative resolvers were not able to resolve (forlorn hope). Order
affects also performance. 

Evaluating this issue I see that declarative mime resolvers were:
- moved to openide.filesystems 
- logic isn't as I expected

Simple fix should be enough to fix this problem - just calling of declarative resolvers first (see attachment). I'm not
sure if this fix is OK and will not lead  to other problems. 

Reassigned to core, please review the fix and apply or reassign back to me (but please then comment and let me know the
way how to fix it). 

Increasing priority.
Comment 5 rmatous 2008-07-24 10:38:13 UTC
Created attachment 65503 [details]
suggested patch
Comment 6 rmatous 2008-07-24 10:39:12 UTC
thanks for evaluation
Comment 7 Torbjorn Norbye 2008-07-24 15:49:55 UTC
*** Issue 141333 has been marked as a duplicate of this issue. ***
Comment 8 Jesse Glick 2008-07-24 16:01:23 UTC
I think the suggested patch makes sense. (MIMEResolver.java's Javadoc would also need to be changed.)

BTW I don't follow why the procedural PHP resolver is recognizing XML and Ruby files, regardless of the order in which
it was invoked. This looks like a bug in the php component to me.
Comment 9 rmatous 2008-07-24 16:23:54 UTC
I don't see any bug in php mime resolver yet.
php mime resolver doesn't care about name or extension (and cannot reliably), just finds pattern <?php. When I
implemented it I relied on the fact that ruby (and other well known) files are already resolved and for the unrecognized
files is OK to say "its php" if contains <?php. php tags can be easily processed no matter whether file is *.php or
*.xml (especially if run as command line - where no apache config.is needed)
Comment 10 Jesse Glick 2008-07-24 16:29:12 UTC
I don't see any "<?php" in the main.rb in the screenshot, and it would be pretty weird if the XMLDocument.xml in C/V
contained this string. Why would the PHP procedural resolver claim these files?
Comment 11 rmatous 2008-07-24 16:30:10 UTC
The only problem might be that as soon as I resolve a file as a php, then its extension is kept and all other files with
such extension are considered as php. 2 reasons for doing it: 1.perf., 2. editing a file - deleting a recreating a
characters <?php would lead to different mime types in time - which is something that NB isn't ready for as far as I know.
Comment 12 Jesse Glick 2008-07-24 16:35:43 UTC
I don't follow rmatous's last comment. The only problem with what? Who is "keeping" an extension?

All I asked in my last comment was why files which do not appear to have anything whatsoever to do with PHP are being
claimed by its resolver. Invoking that resolver much later might reduce the symptoms of its bugginess but that would not
make it correct.
Comment 13 rmatous 2008-07-24 16:39:40 UTC
My previous comment is answer to you last comment about missing <?php in the screenshot. I could also add that other
*.rb file could contain <?php sequence or even file in the screenshot contained also missing "hp" to complete "<?php"
and so on.

Naturally I can have a bug in the code, but I don't know about it yet.
Comment 14 rmatous 2008-07-24 16:58:45 UTC
> I don't follow rmatous's last comment. The only problem with what? Who is "keeping" an extension?

php mime resolver as soon as finds mentioned magic chars will claim it php file. Lets say the magic chars are found in
*.rb then all *.rb files will be later resolved as php files by this mime resolver (reasons explained). Extension is
kept by php mime resolver for later use (as explained above). 


>All I asked in my last comment was why files which do not appear to have anything whatsoever to do with PHP are being
claimed by its resolver.

How should I know which files have something to to do with PHP. OK I can check extensions "*.rb, *.rhtml" (BTW already
fixed) and else? Should I enumerate all known extensions. As I mentioned it could be absolutely OK to treat *.xml files
as php (even someone mentioned it once like a real use case). If there was api that would return me all registered
extensions in NB I wouldn't recognize them as php (but its not necessary if my resolver was called as the last one).

>Invoking that resolver much later might reduce the symptoms of its bugginess but that would not
make it correct.

Not sure whether I understand which bugginess you have in mind (in condition that order is fixed according to patch). If
someone will find a bug, I'm ready to fix it.

Comment 15 Jesse Glick 2008-07-24 17:19:26 UTC
Recognizing all *.foo files as PHP just because some.foo happened to contain "<?php" seems dangerous to me.
Comment 16 Torbjorn Norbye 2008-07-24 17:19:57 UTC
In the scenario I ran into, there were no files that contained "php" anywhere inside them (JavaApplication3,
RubyApplication33 in my screenshot).  I think the userdirectory was new too, but I'm not 100% sure about that.

The reason there are HTML tags and "<?p" in the screenshot is that I suddenly noticed the PHP icon in the file explorer,
and since I thought the file had no syntax highlighting (all text was black with just the Ruby code there), I added some
HTML markup, noticed element syntax highlighting, then added <?p to see if I could get PHP code completion too (which I
did.)

>  "How should I know which files have something to to do with PHP. OK I can check extensions "*.rb, *.rhtml" (BTW
already fixed)"

I assume you meant "have -nothing- to do with PHP", in other words, .rb and .rhtml (or any of the 10 other Ruby standard
file extensions referred in ruby/src/**/RubyMimeResolver.java) will never be treated as PHP, correct?
Comment 17 rmatous 2008-07-24 17:40:51 UTC
> In the scenario I ran into, there were no files that contained "php" anywhere inside 
resolver looks for either for "<?php" or "<?[whitespace]" (also called short tag). Definitely the code on the screenshot
isn't recognized as php file for me(thus cannot reproduce).

> I assume you meant "have -nothing- to do with PHP", 

yes

> in other words, .rb and .rhtml (or any of the 10 other Ruby standard
file extensions referred in ruby/src/**/RubyMimeResolver.java) will never be treated as PHP, correct?

If your RubyMimeResolver.java will be called before php mime resolver, then these ruby files will never be treated as
PHP if  your mime resolver works fine :) naturally.
If my resolver will be called befor yours then it depends whether  "<?php" or "<?[whitespace]" will be found in those
file (just in first 4kB - which is limit that was permitted me by perf.team - this can btw. also lead to unreliable
resolving). 
Comment 18 Jesse Glick 2008-07-24 17:45:04 UTC
Treating anything with "<? " as a PHP file seems wrong to me. This could just be any XML processor instruction (I think).
Comment 19 rmatous 2008-07-24 17:54:53 UTC
You are right. "<?xml" is common, not sure about "<? ". "<? " is going to be deprecated (not yet) and still often used
instead of "<?php". If my mime resolver was the last one, then such file could be php file(instead of unknown), actually
why not. Not sure if it is worse to mark not php file as php file than not recognize php file and treat it as unknown.
Comment 20 Jesse Glick 2008-07-24 18:49:49 UTC
Correction:

<?foo bie="bletch"?>

is a valid XML processor instruction, but

<? foo bie="bletch"?>

is not. Still, "<? " seems like a pretty generic character sequence that is not obviously indicative of PHP.
Comment 21 rmatous 2008-07-24 21:42:21 UTC
If it seemed to be a problem we could consider not to support short tags("<? ") as default, but this is question for
Petr Pisl, but I wouldn't solve it earlier than the order will be changed and until we run into any problems related to it. 
If someone let me know what other indication I should use, I will do it (probably someone familiar with php - Tomas,
Petr  any idea?). 
Comment 22 rmatous 2008-07-24 22:04:57 UTC
> Recognizing all *.foo files as PHP just because some.foo happened to contain "<?php" seems dangerous to me.

why, if there was guaranteed that nobody else handles *.foo files because php resolver is last and moreover the file
contains "<?php"?
Comment 23 Jiri Skrivanek 2008-07-25 09:21:45 UTC
Radek defines his resolver in META-INF.services/org.openide.filesystems.MIMEResolver this way:

org.netbeans.modules.php.project.PhpMimeResolver
#position=999

Is there an option to get position number from lookup and sort accordingly within declaritive resolvers? I am afraid
there isn't because MetaInfServicesLookup.Item.position is private. In that case seems to me appropriate to apply
Radek's patch.
Comment 24 Tomas Mysik 2008-07-25 10:01:20 UTC
> not to support short tags("<? ") as default

We _have_ to support short open tags because _many_ existing PHP projects use short open tags - but this is must for 
editor. I think that we could remove short open tags detection from this MIME resolver because all our templates 
use "<?php".

However more important issue is IMHO in the ordering of MIME resolvers.
Comment 25 Jiri Skrivanek 2008-07-25 11:12:10 UTC
Patch applied and javadoc updated.

http://hg.netbeans.org/core-main/rev/10d30eec11f7
Comment 26 Jesse Glick 2008-07-25 14:16:53 UTC
Right, we can't intermix M-I/s resolvers with Services/MIMEResolver resolvers meaningfully.

MIMESupport Javadoc did not need to be touched; it is package private. But may be helpful for someone reading code.
Comment 27 Petr Pisl 2008-07-25 14:40:19 UTC
I agree with Tomas. For this resolver is important mainly <?php, because we use only this delimiter in our templates. 
Comment 28 rmatous 2008-07-25 15:33:23 UTC
short tags don't indicate php files anymore, fixed:
http://hg.netbeans.org/main/rev/280928439638
Comment 29 Ivan Sidorkin 2008-08-04 11:49:50 UTC
verified