This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 99058 - I18N - problems in use of multibyte in some project file or dir names of ruby or rails projects
Summary: I18N - problems in use of multibyte in some project file or dir names of ruby...
Status: RESOLVED FIXED
Alias: None
Product: ruby
Classification: Unclassified
Component: Project (show other bugs)
Version: 6.x
Hardware: All All
: P3 blocker (vote)
Assignee: Erno Mononen
URL:
Keywords: I18N
Depends on:
Blocks:
 
Reported: 2007-03-27 00:50 UTC by Ken Frank
Modified: 2009-07-21 08:18 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments
image (1.18 KB, image/gif)
2007-03-27 02:10 UTC, Ken Frank
Details
image (1.52 KB, image/gif)
2007-03-27 02:10 UTC, Ken Frank
Details
image (46.13 KB, image/gif)
2007-03-27 02:11 UTC, Ken Frank
Details
image (58.15 KB, image/gif)
2007-03-27 02:31 UTC, Ken Frank
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ken Frank 2007-03-27 00:50:37 UTC
assumption for this issue (and I can seprate into separate ones) - is that, like
for java and other nb projects, its ok to have non ascii or multibyte in names of
projects and the files related to them, and also in the data of the files
(where that multibyte would be allowed as to the syntax of the programming language)

for example for java its ok to have mbyte in names of variables, classes,
functions, etc (assumption here is that user is using characters of the locale
they are running nb under as a basic case)

For ruby, the high level questions to team are:

1. is it legal in jruby language and rails to have multibyte used as file names
and in names of program classes, functions, variables ?

We're not going to be testing jruby or rails for this and assuming that kind of
testing is being done by those doing other jruby or rails testing.

2.  from Tor's comments, am assuming for nb projects that it should be legal
for ruby and rails projects to have mbyte as part of project name at least  ?
but answer to #1 will let us know if its legal for other parts


3. we need to know the same kind of specs for file  names of context of the
other ruby or scripting files like ruby class, rakefile, unit test, rhtml
as well as javascript, php.


4. and having some simple jruby and rails projects to open will be needed
so we can then change some data to use of mbyte or perhaps refactor or rename to
use mbyte -- can we get a contact for that ?

A. here are some things observed - gifs and attachements will be added.

1. even though the default project and folder and some other things are marked
as #NOI18N in ruby or rails bundle files, it does not mean that user cannot
use mbyte in them in wizard panels, and users can run in other locale without
using localized product -- 
when using mbyte in name of project or a dir in its path, creating basic ruby
project and file seems ok and runs but using mbyte in name of the main file,
gives an exception
&&& see attached file rubymain.mb.

2. when in ja locale and choose to generate an rdoc, it does not generate it
(or perhaps generates it but does not display it) and gives message

Rakefile.rb:
 main.rb:
Generating HTML...
/net/machine/export/home2/0321/nb6.0/jruby-0.9.8/lib/ruby/1.8/rdoc/rdoc.rb:267:
in
`chdir':
/home/user/h\256Joi\256NetBeansProjects\244?/h\256RubyApplication5 is not a
directory
(Errno::ENOENT)
from
 /net/machine/export/home2/0321/nb6.0/jruby-0.9.8/lib/ruby/1.8/rdoc/rdoc.rb:267
:in `document'
from
/net/machine/export/home2/0321/nb6.0/jruby-0.9.8/bin/rdoc:63

but its ok if non mbyte is used in name of project or path.

Please note that there is a lot going on in general with encoding handling and
projects in nb6; see 42638 and related - nb developers could tell you more
about this and if scripting needs to do anything related -- so some of this 
*might* be related to that, I just dont know.

3. see this exception even using english project or file names but in english
it seems the rdoc compiles ok not if using mbyte. - attached file is rubydoc.enandja


4. rails project wth mbyte project name - no files like controllers, etc are
generated and output window has a message
Error opening script file: script/server (ファイルもディレクトリもありません。)
where the mbyte here is not from the pseudo localization, perhaps some
msg from a unix command ? (since solaris ja has many localized msgs)
but this could be from some nb ext lib used that has been localized ?

and explorer shows rails project without any tabs or trees (since nothing was
generated)

5. choose menu on rails project with mbytein explorer and choose generate and
choosefor example a controller  - nothing happens,  same error msgs as in 4.

6. create rails project with not mbyte in proj name and all things are generated,
or choose to generate another item, and it appears that the file is generated
(though I don't know if its complete)
however the mbyte in the file name, and in output wndow and in the file
itself is not correct. -see attached gif  genmodel.mb.gif

(this happens in ja euc locale or utf8 locale - and its always good to make sure
encoding handling is ok in each, since sometimes things might work ok in one
but not the other (using ja here as a representative mbyte locale - same could
be true for other locales)
Comment 1 Ken Frank 2007-03-27 02:10:14 UTC
Created attachment 39995 [details]
image
Comment 2 Ken Frank 2007-03-27 02:10:59 UTC
Created attachment 39996 [details]
image
Comment 3 Ken Frank 2007-03-27 02:11:33 UTC
Created attachment 39997 [details]
image
Comment 4 Ken Frank 2007-03-27 02:31:21 UTC
create ruby test in a path that has multibyte and choose run  test from editor

incorrect mbyte shows in output window - I dont know if test should run or if
ow errors is realated to mbyte but the mbyte is not correct
see attachement.

ken.frank@sun.com
Comment 5 Ken Frank 2007-03-27 02:31:56 UTC
Created attachment 39998 [details]
image
Comment 6 Ken Frank 2007-06-13 21:00:37 UTC
new implementation of project encoding property might have impact on some items mentioned in original
filing below, and might have some side effects to cause other situations - will note some others in this posting here:

1. generate something, like a controller for example, where the name has multibyte as part of it, the
multibyte is not shown properly in explorer (which probably means it wont be found as a file when needed
due to encoding problems

a. and the rb file generated also has the multibyte shown incorrectly.


2. project properties - arguments -
please check to see if encoding is handled
ok for arguments specified here that could
have multibyte as part of the argument.


3. a .rb that has multibyte in its name - it can run ok and show mbyte in it ok in output window
(like for a puts command)

4. but a ruby project that has mbyte in the project name

a. can run the project and see mbyte ok

b. but choosing to build it, does not build and get this error
rake aborted!
No such file to load --
/user/NetBeansProjects/??RubyApplication14/lib/Rakefile.rb

where the ?? is multibyte

NOTE - even for a ruby project without mbyte in its name, choosing build gives this error
rake aborted!
Don't know how to build task 'default'
am guessing this is not an i18n situation ?


5. bring up IRB window - the multibyte that is part of the  Welcome to the JRuby IRB shows ok

but using puts 'ZZZZ' where ZZZZ is multibyte - the mbyte displays as question marks.


6. to clarify from original post about rdoc, if ruby project has mbyte in project name or name of main.rb,
then it wont generate it, and gives syntax errors, since it says it can't find the file or project dir and shows
the mbyte its looking for in those names incorrectly in that msg in output window.

ken.frank@sun.com
Comment 7 Jiri Kovalsky 2007-07-03 14:09:36 UTC
Reassigning this issue to newly created 'ruby' component.
Comment 8 Ken Frank 2007-07-03 22:41:38 UTC
additional case -

ruby file that has multibyte as part of its name - the file name is a comment in the newly 
created ruby file, and for a ruby class, also the class name is seeded with the name of the file -
in both these cases, the multibyte does not display ok -

this is when its in a ruby project or created separately from another project like a j2se project --
this is when the default project encoding has not been changed from utf-8.

if changed to utf-8 to the encoding of the system locale, it shows ok, but this is not as per the new
project/file encoding spec.

ken.frank@sun.com
Comment 9 Ken Frank 2007-07-30 22:04:48 UTC
create a ruby class with multibyte  as part of the name; mbyte is not allowed at first letter of name.
thus the file name and class name have mbyte as part of the creation

try to run the file but it gives error - this does not happen if same ruby class has no multibyte in its name
actually the filename can have multibyte, but not the class name - that is what seems to cause the exception below:

Exception in thread "main" java.lang.NumberFormatException: For input string: "?"
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
        at java.lang.Integer.parseInt(Integer.java:447)
        at org.jruby.lexer.yacc.RubyYaccLexer.yylex(RubyYaccLexer.java:1411)
        at org.jruby.lexer.yacc.RubyYaccLexer.advance(RubyYaccLexer.java:123)
        at org.jruby.parser.DefaultRubyParser.yyparse(DefaultRubyParser.java:883)
        at org.jruby.parser.DefaultRubyParser.yyparse(DefaultRubyParser.java:835)
        at org.jruby.parser.DefaultRubyParser.parse(DefaultRubyParser.java:3299)
        at org.jruby.parser.Parser.parse(Parser.java:78)
        at org.jruby.Ruby.parse(Ruby.java:1020)
        at org.jruby.Main.getParsedScript(Main.java:238)
        at org.jruby.Main.runInterpreter(Main.java:222)
        at org.jruby.Main.runInterpreter(Main.java:173)
        at org.jruby.Main.run(Main.java:120)
        at org.jruby.Main.main(Main.java:95)

ken.frank@sun.com
Comment 10 Ken Frank 2007-07-31 00:31:00 UTC
for the comment on rdoc, the generation fails if multibyte is anywhere in the path,
whether project name, path to project dir or name of a ruby file
and no matter if euc-ja encoding of project is used for user in solaris ja locale
or utf-8 encoding of project, which is now a valid choice since there are now project
encoding properties

error msg from gen rdoc is, for example

/net/machine/export/home2/nbinst/ruby1/jruby-1.0/lib/ruby/1.8/rdoc/rdoc.rb:176:in `normalized_file_list': No such file
or directory - /home/NetBeansProjects/RubyApplication17/./lib/¤Èäîmain.rb (Errno::ENOENT)
        from /net/machine/export/home2/nbinst/ruby1/jruby-1.0/lib/ruby/1.8/rdoc/rdoc.rb:187:in `each'
        from /net/machine/export/home2/knbinst/ruby1/jruby-1.0/lib/ruby/1.8/rdoc/rdoc.rb:177:in `normalized_file_list'
      ...
      ...
        
in this case, it can't read that there is a file with mbyte in its name
in other cases, it can't read that mbyte is in a path to the project.

ken.frank@sun.com
Comment 11 Ken Frank 2007-07-31 02:49:15 UTC
another case - a variable that has multibyte in its name or a function created by def  - when running the file the same
error as seen in the last comment is shown in output window and it does not run.

Am assuming this would apply to other use of mbyte in variable, function, method names in ruby or rails.
(what is not known is if that is legal in ruby or rails itself or not)

ken.frank@sun.com
Comment 12 Ken Frank 2007-07-31 18:49:48 UTC
2 comments more on rdoc and mbyte

1. with a ruby class file with en name but with mbyte as part of class name,
the rdoc does generate and show - however, the class name is not shown
correctly - only the part before the multibyte shows as the class name.

perhaps this is ruby restriction on use of mbyte at all in class name ?

but see 2 below - it seems the entire class name is accepted if mbyte is not the first
character, but the rdoc does not show it.


2. or it might be separate problem in key handling of the class name in new class
wizard or it could be that its enforcing that class name cannot start with mbyte -
but its not clear.

that is, when adding mbyte to first part of class name, it takes the first character
input from en keybd while in mbyte input mode (am on solaris) and uses that
character (after capitalizing it) as the first letter of the class name, then it proceeds
to allow the mbyte input.

perhaps there could be a help msg in the window about this if there is restriction
related to chars of class name.

ken.frank@sun.com
Comment 13 Masaki Katakai 2007-08-01 03:38:04 UTC
 - Ruby identifiers are consist of alphabets, decimal digits, and the underscore character
 - Class name and Module names are constants. It should begin with upper case letters ([A-Z]).

I could not find any description written in English but Japanese man page says,

 - It is not recommended because of compatibility issue, but by providing -K option properly,
   identifiers written in Japanese can be used as local variable.
   (http://www.ruby-lang.org/ja/man/?cmd=view;name=%CA%D1%BF%F4%A4%C8%C4%EA%BF%F4)

From these things, I understand :

 - deprecated : Japanese can be used just in local variable with proper -K option

Usually developers do not use Japanese as variable and filename.
So I think the priority of this bug should not be high.
Comment 14 Torbjorn Norbye 2007-08-01 03:47:56 UTC
Thanks. Changing priority to P3. It's unlikely that we can do anything about this on our end; it's a limitation of Ruby and JRuby.  I believe unicode support is 
one of the goals for Ruby 2.0 which should the charset aspects of this bug. On top of that lots of Rails libraries etc. would need to start using message 
catalogs and such.
Comment 15 Ken Frank 2007-08-01 04:01:02 UTC
Tor,

this issue was meant as an early placeholder issue about various things about mbyte -
not just about ruby language but about how nb handles mbyte.

Can you/ruby team go thru it and note which things are limitation of rub/rails language/libs

and which are things that should work with mbyte in nb that don't run into the ruby limitations ?

Then we can put the limitations stuff in docs and we can file sep issues on what could be fixed
in nb ruby/rails  modules.

Also, there could be an issue related to not allowing mbyte and/or extended ascii
input in places like wizards, code, etc or at least warning msgs so users are not confused.

And if there is limitation to actually using mbyte chars in program data in that they will
not show correctly, that especially can be called out as that is not related to ruby lang
constructs per se.

We are not talking only about japanese here, we have users from all over and expectations
of use of native chars can be different in different countries.

ken.frank@sun.com
Comment 16 Torbjorn Norbye 2007-08-01 17:48:46 UTC
I've added some code which checks if a Ruby name is "safe" ([a-zA-Z0-9_]) and if not, will generate warnings (but not errors) in the create project dialogs 
(for Ruby and Rails projects), in the new file dialog (for classes, modules, tests), and in the Rename refactoring "new name" dialog.

IDE:-------------------------------------------------
IDE: [8/1/07 9:46 AM] Committing started
Checking in refactoring/src/org/netbeans/modules/refactoring/ruby/plugins/RenameRefactoringPlugin.java;
/cvs/ruby/refactoring/src/org/netbeans/modules/refactoring/ruby/plugins/RenameRefactoringPlugin.java,v  <--  RenameRefactoringPlugin.java
new revision: 1.3; previous revision: 1.2
done
Checking in railsprojects/src/org/netbeans/modules/ruby/railsprojects/ui/wizards/PanelProjectLocationVisual.java;
/cvs/ruby/railsprojects/src/org/netbeans/modules/ruby/railsprojects/ui/wizards/PanelProjectLocationVisual.java,v  <--  PanelProjectLocationVisual.java
new revision: 1.3; previous revision: 1.2
done
Checking in editing/src/org/netbeans/modules/ruby/Bundle.properties;
/cvs/ruby/editing/src/org/netbeans/modules/ruby/Bundle.properties,v  <--  Bundle.properties
new revision: 1.6; previous revision: 1.5
done
Checking in editing/src/org/netbeans/modules/ruby/RubyUtils.java;
/cvs/ruby/editing/src/org/netbeans/modules/ruby/RubyUtils.java,v  <--  RubyUtils.java
new revision: 1.5; previous revision: 1.4
done
Checking in projects/src/org/netbeans/modules/ruby/rubyproject/templates/RubyTargetChooserPanel.java;
/cvs/ruby/projects/src/org/netbeans/modules/ruby/rubyproject/templates/RubyTargetChooserPanel.java,v  <--  RubyTargetChooserPanel.java
new revision: 1.4; previous revision: 1.3
done
Checking in projects/src/org/netbeans/modules/ruby/rubyproject/ui/wizards/PanelProjectLocationVisual.java;
/cvs/ruby/projects/src/org/netbeans/modules/ruby/rubyproject/ui/wizards/PanelProjectLocationVisual.java,v  <--  PanelProjectLocationVisual.java
new revision: 1.3; previous revision: 1.2
done
IDE: [8/1/07 9:46 AM] Committing finished
Comment 17 Ken Frank 2007-08-13 04:53:18 UTC
from other discussions, it seems that most if not all use of multibyte
is not safe or supported for ruby (am not talking about any Japanese specific things here)

thus should the warnings about not using extended ascii or mbyte be also for
name of main.rb, ruby filename and classnames, project path, rakefile names ?

ken.frank@sun.com
Comment 18 Ken Frank 2007-09-30 03:39:29 UTC
before verifying, need to know if there will be more warnings as per previous comment which was:

from other discussions, it seems that most if not all use of multibyte
is not safe or supported for ruby (am not talking about any Japanese specific things here)

thus should the warnings about not using extended ascii or mbyte be also for
name of main.rb, ruby filename and classnames, project path, rakefile names ?

--> will there be more warnings ?

also see ruby issue about adding to docs/olh the problems with use of mbyte due to ruby/rails limitations.

ken.frank@sun.com
Comment 19 Ken Frank 2008-03-07 02:42:40 UTC
since this issue seems to have become a place where its noted about many things
related to use of multibyte and non ascii dont work either in ruby/rails modules
or more likely the ruby itself being used now, I'll add another one here;
we can separate out specific issues when next version of ruby is used that is supposed
to have some i18n fixes

1. if path to the project has multibyte in it, you can't debug the ruby program,
the debugger says it can't find the file since its not using the correct encoding,
or perhaps ruby module not passing it correct encoding
(am in euc-jp project encoding here but am guessing same case for utf-8 or other project
encodings)



ken.frank@sun.com
Comment 20 Erno Mononen 2009-07-21 08:18:19 UTC
As far as I can tell the original problem, i.e. the one described in the summary, has been fixed long ago. As for the 
rest: if there are any remaining I18N problems, best to file new issues for each of them (one issue / one problem). I 
can't really handle this kind of convoluted issues. Thanks for understanding.