Bug 201848 - Provide a Common API for Validating/Sanitizing a File Name
Provide a Common API for Validating/Sanitizing a File Name
Status: NEW
Product: platform
Classification: Unclassified
Component: Filesystems
-S1S-
All All
: P3 with 1 vote (vote)
: TBD
Assigned To: tomwheeler
issues@platform
:
Depends on:
Blocks: 130554
  Show dependency treegraph
 
Reported: 2011-09-08 17:19 UTC by tomwheeler
Modified: 2011-09-16 16:42 UTC (History)
1 user (show)

See Also:
Issue Type: ENHANCEMENT
:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description tomwheeler 2011-09-08 17:19:13 UTC
Consider the case of a wizard which asks the user to specify a file or directory name in a NetBeans Platform application.  Since the operating system imposes restrictions on a filename (e.g. cannot contain / in UNIX, or \, :, in MS Windows), the name provided may not be valid.  

Unfortunately, there appears to be no common method one can call to validate a filename according to the native filesystem.  Consequently, it seems that many modules in the IDE sources create their own utility method to validate the name.  Here is an example from the NetBeans Module Project type:

apisupport.ant/src/org/netbeans/modules/apisupport/project/ui/wizard/BasicInfoWizardPanel.java:

    String pattern;
    String forbiddenChars;
    if (Utilities.isWindows()) {
        pattern = ".*[\\/:*?\"<>|].*";    // NOI18N
        forbiddenChars = "\\ / : * ? \" < > |";    // NOI18N
    } else {
        pattern = ".*[\\/].*";    // NOI18N
        forbiddenChars = "\\ /";    // NOI18N
    }

    // #145574: check for forbidden characters in FolderObject
    if (Pattern.matches(pattern, name)) {
        String message = NbBundle.getMessage(BasicInfoWizardPanel.class,
            "MSG_ProjectFolderInvalidCharacters");

        message = String.format(message, forbiddenChars);
        throw new WizardValidationException(
            getVisualPanel().nameValue, message, message);
    }

Likewise, some modules define a method to simply strip a prospective filename of illegal characters, as shown in org.netbeans.modules.glassfish.spi.Utils:

    public static String sanitizeName(String name) {
        if (null == name || name.matches("[\\p{L}\\p{N}_][\\p{L}\\p{N}\\-_./;#:]*")) {
            return name;
        }
        // the string is bad...
        return "_" + name.replaceAll("[^\\p{L}\\p{N}\\-_./;#:]", "_");
    }

    public static final String escapePath(String path) {
        return path.replace("\\", "\\\\").replace("$", "\\$"); // NOI18N
    }

Since these are commonly done in both the IDE and externally-developed platform applications, I feel it would be beneficial if the Filesystem API offered methods to validate and sanitize file names similar to what was described above.  This might best fit into the org.openide.filesystems.FileUtil class.
Comment 1 tomwheeler 2011-09-08 19:43:13 UTC
This Wikipedia page lists validity rules for different systems:

   http://en.wikipedia.org/wiki/Filename#Reserved_characters_and_words
Comment 2 tomwheeler 2011-09-08 19:48:42 UTC
And, for MS Windows, I guess this may be the definitive source:

   http://msdn.microsoft.com/en-us/library/aa365247%28VS.85%29.aspx
Comment 3 Jaroslav Tulach 2011-09-16 05:55:29 UTC
Can you donate a patch?
Comment 4 tomwheeler 2011-09-16 13:48:00 UTC
Yes, I think I should be able to implement it though it will probably be a week or two till I have time.  I am reassigning the issue to me.
Comment 5 err 2011-09-16 15:06:34 UTC
Suggest considering ways to override which platform to be compatible with. For example, if default behavior is validate for platform you are running on then an option, like a system property, that filename is valid on other systems and/or all systems. A method signature that takes a platform, as an enum, as an option (with a pseudo ALL platform) might be handy.
Comment 6 tomwheeler 2011-09-16 16:21:21 UTC
I was thinking the same thing, Ernie.  That would be more flexible and allow someone to handle cases in which they're developing on one platform but will deploy to another.  That would be especially helpful for things like EJB or Web application support, since people commonly develop on MS Windows but deploy to UNIX.

The only problem is that, to avoid duplication, we'd probably want to use the Utilities.OS_* int constants.  Ideally we could replace those with enum, but it would break backwards compatibility.  I hate writing methods that take special int values, but can't think of a way offhand to use an enum without either duplicating existing code or breaking compatibility.

One other question is whether we'd need to support any non-UNIX, non-Windows platforms (e.g. VMS or OS/400).  NetBeans system requirements don't show that these are supported any more:

  http://netbeans.org/community/releases/70/relnotes.html#system_requirements

but maybe people still support them for NB Platform applications.  To avoid breaking those, maybe the default behavior should be to return true for isValid and to return the original string for sanitizeFileName (and maybe log a warning) for an unrecognized platform.
Comment 7 err 2011-09-16 16:42:28 UTC
> use the Utilities.OS_* int constants.  Ideally we could replace those
> with enum, but it would break backwards compatibility.

I know, I thought I'd put it out there anyway. I really like the self documenting aspect of enums (getting a list for combo box) and ... 

One approach is to embed the magic number in the enum and have something like
    validate(int platform)       // uses OS_* int constants
    validate(Platform platform) { return validate(platform.MAGIC_NUMBER); }

of course, getting agreement to define such an global enum for general use might not get much traction.


By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo