TIP 339: Case-Insensitive Package Names

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:         Andreas Kupries <andreask@activestate.com>
State:          Rejected
Type:           Project
Vote:           Done
Created:        14-Nov-2008
Post-History:   
Tcl-Version:    8.6

Abstract

This document proposes to change the package management facilities of the Tcl core to handle package names in a case-insensitive manner, switching back to case-sensitive operation if and only if explicitly requested by the user. The case of package names is preserved in the storage however.

Rationale

The package management facilities of Tcl are an area where the user of packages is currently burdened with more ... than is convenient or easy. A big problem is that Tcl compares package names case-sensitively. This means that Expect and expect are two different things to Tcl. In the real world however having two packages using the same name but different capitalization is extremely rare.

Yet the user of Tcl package facilities has to remember whatever ecclectic capitalization the package creator has chosen.

Changing this, i.e. having Tcl's package management go to case-insensitive comparison, will lift this burden and make things easier, as it is not necessary anymore to get the capitalization right. Most people will likely simply start to write all package names lower-case after this change.

Specification

Script Level

The only command affected by the proposed change is package. More specifically, its sub-commands ifneeded, present, require, and versions.

In detail

  • package require name ...

    This sub-command is changed to accept an additional option -strict.

    If this option is present the sub-command performs an old-style, i.e. case-sensitive search for the package name.

    Otherwise the command is changed to disregard case during the search for name. In that case it is possible that multiple packages are found which have the same name, just with different capitalizations. To resolve this conflict the command will load the package in the matching set which was added first to its memory database via package ifneeded.

    If that resolves to the wrong package the user has to use -strict and specify the exact name of the package required, capitalization and all. Given our rationale for the TIP this should be required seldomly.

  • package present ?-exact? name ?version?

    This sub-command is changed to accept an additional option -strict.

    If this option is present the sub-command performs an old-style, i.e. case-sensitive search for the package name.

    Otherwise the command is changed to disregard case during the search for name. In that case it is possible that multiple packages are found which have the same name, just with different capitalizations. To resolve this conflict the command will return the version of the package in the matching set it found first among the loaded packages.

    If that resolves to the wrong package the user has to use -strict and specify the exact name of the package to look for, capitalization and all. Given our rationale for the TIP this should be required seldomly.

  • package versions name

    This sub-command is changed to accept an additional option -strict.

    If this option is present the sub-command performs an old-style, i.e. case-sensitive search for the package name.

    Otherwise the command is changed to disregard case during the search for name. In that case it is possible that multiple packages are found which have the same name, just with different capitalizations. To resolve this conflict the command will return the version of the package in the matching set which was added first to its memory database via package ifneeded.

    If that resolves to the wrong package the user has to use -strict and specify the exact name of the package to query, capitalization and all. Given our rationale for the TIP this should be required seldomly.

  • package forget name ...

    This sub-command is changed to accept an additional option -strict.

    If this option is present the sub-command performs an old-style, i.e. case-sensitive search for the package name to forget.

    Otherwise the command is changed to disregard case during the search for name. In that case it is possible that multiple packages are found which have the same name, just with different capitalizations. Instead of resolving the conflict by forgetting only the first package found among the loaded this command will forget all of the packages matching the name.

    If that does more than intended the user has to use -strict and specify the exact name(s) of the package to forget, capitalization and all. Given our rationale for the TIP this should be required seldomly.

  • package provide name version

    This form of the command is left unchanged.

  • package provide name

    This form of the command is like package present, except for the different handling of a package not in memory. It is therefore changed in the same manner, i.e:

    This form of the sub-command is changed to accept an additional option -strict.

    If this option is present this form of the sub-command performs an old-style, i.e. case-sensitive search for the package name before returning its version, or the empty string if nothing was found.

    Otherwise this form of the sub-command is changed to disregard case during the search for name. In that case it is possible that multiple packages are found which have the same name, just with different capitalizations. To resolve this conflict the command will return the version of the package in the matching set it found first among the loaded packages.

    If that resolves to the wrong package the user has to use -strict and specify the exact name of the package to look for, capitalization and all. Given our rationale for the TIP this should be required seldomly.

  • package ifneeded name version script

    This form of the command is left unchanged. It keeps storing the package name as given, preserving capitalization, and package provide searches case-sensitively. This means that the name used by provide has to match the name used by the ifneeded, just like the version numbers have to match. While we are working on easing the load on the user of a package with regard to letter case we do assume that the developer of a package does know how it is spelled, capitalization and all, therefore a strict, case-sensitive search is best to detect such avoidable mismatches.

  • package ifneeded name version

    This form of the command is similar to package present or package provide. It is therefore changed in the same manner, i.e:

    This form of the sub-command is changed to accept an additional option -strict.

    If this option is present this form of the sub-command performs an old-style, i.e. case-sensitive search for the package name before returning the script, or the empty string if nothing was found.

    Otherwise this form of the sub-command is changed to disregard case during the search for name. In that case it is possible that multiple packages are found which have the same name, just with different capitalizations, and the same version. To resolve this conflict the command will return the script of the package in the matching set it found first among the loaded packages.

    If that resolves to the wrong package the user has to use -strict and specify the exact name of the package to look for, capitalization and all. Given our rationale for the TIP this should be required seldomly.

C API

At the C-level the only functions to consider are

  • Tcl_PkgPresent

  • Tcl_PkgPresentEx

  • Tcl_PkgRequire

  • Tcl_PkgRequireEx

  • Tcl_PkgRequireProc

The first four of these currently have an integer (boolean) argument exact. This argument is changed to an integer (bitset) flags, with the original exact value residing in bit 0. The strictness of the package name comparison is encoded in bit 1. For easy access to the flags we define

 #define TCL_PKG_EXACT  1  /* Exact version required */
 #define TCL_PKG_STRICT 2  /* Use strict (case-sensitive) package name
                            * comparison */

This change relies on the assumption that all (most) existing users of these functions use the constants 0 and 1 for exact, making them fully backward compatible.

The last function, Tcl_PkgRequireProc, has no arguments we can modify in this manner. Its behaviour is changed to perform case-insensitive searches, as specified above for the associated subcommands of package, and a new function is added for access to the old behaviour, i.e. case-sensitive searches. This new function is

  • Tcl_PkgRequireProcEx

It has the same signature as its pre-TIP counterpart, except that a flags argument is added after the clientData. This argument recognizes TCL_PKG_STRICT as defined above and uses it to switch between case-insensitive and strict search. The old Tcl_PkgRequireProc is reimplemented in terms of Tcl_PkgRequireProcEx.

Unknown handlers

The contract between the package management facilities in the Tcl core and the handler commands registered with package unknown is amended. The handler has to perform case-insensitive search for the package whose name it is invoked with.

Discussions

The contract for the package unknown handler is amended because not doing so would force the package management facilities in the Tcl core to search for a package by iterating over all possible combinations of upper/lower case in the package name. For a name containing N alphabetic characters (i.e. not counting ':'s, and digits) this means to loop 2**N times. This exponentional explosion and consequent running time is not acceptable (The longest name for a package currently found in ActiveState's public TEApot repository is 25 alpabetic characters, forcing over 30 million searches before the core can give up).

Further, of the two known standard package unknown handlers the handler for regular packages need not be modified at all as it always register every possible package it finds. Only the unknown handler for Tcl Modules has to be modified to perform case insensitive filesystem searches, and only for Unix. On Windows the builtin glob command already perform such.

Reference Implementation

An implementation patch is available at SourceForge http://sourceforge.net/support/tracker.php?aid=2316115 .

Copyright

This document has been placed in the public domain.

Please note that any correspondence to the author concerning this TIP is considered in the public domain unless otherwise specifically requested by the individual(s) authoring said correspondence. This is to allow information about the TIP to be placed in a public forum for discussion.

History