Delphi Clinic C++Builder Gate Training & Consultancy Delphi Notes Weblog Dr.Bob's Webshop
Bob Swart (aka Drs.Bob) Dr.Bob's Delphi Clinics Dr.Bob's Delphi Courseware Manuals
View Bob Swart's profile on LinkedIn Drs.Bob's Delphi Notes
These are the voyages using Delphi Enterprise (and Architect). Its mission: to explore strange, new worlds. To design and build new applications. To boldly go...
Title:

Unicode tip #10 - ANSI vs. Unicode UpperCase

Author: Bob Swart
Posted: 12/16/2008 5:55:29 PM (GMT+1)
Content:

There's an overloaded version of function UpperCase in the SysUtils unit as well as the AnsiStrings unit. To start with the unit AnsiStrings, the definition of UpperCase is as follows:

(* AnsiStrings *)
function UpperCase(const S: AnsiString): AnsiString; overload;
function UpperCase(const S: AnsiString; LocaleOptions: TLocaleOptions): AnsiString;
overload; inline;
The definition of UpperCase in the SysUtils unit is using String instead of AnsiString, and is defined as follows:
(* SysUtils *)
function UpperCase(const S: string): string; overload;
function UpperCase(const S: string; LocaleOptions: TLocaleOptions): string;
overload; inline;
Unfortunately, there is also a function called AnsiUpperCase inside SysUtils, which has the following signature:
  function AnsiUpperCase(const S: string): string; overload;
Note that it works on (Unicode)Strings and not AnsiStrings! The comments (and help?) of AnsiUpperCase say the following:
(* UpperCase converts all ASCII characters in the given string to upper case.
The conversion affects only 7-bit ASCII characters between 'a' and 'z'.
To convert 8-bit international characters, use AnsiUpperCase.
*)
This should be "Unicode characters" instead of 8-bit, as the Windows API CharUpperBuff is now called (the W-version in Delphi 2009)...

To remedy this potential confusion, I suggest that there will be a new function introduced, UnicodeUpperCase (not WideUpperCase, since that one already exists for WideStrings).

Same thing for the AnsiLowerCase (calling CharLowerBuff API), AnsiCompareStr (calling CompareString API), AnsiSameStr, AnsiCompareText (calling CompareString API), AnsiSameText, AnsiCompareFileName (calling CompareStr), AnsiLowerCaseFileName (calling AnsiLowerCase), AnsiUpperCaseFileName (calling AnsiUpperCase), and AnsiPos (calling StrPosLen). UnicodeUpperCase should implement the call to CharUpperBuffW while AnsiUpperCase should be moved to the AnsiStrings unit, calling the CharUpperBuffA function.

I also suggest a number of UnicodeXXX functions to call the "W"-APIs and a change to the AnsiXXX functions to call the "A"-APIs. The latter is already implemented: the AnsiStrings unit contains the AnsiXXX functions. So the only "strange" thing is the names of the AnsiXXX functions inside the SysUtils unit, which should IMHO be UnicodeXXX functions instead.

The reason for not changing these names from AnsiXXX to UnicodeXXX is probably to avoid breaking existing code. On the other hand, it would be a good idea for people to explicitly wonder about which function to use...

This tip is the 10th in a series of Unicode tips taken from my Delphi 2009 Development Essentials book published earlier this week on Lulu.com.

Back  


1 Comment

AuthorPostedComments
Fabricio 08/12/18 21:43:33These ANSI functions always resided in SysUtils, moving them would break a lot of code


New Comment (max. 2048 characters, no HTML):

Name:
Comment:



This webpage © 2005-2014 by Bob Swart (aka Dr.Bob - www.drbob42.com). All Rights Reserved.