13.34 <locale>

The <locale> header declares class and function templates for internationalization and localization. It supports conversion between narrow and wide character sets, character classification and collation, formatting and parsing numbers, currency, dates and times, and retrieving messages. For example, every I/O stream has a locale, which it uses to parse formatted input or to format output.

A locale is an embodiment of a set of cultural conventions, including information about the native character set, how dates are formatted, which symbol to use for currency, and so on. Each set of related attributes is called a facet, which are grouped into categories.

The categories are fixed and defined by the standard (see Table 13-20, under locale::category, for a complete list), and each category has several predefined facets. For example, one of the facets in the time category is time_get<charT, InputIter>, which specifies rules for parsing a time string. You can define additional facets; see the description of the locale::facet class in this section for details.

Many of the facets come in two flavors: plain and named. The plain versions implement default behavior, and the named versions implement the behavior for a named locale. See the locale class later in this section for a discussion of locale names.

When a program starts, the global locale is initialized to the "C" locale, and the standard I/O streams use this locale for character conversions and formatting. A program can change the locale at any time; see the locale class in this section for details.

The C++ <locale> header provides more functionality than the C <clocale>, <cctype>, and <cwchar> headers, especially the ability to extend a locale with your own facets. On the other hand, facets and the locale class template are more complicated than the C functions. For simple character classification in a single locale, you are probably better off using the C functions. If a program must work with multiple locales simultaneously, use the C++ locale template.

codecvt class template Facet for mapping one character set to another

template <typename internT,typename externT,typename stateT>

class codecvt : public locale::facet, public codecvt_base

{

public:

  typedef internT intern_type;

  typedef externT extern_type;

  typedef stateT state_type;

  explicit codecvt(size_t refs = 0);

  result out(stateT& state, const internT* from, const internT* from_end,

             const internT*& from_next, externT* to, externT* to_limit, 

             externT*& to_next) const;

  result unshift(stateT& state, externT* to, externT* to_limit, 

                 externT*& to_next) const;

  result in(stateT& state, const externT* from, const externT* from_end,

            const externT*& from_next, internT* to, internT* to_limit, 

            internT*& to_next) const;

  int encoding(  ) const throw(  );

  bool always_noconv(  ) const throw(  );

  int length(stateT&, const externT* from, const externT* end, size_t max)

    const;

  int max_length(  ) const throw(  );

  static locale::id id;

protected:

  virtual ~codecvt(  );

  virtual result do_out(stateT& state, const internT* from, 

                        const internT* from_end, const internT*& from_next, 

                        externT* to, externT* to_limit, externT*& to_next)

    const;

  virtual result do_in(stateT& state, const externT* from, 

                       const externT* from_end, const externT*& from_next,

                       internT* to, internT* to_limit, internT*& to_next)

    const;

  virtual result do_unshift(stateT& state, externT* to, externT* to_limit,

                            externT*& to_next) const;

  virtual int do_encoding(  ) const throw(  );

  virtual bool do_always_noconv(  ) const throw(  );

  virtual int do_length(stateT&, const externT* from, const externT* end,

                        size_t max) const;

  virtual int do_max_length(  ) const throw(  );

};

The codecvt template converts characters from one character encoding to another. It is most often used to convert multibyte characters to and from wide characters.

The following template specializations are required by the standard:

codecvt<wchar_t, char, mbstate_t>

Converts multibyte narrow characters to wide characters (in) and wide to multibyte (out)

codecvt<char, char, mbstate_t>

A no-op, "converting" characters to themselves

As with other facets, the public members of codecvt call virtual, protected members with the same name prefaced by do_. Thus, to use the facet, call the public functions, such as in and out, which in turn call do_in and do_out. The descriptions below are for the virtual functions because they do the real work. Imagine that for each virtual function description, there is a corresponding description for a public, nonvirtual function, such as:

bool always_noconv( ) const throw( )

Returns do_always_noconv( )

The following are the virtual, protected members of codecvt:

virtual bool do_always_noconv( ) const throw( )

Returns true if the codecvt object does not actually perform any conversions, that is, in and out are no-ops. For example, for the specialization codecvt<char,char,mbstate_t>, do_always_noconv always returns true.

virtual int do_encoding( ) const throw( )

Returns the number of externT characters needed to represent a single internT character. If this number is not a fixed constant, the return value is 0. The return value is -1 if externT character sequences are not state-dependent.

virtual result do_in(stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_limit, internT*& to_next) const

Converts externT characters to internT characters. The characters in the range [from, from_end) are converted and stored in the array starting at to. The number of characters converted is the minimum of from_end - from and to_limit - to.

The from_next parameter is set to point to the value in [from, from_end) where the conversion stopped, and to_next points to the value in [to, to_limit) where the conversion stopped. If no conversion was performed, from_next is the same as from, and to_next is equal to to.

The return value is a result, as described in Table 13-19 (under the codecvt_base class).

virtual int do_length(stateT&, const externT* from, const externT* from_end, size_t max) const

Returns the number of externT characters in the range [from, from_end) that are used to convert to internT characters. At most, max internT characters are converted.

virtual int do_max_length( ) const throw( )

Returns the maximum number of externT characters needed to represent a single internT character, that is, the maximum value that do_length can return when max is 1.

virtual result do_out(stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_limit, externT*& to_next) const

Converts internT characters to externT characters. The characters in the range [from, from_end) are converted and stored in the array starting at to. The number of characters converted is the minimum of from_end - from and to_limit - to.

The from_next parameter is set to point to the value in [from, from_end) where the conversion stopped, and to_next points to the value in [to, to_limit) where the conversion stopped. If no conversion was performed, from_next is the same as from, and to_next is equal to to.

The return value is a result, as described in Table 13-19 (under codecvt_base class).

virtual result do_unshift(stateT& state, externT* to, externT* to_limit, externT*& to_next) const

Ends a shift state by storing characters in the array starting at to such that the characters undo the state shift given by state. Up to to_limit - to characters are written, and to_next is set to point to one past the last character written into to.

The return value is a result, as described in Table 13-19 (under codecvt_base class).

See Also

codecvt_base class, codecvt_byname class template, locale::facet class

codecvt_base class Base class for the codecvt template

class codecvt_base {

public:

  enum result { ok, partial, error, noconv };

};

The codecvt_base class is the base class for the codecvt and codecvt_byname class templates. It declares the result type, which is the type returned by the do_in and do_out conversion functions. Table 13-19 lists the literals of the result enumerated type.

Table 13-19. codecvt_base::result literals

Literal

Description

error

Error in conversion (e.g., invalid state or multibyte character sequence)

noconv

No conversion (or unshift terminated) needed

ok

Conversion finished successfully

partial

Not all source characters converted, or unshift sequence is incomplete

See Also

codecvt class template, codecvt_byname class template

codecvt_byname class template Facet for mapping one character set to another

template<typename internT, typename externT, typename stateT>

class codecvt_byname :

  public codecvt<internT, externT, stateT>

{

public:

  explicit codecvt_byname(const char*, size_t refs = 0);

protected:

  //  . . .  Same virtual functions as in codecvt

};

The codecvt_byname class template converts characters from one character encoding to another using the rules of a named locale. The codecvt_byname<char,char,mbstate_t> and codecvt_byname<wchar_t,char,mbstate_t> instantiations are standard.

See Also

codecvt class template, locale::facet class

collate class template Facet for comparing strings in collation order

template <typename charT>

class collate : public locale::facet

{

public:

  typedef charT char_type;

  typedef basic_string<charT> string_type;

  explicit collate(size_t refs = 0);

  int compare(const charT* low1, const charT* high1, const charT* low2,

              const charT* high2) const;

  string_type transform(const charT* low, const charT* high) const;

  long hash(const charT* low, const charT* high) const;

  static locale::id id;

protected:

  virtual ~collate(  );

  virtual int do_compare(const charT* low1, const charT* high1, 

                         const charT* low2, const charT* high2) const;

  virtual string_type do_transform (const charT* low, const charT* high) const;

  virtual long do_hash (const charT* low, const charT* high) const;

};

The collate class template is a facet used to compare strings. In some locales, the collation order of characters is not the same as the numerical order of their encodings, and some characters might be logically equivalent even if they have different encodings.

You can use a locale object as a comparator for algorithms that need a comparison function; the locale's operator( ) function uses the collate facet to perform the comparison.

The standard mandates the collate<char> and collate<wchar_t> instantiations, which perform lexicographical (element-wise, numerical) comparison. See lexicographical_compare in <algorithm> earlier in this chapter.

As with other facets, the public members call virtual, protected members with the same name prefaced by do_. Thus, to use the facet, call the public functions, such as compare, which calls do_compare. The descriptions below are for the virtual functions because they do the real work. Imagine that for each virtual function description, there is a corresponding description for a public, nonvirtual function, such as:

int compare(const charT* low1, const charT* high1, const charT* low2, const charT* high2) const

Returns do_compare(low1, high1, low2, high2)

The following are the virtual, protected members of collate:

virtual int do_compare(const charT* low1, const charT* high1, const charT* low2, const charT* high2) const

Compares the character sequences [low1, high1) with the character sequence [low2, high2). The return value is one of the following:

  • -1 if sequence 1 is less than sequence 2

  • 0 if the sequences are equal

  • 1 if sequence 1 is greater than sequence 2

virtual long do_hash (const charT* low, const charT* high) const

Returns a hash value for the character sequence [low, high). If do_compare returns 0 for two character sequences, do_hash returns the same value for the two sequences. The reverse is not necessarily the case.

virtual string_type do_transform(const charT* low, const charT* high) const

Transforms the character sequence [low, high) into a string that can be compared (as a simple lexicographical comparison) with another transformed string to obtain the same result as calling do_compare on the original character sequences. The do_transform function is useful if a program needs to compare the same character sequence many times.

See Also

collate_byname class template, locale class, locale::facet class

collate_byname class template Facet for comparing strings in collation order

template <typename charT>

class collate_byname : public collate<charT>

{

public:

  typedef basic_string<charT> string_type;

  explicit collate_byname(const char*, size_t refs = 0);

protected:

  //  . . .  Same virtual functions as in collate

};

Compares strings using a named locale's collation order. The collate_byname<char> and collate_byname<wchar_t> instantiations are standard.

See Also

collate class template, locale::facet class

ctype class template Facet for classifying characters

class ctype : public locale::facet, public ctype_base

{

public:

  typedef charT char_type;

  explicit ctype(size_t refs = 0);

  bool is(mask m, charT c) const;

  const charT* is(const charT* low, const charT* high, mask* vec) const;

  const charT* scan_is(mask m, const charT* low, const charT* high) const;

  const charT* scan_not(mask m, const charT* low, const charT* high) const;

  charT toupper(charT c) const;

  const charT* toupper(charT* low, const charT* high) const;

  charT tolower(charT c) const;

  const charT* tolower(charT* low, const charT* high) const;

  charT widen(char c) const;

  const char* widen(const char* low, const char* high, charT* to) const;

  char narrow(charT c, char dfault) const;

  const charT* narrow(const charT* low, const charT*, char dfault, char* to)

    const;

  static locale::id id;

protected:

  virtual ~ctype(  );

  virtual bool do_is(mask m, charT c) const;

  virtual const charT* do_is(const charT* low, const charT* high, mask* vec)

    const;

  virtual const charT* do_scan_is(mask m, const charT* low, 

                                  const charT* high) const;

  virtual const charT* do_scan_not(mask m, const charT* low, 

                                   const charT* high) const;

  virtual charT do_toupper(charT) const;

  virtual const charT* do_toupper(charT* low, const charT* high) const;

  virtual charT do_tolower(charT) const;

  virtual const charT* do_tolower(charT* low, const charT* high) const;

  virtual charT do_widen(char) const;

  virtual const char* do_widen(const char* low, const char* high, charT* dest)

    const;

  virtual char do_narrow(charT, char dfault) const;

  virtual const charT* do_narrow(const charT* low, const charT* high, 

                                 char dfault, char* dest) const;

};

The ctype class template is a facet for classifying characters.

figs/acorn.gif

The ctype<char> specialization is described in its own section later in this chapter. The standard also mandates the ctype<wchar_t> instantiation. Both instantiations depend on the implementation's native character set.

As with other facets, the public members call virtual, protected members with the same name prefaced by do_. Thus, to use the facet, call the public functions, such as narrow, which calls do_narrow. The descriptions below are for the virtual functions because they do the real work. Imagine that for each virtual function description, there is a corresponding description for a public, nonvirtual function, such as:

bool is(mask m, charT c) const

Returns do_is(m, c)

The following are the virtual, protected members of ctype:

virtual bool do_is(mask m, charT c) const
virtual const charT* do_is(const charT* low, const charT* high, mask* dest) const

Classifies a single character c or a sequence of characters [low, high). The first form tests the classification mask, M, of c and returns (M & m) != 0. The second form determines the mask for each character in the range and stores the mask values in the dest array (which must be large enough to hold high - low masks), returning high. See Table 13-19 (under the ctype_base class) for a description of the mask type.

virtual char do_narrow(charT c, char dfault) const
virtual const charT* do_narrow(const charT* low, const charT* high, char dfault, char* dest) const

Converts a character c or a sequence of characters [low, high) to narrow characters of type char. The first form returns the narrow character, and the second form stores the characters in the array dest (which must be large enough to hold high - low characters), returning high. If a charT source character cannot be converted to a narrow character, the first function returns dfault, and the second function stores dfault in dest as the narrow version of that character.

virtual const charT* do_scan_is(mask m, const charT* low, const charT* high) const

Searches the sequence of characters [low, high) for the first character that matches m, that is, for which do_is(m, c) is true. The return value is a pointer to the first matching character, or high if no characters match m.

virtual const charT* do_scan_not(mask m, const charT* low, const charT* high) const

Searches the sequence of characters [low, high) for the first character that does not match m, that is, for which do_is(m, c) is false. The return value is a pointer to the first matching character, or high if every character matches m.

virtual charT do_tolower(charT c) const
virtual const charT* do_tolower(charT* low, const charT* high) const

Converts a character c or a sequence of characters [low, high) to lowercase. The first form returns the lowercase version of c, or it returns c if c does not have a lowercase counterpart.

The second form modifies the character sequence: each character in [low, high) is replaced by its lowercase counterpart; if a character cannot be converted to lowercase, it is not touched. The function returns high.

virtual charT do_toupper(charT c) const
virtual const charT* do_toupper(charT* low, const charT* high) const

Converts a character c or a sequence of characters [low, high) to uppercase. The first form returns the uppercase version of c, or it returns c if c does not have a uppercase counterpart.

The second form modifies the character sequence: each character in [low, high) is replaced by its uppercase counterpart; if a character cannot be converted to uppercase, it is not touched. The function returns high.

virtual charT do_widen(char c) const
virtual const char* do_widen(const char* low, const char* high, charT* dest) const

Converts a narrow character c or a sequence of narrow characters [low, high) to characters of type charT. The first form returns the new character, and the second form stores the characters in the array dest (which must be large enough to hold high - low characters), returning high.

See Also

ctype_base class, ctype_byname class template, locale::facet class

ctype<char> class Facet for classifying narrow characters

template <>

class ctype<char> : public locale::facet, public ctype_base

{

  ...

public:

  explicit ctype(const mask* tab = 0, bool del = false, size_t refs = 0);

  static const size_t table_size =  . . . ;

  inline bool is(mask m, char c) const;

  inline const char* is(const char* low, const char* high, mask* vec) const;

  inline const char* scan_is(mask m, const char* low, const char* high) const;

  inline const char* scan_not(mask m, const char* low, const char* high) const;

protected:

  virtual ~ctype(  );

  inline const mask* table(  ) const throw(  );

  inline static const mask* classic_table(  ) throw(  );

};

The ctype<> class template is specialized for type char (but not signed char or unsigned char) so the member functions can be implemented as inline functions. The standard requires the implementation to have the protected member functions table and classic_table. Each of these functions returns an array of mask values indexed by characters cast to unsigned char. The number of elements in a table must be at least table_size, which is an implementation-defined constant value.

The following are the key member functions:

explicit ctype(const mask* tab = 0, bool del = false, size_t refs = 0)

Initializes the table( ) pointer with tab. If tab is a null pointer, table( ) is set to classic_table( ). If tab is not null, and del is true, the ctype object owns the table, and when the ctype destructor is called, it will delete the table. The refs parameter is passed to the base class, as with any facet.

virtual ~ctype( )

If the constructor's del flag was true, and tab was not a null pointer, performs delete[] tab.

inline bool is(mask m, charT c) const
inline const charT* is(const charT* low, const charT* high, mask* dest) const

Tests character classifications. The first form returns:

(table(  )[static_cast<unsigned char>(c)] & m) != 0

The second form stores the following in dest for each element c of the range [low, high):

table(  )[static_cast<unsigned char>(c)]

Note that is does not call do_is, so is can be implemented as an inline function.

inline static const mask* classic_table( ) throw( )

Returns a table that corresponds to the "C" locale.

inline const char* scan_is(mask m, const char* low, const char* high) const

Searches the sequence of characters [low, high) for the first character that matches m, that is, for which is(m, c) is true. The return value is a pointer to the first matching character, or high if no characters match m.

inline const char* scan_not(mask m, const char* low, const char* high) const

Searches the sequence of characters [low, high) for the first character that does not match m, that is, for which is(m, c) is false. The return value is a pointer to the first matching character, or high if every character matches m.

inline const mask* table( ) throw( )

Returns the value that was passed to the constructor as the tab parameter, or, if tab was null, classic_table( ) is returned.

See Also

ctype class template, locale::facet class, <cctype>, <cwctype>

ctype_base class Base class for ctype facet

class ctype_base{

public:

  enum mask {

    space, print, cntrl, upper, lower, alpha, digit, punct, xdigit,

    alnum=alpha|digit, graph=alnum|punct

  };

};

The ctype_base class is the base class for the ctype and ctype_byname class templates. It declares the mask enumerated type, which is used for classifying characters. Table 13-20 describes the mask literals and their definitions for the classic "C" locale.

Table 13-20. mask literals for classifying characters

Literal

Description

"C" locale

alpha

Alphabetic (a letter)

lower or upper

alnum

Alphanumeric (letter or digit)

alpha or digit

cntrl

Control (nonprintable)

Not print

digit

'0'-'9'

All locales

graph

Character that occupies graphical space

print but not space

lower

Lowercase letter

'a'-'z'

print

Printable character (alphanumeric, punctuation, space, etc.)

Depends on character set; in ASCII: '\x20'-'\x7e')

space

Whitespace

' ', '\f', '\n', '\r', '\t', '\v'

upper

Uppercase letter

'A'-'Z'

xdigit

Hexadecimal digit ('0'-'9', 'a'-'f', 'A'-'F')

All locales

See Also

ctype class template, ctype_byname class template

ctype_byname class template Facet for classifying characters

template <typename charT>

class ctype_byname : public ctype<charT>

{

public:

  typedef ctype<charT>::mask mask;

  explicit ctype_byname(const char*, size_t refs = 0);

protected:

  //  . . .  Same virtual functions as in ctype

};

The ctype_byname class template is a facet for classifying characters; it uses a named locale. The ctype_byname<char> and ctype_byname<wchar_t> instantiations are standard.

See Also

ctype class template, ctype_byname<char> class

ctype_byname<char> class Facet for classifying narrow characters

template <>

class ctype_byname<char> : public ctype<char>

{

public:

  explicit ctype_byname(const char*, size_t refs = 0);

protected:

  //  . . .  Same virtual functions as in ctype<char>

};

The ctype_byname<char> class specializes the ctype_byname template for type char. (No specialization exists for signed char and unsigned char.) It derives from ctype<char>, so it inherits its table-driven implementation.

See Also

ctype<char> class, ctype_byname class template

has_facet function template Test for existence of a facet in a locale

template <typename Facet>

bool has_facet(const locale& loc) throw(  );

The has_facet function determines whether the locale loc supports the facet Facet. It returns true if the facet is supported or false if it is not. Call has_facet to determine whether a locale supports a user-defined facet. (Every locale must support the standard facets that are described in this section.) Example 13-24 shows how has_facet is used.

Example

Example 13-24. Testing whether a locale supports a facet
// The units facet is defined under the locale::facet class (later in this

// section).

   

using std::locale;

if (std::has_facet<units>(locale(  )) {

  // Get a reference to the units facet of the locale.

  const units& u = std::use_facet<units>(locale(  ));

  // Construct a value of 42 cm.

  units::value_t len = u.make(42, units::cm);

  // Print the length (42 cm) in the locale's preferred units.

  u.length_put(std::cout, len);

}

See Also

locale class, use_facet function template

isalnum function template Determines whether a character is alphanumeric in a locale

template <typename charT>

bool isalnum(charT c, const locale& loc);

The isalnum function determines whether the character c is an alphanumeric character in the locale loc. It returns the following:

use_facet<ctype<charT> >(loc).is(ctype_base::alnum, c)

See Also

ctype_base class, ctype class template, isalpha function template, isdigit function template

isalpha function template Determines whether a character is a letter in a locale

template <typename charT>

bool isalpha(charT c, const locale& loc);

The isalpha function determines whether the character c is a letter in the locale loc. It returns the following:

use_facet<ctype<charT> >(loc).is(ctype_base::alpha, c)

See Also

ctype_base class, ctype class template, isalnum function template, islower function template, isupper function template

iscntrl function template Determines whether a character is a control character in a locale

template <typename charT>

bool iscntrl(charT c, const locale& loc);

The iscntrl function determines whether the character c is a control character in the locale loc. It returns the following:

use_facet<ctype<charT> >(loc).is(ctype_base::cntrl, c)

See Also

ctype_base class, ctype class template, isprint function template

isdigit function template Determines whether a character is a digit in a locale

template <typename charT>

bool isdigit(charT c, const locale& loc);

The isdigit function determines whether the character c is a digit in the locale loc. It returns the following:

use_facet<c