useful web sites and some useful stuff: SQL Server Collation

Thursday, October 29, 2009

SQL Server Collation

What is collation?

Collation refers to a set of rules that determine how data is sorted and compared.

Character data is sorted using rules that define the correct character sequence,

with options for specifying

1) case-sensitivity,

2) accent marks,

3) kana character types and

4)character width.

A collation specifies the bit patterns that represent each character and the rules by which characters are sorted and compared.

Case sensitivity

If A and a, B and b, etc. are treated in the same way then it is case-insensitive. A computer treats A and a differently because it uses ASCII code to differentiate the input. The ASCII value of A is 65, while a is 97. The ASCII value of B is 66 and b is 98.

Accent sensitivity

Sensitive or insensitive to accented characters, also known as extended characters. Accented characters are those characters that have a diacritical mark, such as the German umlaut (ë) or the Spanish tilde (~). For example, accent sensitivity defines whether 'a' is equal or not equal to 'ä'.

If a and á, o and ó are treated in the same way, then it is accent-insensitive. A computer treats a and á differently because it uses ASCII code for differentiating the input. The ASCII value of a is 97 and áis 225. The ASCII value of o is 111 and ó is 243.

Kana Sensitivity

When Japanese kana characters Hiragana and Katakana are treated differently, it is called Kana sensitive.

Width sensitivity

When a single-byte character (half-width) and the same character when represented as a double-byte character (full-width) are treated differently then it is width sensitive.

SQL Server supports two categories of character data types:

* The Unicode data types nchar, nvarchar, and ntext. These data types use the Unicode character representation. Code pages do not apply to these data types.

* The non-Unicode character data types char, varchar, and text. These data types use the character representation scheme defined in a single or double-byte code page.

The easiest way to manage character data in international databases is to always use the Unicode nchar, nvarchar, and ntext data types in place of their non-Unicode equivalents (char, varchar, and text).

Sort Order

A sort order specifies the rules used by SQL Server to interpret, collate, compare, and present character data. For example, a sort order defines whether 'a' is less than, equal to, or greater than 'b'

When you install an instance of SQL Server, you can specify the default collation for that instance during setup. The default collation for the instance also becomes the default collation of the system databases: master, model, tempdb, msdb, and Distribution.

CREATE TABLE MyTable
(PrimaryKey int PRIMARY KEY,
CharCol varchar(10) COLLATE French_CI_AS NOT NULL
)
GO
ALTER TABLE MyTable ALTER COLUMN CharCol
varchar(10)COLLATE Latin1_General_CI_AS NOT NULL
GO

After a collation has been assigned to any object other than a column or database, you cannot change the collation except by dropping and re-creating the object

sample

Thursday, October 29, 2009

SQL Server Collation

No comments:

Popular Posts