|
to secooler:
1 关于你的这篇文章(http://space.itpub.net/519536/viewspace-627797)中这一段,
4.在此环境下,查看默认的汉字排序规则
sec@secooler> select * from t order by x;
X
------------------------------
拆
侯
拉
你
推
做
6 rows selected.
可见,此时的默认汉字排序规则是“拼音”。 对此,我有不同看法
2 目前我们的数据库字符集大多数选择(ZHS16GBK),在这里假设你文章中的数据库也是选择此此字符集。
而该字符集,ORACLE定义,是按二进制排序; 遇上汉字时,则按照汉字的字符编码模式(character encoding schema)排序。
如我的库的字符集信息
SQL>select * from sys.props$;
NAME VALUE$ COMMENT$
---------------------------------------- ---------------------------------------- --------------------------------------------------
...
NLS_CHARACTERSET ZHS16GBK Character set
NLS_CALENDAR GREGORIAN Calendar system
NLS_DATE_FORMAT DD-MON-RR Date format
NLS_DATE_LANGUAGE AMERICAN Date language
NLS_SORT BINARY Linguistic definition
...
NLS_RDBMS_VERSION 10.2.0.4.0 RDBMS version for NLS parameters
WORKLOAD_CAPTURE_MODE CAPTURE implies workload capture is in progress
28 rows selected.
Elapsed: 00:00:00.06
3 测试发现,之所以出现你上面的情况, 是因为恰巧这几个字的字符编码模式值,刚好和该字的拼音大小顺序一致。
也就是,这几个字的字符编码模式值排序顺序为,拆<侯<拉<你<推<做(如下:ASCII出的值),
而在拼音上,也恰巧是 拆(chi)<侯(hou)<拉(la)<你(ni)<推(tui)<做(zuo)。因此,你ORDER BY时,刚好就看到是拼音排序。
NCV31@ncdb>select ascii('拆'), ascii('侯'),ascii('拉'),ascii('你'),ascii('推'),ascii('做') from dual;
ASCII('拆') ASCII('侯') ASCII('拉') ASCII('你') ASCII('推') ASCII('做')
---------------- ---------------- ---------------- ---------------- ---------------- ----------------
45808 47854 49325 50403 52678 55286
6 rows selected.
4 现在我举个反例,假设默认是拼音排序,则下面这四字的排序结果,应该如下:
金(jin) < 睿(rui) < 星(xing) < 优(you)
SQL>select name from (select '金' name from dual union all select '优' name from dual union all
2 select '睿' name from dual union all select '星' name from dual);
NAME
----------------------------------------
金
优
睿
星
4 rows selected.
Elapsed: 00:00:00.01
5 但实际上, “睿” 并没有排在第2位,
SQL>select name from (select '金' name from dual union all select '优' name from dual union all
2 select '睿' name from dual union all select '星' name from dual) order by name;
NAME
----------------------------------------
金
星
优
睿
4 rows selected.
Elapsed: 00:00:00.01
6 之所以“睿” 被排在最后,是因为其编码值最大:61091。因此,真正按照二进制排序时,就出现如上的结果。
SQL>select ascii('金'), ascii('星'), ascii('优'), ascii('睿') from dual;
ASCII('金') ASCII('星') ASCII('优') ASCII('睿')
---------------- ---------------- ---------------- ----------------
48624 53447 54213 61091
Elapsed: 00:00:00.01
SQL>select chr(48624), chr(53447), chr(54213), chr(61091) from dual;
CH CH CH CH
-- -- -- --
金 星 优 睿
Elapsed: 00:00:00.01
SQL>select name, dump(name) dump from (select '金' name from dual union all select '优' name from dual union all
2 select '睿' name from dual union all select '星' name from dual) order by name;
NAME DUMP
---------------------------------------- -------------------------
金 Typ=96 Len=2: 189,240
星 Typ=96 Len=2: 208,199
优 Typ=96 Len=2: 211,197
睿 Typ=96 Len=2: 238,163
4 rows selected.
7 测试版本和时间。
SQL>select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production
5 rows selected.
Elapsed: 00:00:00.21
SQL>select sysdate from dual;
SYSDATE
-------------------
2010-07-07 10:52:17
Elapsed: 00:00:00.01 |
|