MySQL中无GROUP BY直接HAVING的问题

8月 8th, 2013 | Posted by | Filed under 未分类

今天有同学给我反应,有一张表,id是主键,这样的写法可以返回一条记录:

“SELECT * FROM t HAVING id=MIN(id);”

但是只是把MIN换成MAX,这样返回就是空了:

“SELECT * FROM t HAVING id=MAX(id);”

这是为什么呢?

我们先来做个试验,验证这种情况。
这是表结构,初始化两条记录,然后试验:

root@localhost : plx 10:25:10> show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `a` int(11) DEFAULT NULL,
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8

root@localhost : plx 10:25:15> select * from t2;
+------+----+
| a    | id |
+------+----+
|    1 |  1 |
|    1 |  3 |
+------+----+
2 rows in set (0.00 sec)

root@localhost : plx 10:25:20> SELECT * FROM t2 HAVING id=MIN(id);
+------+----+
| a    | id |
+------+----+
|    1 |  1 |
+------+----+
1 row in set (0.00 sec)

root@localhost : plx 10:25:30> SELECT * FROM t2 HAVING id=MAX(id);
Empty set (0.00 sec)

初看之下,好像真的是这样哎,怎么会这样呢?

我再试一下,把a字段改一个为10,然后试下a字段:

root@localhost : plx 10:26:58> select * from t2;
+------+----+
| a    | id |
+------+----+
|   10 |  1 |
|    1 |  3 |
+------+----+
2 rows in set (0.00 sec)

root@localhost : plx 10:28:20> SELECT * FROM t2 HAVING a=MAX(a);
+------+----+
| a    | id |
+------+----+
|   10 |  1 |
+------+----+
1 row in set (0.00 sec)

root@localhost : plx 10:28:28> SELECT * FROM t2 HAVING a=MIN(a);
Empty set (0.00 sec)

我擦,这回MAX能返回,MIN不能了,这又是为啥呢?

旁白
一般来说,HAVING子句是配合GROUP BY使用的,单独使用HAVING本身是不符合规范的,
但是MySQL会做一个重写,加上一个GROUP BY NULL,”SELECT * FROM t HAVING id=MIN(id)”会被重写为”SELECT * FROM t GROUP BY NULL HAVING id=MIN(id)”,这样语法就符合规范了。

继续……
但是,这个 GROUP BY NULL 会产生什么结果呢?经过查看代码和试验,可以证明,GROUP BY NULL 等价于 LIMIT 1:

root@localhost : plx 10:25:48> SELECT * FROM t2 GROUP BY NULL;
+------+----+
| a    | id |
+------+----+
|   10 |  1 |
+------+----+
1 row in set (0.00 sec)

也就是说,GROUP BY NULL 以后,只会有一个分组,里面就是第一行数据。
但是如果这样,MIN、MAX结果应该是一致的,那也不应该MAX和MIN一个有结果,一个没结果啊,这是为什么呢,再做一个测试。
修改一下数据,然后直接查看MIN/MAX的值:

root@localhost : plx 10:26:58> select * from t2;
+------+----+
| a    | id |
+------+----+
|   10 |  1 |
|    1 |  3 |
+------+----+
2 rows in set (0.00 sec)

root@localhost : plx 10:27:04> SELECT * FROM t2 GROUP BY NULL;
+------+----+
| a    | id |
+------+----+
|   10 |  1 |
+------+----+
1 row in set (0.00 sec)

root@localhost : plx 10:30:21> SELECT MAX(a),MIN(a),MAX(id),MIN(id) FROM t2 GROUP BY NULL;
+--------+--------+---------+---------+
| MAX(a) | MIN(a) | MAX(id) | MIN(id) |
+--------+--------+---------+---------+
|     10 |      1 |       3 |       1 |
+--------+--------+---------+---------+
1 row in set (0.00 sec)

是不是发现问题了?
MAX/MIN函数取值是全局的,而不是LIMIT 1这个分组内的。
因此,当GROUP BY NULL的时候,MAX/MIN函数是取所有数据里的最大和最小值!

所以啊,”SELECT * FROM t HAVING id=MIN(id)”本质上是”SELECT * FROM t HAVING id=1″, 就能返回一条记录,而”SELECT * FROM t HAVING id=MAX(id)”本质上是”SELECT * FROM t HAVING id=3″,当然没有返回记录,这就是问题的根源。

测试一下GROUP BY a,这样就对了,每个分组内只有一行,所以MAX/MIN一样大,这回是取得组内最大和最小值。

root@localhost : plx 11:29:49> SELECT MAX(a),MIN(a),MAX(id),MIN(id) FROM t2 GROUP BY a;
+--------+--------+---------+---------+
| MAX(a) | MIN(a) | MAX(id) | MIN(id) |
+--------+--------+---------+---------+
|      1 |      1 |       3 |       3 |
|     10 |     10 |       5 |       5 |
+--------+--------+---------+---------+
2 rows in set (0.00 sec)

GROUP BY NULL时MAX/MIN的行为,是这个问题的本质,所以啊,尽量使用标准语法,玩花样SQL之前,一定要搞清楚它的行为是否与理解的一致。

Enjoy MySQL!

mysqldump的流程

3月 29th, 2013 | Posted by | Filed under 未分类

前几天看到群里在讨论mysqldump导致锁表的问题,为什么一个表已经dump完了还会被锁住?mysqldump里面到底是怎么处理的,为了解答这些问题,就来看看mysqldump.cc中的实现吧。

重要参数

首先我们把参数和内部变量对应起来,并且看一下它们的注释:

–single-transaction: opt_single_transaction

Creates a consistent snapshot by dumping all tables in a single transaction. Works ONLY for tables stored in storage engines which support multiversioning (currently only InnoDB does); the dump is NOT guaranteed to be consistent for other storage engines. While a –single-transaction dump is in process, to ensure a valid dump file (correct table contents and binary log position), no other connection should use the following statements: ALTER TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE, as consistent snapshot is not isolated from them. Option automatically turns off –lock-tables.

通过将导出操作封装在一个事务内来使得导出的数据是一个一致性快照。只有当表使用支持MVCC的存储引擎(目前只有InnoDB)时才可以工作;其他引擎不能保证导出是一致的。当导出开启了–single-transaction选项时,要确保导出文件有效(正确的表数据和二进制日志位置),就要保证没有其他连接会执行如下语句:ALTER TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE,这会导致一致性快照失效。这个选项开启后会自动关闭lock-tables。

–master-data: opt_master_data

This causes the binary log position and filename to be appended to the output. If equal to 1, will print it as a CHANGE MASTER command; if equal to 2, that command will be prefixed with a comment symbol. This option will turn –lock-all-tables on, unless –single-transaction is specified too (in which case a global read lock is only taken a short time at the beginning of the dump; don’t forget to read about –single-transaction below). In all cases, any action on logs will happen at the exact moment of the dump. Option automatically turns –lock-tables off.

这个选项可以把binlog的位置和文件名添加到输出中,如果等于1,将会打印成一个CHANGE MASTER命令;如果等于2,会加上注释前缀。并且这个选项会自动打开–lock-all-tables,除非同时设置了–single-transaction(这种情况下,全局读锁只会在开始dump的时候加上一小段时间,不要忘了阅读–single-transaction的部分)。在任何情况下,所有日志中的操作都会发生在导出的准确时刻。这个选项会自动关闭–lock-tables。

阅读全文…

InnoDB建表时设定初始大小 (Setting InnoDB table datafile initial size when create new table)

12月 3rd, 2012 | Posted by | Filed under 未分类

InnoDB在写密集的压力时,由于B-Tree扩展,因而也会带来数据文件的扩展,然而,InnoDB数据文件扩展需要使用mutex保护数据文件,这就会导致波动。 丁奇的博客说明了这个问题:http://dinglin.iteye.com/blog/1317874

When InnoDB under heavy write workload, datafiles will extend quickly, because of B-Tree allocate new pages. But InnoDB need to use mutex to protect datafile, so it will cause performance jitter. Xiaobin Lin said this in his blog: http://dinglin.iteye.com/blog/1317874

解决的方法也很简单,只要知道数据文件可能会增长到多大,预先扩展即可。阅读代码可以知道,InnoDB建表后自动初始化大小是FIL_IBD_FILE_INITIAL_SIZE这个常量控制的,而初始化数据文件是由fil_create_new_single_table_tablespace()函数控制的。所以要改变数据文件初始化大小,只要修改fil_create_new_single_table_tablespace的传入值即可,默认是FIL_IBD_FILE_INITIAL_SIZE。

How to solve it? That’s easy. If we know the datafile will extend to which size at most, we can pre-extend it. After reading source code, we can know InnoDB initial datafile size by FIL_IBD_FILE_INITIAL_SIZE, and fil_create_new_single_table_tablespace() function to do it. So if we want to change datafile initial size, we only need to change the initial size parameter in fil_create_new_single_table_tablespace(), the default value is FIL_IBD_FILE_INITIAL_SIZE.

因此,我在建表语法中加上了datafile_initial_size这个参数,例如:
CREATE TABLE test (

) ENGINE = InnoDB DATAFILE_INITIAL_SIZE=100000;
如果设定的值比FIL_IBD_FILE_INITIAL_SIZE还小,就依然传入FIL_IBD_FILE_INITIAL_SIZE给fil_create_new_single_table_tablespace,否则传入datafile_initial_size进行初始化。

So, I add a new parameter for CREATE TABLE, named ‘datafile_initial_size’. For example:
CREATE TABLE test (

) ENGINE = InnoDB DATAFILE_INITIAL_SIZE=100000;
If DATAFILE_INITIAL_SIZE value less than FIL_IBD_FILE_INITIAL_SIZE, I will still pass FIL_IBD_FILE_INITIAL_SIZE to fil_create_new_single_table_tablespace(), otherwise, I pass DATAFILE_INITIAL_SIZE value to fil_create_new_single_table_tablespace() function for initialization.

因此,这个简单安全的patch就有了,可以看 http://bugs.mysql.com/bug.php?id=67792 关注官方的进展:
So, I wrote this simple patch, see http://bugs.mysql.com/bug.php?id=67792:

Index: storage/innobase/dict/dict0crea.c
===================================================================
--- storage/innobase/dict/dict0crea.c	(revision 3063)
+++ storage/innobase/dict/dict0crea.c	(working copy)
@@ -294,7 +294,8 @@
 		error = fil_create_new_single_table_tablespace(
 			space, path_or_name, is_path,
 			flags == DICT_TF_COMPACT ? 0 : flags,
-			FIL_IBD_FILE_INITIAL_SIZE);
+			table->datafile_initial_size < FIL_IBD_FILE_INITIAL_SIZE ? 
+        FIL_IBD_FILE_INITIAL_SIZE : table->datafile_initial_size);
 		table->space = (unsigned int) space;
 
 		if (error != DB_SUCCESS) {
Index: storage/innobase/handler/ha_innodb.cc
===================================================================
--- storage/innobase/handler/ha_innodb.cc	(revision 3063)
+++ storage/innobase/handler/ha_innodb.cc	(working copy)
@@ -7155,6 +7155,7 @@
 			col_len);
 	}
 
+  table->datafile_initial_size= form->datafile_initial_size;
 	error = row_create_table_for_mysql(table, trx);
 
 	if (error == DB_DUPLICATE_KEY) {
@@ -7760,6 +7761,7 @@
 
 	row_mysql_lock_data_dictionary(trx);
 
+  form->datafile_initial_size= create_info->datafile_initial_size;
 	error = create_table_def(trx, form, norm_name,
 		create_info->options & HA_LEX_CREATE_TMP_TABLE ? name2 : NULL,
 		flags);
Index: storage/innobase/include/dict0mem.h
===================================================================
--- storage/innobase/include/dict0mem.h	(revision 3063)
+++ storage/innobase/include/dict0mem.h	(working copy)
@@ -678,6 +678,7 @@
 /** Value of dict_table_struct::magic_n */
 # define DICT_TABLE_MAGIC_N	76333786
 #endif /* UNIV_DEBUG */
+  uint datafile_initial_size; /* the initial size of the datafile */
 };
 
 #ifndef UNIV_NONINL
Index: support-files/mysql.5.5.18.spec
===================================================================
--- support-files/mysql.5.5.18.spec	(revision 3063)
+++ support-files/mysql.5.5.18.spec	(working copy)
@@ -244,7 +244,7 @@
 Version:        5.5.18
 Release:        %{release}%{?distro_releasetag:.%{distro_releasetag}}
 Distribution:   %{distro_description}
-License:        Copyright (c) 2000, 2011, %{mysql_vendor}. All rights reserved. Under %{license_type} license as shown in the Description field.
+License:        Copyright (c) 2000, 2012, %{mysql_vendor}. All rights reserved. Under %{license_type} license as shown in the Description field.
 Source:         http://www.mysql.com/Downloads/MySQL-5.5/%{src_dir}.tar.gz
 URL:            http://www.mysql.com/
 Packager:       MySQL Release Engineering 
Index: sql/table.h
===================================================================
--- sql/table.h	(revision 3063)
+++ sql/table.h	(working copy)
@@ -596,6 +596,7 @@
   */
   key_map keys_in_use;
   key_map keys_for_keyread;
+  uint datafile_initial_size; /* the initial size of the datafile */
   ha_rows min_rows, max_rows;		/* create information */
   ulong   avg_row_length;		/* create information */
   ulong   version, mysql_version;
@@ -1094,6 +1095,8 @@
 #endif
   MDL_ticket *mdl_ticket;
 
+  uint datafile_initial_size;
+
   void init(THD *thd, TABLE_LIST *tl);
   bool fill_item_list(List *item_list) const;
   void reset_item_list(List *item_list) const;
Index: sql/sql_yacc.yy
===================================================================
--- sql/sql_yacc.yy	(revision 3063)
+++ sql/sql_yacc.yy	(working copy)
@@ -906,6 +906,7 @@
 %token  DATABASE
 %token  DATABASES
 %token  DATAFILE_SYM
+%token  DATAFILE_INITIAL_SIZE_SYM
 %token  DATA_SYM                      /* SQL-2003-N */
 %token  DATETIME
 %token  DATE_ADD_INTERVAL             /* MYSQL-FUNC */
@@ -5046,6 +5047,18 @@
             Lex->create_info.db_type= $3;
             Lex->create_info.used_fields|= HA_CREATE_USED_ENGINE;
           }
+        | DATAFILE_INITIAL_SIZE_SYM opt_equal ulonglong_num
+          {
+            if ($3 > UINT_MAX32)
+            {
+              Lex->create_info.datafile_initial_size= UINT_MAX32;
+            }
+            else
+            {
+              Lex->create_info.datafile_initial_size= $3;
+            }
+            Lex->create_info.used_fields|= HA_CREATE_USED_DATAFILE_INITIAL_SIZE;
+          }
         | MAX_ROWS opt_equal ulonglong_num
           {
             Lex->create_info.max_rows= $3;
@@ -12585,6 +12598,7 @@
         | CURSOR_NAME_SYM          {}
         | DATA_SYM                 {}
         | DATAFILE_SYM             {}
+        | DATAFILE_INITIAL_SIZE_SYM{}
         | DATETIME                 {}
         | DATE_SYM                 {}
         | DAY_SYM                  {}
Index: sql/handler.h
===================================================================
--- sql/handler.h	(revision 3063)
+++ sql/handler.h	(working copy)
@@ -387,6 +387,8 @@
 #define HA_CREATE_USED_TRANSACTIONAL    (1L << 20)
 /** Unused. Reserved for future versions. */
 #define HA_CREATE_USED_PAGE_CHECKSUM    (1L << 21)
+/** Used for InnoDB initial table size. */
+#define HA_CREATE_USED_DATAFILE_INITIAL_SIZE (1L << 22)
 
 typedef ulonglong my_xid; // this line is the same as in log_event.h
 #define MYSQL_XID_PREFIX "MySQLXid"
@@ -1053,6 +1055,7 @@
   LEX_STRING comment;
   const char *data_file_name, *index_file_name;
   const char *alias;
+  uint datafile_initial_size; /* the initial size of the datafile */
   ulonglong max_rows,min_rows;
   ulonglong auto_increment_value;
   ulong table_options;
Index: sql/lex.h
===================================================================
--- sql/lex.h	(revision 3063)
+++ sql/lex.h	(working copy)
@@ -153,6 +153,7 @@
   { "DATABASE",		SYM(DATABASE)},
   { "DATABASES",	SYM(DATABASES)},
   { "DATAFILE", 	SYM(DATAFILE_SYM)},
+  { "DATAFILE_INITIAL_SIZE",   SYM(DATAFILE_INITIAL_SIZE_SYM)},
   { "DATE",		SYM(DATE_SYM)},
   { "DATETIME",		SYM(DATETIME)},
   { "DAY",		SYM(DAY_SYM)},

MariaDB 10.x 将包含多主复制功能

10月 17th, 2012 | Posted by | Filed under 未分类

国庆期间与Monty合作,将我开发的多主复制功能合并到了MariaDB主干,将在10.x版本中出现。

Monty专门写了一片博客来介绍多主复制补丁:http://monty-says.blogspot.com/2012/10/multi-source-replication-for-mariadb-is.html

虽然MariaDB 10.x还没正式发布,但是已经可以下载最新的源码树来编译使用:https://code.launchpad.net/~maria-captains/maria/10.0-base

目前已知的问题就是采用多主复制以后,半同步(Semi-sync)会无法使用,这个要fix估计还需要一点时间,如果你不使用半同步,并且急切的需要使用多主复制,那么可以直接采用源码树上的代码,不再需要把我的补丁打到MySQL中再编译了。而且一般来说用多主复制都是为了聚合数据进行分析,而MariaDB的优化器不用多言,在MySQL的分支中是最强大的,正好可以更好的做OLAP。

具体的使用文档看这里:https://kb.askmonty.org/en/multi-source-replication/

值得一提的是,这次合并以后增加了SHOW ALL SLAVES STATUS功能,可以显示所有的通道复制情况。START/STOP ALL SLAVES 也可以一次性启停所有通道。另外一直影响大家使用的无法跳过指定通道错误的问题,也顺便修复了,增加了一个变量,set @@default_master_connection=’connection_name’,这样可以指定一个通道,然后用单通道的Sql_slave_skip_counter就可以了。

当然也要感谢Monty为我review patch,发现那么多隐含问题,并且给我commit权限,希望能给开源做更多的事情,对MySQL做更多的改进。

SVN:合并一个分支到主干

9月 21st, 2012 | Posted by | Filed under 程序设计

原文在此,我只是翻译:http://www.sepcot.com/blog/2007/04/SVN-Merge-Branch-Trunk

这篇文章只是写给我自己备用的,但是写出来可能更多的人会觉得这很有用。

最近在工作中,我被分配了更多的职责。包括部分网站的分支控制工作。我花了一段时间才理清楚如何处理所有的事情,并且大部分在网络上找到的资料对我都没有太大的帮助,所以我会在这里发这篇文章来阐述。

我们采用SVN做代码版本控制,并且代码存在一台可以用SSH访问的服务器上。

阅读全文…

标签: ,