P.Linux Laboratory

ICC静态编译Percona

1 月 6th, 2011 | Posted by P.Linux | Filed under 未分类

经过我的测试ICC在浮点运算，线程库和数学函数上的优势非常明显，原生SSE2指令集支持、Intel自己编写的线程库和数学函数库，性能没得说。
我用同一份运算PI值的代码在ICC和GCC下编译，提升比例达20%，实际在数据库中比较同一条超级复杂的聚合SQL，ICC提升达34%。

第一步：编译安装libunwind
wget http://download.savannah.gnu.org/releases/libunwind/libunwind-0.99.tar.gz
tar zxvf libunwind-0.99.tar.gz

CC=icc \
CXX=icpc \
LD=xild \
AR=xiar \
CFLAGS=”-O3 -no-prec-div -ip -xSSE2 -axSSE2″ \
CXXFLAGS=”${CFLAGS}” \
./configure && make && make install

第二布：编译安装tcmalloc
wget http://google-perftools.googlecode.com/files/google-perftools-1.6.tar.gz
tar zxvf google-perftools-1.6.tar.gz

CC=icc \
CXX=icpc \
LD=xild \
AR=xiar \
CFLAGS=”-O3 -no-prec-div -ip -xSSE2 -axSSE2″ \
CXXFLAGS=”${CFLAGS}” \
./configure –disable-debugalloc –enable-frame-pointers && make && make install

echo “/usr/local/lib” > /etc/ld.so.conf.d/usr_local_lib.conf
/sbin/ldconfig

第三部：编译安装Percona
CC=icc \
CXX=icpc \
LD=xild \
AR=xiar \
CFLAGS=”-O3 -unroll2 -ip -mp -restrict -fno-exceptions -fno-rtti -no-prec-div -fno-implicit-templates -static-intel -static-libgcc -xSSE2 -axSSE2″ \
CXXFLAGS=”${CFLAGS}” \
CPPFLAGS=” -I/usr/alibaba/icc/include ” \
LDFLAGS=” -L/usr/alibaba/icc/lib -lrt ” \
./configure –prefix=/usr/alibaba/install/percona-custom-5.1.53-12.4 \
–with-server-suffix=-alibaba-edition \
–with-mysqld-user=mysql \
–with-plugins=heap,innodb_plugin,myisam,partition \
–with-charset=utf8 \
–with-collation=utf8_general_ci \
–with-extra-charsets=gbk,utf8,ascii \
–with-big-tables \
–with-fast-mutexes \
–with-zlib-dir=bundled \
–with-readline \
–with-pthread \
–with-mysqld-ldflags=’-all-static -ltcmalloc’ \
–enable-assembler \
–enable-profiling \
–enable-local-infile \
–enable-thread-safe-client \
–without-embedded-server \
–with-client-ldflags=-all-static \
–with-mysqld-ldflags=-all-static \
–with-mysqld-ldflags=-ltcmalloc \
–without-query-cache \
–without-geometry \
–without-debug \
–without-ndb-binlog \
–without-ndb-debug
编译完成后make && make install

标签: 数据库, ICC, MySQL, Percona, XtraDB

PostgreSQL和MySQL的对比，第1部分：表组织

12 月 27th, 2010 | Posted by P.Linux | Filed under 未分类

1 条评论

翻译自：http://blogs.enterprisedb.com/2010/11/29/mysql-vs-postgresql-part-1-table-organization/
翻译不正确之处请指正。

I’m going to be starting an occasional series of blog postings comparing MySQL’s architecture to PostgreSQL’s architecture. Regular readers of this blog will already be aware that I know PostgreSQL far better than MySQL, having last used MySQL a very long time ago when both products were far less mature than they are today. So, my discussion of how PostgreSQL works will be based on first-hand knowledge, but discussion of how MySQL works will be based on research and – insofar as I’m can make it happen – discussion with people who know it better than I do. (Note: If you’re a person who knows MySQL better than I do and would like to help me avoid making stupid mistakes, drop me an email.)
我将要开始一个比较MySQL和PostgreSQL架构系列的博客。本博客的长期读者都已经知道，我最后一次使用MySQL是在很久很久以前两款产品都远不如今天的时候，所以我认为PostgreSQL远好于MySQL。因此，我讨论PostgreSQL如何工作是基于第一手资料，而对于MySQL则是基于很久以前的情况，看博客的同学有很多比我更了解MySQL。如果你是一个比我更了解MySQL的人，发现了我愚蠢的错误请给我一个邮件。

In writing these posts, I’m going to try to avoid making value judgments about which system is “better”, and instead focus on describing how the architecture differs, and maybe a bit about the advantages of each architecture. I can’t promise that it will be entirely unbiased (after all, I am a PostgreSQL committer, not a MySQL committer!) but I’m going to try to make it as unbiased as I can. Also, bearing in mind what I’ve recently been told by Baron Schwartz and Rob Wultsch, I’m going to focus completely on InnoDB and ignore MyISAM and all other storage engines. Finally, I’m going to focus on architectural differences. People might choose to use PostgreSQL because they hate Oracle, or MySQL because it’s easier to find hosting, or either product because they know it better, and that’s totally legitimate and perhaps worth talking about, but – partly in the interests of harmony among communities that ought to be allies – it’s not what I’m going to talk about here.
写这些文章，我要尽量避免作出哪个系统更好的判断，而是侧重于介绍他们架构的不同，也许是一些各种架构的优势。我不能保证这些观点是完全不带偏见的（毕竟，我是一个PostgreSQL代码的提交者，而不是MySQL的提交者），但是我会尽量做到不偏重某一个。此外，考虑到我最近已经跟Baron Schwartz和Rob Wultsch说的内容，我将完全忽略MyISAM和所有其他存储引擎，而重点关注InnoDB。最后，我将专注于架构的差异。人们有时选择使用PostgreSQL是因为他们恨甲骨文，或者选择MySQL因为它更容易找到托管服务，或其他一些产品因为他们知道它更好，并且这是完全符合授权的。这些但这不是我想要谈的。（译者注：最后一段话太绕口，翻译不了，只翻译大意）

So, all that having been said, what I’d like to talk about in this post is the way that MySQL and PostgreSQL store tables and indexes on disk. In PostgreSQL, table data and index data are stored in completely separate structures. When a new row is inserted, or when an existing row is updated, the new row is stored in any convenient place in the table. In the case of an update, we try to store the new row on the same page as the old row if there’s room; if there isn’t room or if it’s an insert, we pick a page that has adequate free space and use that, or failing all else extend the table by one page and add the new row there. Once the table row is added, we cycle through all the indexes defined for the table and add an index entry to each one pointing at the physical position of the table row. One index may happen to be the primary key, but that’s a fairly nominal distinction – all indexes are basically the same.
因此，我将说的内容是，MySQL和PostgreSQL的表和索引存储在磁盘上的方式。在 PostgreSQL，表数据和索引数据是完全分开存储的。当新行插入，或现有的行被更新，新行是表中的任何方便保存的地方保存。在更新的场景下，我们尝试在页内还有空间的情况下存储新行与旧行在同一个页上。如果没有空间，或者如果它是一个插入操作，我们将选择一个有足够空闲空间的页，使用它，或者扩展一个新页把新行放入。我们轮训表上定义的所有索引，并添加一个索引项指针指向表中新行的物理位置。这个索引也许是主键，也许是一般的索引，但是所有的所有索引都是基于一样的操作。

Under MySQL’s InnoDB, the table data and the primary key index are stored in the same data structure. As I understand it, this is what Oracle calls an index-organized table. Any additional (”secondary”) indexes refer to the primary key value of the tuple to which they point, not the physical position, which can change as leaf pages in the primary key index are split. Since this architecture requires every table to have a primary key, an internal row ID field is used as the primary key if no explicit primary key is specified.
在InnoDB中，表数据和主键索引是存在同样的数据结构中（译者注：主键聚集索引）。据我的理解，这就像Oracle的索引组织表（译者注：还是有一些区别，索引组织表完全按索引排序，但是InnoDB只按主键排序）。任何非主键索引指向主键索引的位置，而不是物理位置，所以主键索引页的页节点分裂不会导致数据改变。由于这种架构要求每个表都有一个主键，所以如果没有定义主键内部将隐含定义一个主键（译者注，内部定义的主键为6字节）。

Since Oracle supports both options, they are probably both useful. An index-organized table seems particularly likely to be useful when most lookups are by primary key, and most of the data in each row is part of the primary key anyway, either because the primary key columns are long compared with the remaining columns, or because the rows, overall, are short. Storing the whole row in the index avoids storing the same data twice (once in the index and once in the table), and the gain will be larger when the primary key is a substantial percentage of the total data. Furthermore, in this situation, the index page still holds as many, or almost as many, keys as it would if only a pointer were stored in lieu of the whole row, so one fewer random I/Os will be needed to access a given row.
由于Oracle支持两种选择（索引组织表和堆表），他们可能都非常有用。一个索引组织表似乎在多数SQL是通过主键查找，以及每行的大部分数据是主键的一部分的时候非常有用。要么因为主键列比其余的列长，或因为行总体而言是比较短的。存储整行数据在索引上避免了同样的数据存两分（一份在索引，一份在表中），但是如果主键占数据行的比例较大时，数据增益（译者注：数据+表的重复数据量）将更大。此外，在这种情况下，索引页将保存很多或几乎一样多的数据，访问数据时在索引页中就可能得到整行需要的列，所以这可以减少随机IO（译者注：覆盖索引扫描，Index Scan）。

When accessing an index-organized table via a secondary index, it may be necessary to traverse both the B-tree in the secondary-index, and the B-tree in the primary index. As a result, queries involving secondary indexes might be slower. However, since MySQL has index-only scans ( PostgreSQL does not ), it can sometimes avoid traversing the secondary index. So in MySQL, adding additional columns to an index might very well make it run faster, if it causes the index to function as a covering index for the query being executed. But in PostgreSQL, we frequently find ourselves telling users to pare down the columns in the index to the minimum set that is absolutely necessary, often resulting in dramatic performance gains. This is an interesting example of how the tuning that is right for one database may be completely wrong for another database.
当通过非主键索引访问一个索引组织表，可能需要遍历非主键索引的B树和主键索引的B树。因此，查询涉及非主键索引可能会变慢。然而，由于MySQL有Index-Scan方式（译者注：访问索引即可获得数据）而PostgreSQL没有，它有时访问非主键索引就能拿到数据。因此，在MySQL中，添加额外的列索引如果带来覆盖索引的查询计划，则很可能使SQL运行得更快（译者注：这个不完全对，索引多的话索引页分裂时的物理IO操作还是比较多的，推荐满足需求的情况下减少索引，除非你能保证覆盖索引经常被用到）。但是在PostgreSQL里，我们经常发现自己告诉用户减少索引到满足要求的最低限度时往往能带来巨大的性能提升。这是一个有趣的例子，如何调整数据库在不同的数据库中是完全相反的方法。

I’ve recently learned that neither InnoDB nor PostgreSQL supports traversing an index in physical order, only in key order. For InnoDB, this means that ALL scans are performed in key order, since the table itself is, in essence, also an index. As I understand it, this can make a large sequential scan quite slow, by defeating the operating system’s prefetch logic. In PostgreSQL, however, because tables are not index-organized, sequential scans are always performed in physical order, and don’t require looking at the indexes at all; this also means we can skip any I/O or CPU cost associated with examining non-leaf index pages. Traversing in physical order is apparently difficult from a locking perspective, although it must be possible, because Oracle supports it. It would be very useful to see this support in MySQL, and once PostgreSQL has index-only scans, it would be a useful improvement for PostgreSQL, too.
我最近获悉，PostgreSQL跟InnoDB一样也支持通过主键索引顺序遍历（译者注：InnoDB访问全表返回数据按主键顺序排列）。对于 InnoDB，这意味着所有的全表扫描是在扫描主键索引，主键索引本身就是表。据我了解，这可能导致大的顺序扫描慢很多（译者注：这个比较扯淡，在数据静止的情况下，PostgreSQL一样要通过block的指针访问下一个block，InnoDB通过页的指针访问下一个页）。在PostgreSQL，因为表不是按（主键）索引组织，顺序扫描总是按物理顺序进行，并且完全不需要访问索引，这也意味着我们可以跳过任何访问索引非叶子节点的IO或CPU开销（译者注：这位兄台应该忘记了什么是B+树）。显然按物理顺序访问是很困难的，但是肯定可以实现，因为Oracle支持。这是MySQL一个非常有用的功能，PostgreSQL一旦有了覆盖索引扫描功能，对PostgreSQL也将是非常有用的提升。

One final difficulty with an index-organized table is that you can’t add, drop, or change the primary key definition without a full-table rewrite. In PostgreSQL, on the other hand, this can be done – even while allow concurrent read and write activity. This is a fairly nominal advantage for most use cases since the primary key of a table rarely changes – I think it’s happened to me only once or twice in the last ten years – but it is useful when it does comes up.
使用索引组织表的最后一个问题是不能在不重建全表的情况下添加，删除或变更主键索引定义。反而在PostgreSQL里，这是可以做到的——即使当允许并发读写活动时。在大多数情况下（InnoDB）具有优势，因为在大多数场景下一旦定义主键不太可能更改。在我最近十年内这只碰到一次或两次——但是它真的发生时，（PostgreSQL）还是很有用的。

I hope that the above is a fair and accurate summary of the topic, but I’m sure I’ve missed a few things and covered others incompletely or in less detail than might be helpful. Please feel free to respond with a comment below or a blog post of your own if I’ve missed something.
我希望以上是这个专题比较公正和准确的总结，但我敢肯定，我已经错过了一些东西，或者覆盖一些内容不完全，缺少一些可能会有所帮助的细节。请随时反馈在下面的评论中评论您对我遗漏的一些内容的看法。

标签: 数据库, InnoDB, MySQL, PostgreSQL

InnoDB的Master Thread调度流程

12 月 15th, 2010 | Posted by P.Linux | Filed under 未分类

8 条评论

InnoDB的主要IO操作都是在Master Thread（srv0srv.c）中完成的，所以分析InnoDB的IO调度，就一定要分析Master Thread线程。

下面是我画的一张流程图，标识了整个Master Thread的调度流程。红色部分是InnoDB Plugin/XtraDB对原有InnoDB引擎的改进。
每个Process文字中最下面的括号是进行这个操作的具体函数，可以参照源代码阅读本图。

顺便解释一下“插入缓冲”（Insert Buffer）：InnoDB为了避免更新数据时更新索引损失太多性能，使用了这种称为Insert Buffer的方法来缓冲索引更新，对于非聚集索引（主键索引）、唯一索引的修改，不是每次都直接插入索引页，而是先判断要更新的这一页在不在内存中，如果不在则存入Insert Buffer，按照Master Thread的调度规则来合并非唯一索引和索引页中的叶子结点，这样经常能减少更新索引的代价。为什么要求是非唯一索引（排除主键索引和唯一索引）呢？因为唯一索引要检查记录是不是存在，所以必须把修改的记录影响的索引页读出来才知道是不是唯一，这样Insert Buffer就没意义了，反正要读出来，所以只对非唯一索引有效。
show innodb status中的“INSERT BUFFER AND ADAPITIVE HASH INDEX”里面显示了Insert Buffer的效果。

更正一部分，发现在刷新100个赃页后，InnoDB认为刷新耗时已经超过一秒了，无需等待，设置skip_sleep=TRUE，直接跳过os_pthread_sleep，进行下一次判断。

InnoDB Master Thread

标签: 数据库, InnoDB, Master Thread, MySQL, XtraDB

Slave SQL线程阻塞时执行Slave相关命令的风险

12 月 12th, 2010 | Posted by P.Linux | Filed under 未分类

2 条评论

今天做一批备机加主键的工作时，意外发现，如果有一个线程阻塞了Slave SQL线程应用日志，导致Slave SQL在Locked状态，再试图执行Slave Stop命令时，必定导致show slave status/master status等语句执行Hang死。
解决方法是只能等待锁定Slave SQL的线程结束，或者重启数据库，还没试出其他方法可以解决。已经在MySQL 5.0.68、5.1.30/34/40上重现。
搜索了Bug库，确实找到了这个bug，http://bugs.mysql.com/bug.php?id=56676，至少在5.1.50之前都会有这个问题。

查看了源码，主要是由于mi->run_lock和LOCK_active_mi两个锁导致的问题。
slave的运行流程是 start_slave_thread函数创建handler_slave_sql线程去轮询日志，handler_slave_sql调用exec_relay_log_event去应用日志事件，exec_relay_log_event又调用apply_event_and_update_pos来具体读取一个日志事件应用日志到存储引擎并更新relay-log的pos信息，最后根据读取的日志类型，调用不同类重载的XXX_log_event::do_apply_event去真正使用解出来的日志。

导致Hang住的原因是这样的：
slave_sql一旦启动成功，就会持有mi->run_lock锁，mi是Master_info的实例，记录主机信息，就是master.info的内容，mi->run_lock被持有表示mi的Slave正在运行（mi定义为Master_info *，注释里也说了，Multi Master写完后，mi是个数组，可以有每个Master分别持有锁，所以MySQL也在做这个事了），由于目前只支持单Master，所以mi的锁是全局的，即LOCK_active_mi。当一条SQL被Locked的时候，Slave SQL持有mi->run_lock，cond_wait等待不到继续进行的条件，于是运行不到if (!sql_slave_killed(thd,rli))这条语句。所以stop_slave发出kill无法被判断到，于是slave stop就Hang住了。由于stop slave持有LOCK_active_mi（关闭Slave需要保存master.info），而show slave status/show status都会先做pthread_mutex_lock(&LOCK_active_mi);因而全部堵住。
还有一个可能存在的风险，Relay_log_info类的tables_to_lock链表存了Slave要锁住的表，如果Slave不能及时继续，tables_to_lock链表就不能及时清理，会带来很多锁问题，可能引起大面积阻塞。上次有个故障，MySQL Hang死，很可能就是我们一个跳过复制错误的脚本show slave status和slave start/stop执行频率很高，突然切换主备需要建立大量连接的时候CPU上下文切换较多，释放LOCK_active_mi锁的速度就跟不上，另一些show slave status采集监控的脚本迅速阻塞，导致tables_to_lock链表不能及时释放，进而导致正常SQL执行被锁阻塞，由于变更量非常大，阻塞迅速蔓延，锁等待几乎把数据库Hang死。

所以我提醒各位，在Slave中有长SQL或Locked的SQL执行时，除show processlist;外千万不要做show slave/master status以及slave stop等slave相关命令。

handler_slave_sql循环执行：
03058 while (!sql_slave_killed(thd,rli))
03059 {
03060 thd_proc_info(thd, “Reading event from the relay log”);
03061 DBUG_ASSERT(rli->sql_thd == thd);
03062 THD_CHECK_SENTRY(thd);
03063
03064 if (saved_skip && rli->slave_skip_counter == 0)
03065 {省略
03076 }
03077
03078 if (exec_relay_log_event(thd,rli))
03079 {
03080 DBUG_PRINT(“info”, (“exec_relay_log_event() failed”));
03081 // do not scare the user if SQL thread was simply killed or stopped
03082 if (!sql_slave_killed(thd,rli))
03083 {省略
03144 }
03145 goto err;
03146 }
03147 }

show slave status命令
07409 static int show_slave_running(THD *thd, SHOW_VAR *var, char *buff)
07410 {
07411 var->type= SHOW_MY_BOOL;
07412 pthread_mutex_lock(&LOCK_active_mi);
07413 var->value= buff;
07414 *((my_bool *)buff)= (my_bool) (active_mi &&
07415 active_mi->slave_running == MYSQL_SLAVE_RUN_CONNECT &&
07416 active_mi->rli.slave_running);
07417 pthread_mutex_unlock(&LOCK_active_mi);
07418 return 0;
07419 }

清除锁定表的clear_tables_to_lcok
01222 void Relay_log_info::clear_tables_to_lock()
01223 {
01224 while (tables_to_lock)
01225 {
01226 uchar* to_free= reinterpret_cast(tables_to_lock);
01227 if (tables_to_lock->m_tabledef_valid)
01228 {
01229 tables_to_lock->m_tabledef.table_def::~table_def();
01230 tables_to_lock->m_tabledef_valid= FALSE;
01231 }
01232 tables_to_lock=
01233 static_cast(tables_to_lock->next_global);
01234 tables_to_lock_count–;
01235 my_free(to_free, MYF(MY_WME));
01236 }
01237 DBUG_ASSERT(tables_to_lock == NULL && tables_to_lock_count == 0);
01238 }

标签: 数据库, MySQL, Slave

Percona对MySQL标准版本的改进

12 月 6th, 2010 | Posted by P.Linux | Filed under 未分类

3 条评论

周末有空读了下Percona XtraDB对MySQL InnoDB的改进点，这里给大家分享下。

一、对可扩展性的改进：
1. 提升Buffer Pool的扩展性
InnoDB Buffer Pool一个众所周知的问题是大并发查询执行的争用，XtraDB将Buffer Pool的全局Mutex拆成了多个Mutex以减少争用。

2. 提高InnoDB IO扩展性
XtraDB增加了许多变量去调整IO到最佳状态，包括调整checkpoint、后台读写数据文件线程数等等的参数。

3. 多个回滚段
为提供一直读，InnoDB将事务修改的数据写到回滚段。回滚段被一个独立的Mutex保护，这直接导致了写密集型的工作并发不高。在 XtraDB可以改变回滚段的数目（innodb_extra_rsegments），在写密集型操作中可以大幅度提高性能。

4. 可以更高的并发数
InnoDB在回滚段只提供了1024个回滚槽（春哥就遇到过这个瓶颈），如果回滚槽用完，新的事务将不能开始，直到有回滚槽被释放。

二、性能上的提升
1. 专用的Purge线程
在InnoDB一个事务修改的数据被写到共享表空间的undo space，所以InnoDB能提供读一致。到一个事务结束了，undo space的相应区域被释放。但是如果有很多事务，Purge线程清理空间不够快，共享表空间将急剧增长（BRMMS共享表空间巨大应该是这个原因）。这将导致性能严重下降，甚至可能用完所有的磁盘空间。XtraDB使用了一个专用的线程来清理undo space，这对undo space的清理速度可以提升很多。尽管这可能使整体的性能降低，但是可以大大提高稳定性，因而整体性能略微降低是值得的。

2. 可配置的Doublewrite缓冲
InnoDB使用了double write功能来防止数据损坏，double write的意思是，是写数据到文件前，先顺序写到到共享表空间。如果遇到一个损坏的写，InnoDB将使用这个buffer去恢复数据。尽管数据被写了两次但对性能影响通常较小，但是在一些高负载环境，doublewrite就成了瓶颈。XtraDB提供了一个选项将doublewrite buffer放在一个独立的磁盘来提升并发性能。

3. Query Cache增强
Percona提供了额外的参数来配置Query Cache，例如忽略SQL中的注释性语句来检查是否可以命中。

4. Fast InnoDB Checksum
InnoDB可以checksum所有从磁盘上读取的页，以提供防止数据损坏的额外安全保障。在XtraDB中，Percona改进算了 checksum算法，可以提供更好的性能。

5. 删除过多的函数调用
当MySQL从socket读数据时，将产生很多fcntl（针对描述符提供控制的函数）调用，导致并发性能下降。Percona移出了多于的调用。

6. 减少了Buffer Pool Mutex竞争
在InnoDB内核操作时减少了Buffer Pool之间的Mutex争用（拆分Mutex变量）

三、灵活性改进
1. 支持多种页大小
尽管InnoDB支持多种页大小，但是默认的页大小16K无法在不重新编译的情况下改变。XtraDB提供一个系统变量（innodb_page_size）来改变这个值。更小的页大小可以提升大多数OLTP系统的工作性能，更大的页通常可以提供更好的 OLAP性能。

2. 禁止Replication警告
默认的基于Statement的复制，例如NOW(),RAND()，call存储过程/函数等一些语句，或者UPDATE没有ORDER BY而使用LIMIT，可能是不安全的。在这种情况下，MySQL会发出1592警告（声明语句在Statement日志下是不安全的）。不幸的是，MySQL 5.1的一个Bug导致Server发出这个警告在一些安全的情况下。索然他不会导致任何与复制相关的问题，但是这会导致Error Log里面存在没必要的报警。这个改进可以避免这些警告。

3. 处理BLOB中的行结束符
Percona(5.1.x-12.x开始，5.1.x-11.x不支持)为MySQL客户端提供一个新的选项（no-remove- eol-carret）来处理Blob字段含\r字符的情况。

4. 复制停止恢复
当使用sql_slave_skip_counter参数时，如果一个事件组的中间某条出错了，slave将跳过所有剩余的时间操作直到这个事件组结束。表述比较困难，直接看Percona给的使用例子就明白了。
http://www.percona.com/docs/wiki/percona-server:features:replication_skip_single_statement

5. 可固定的预读区
在InnoDB中，预读（read-ahead区域）的大小是动态计算的，但是它经常是一个同样的值。XtraDB(5.1.x-12.x开始，5.1.x-11.x不支持)可以让这个这个区域的大小固定，避免无用的计算。
这是Facebook放出的补丁：http://bazaar.launchpad.net/~mysqlatfacebook/mysqlatfacebook/5.1/revision/3538

四、可靠性的改进
1. Crash后同步日志
在InnoDB中，slave复制状态存储在两个不同步的文件中(relay.index和relay.info)。如果slave因为错误状态而停止，文件将不同步，最后的事务将重新执行。Percona在XtraDB事务日志中增加了复制状态：当重启事务时，slave可以使用这个信息来实现一致性。
来自Google的补丁：http://code.google.com/p/google-mysql-tools/wiki/TransactionalReplication
这个缺陷可能导致的Bug：http://bugs.mysql.com/bug.php?id=34058

2. Too Many Connections的警告
Percona将“Too Many Connections”这个警告写入Server端的error_log，而不只是客户端报这个错。

3. 错误代码的兼容性
Percona(5.1.x-12.x开始，5.1.x-11.x不支持)提供与MySQL 5.5错误代码的兼容性，避免因为升级到5.5而带来错误码不一样的问题。

4. 文件句柄损坏的表（InnoDB）
MySQL在InnoDB有表损坏之后，所有的InnoDB表都不可用。XtraDB改进了这一点，只是disable损坏的表，数据库依然可以使用其他的表，损坏的表被锁定。

五、可管理性的提升
1. Fast InnoDB Recovery
InnoDB一直以来有个很麻烦的事情，在crash后回复InnoDB的表非常的缓慢。Percona/XtraDB因为是基于 InnoDB Plugin 1.0.8+的，也具备InnoDB Plugin快速恢复的功能。（早期的Percona版本也能看到XtraDB恢复速度比InnoDB快很多，因为XtraDB早期使用了自己开发的 Fast Revcovery）
一些测试：http://www.mysqlperformanceblog.com/2009/07/07/improving-innodb-recovery-time/

2. InnoDB 数据字段大小限制
InnoDB在自己的表缓存（Table Cache）中分配存储表定义（Table Definitions）的内存称为数据字典。默认情况下，一旦打开表，字典中表示它的内部对象将一直保存在内存中，直到表被删除或者服务器重启。如果存在很多表（例如 10万张或更多，Dubbo就有这种情况，logstat库），可能导致消耗巨大的内存有时可能达到G级别。Percona修改了这种策略，可以设置参数（innodb_dict_size_limit）来限制数据字典的大小，使InnoDB使用LRU算法来限制数据字典大小，而不是一直存在内存中，避免因为表太多而内存耗尽。

3. 展开表导入
InnoDB不像MyISAM那样可以在服务器之间拷贝单表定义文件。如果配合Xtrabackup导出，一张表可以在另一个XtraDB导入。

4. Buffer Pool使用共享内存
当Buffer Pool非常大时，重启后Warn up需要大量磁盘读写，这会消耗很多时间。通过将Buffer Pool存储在Shared Memory中，这些非是耗时的IO将会节省掉。主机重启就没办法了，得用下面的功能。

5. 导出/恢复Buffer Pool
对于使用了很大Buffer Pool的InnoDB，重启数据库很痛苦。通常需要InnoDB Buffer Pool先Warn Up再提供服务，这可能需要很久。XtraDB(5.1.x-12.x开始，5.1.x-11.x不支持)提供了命令可以把Buffer Pool的内容导入或导出，从而可以提高重启提供服务的速度。
使用方法：http://www.percona.com/docs/wiki/percona-server:features:innodb_lru_dump_restore?redirect=1

6. Fast Index Creation
快速索引创建是InnoDB Plugin的功能，只要不是主键变动，修改索引的速度比之前快很多。但是在一些场景下，这可能导致损坏。XtraDB提供参数（innodb_fast_index_creation）来选择Fast Index Creation功能是否启用，如果关闭，则使用原来的创建方法。

7. Fast Index Renaming
XtraDB（(5.1.x-12.x开始，5.1.x-11.x不支持)）扩展了ALTER TABLE命令，提供在线重命名索引功能，这样不会导致重建索引。（这对我们调整不规范索引名称非常有用）

8. 防止缓存Flashcache
Flashcache通过在SSD上缓存数据来提升性能。它工作时应该让更热的数据缓存才能能提高更好的性能，XtraDB提供了注释提示来忽略不必缓存的数据。

六、诊断问题方面的提升
1. 额外的INFORMATION_SCHEMA表
Percona/XtraDB提供额外的INFORMATION_SCHEMA表以获得数据库内部更详尽的信息，例如内部缓冲池的内容或统计信息。

2. 慢查日志扩展
Percona提供了额外的统计数据，可以通过参数启用。它可以帮助我们捕捉需要的事件尽可能详细的信息，简化了慢查分析的难度。

3. InnoDB状态显示
XtraDB整理了InnoDB Status的显示量，提供更好的可读性，状态由24个上升到48个，并且打印了被内部哈希表使用的内存量。通过新的参数可以配置的输出。

4. 计算InnoDB死锁数
当运行一饿事务性的应用程序，总会不同程度的出现死锁，只要不经常出现这并不是大的问题。InnoDB中Show InnoDB Status命令只给出了最后一次死锁额信息，当我们需要知道总的死锁数或一个单位时间的死锁量这里并不能给出。XtraDB增加了一个保存死锁量的状态变量，通过这个变量可以更好的了解我们数据库上发生的死锁。

5. 可以记录所有Server端命令（syslog）
Percona可以在syslog中记录所有运行在Server端的命令。

6. 响应时间分布
Percona提供了一份报告表明在一定间隔内在服务器上执行Query数。这个信息可以用于监控数据库性能是否稳定。

7. Show Storage Engines
Percona改变了Show Storage Egnines的输出，以表名XtraDB是不是启用。（以前XtraDB也使用InnoDB的名称输出）

8. Query Cache Mutex状态
Query Cache可能导致一些很难被检测出来的问题，Percona修改了show processlist命令，可以输出“Waiting on query cache mutex”状态。

9. 显示锁名称
“show mutex status”命令可以显示当前发生的锁定名称和os_wait值。

标签: 数据库, InnoDB, MySQL, Percona, XtraDB

P.Linux Laboratory

ICC静态编译Percona

PostgreSQL和MySQL的对比，第1部分：表组织

InnoDB的Master Thread调度流程

Slave SQL线程阻塞时执行Slave相关命令的风险

Percona对MySQL标准版本的改进

订阅统计

分类目录

标签云

近期评论

友情链接

我的网站