Databases News

Examining the Oracle Database 10053 Trace Event Dump File
Ever wonder what the Oracle Database Cost Based Optimizer (CBO) was doing behind the scenes or how it comes up with an execution plan? While other tools or settings show us WHAT the CBO comes up with, the 10053 trace event setting tells us HOW the CBO came to its decision (the final execution plan). redir.internet.com | 7/30/10 12:21 AM
North Texans -- vote for your new meeting location
The North Texas MySQL Users Group is now a special interest group withing the Dallas Oracle Users Group. As such, we can meet in oracle's offices in Plano or Irving. In the past there has been demand for meetings in the northern part of the Metroplex and demand for meetings more near the DFW Airport. So we can meet in either office or alternate between the two. Please state you preference by voting on the North Texas MySQL Users Group website. dave-stokes.blogspot.com | 7/29/10 10:04 PM
Data Warehousing Best Practices: Comparing Oracle to MySQL, part 2 (partitioning)
At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer. See part 1 for the introduction and talking about power and hardware. This part will go over the 2nd “P”, partitioning. Learning about Oracle’s partitioning has gotten me more interested in how MySQL’s partitioning works, and I do hope that MySQL partitioning will develop to the level that Oracle partitioning does, because Oracle’s partitioning looks very nice (then again, that’s why it costs so much I guess). Partition – Larger tables or fact tables can benefit from partitioning because it makes data load easier and can increase join performance and use data elimination. Parallel execution can be done with partitioning due to partition pruning. The degree of parallelism should be a power of 2, because of hash-based algorithm in hash partitioning. To translate this to the MySQL world, if you are using LINEAR HASH partitioning, then you should use a degree of parallelism that is a power of 2 (I checked, and indeed. Otherwise, use a degree of parallelism that makes sense given the number of partitions you have. One important note that during Pythian’s testing of MySQL partitioning, we found that all partitions were locked when an INSERT occurs, for the duration of the INSERT. Bulk-loading with MySQL partitioning is not as fast as it would be if MySQL allowed partition pruning for INSERTs. So, what should be partitioned? For the first level of partitioning, the goal is to enable partitioning pruning and simplify data management. The most typical partitioning is range or interval partitioning on a date column. Interval partitioning is you say what the partition is (date, month) and partition is automatically created. MySQL does not have interval partitioning, and I have seen typical first-level partitioning be range or list based on a date or timestamp column. Note that if you use a timestamp field, the partitioning expression is optimized if you use TO_DAYS(timestamp_field) or YEAR(timestamp_field). In my experience, using anything else (such as DATE(timestamp_field)) actually makes partitioning slower than not using partitioning at all. Note that this is based on tests I did a few months ago, and your mileage may vary. So — how do you decide partitioning strategy? Ask yourself: What range of data do the queries touch – a quarter, a year? What is the data loading frequency? Is an incremental load required? How much data is involved, a day, a week, a month? The answers to the above questions will tell you about how big your interval needs to be. The best scenario is that all answers are the same, “we load every day, and people query by day.” If the answers are different weight access a higher priority than loading, because most people care more about query performance than performance of ETL. This is true even if your intervals have different sizes — ie sales per day are much bigger in Dec but that’s OK. However, Maria recommends that the subpartition be as evenly divided as possible. Easier to look at more partitions than to look at a partition that’s too big. But you don’t want too many partitions, max Oracle allows partitions is 1 million partitions, prior to 11g it was 64,000. “Stick closer to 64,000 than 1 million”. MySQL’s limitation is 1024 per table. For the second level of partitioning, also called subpartitioning, the goal is to allow for multi-level pruning and improve join performance. In Oracle, the most typical subpartition is hash or list – in MySQL, you can only subpartition by hash or key. How do you decide subpartitioning strategy? Select the dimension queried most frequently on the fact table OR Pick the common join column For example, if you want to look at sales per day, per store, you would choose “per day” as the partition and “per store” as the subpartition. If you do not have a good partition on logical elements (like grouping), then you can subpartition using hash partitioning on common joins — perhaps surrogate keys, or using join key of the largest table involved in the join. For example, if the sales table is partitioned and another big table is product, you can hash subpartition product_id. Because there’s overhead in partitions (loading metadata, reading metadata), make sure size of partitions and subpartitions is >20 Mb. So better to have a 30 Mb subpartition than a 15 Mb subpartition. [I have no idea if this is true in MySQL or not -- I think the general concept is true, because there is some overhead, but I have no idea about the 20 Mb figure and why that's true for Oracle, nor do I know what is true in MySQL.] One easy calculation is double the # of CPUs, round up to nearest power of 2. If you’re executing in parallel, Oracle will use 2x CPUs. (all this advice, by the way, follows 80/20 rule, this is probably good for about 80% of the environments out there). Of course, MySQL does not do parallel execution very well, so this probably does not apply. Oracle knows it can get partition elimination while it does a join. If 2 tables have the same degree of parallelism (same # of buckets) and are partitioned in the same way on the join column (say, customer_id in a subpartition of sales and a partition of customer), Oracle will match the partitions when joining: sales table joined with customer table can change into 4 small joins: sales sub part 1 joins with customer part 1 sales sub part 2 joins with customer part 2 sales sub part 3 joins with customer part 3 sales sub part 4 joins with customer part 4 And with parallelism, the total time is now reduced to the time it takes to do one of those smaller joins. This is also why you want to have a power of 2 for buckets – because cores/processors come in powers of 2. Partition-wise joins like this can also be done with range or list, assuming both tables in the join have the same buckets. I have no idea if MySQL partitioning works this way, but it’s certainly a functionality that makes sense to me. www.pythian.com | 7/29/10 10:00 PM
Data Warehousing Best Practices: Comparing Oracle to MySQL, part 1 (introduction and power)
At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer. These are my notes from the session, which include comparisons of how Oracle works (which Maria gave) and how MySQL works (which I researched to figure out the difference, which is why this blog post took a month after the conference to write). Note that I am not an expert on data warehousing in either Oracle or MySQL, so these are more concepts to think about than hard-and-fast advice. In some places, I still have questions, and I am happy to have folks comment and contribute what they know. One interesting point brought up: Maria quoted someone (she said the name but I did not grab it) from Forrester saying, “3NF is typically a selfless model used by Enterprise data warehouse, which is used by the whole company. A star schema is a selfish model, used by a department, because it’s already got aggregation in it.” I thought that was an interesting way of pointing that out — most people do not understand why 3NF is not good enough for data warehousing, and I have had a hard time explaining why a star or snowflake schema should be used. Another schema-related topic I had a hard time putting into words before this workshop was the difference between a star and a snowflake schema: compared to a star schema, in a snowflake schema, you have more than one fact table and maybe some dimensions that are not used often. From Maria and the slides: “Oracle says model what will suit your business best. Don’t get lost in academia. Most schemas are not 100% according to the theoretical models. Some examples: 3NF schema with denormalized attributes to avoid costly joins, Star schema with multiple hierarchies in same fact table.” Data warehousing has a 3-step approach — 1) data sources -> staging layer (temp loading layer) 2) staging layer (temp loading layer)-> foundation (logical, data store) layer 3) foundation (logical, data store) layer -> access and performance layer The foundation layer is usually 3NF the access layer is usually a star or snowflake schema. As for the data sources, they can be varied, you would hope that they are in 3NF (and if they are you can skip the first 2 steps) but they are not always that way. The 3 P’s of best practice for data warehousing (on Oracle) are power, partitioning, parallelism. The goal of the data warehousing environment is to minimize the amount of data accessed and use the most efficient joins – so it is not so index focused. This may be based on Oracle’s way of doing joins, I am not so sure if it applies to MySQL as well. Power The weakest link in the chain (the 3 steps above) will define the throughput, so make sure your hardware configuration is balanced. Maria mentioned that as DBAs, “most of the time we don’t have control over this, but we’re still bound to the SLAs.” This includes hardware that immediately comes to mind such as # of CPUs/cores, speed of CPU, amount of RAM, speed of disk as well as what we may not think of immediately: speed of network switches, speed of disk controllers, number and speed of host BUS adapters. Notes on host BUS adapters (HBAs): Know the # of HBA ports you have. 4 Gb HBA does 400 Mb/sec. 2 Gb HBA does 200 Mb/sec. Make sure there’s enough HBA capacity to sustain the CPU throughput (ie, make sure HBA isn’t the bottleneck). Also the speed at which it all talks. If you have a 4 Gb machine but a 2 Gb switch, you end up having 2 Gb throughput. Upgrade the network at the same time you upgrade machines. Because we are talking about data warehousing, it is often not possible to eliminate disk I/O, so the goal is to have the fastest I/O throughput possible. Data warehouses need to be sized on I/O throughput not number of I/O’s. I made a post earlier about how to determine I/O throughput for a system, which used information from this session. Justin Swanhart already pointed out that this is based on the fact that Oracle can do hash joins and MySQL can only do nested loop joins. I wonder, though, if there is indeed no case when using MySQL for which I/O throughput is a more useful metric than iops. Disk arrays that are expensive are usually sized for iops, not throughput, and because they’re expensive the disk array is shared throughout the company. A DBA needs to ask ‘how many connections into the storage array do I have? How many disk controllers do I have? Where are my physical disks, and which controllers are they hanging off of?’ Typical 15k rpm disk can do about 25-35 Mb/sec (per disk) random i/o’s. Disk manufacturers will throw out numbers like 200-300 Mb/sec but that’s sequential I/O and leading edge of the drive. Make sure all your LUNs are not coming off the same set of disks, so that you’re not conflicting on disk seeks. Continue to part 2, partitioning. www.pythian.com | 7/29/10 9:53 PM
Databases, The End User Experience
Does it matter if the end user knows what the database is?Recently I got a wonderful view of a database from the end user perspective.While I was traveling I had found a restaurant where I had decided to let friends who live locally know where I was at. Part way through my food I got a message from a local friend that said "Don't eat there, their food always makes people sick!""Always" is a word that I would think would be a little too strong when applied to a restaurant, right?Nope, the next day I got to feel the full truth of the word.A couple of days later I am telling some friends about this and a local asked me "Where was this, I want to avoid them." I didn't get asked this question once, I got it asked a dozen times.I don't know where the place is. Why is that? Because the system I was using lost the entire day worth of my data. I don't know how often they loose data, but from asking a few other folks it appeared to be that it is more frequent then not.It came up in casual conversation the other day that the site had moved off Postgres to another system recently. Which suddenly made everything make sense, because the particular solution they moved too is not very durable.We talk about databases being "transactional" or not. We talk about them being "durable". What matters in the end, to me as an end user, is that when I put my data in a system, I want a confirmation that the system stored it. I don't want to retype my data, and I don't want to collect it again. If I was the operator for the site? I certainly wouldn't want to be losing my users data.In the MySQL world? MyISAM is the most abandoned storage engine in the stack. People will pick it initially because it is fast, but the first time they discover data corruption or have to deal with multiple hours of recovery time they quickly move away from it.As an operator I wouldn't want to be having to explain to my users or my boss, that we had to wait 12 hours until the database recovered itself (or that it had corrupted itself). "Transactional" systems know how to handle recovery. People will wave their arms about and talk about disk controllers, disk failures, etc... That is hand waving. A properly configured system is redundant and sure it can be hit by lighting, but the real issue is most likely going to be that a plug gets pulled or a program crashes.I look at, and even work with, some of the "no-sql" solutions. Some of them I recommend, and other's of them I don't. I look at scale out needs, usage patterns, and a wide variety of other details.As end user though?I would like to know that my data was stored, and that I will reliably be able to retrieve it when I want. I don't like outages. Of the services online that I pay for or that I have integrated into my life? I can't imagine wanting to deal with a system which was unreliable. A free service which does not work most of the time, is not free. It will consume my time whenever it is not available.There is an end user experience for the database, site operators ought to remember this. krow.livejournal.com | 7/29/10 6:11 PM
Determining I/O throughput for a system
At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. In that session, there was a section on how to determine I/O throughput for a system, because in data warehousing I/O per second (iops) is less important than I/O throughput (how much actual data goes through, not just how many reads/writes). The section contained an Oracle-specific in-database tool, and a standalone tool that can be used on many operating systems, regardless of whether or not a database exists: If Oracle is installed, run DBMS_RESOURCE_MANAGER.CALIBRATE_IO: SET SERVEROUTPUT ON DECLARE lat INTEGER; iops INTEGER; mbps INTEGER; BEGIN -- DBMS_RESOURCE_MANAGER.CALIBRATE_IO(<DISKS>, <MAX_LATENCY>,iops,mbps,lat); DBMS_RESOURCE_MANAGER.CALIBRATE_IO (2, 10, iops, mbps, lat); dbms_output.put_line('max_mbps = ' || mbps); end; For us MySQL folks, or even the Drizzle or NoSQL folks, Oracle offers a free standalone tool called Orion. The example given in the slides was: ./orion –run advanced –testname mytest –num_small 0 –size_large 1024 –type rand –simulate contact –write 0 –duration 60 –matrix column -num_small is 0 because you don’t usually do small transactions in a dw. -type rand for random I/O’s because data warehouse queries usually don’t do sequential reads -write 0 – no writes, because you do not write often to the dw, that is what the ETL is for. -duration is in seconds -matrix column shows you how much you can sustain I would be interested to see how other folks measure I/O throughput, and maybe even do a side-by-side comparison of different tools. Orion is available for: Linux (x86, x86-64, Itanium, Power) Solaris (SPARC64) AIX (PPC64) zLinux HPUX (PA RISC, Itanium) Windows I am working on a larger write-up of the session itself, which had many concise descriptions of data warehousing issues, but I thought that this merited its own post. www.pythian.com | 7/29/10 4:23 PM
Shinguz's Blog (en): MySQL Cluster Local Checkpoint (LCP) and Global Checkpoint (GCP)
MySQL Cluster is mainly an in-memory database. Nevertheless it requires a good I/O system for writing various different information to disk. The information MySQL Cluster writes to disk are the: Global Checkpoints (GCP) which are the transactions. Local Checkpoints (LCP) which is a dirty image of the data. Backup. In the following schema you can see what is related to each other: Please find here the meaning of each parameter: TimeBetweenLocalCheckpoints DataMemory DiskCheckpointSpeed DiskCheckpointSpeedInRestart CompressedLCP Odirect DiskSyncSize RedoBuffer TimeBetweenGlobalCheckpoints NoOfFragmentLogFiles FragmentLogFileSize InitFragmentLogFiles www.fromdual.com | 7/29/10 3:29 PM
Fast Company is searching for 2010's Most Influential Person Online.
I admit this is not closely related to my day-by-day activities, but I find the experiment interesting. MySQL is the most popular open source database, particularly in the Web. Both twitter and facebook are avid users of MySQL. MySQL has influenced plenty of projects, has contributed to the creation of a multiplicity of social networks and it's behind many intriguing entrepreneurial ideas. The question is: how influential are you? Fast Company started a project to measure how much we influence the online communities. Here is the description coming from their website. We started with a simple question: Who are the most influential people online right now? That's what The Influence Project is designed to answer. By participating, you will have your picture appear in the November issue of Fast Company magazine as part of an amazing photo spread. The more influence you demonstrate, the bigger your picture will be. You may discover that you're more influential than you think. Influence is not only about having the most friends or followers. Real influence is about being able to affect the behavior of those you interact with, to get others in your social network to act on a suggestion or recommendation. When you post a link or recommend a site, how many people actually bother to check it out? And what's the likelihood of those people then forwarding it on? How far does your influence spread? This is the type of influence we're looking for. We want to find the most influential person online. Who knows? It might even be you. Participate to the project by clicking here. feedproxy.google.com | 7/29/10 12:57 PM
Building a Data Warehouse Blueprint for Success
One of the most integral components and critical success factors of any enterprise data warehousing initiative is the Solutions Architecture document, a high-level conceptual model of a data warehousing solution. Learn why this collaborative effort that addresses the needs of all major stakeholders, including both the business units and Information Technology (IT), is essential. redir.internet.com | 7/29/10 12:30 AM
I/O Performance Tuning Tools for Oracle Database 11gR2
Oracle Database 11g Release 2 (11gR2) makes it easier than ever to isolate and identify potential root causes of poor I/O performance. This article focuses on the various Oracle and non-Oracle tools to generate sample workloads to provide sufficient metrics for detection and eventual resolution of performance bottlenecks within the database's underlying input/output (I/O) subsystem. redir.internet.com | 7/29/10 12:15 AM
Will Oracle kill MySQL?
I get asked this question often. It was mentioned again recently in a NYTECH executive breakfast with RedHat CIO Lee Congdon. The short answer is No. There is clear evidence that in the short to medium term Oracle will continue to promote and enhance MySQL. Some of these indicators include: EU 10 point commitment in December 2009 – See Oracle Makes Commitments to Customers, Developers and Users of MySQL MySQL Conference April 2010 – Opening keynote by Edward Screven State of the Dolphin Oracle Magazine Jul/Aug 2010 – Interview with Edward Screven Open for Business. It is clear from these sources that Oracle intends to incorporate MySQL into Oracle Backup and Security Vault products. Both a practical and necessary step. There is also a clear mention of focusing on the Microsoft platform, a clear indicator that SQL Server is in their sights without actually saying it. What is unknown is exact how and when features will be implemented. Also important is how much these may cost the end user. Oracle is in the business of selling, now an entire H/W and S/W stack. They also have a complicated pricing model of different components with product offerings. I assume this will continue. There are already two indications, InnoDBbackup included for Enterprise Backup (from April Keynote) and 5.1 enterprise split. (Note: while this split may have existed prior to Oracle, it is now more clearly obvious). MySQL can never be seen as drawing away from any Oracle sales of the core entry level database product. It is likely Oracle will provide a SQL Syntax compatibility layer for MySQL within 2 years, however it will I’m sure be a commercial add-on. Likewise, I would suspect a PL/SQL lite layer within 5 years, but again at a significant cost to offset the potential loss of sales in the low end of the server market. There continues to be active development in the MySQL Enterprise Monitor, MySQL Workbench and MySQL Connectors which is all excellent news for users. Moving forward, how long will this ancillary development of free tools continue? What will happen to the commercial storage engine, OEM and licensing model after the 5 year commitment? How will the MySQL ecosystem survive.? There is active development in Percona, MariaDB and Drizzle forks, however unless all players that want to provide a close MySQL compatible solution work together, progress will continue to be a disappointing disjointed approach. The 2011 conference season will also see a clear line with competing MySQL conferences in April scheduled at the same time, the O’Reilly MySQL conference in Santa Clara California and the Oracle supported(*) Collaborate 2011 in Orlando, Florida. I have a number of predictions on what Oracle ME MySQL may look like in 5 years however this is a topic for a personal discussion. ronaldbradford.com | 7/28/10 10:32 PM
Speaking at Surge Scalability 2010 – Baltimore, MD
I will be joining a great list of quality speakers including John Allspaw, Theo Schlossnagle, Rasmus Lerdorf and Tom Cook at Surge 2010 in Baltimore, Maryland on Thu 30 Sep, and Fri Oct 1st 2010. My presentation on “The most common MySQL scalability mistakes, and how to avoid them.” will include discussing various experiences observed in the field as a MySQL Consultant and MySQL Performance Tuning expert. Abstract: The most common mistakes are easy to avoid however many startups continue to fall prey, with the impact including large re-design costs, delays in new feature releases, lower staff productivity and less then ideal ROI. All growing and successful sites need to achieve higher Availability, seamless Scalability and proven Resilience. Know the right MySQL environments to provide a suitable architecture and application design to support these essential needs. Overview: Some details of the presentation would include: The different types of accessible data  (e.g. R/W, R, none) What limits MySQL availability (e.g software upgrades, blocking statements, locking etc) The three components of scalability – Read Scalability/Write Scalability/Caching Design practices for increasing scalability and not physical resources Disaster is inevitable. Having a tested and functional failover strategy When other products are better (e.g. Static files, Session management via Key/Value store) What a lack of accurate monitoring causes What a lack of breakability testing causes What does “No Downtime” mean to your organization. Implementing a successful “failed whale” approach with preemptive analysis Identifying when MySQL is not your bottleneck ronaldbradford.com | 7/28/10 9:53 PM
db4free.net’s 5th birthday
To my shame I must admit, I missed it. It happened on June 29, 2005 when db4free.net was first available to the public. At that time it was running MySQL 5.0.7 beta. Quite a lot has happened since then, MySQL 5.0 made its way up to 5.0.91 and the current MySQL GA version is 5.1.49, which is also the version db4free.net is running as of today. The first phpMyAdmin version that db4free.net was offering to provide easy access to the user’s databases was 2.6.3. Today I updated phpMyAdmin to 3.3.5. Statistics are not necessarily 100 % accurate, but here is the best I can come up with. Since its launch, db4free.net had 528,900 visits. The ratio registrations per visits is at about 22 %, so more than every fifth visit ends in signing up for a new database. Which means, that about between 110,000 and 120,000 database accounts (meaning database and user) have ever been created. There have been some cleanups since then to make resources available to people who actively use their databases. Which is why the current number of databases is much lower, at slightly above 13,400. Today I must (also) admit that the code behind db4free.net’s web application was initially quite poor and it stayed poor for quite long. Why it was poor from my today’s point of view is probably due to my learning process in these 5 years. This year I did a lot of cleanup on the code behind db4free.net’s web application and removed a few bottlenecks which often made the website painfully slow. Now it’s in a fairly reasonable state again, and I have some ideas in the back of my head how to further improve it. So db4free.net is far from its end of life. Quite the opposite is the case, it’s time to give it a new boost so that the balance after its 10th anniversary is even more impressive than today’s 5 year balance. www.mpopp.net | 7/28/10 8:34 PM
Early results from admission_control
I want to limit the maximum number of concurrent queries by MySQL account. This will allow some accounts to have many concurrent connections without the risk of having many concurrent queries. The feature is enabled by adding admission_control to my.cnf. Early results for it are promising but I have much more testing to do. Results for sysbench doing fetch by primary key with 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048 and 4096 concurrent clients are in this graph. The results for admission control have the label new. Both new and original use MySQL 5.1.47 with the Facebook patch. MySQL with admission control is much better at sustaining peak QPS when the server is saturated. I have a bit of work to make this ready for production. The implementation requires a count of running queries per account. This count must be adjusted when a thread blocks within the server. Things that can make a thread block include LOCK TABLES, GET_LOCK() and InnoDB row lock waits. I need to find all places in the server where a thread might block. When this feature is enabled, this output from SHOW PROCESSLIST displays the result of a query blocked by admission control: | 1182 | root | db1:56546 | test | Query | 0 | NULL | HANDLER sbtest READ `PRIMARY` = (3331584) | | 1188 | root | db1:56552 | test | Sleep | 0 | wait for max concurrent queries | NULL | www.facebook.com | 7/28/10 7:20 PM
Install MySQL from Tar ball
Download your favourite version of MySQL Server from your local mirror. We can use the latest GA (5.1.49) or development release (5.5.5m3). Move your tarball to the /usr/local directory. I'm assuming that you're logged into your box using the root account. If you're not then you might need to utilize sudo.Add the mysql user and group   groupadd mysql   useradd -g mysql mysqlUntar and Create a symbolic link   cd /usr/local/   tar -zxvf mysql-{version}.tar.gz   ln -s mysql-{version} mysqlFile and Directory creation/permissions    mkdir -p /var/lib/mysql *   chown -R mysql:mysql /usr/local/mysql   chown -R mysql:mysql /var/lib/mysqlInstall the system databases and place my.cnf   cp /usr/local/mysql/support-files/my-medium.cnf  /etc/my.cnf **   scripts/mysql_install_db --user=mysql --datadir=/var/lib/mysql --basedir=/usr/local/mysql ***Start mysqld   /usr/local/mysql/bin/mysqld_safe &Check your error log file to keep abreast of any issues that might arise.Some house keeping is needed to ensure that MySQL will start a boot and also adding mysql into the path. These are basic Linux tasks. Please let me know if you'd like to see my recommendations for these tasks.* This command is going to create your datadir location. This might not fit your preferences so this is something that you will choose to include or omit. You should review your my.cnf and make any changes to accommodate your desired location.** This will copy a default my.cnf to the preferred my.cnf location. If you have written your own or have something you wish to use, copy that and not the example file. Your my.cnf is a pretty big deal as it's going configure your server to run like a two-legged dog or Usain Bolt!*** These parameters can be augmented, omitted or kept. You're going to want to use the values in your my.cnf or change your my.cnf to reflect what you use here. blog.mysqlboy.com | 7/28/10 2:00 PM
Hybrid licensing strategies for open source monetization
One of the issues that has arisen from the ongoing debate about the open core licensing strategy is the continuing confusion about open core compared to the use of open source components in a larger proprietary product – such as IBM’s use of Apache within WebSphere. To some people there is no difference between the two (since they both result in products that make use of open source but are not open source), however it is clear to me that while the end result might be the same these are very different strategies that involve different approaches to engaging with open source communities/projects. While open core has a clear definition there is no agreed term or definition for the latter category. Over the years we have used a variety of terms to describe it, including “open and closed”, “embedded open source”, “open inside” and “open complement”, while Jack Repenning has referred to it as “open infrastructure”. Our next categorization of open source-related business strategies is still a work in progress but the current thinking is as follows: There are a variety of complementary strategies employed by vendors to generate revenue from open source software indirectly. The simplest of these is open complement which is selling other products and services that are related to but separate from, and not reliant upon, the open source project. Then there is encouraging open source development on top of proprietary products to retain develop interest in that product. This is known as open edge. Then there is using open source software to create a platform for the provision of SaaS or cloud or social networking services (for example), which I am referring to as open platform. Then there is using open source components as building blocks for a larger proprietary software product, which I am calling an open foundation licensing strategy. (This categorization is a work in progress, we welcome and encourage any feedback) Open core and open foundation have different evolutionary lineages: open core is a variation on dual licensing as practiced by the likes of MySQL and Sleepycat that also borrows heavily on the value-added subscription model as practiced by Red Hat and JBoss. Meanwhile open foundation has its roots in the commercialization of BSD, which pre-dates the concepts of open source and free software, as well as Apache. From a practical perspective, the easiest way to think of the distinction between open core and open foundation is via an example: PostgreSQL is an independent, community-developed open source project. EnterpriseDB offers extensions to the PostgreSQL core, such as Oracle-compatibility, in the form of Postgres Plus Advanced Server. PostgreSQL has also been used by many other vendors to create commercial products. For example Greenplum used PostgreSQL as the foundation of its Greenplum Database (for other examples see this post). This allowed the company to build on proven database technology and avoid reinventing the wheel, but it also involved the creation of an entirely new product, rather than extensions to an open source project (the company initially actually started a new project, Bizgres, and created extensions to that but Bizgres was last seen in August 2008). So while open core involves offering proprietary extensions targeted at a segment of the open source project user base, open foundation involves using open source software to create entirely new products, targeted at a different user base. The example used above highlights three important points to consider when comparing open core and open foundation strategies: 1/ While open core is most readily associated with vendor-controlled projects it can also be used as a strategy to monetize community-controlled projects. 2/ Open core strategies can be used in conjunction with complementary strategies. In the Greenplum example the company’s relationship with Bizgres was open core, while the relationship with PostgreSQL was open foundation. Similarly there is an open core relationship between Actuate’s BIRT products and the Eclipse BIRT project, and an open complement relationship between Actuate 10 and the Eclipse BIRT project. Meanwhile there is an open core relationship between Day Software’s CRX content repository and the Apache Jackrabbit and Sling projects, and a open foundation relationship between CQ5 and Jackrabbit, Felix and Sling – as well as the numerous other Apache projects that Day contributes to. 3/ Open core and open foundation are licensing strategies used as part of a larger business strategy for engaging with and commercializing open source software, which highlights the futility in trying to pigeon-hole companies as “open core vendors” or “open source vendors”. Finally it is worth thinking about the different tensions that the open core and open foundation strategies create with their respective communities. As Jorg Janke notes, “looking for an income stream as an open source vendor always results in some sort of conflict with the community. So, you have to pick the community you want to ‘offend’.” With a vendor-controlled open core strategy the community is a user community, and as we have previously discussed the conflict is in deciding what features belong in the core and what features don’t. With an open foundation strategy the community is the open source project developer community, and the conflict lies in deciding what features and resources to contribute to that project. A community-controlled open core strategy arguable results in conflict with both the user and developer communities, although since the vendor does not own or control the project the relationship is much more comparable to the open foundation strategy. We will be writing more about other strategies for generating revenue from open source software, in a follow-up to our Open Source is Not a Business Model report, which is due to be published latter this year. It will provide more context for the economic motivators and issues involved in the various models, as well as updated research on which vendors are following which strategies, and why, as well as a survey to uncover what software users make of it all. The report will be freely available to CAOS subscribers. For more details of the CAOS research practice, and to apply for trial access, click here. blogs.the451group.com | 7/28/10 11:58 AM
Drizzle, boost::unordered_multimap++
========================================================================================================= REGRESSION REPORT ========================================================================================================= MACHINE: drizzle-build-n01.wc1.dfw1.stabletransit.com RUN ID: 324 RUN DATE: 2010-07-27T21:48:07.932094 WORKLOAD: innodb_1000K_readonly SERVER: drizzled VERSION: lp:drizzle/staging REVISION: 1669 COMMENT: 1669: Brian Aker 2010-07-27 This patch turns the table_cache into boost::unordered_multimap. ========================================================================================================= TRENDING OVER LAST 5 runs Conc TPS % Diff from Avg Diff Norm? Min Max Avg STD ========================================================================================================= 16 1993.10 +1.19% 23.37 within norms 1905.44 2010.28 1969.73 26.56 32 2676.28 +1.75% 46.11 outside norms 2568.20 2685.05 2630.17 33.22 64 2622.82 -2.09% -56.00 outside norms 2613.02 2737.89 2678.82 43.09 128 2621.23 -0.61% -16.15 within norms 2609.74 2662.30 2637.38 16.41 256 2598.18 +5.50% 135.46 outside norms 2406.50 2599.34 2462.72 68.67 512 2343.83 +52.30% 804.91 outside norms 1318.38 2347.49 1538.92 402.80 1024 1722.08 +104.66% 880.65 outside norms 614.65 1725.52 841.43 440.35 Boost::unordered_multimap versus the hand crafted MySQL HASH?Boost wins by a long shot. The above is from sysbench. We run it on each and every push that goes into the main tree looking for regression. Every so often we get to see the opposite happen :)Thanks to this patch, "Table" now joins the ranks of what we call a "trusted object". This means we can start safely assuming that the destructor on it is working (most of the MySQL codebase was crafted in such a way that you can only use a very limited subset of C++). In all of the new code in Drizzle we can easily make use of C++. There are still a few older bits where we cannot. Having "Table" now work means we can safely work in a number of new areas in the server. One of the biggest changes that will be coming soon is the removal of LOCK_open for a number of new cases. krow.livejournal.com | 7/28/10 6:47 AM
Five Handy Tips for MySQL's Powerful UPDATE Statement
Rob Gravelle explores the MySQL UPDATE statement, which is used to modify existing records in a table. Among its many features, he looks at how to update multiple tables, avoiding mistakes, and how to limit how many rows are updated. redir.internet.com | 7/28/10 12:25 AM
If you missed MySQL Idiosyncrasies that BITE
I recently gave a webinar to the LAOUC and NZOUG user groups on MySQL Idiosyncrasies that BITE. For the benefit of many viewers that do not use English as a first language my slides include some additional information from my ODTUG Kaleidoscope presentation in June. Thanks to Francisco Munoz Alvarez for organizing. MySQL Idiosyncrasies That Bite 2010.07 View more presentations from Ronald Bradford. ronaldbradford.com | 7/28/10 12:24 AM
N900 – control all of your accounts with this script
If you own a Nokia N900 cellular device you might be interested in the ability to control all of your IM accounts from the command line. For those that do not know, the N900 runs Maemo Linux and is capable of running MySQL embedded if you so choose. Here’s a quick script I wrote to provide that functionality for IM accounts. It’s at the bottom of the page, called “im-connections”. wiki: http://wiki.maemo.org/N900_Mission_Control#Set_all_SIP_accounts_to_online_or_offline pastebin: http://pastebin.com/qAC57E1N themattreid.com | 7/27/10 11:42 PM
MySQL Connector/Net 6.3.3 (beta 2) has been released
MySQL Connector/Net 6.3.3, a new version of the all-managed .NET driver for MySQL has been released. This is a beta release and is intended to introduce you to the new features and enhancements we are planning. This release should not be used in a production environment.  It is now available in source and binary form from http://dev.mysql.com/downloads/connector/net/6.3.html] and mirror sites (note that not all mirror sites may be up to date at this point of time – if you can’t find this version on some mirror, please try again later or choose another download site.) The new features or changes in this release are: Visual Studio 2010 RTM support New sql editor.  Create a new file with a .mysql extension to see it in action Please check the changelog and release notes for more information What we know may be broken Documentation is not updated yet and is not integrated into VS 2010. Please let us know what else we broke and how we can make it better! www.reggieburnett.com | 7/27/10 10:01 PM
HOWTO screw up launching a free software project
Josh Berkus gave a great talk at linux.conf.au 2010 (the CFP for linux.conf.au 2011 is open until August 7th) entitled “How to destroy your community” (lwn coverage). It was a simple, patented, 10 step program, finely homed over time to have maximum effect. Each step is simple and we can all name a dozen companies that have done at least three of them. Simon Phipps this past week at OSCON talked about Open Source Continuity in practice – specifically mentioning some open source software projects that were at Sun but have since been abandoned by Oracle and different strategies you can put in place to ensure your software survives, and check lists for software you use to see if it will survive. So what can you do to not destroy your community, but ensure you never get one to begin with? Similar to destroying your community, you can just make it hard: “#1 is to make the project depend as much as possible on difficult tools.” #1 A Contributor License Agreement and Copyright Assignment. If you happen to be in the unfortunate situation of being employed, this means you get to talk to lawyers. While your employer may well have an excellent Open Source Contribution Policy that lets you hack on GPL software on nights and weekends without a problem – if you’re handing over all the rights to another company – there gets to be lawyer time. Your 1hr of contribution has now just ballooned. You’re going to use up resources of your employer (hey, lawyers are not cheap), it’s going to suck up your work time talking to them, and if you can get away from this in under several hours over a few weeks, you’re doing amazingly well – especially if you work for a large company. If you are the kind of person with strong moral convictions, this is a non-starter. It is completely valid to not want to waste your employers’ time and money for a weekend project. People scratching their own itch, however small is how free software gets to be so awesome. I think we got this almost right with OpenStack. If you compare the agreement to the Apache License, there’s so much common wording it ends up pretty much saying that you agree you are able to submit things to the project under the Apache license.  This (of course) makes the entire thing pretty redundant as if people are going to be dishonest about submitting things under the Apache licnese there’s no reason they’re not going to be dishonest and sign this too. You could also never make it about people – just make it about your company. #2 Make it all about the company, and never about the project People are not going to show up, do free work for you to make your company big, huge and yourself rich. People are self serving. They see software they want only a few patches away, they see software that serves their company only a few patches away. They see software that is an excellent starting point for something totally different. I’m not sure why this is down at number three… it’s possibly the biggest one for danger signs that you’re going to destroy something that doesn’t even yet exist… #3 Open Core This pretty much automatically means that you’re not going to accept certain patches for reasons of increasing your own company’s short term profit. i.e. software is no longer judged on technical merits, but rather political ones. There is enough politics in free software as it is, creating more is not a feature. So when people ask me about how I think the OpenStack launch went, I really want people to know how amazing it can be to just not fuck it up to begin with. Initial damage is very, very hard to ever undo. The number of Open Source software projects originally coming out of a company that are long running, have a wide variety of contributors and survive the original company are much smaller than you think. PostgreSQL has survived many companies coming and going around it, and is stronger than ever. MySQL only has a developer community around it almost in spite of the companies that have shepherded the project. With Drizzle I think we’ve been doing okay – I think we need to work on some things, but they’re more generic to teams of people working on software in general rather than anything to do with a company. Share this on Facebook Tweet This! Share this on del.icio.us Digg this! Post on Google Buzz www.flamingspork.com | 7/27/10 7:40 PM
Does Size or Type Matter?
MySQL seems to be happy to convert types for you. Developers are rushed to complete their project and if the function works they just move on. But what is the costs of mixing your types? Does it matter if your are running across a million rows or more? Lets find out. Here is what the programmers see. mysql> select 1+1; +-----+ | 1+1 | +-----+ |   2 | +-----+ 1 row in set (0.00 sec) mysql> select "1"+"1"; +---------+ | "1"+"1" | +---------+ |     2 | +---------+ 1 row in set (0.00 sec) Benchmark What if we do a thousand simple loops?  How long does the looping itself take? The BENCHMARK() function executes the expression expr repeatedly count times. It may be used to time how quickly MySQL processes the expression. The result value is always 0. mysql> select benchmark(1000000000, 1); +--------------------------+ | benchmark(1000000000, 1) | +--------------------------+ |                        0 | +--------------------------+ 1 row in set (5.42 sec) mysql> select benchmark(1000000000, "1" ); +-----------------------------+ | benchmark(1000000000, "1" ) | +-----------------------------+ |                           0 | +-----------------------------+ 1 row in set (5.40 sec) So maybe type doesn’t matter? About five seconds just to loop but the type didn’t change it.   What if we add 1+”1″? mysql> select benchmark(1000000000, 1+1); +----------------------------+ | benchmark(1000000000, 1+1) | +----------------------------+ |                          0 | +----------------------------+ 1 row in set (12.65 sec) mysql> select benchmark(1000000000, 1+"1"); +------------------------------+ | benchmark(1000000000, 1+"1") | +------------------------------+ |                            0 | +------------------------------+ 1 row in set (35.58 sec) mysql> select benchmark(1000000000, "1"+"1"); +--------------------------------+ | benchmark(1000000000, "1"+"1") | +--------------------------------+ |                              0 | +--------------------------------+ 1 row in set (51.59 sec) It looks like type does matter.  But does it always matter? mysql> select benchmark(1000000000, sum(1+1)); +---------------------------------+ | benchmark(1000000000, sum(1+1)) | +---------------------------------+ |                               0 | +---------------------------------+ 1 row in set (9.69 sec) mysql> select benchmark(1000000000, sum("1"+"1")); +-------------------------------------+ | benchmark(1000000000, sum("1"+"1")) | +-------------------------------------+ |                                   0 | +-------------------------------------+ 1 row in set (9.94 sec) mysql> select benchmark(1000000000, sum("1.23456789"+"1.23456789")); +-------------------------------------------------------+ | benchmark(1000000000, sum("1.23456789"+"1.23456789")) | +-------------------------------------------------------+ |                                                     0 | +-------------------------------------------------------+ 1 row in set (10.32 sec) So, not all functions are the same.  But it looks like size might matter! mysql> select benchmark(1000000000, 1.1+1.1); +--------------------------------+ | benchmark(1000000000, 1.1+1.1) | +--------------------------------+ |                              0 | +--------------------------------+ 1 row in set (34.90 sec) mysql> select benchmark(1000000000, "1.1"+"1.1"); +------------------------------------+ | benchmark(1000000000, "1.1"+"1.1") | +------------------------------------+ |                                  0 | +------------------------------------+ 1 row in set (1 min 15.32 sec) mysql> select  benchmark(1000000000, "1.123456789"+"1.123456789"); +----------------------------------------------------+ | benchmark(1000000000, "1.123456789"+"1.123456789") | +----------------------------------------------------+ |                                                  0 | +----------------------------------------------------+ 1 row in set (1 min 53.32 sec) Sorry.  Looks like size does matter. This doesn't seem logical. mysql> select benchmark(1000000000, 1=1); +----------------------------+ | benchmark(1000000000, 1=1) | +----------------------------+ |                          0 | +----------------------------+ 1 row in set (12.75 sec) mysql> select benchmark(1000000000, 1="1"); +------------------------------+ | benchmark(1000000000, 1="1") | +------------------------------+ |                            0 | +------------------------------+ 1 row in set (40.78 sec) mysql> select benchmark(1000000000, 1=true); +-------------------------------+ | benchmark(1000000000, 1=true) | +-------------------------------+ |                             0 | +-------------------------------+ 1 row in set (12.73 sec) mysql> select benchmark(1000000000, 1="true"); +---------------------------------+ | benchmark(1000000000, 1="true") | +---------------------------------+ |                               0 | +---------------------------------+ 1 row in set, 65535 warnings (3 min 5.72 sec) mysql> select benchmark(1000000000, "true"="true"); +--------------------------------------+ | benchmark(1000000000, "true"="true") | +--------------------------------------+ |                                    0 | +--------------------------------------+ 1 row in set (57.25 sec) Maybe we should CAST our work? mysql> select benchmark(1000000000, cast("1" as unsigned)); +----------------------------------------------+ | benchmark(1000000000, cast("1" as unsigned)) | +----------------------------------------------+ |                                            0 | +----------------------------------------------+ 1 row in set (32.27 sec) mysql> select benchmark(1000000000, cast("1" as unsigned) + cast("1" as unsigned)); +----------------------------------------------------------------------+ | benchmark(1000000000, cast("1" as unsigned) + cast("1" as unsigned)) | +----------------------------------------------------------------------+ |                                                                    0 | +----------------------------------------------------------------------+ 1 row in set (1 min 7.24 sec) Maybe not! Conclusion:  Be careful with your data types.  If you are taking user input, do the type conversion ONCE in your program.  Don’t let MySQL do the type conversions for you. query = “SELECT * FROM table where $INPUT = 1″;   could be doing your wrong. References: http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_benchmark http://dev.mysql.com/doc/refman/5.0/en/numeric-type-overview.html http://dev.mysql.com/doc/refman/5.0/en/cast-functions.html www.mysqlfanboy.com | 7/27/10 7:08 PM
Online Verification That Master and Slaves are in Sync
In October 2008, Baron posted How to Check MySQL Replication Integrity Continually. Here at Pythian we have developed a method based on that post, and added “verifying that masters and slaves are in sync” to our standard battery of tests. We call it “Continual replication sync checking”. This article will explain how it works, how to test and make the procedure non-blocking, benchmarks from the real world, issues we encountered along the way, and finally Pythian’s procedure of setting up continual replication sync in a new environment. At the 2010 MySQL User Conference & Expo, my co-worker Danil Zburivsky did a presentation about this, and you can get the slides in ODP (Open Office) format or watch the 46-minute video on YouTube. How it works On the master, mk-table-checksum is run. In order to make the checksum operation online, we use the modulo and offset features of mk-table-checksum to checksum only part of the data at a time. The checksum is run (from cron) on the master and replicates to the slave. The results are captured in a result table, and a separate process checks the result table and notifies us of any discrepancies. Testing resource usage for non-blocking replication sync checking Each environment has a different data size, different resources, and a different level of what “intrusive” means. We start with a baseline of the size of the database, which we get from: SELECT SUM(INDEX_LENGTH+DATA_LENGTH)/1024/1024/1024 as sizeGb FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE='BASE TABLE'; I included INDEX_LENGTH because the queries that generate the checksum information may or may not depend on indexes. Note that for InnoDB tables, the metadata is approximate, so this will not necessarily be the exact size of the database, but it will be in the right range. In general, running this query in the INFORMATION_SCHEMA did not consume too many resources, though in many cases the query took a few minutes to complete. On databases with several thousand tables, this query can take hours to complete. Of course there is the caveat that it may take a VERY long time to run this in your environment, and some folks have reported crashing (this is VERY rare, I personally have not seen it even with the high demand client we have had, but others have reported it). So if you prefer to be on the safe side, you could look at the size of your ibdata files (whether you are innodb_file_per_table or not) plus the sizes of all the tables on disk (MyISAM, CSV, etc). That will give you a sense of how big the data is, although the size of the files on disk also includes any fragmentation. In practice, I have seen fragmentation reduce the size of a large database by 40%, so beware that neither of these methods is perfect. Once we get the data size in gigabytes, we can come up with a modulo to test. In general, we have found that mk-table-checksum is quick enough that it is non-blocking when we use a modulo value that corresponds to approximately 1 Gb of data, and a chunk-size of 100. We actually started out using a checksum of 1000, but locking 1000 rows to test a checksum proved to be longer than was acceptable for clients with lots of transactions — the problem was not the locking on the master, the problem was that it created a slave lag on the slaves. We tried using the --sleep option to mk-table-checksum, which worked, but changing the chunksize to be smaller caused less slave lag than adding a sleep parameter. Another issue when using chunk-size is that it requires a numeric (but non-decimal) index. We ran across clients using VARCHAR(32) as a PRIMARY key (yes, on InnoDB, and yes, performance was an issue there), or tables that did not have any indexes at all (such as a logging table). If mk-table-checksum does not find a suitable index it will just do a checksum on the whole table. To find tables that would be problematic, here is an INFORMATION_SCHEMA query that can be run (again, the usual caveats about INFORMATION_SCHEMA apply): SELECT CONCAT("SHOW CREATE TABLE ", table_schema,".",table_name,"\\G") FROM TABLES LEFT JOIN ( SELECT distinct table_schema,table_name FROM STATISTICS INNER JOIN COLUMNS USING (table_schema,table_name,column_name) WHERE (data_type like '%int' or data_type like 'date%' or data_type like 'time%' or data_type='float') and seq_in_index=1 ) AS good_for_checksum USING (table_schema,table_name) WHERE good_for_checksum.table_name IS NULL and table_schema not in ('information_schema'); Frequency of checksum runs The frequency that we run the checksum is also very flexible, so we take the size and translate that into a modulo that is “even” in a time-based way. For example, on a server that reported 113 Gb in size from the INFORMATION_SCHEMA query above, we set the modulo to 120. The checksum took 10 minutes and 6 seconds from the time the master started to the time the slave finished. There was no excessive slave lag caused and other resource checks showed that this test was acceptable, including application response time for non-administrative queries. Given a modulo of 120 that takes about 10 minutes to run and the environment, we decided to run the checksum 9 times per day (hourly for 9 hours during off-peak time). This resulted in the entire data set being checked during a period of just under 2 weeks (120 parts / 9 times per day = 13.333 days). This means that if there is a data discrepancy, it is discovered within 2 weeks in this particular environment. Though that is not ideal, it is much better than not discovering data discrepancies at all, which is how replication currently works. Benchmarks From the Real World The first row in the table below is the example we just went through. The subsequent rows of the table are results from some of the other production environments we run the checksum in. As you can see, we try to keep the data checked at one time to about 1 Gb. Total Data SizeModuloTest timeChecksum RateFrequencyPeriod 113 Gb12010 min 6 seconds1.59 Mb / sec9x / day2 weeks 9 Gb2151 seconds8.4 Mb / sec3x / day1 week 29 Gb219 min 16 seconds2.6 Mb / sec3x / day1 week 70 Gb2128 seconds2650 Mb / sec(data wasfreshly defragmented!)3x / day1 week 5.1 Gb214 min 22 sec0.958 Mb / sec3x / day1 week 314.5 Gb33636 min 3 seconds0.44 Mb / sec16x / day3 weeks In all of these environments, slave lag was 10 seconds or less at any given point on the slaves. Issues Encountered Some of the issues we encountered have workarounds, so I wanted to discuss and explain them here before giving our procedure, which contains the workarounds. Bug 304 – mk-table-checksum deletes all prior checksum results in the result table on every run. There is no fix for this yet, but if you are using mk-table-checksum only for the procedure described in this article (and in particular are not using the --resume-replication option), you can comment out the following code from mk-table-checksum: # Clean out the replication table entry for this table. if ( (my $replicate_table = $final_o->get('replicate')) && !$final_o->get('explain') ) { use_repl_db(%args); # USE the proper replicate db my $del_sql = "DELETE FROM $replicate_table WHERE db=? AND tbl=?"; MKDEBUG && _d($dbh, $del_sql, $db, $table->{table}); $dbh->do($del_sql, {}, $db, $table->{table}); } It is in different places in different versions, but last I checked, searching for "DELETE FROM" in mk-table-checksum only matched three lines of code, and it was pretty clear (due to the inline comment) which block of code to delete. The block shown above is from lines 5154 – 5161 in mk-table-checksum changeset 6647. Running the checksum may cause “statement not safe for replication” errors, especially in 5.1. This is usually OK to ignore, because mk-table-checksum works specifically because you can run the same command on the master and slave and get different results. In MySQL 5.1, CSV tables for the general and slow logs exist by default, even if they are not being written to, and “Statement is not safe to log in statement format” errors show up. You will need to redirect stderr and expect to see those statements in the MySQL error log. Note that mk-table-checksum works regardless of whether you are using statement-based, row-based or mixed replication. mk-table-checksum is not perfect, and sometimes shows false positives and false negatives. This is a hard to deal with problem, and we encourage making bug reports when they are found. However, I will note that if mk-table-checksum finds even one undetected data integrity issue, then it is useful, because right now there is no other way of detecting issues in an automated fashion. As more people use mk-table-checksum and can help the developers figure out how to fix the false positives/false negatives, I am sure it will be even better. Pythian’s procedure to set up continual replication sync Check to make sure all tables have appropriate indexes, as above. If they do not, the nibbling algorithm can be used, though as a caveat I have not tested nibbling with mk-table-checksum. Figure out the modulo value based on data size, as above Decide what database in which to put the tables that mk-table-checksum uses. We either use our monitoring database or a database called “maatkit”. Note that it is important to use a database that actually gets replicated! Get mk-table-checksum and comment out the lines that always delete from the replicated table, as above Create and populate the table mk-table-checksum will need for the modulo value: CREATE TABLE IF NOT EXISTS `checksum_modulo` ( `modulo_offset` smallint(5) unsigned NOT NULL default '0' primary key ) ENGINE=MyISAM DEFAULT CHARSET=utf8; INSERT IGNORE INTO checksum_modulo (modulo_offset) VALUES (0); Do a test run of mk-table-checksum: perl mk-table-checksum -u avail --ask-pass --algorithm BIT_XOR \ --replicate maatkit.checksum --create-replicate-table \ --modulo 120 --chunk-size 100 \ --offset 'modulo_offset FROM maatkit.checksum_modulo' localhost And update the modulo table: update maatkit.checksum_modulo set modulo_offset = ((modulo_offset+1)%21); And check the results on the slave: SELECT * FROM maatkit.checksum WHERE this_crc!=master_crc AND db not in ('maatkit','mysql'); If the test run came out OK, set up a script to run via cron such as: # run checksum table 9 times per day, during off-peak times: 30 0,1,2,3,4,5,21,22,23 * * * /home/pythian/bin/checksumtbls.sh >> /home/pythian/logs/checksum.log 2>&1 And the checksumtbls.sh script looks like: #!/bin/sh DB=maatkit USER=maatkit_user PASS=password REPLTBL="$DB.checksum" MODULO=120 CHUNKSIZE=100 OFFSETTBL="checksum_modulo" ALGORITHM=BIT_XOR LOG=/home/pythian/logs/checksum.log /usr/bin/perl /home/pythian/bin/mk-table-checksum -u $USER -p $PASS \ --modulo $MODULO \ --algorithm $ALGORITHM --chunk-size $CHUNKSIZE \ --offset "modulo_offset FROM $DB.$OFFSETTBL" \ --replicate $REPLTBL --create-replicate-table localhost >> $LOG /usr/bin/mysql -u $USER -p$PASS -e "update $DB.$OFFSETTBL set modulo_offset=((modulo_offset+1)%$MODULO)" >> $LOG And of course, do not forget to periodically check on the slave to see where the issues are: SELECT * FROM maatkit.checksum WHERE this_crc!=master_crc AND db not in ('maatkit','mysql'); I hope this helps; It is extremely important to make sure www.pythian.com | 7/27/10 5:09 PM
451 CAOS Links 2010.07.27
New projects. Old arguments. And more. Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca “Tracking the open source news wires, so you don’t have to.” New projects # Gemini Mobile Technologies released Hibari, a new open source non-relational database for big data. # Lockheed Martin launched the Eureka Streams open source project for enterprise social networking. # Sony Pictures Imageworks expanded its open source initiative with the release of OpenColorIO. Old arguments # Kirk Wylie discussed the importance of natural split in open core , OpenGamma’s approach. # Alan Shimel offered 10 commandments for open core. Mostly sensible, #6 will ruffle some feathers though. # Simon Phipps maintained that open source does not need “monetising”. # Carlo Daffara discussed property and efficiency as the basis of OSS business models. # Jorg Janke continued his discussion of various open source business strategies in relation to Compiere. # Henrik Ingo explained what you can do to help get rid of open core if you are so inclined. # dotCMS went open core with the release of version 1.9. # IBM faces EU antitrust investigation linked to TurboHercules complaint. # The FT reported that IBM is blaming Microsoft for the EU investigation into its mainframe business practices. # TechDirt explained how WordPress and Thesis have settled their differences over themes and the GPL. The best of the rest # Novell introduced SUSE Gallery for publishing and sharing Linux-based appliances. # VoltDB released version 1.1 of its open source database. # EnterpriseDB released Postgres Plus Advanced Server 8.4 and added Rob Bearden to its board. # SAP has adopted Black Duck’s Suite to manage the use of open source software in its software development process. # Oracle provided details of the MySQL Sunday event at Oracle Open World. # SearchEnterpriseLinux reported that Ubuntu is gaining ground as a data center OS at the expense of SUSE Linux. # Physorg.com explained how Georgia Institute of Technology researchers are helping the US military benefit from OSS. # GENIVI Allianced has reportedly opted for MeeGo for its in-vehicle infotainment platform. blogs.the451group.com | 7/27/10 4:58 PM
Implementing SSIS Package Configurations
Marcin Policht demonstrates how to leverage SQL Server Integration Services (SSIS) package variables in order to modify SSIS properties without directly editing package content. redir.internet.com | 7/27/10 12:23 AM
IBM DB2 pureScale: The Next Big Thing or a Solution Looking for a Problem?
In October 2009, IBM announced that some rather neat technology known as data sharing would be delivered for the IBM DB2 for Linux, Unix and Windows product, in an optional facility dubbed pureScale. Julian Stuhler takes a closer look at this technology and the possible implications for the entire IBM DB2 world. redir.internet.com | 7/27/10 12:08 AM
MariaDB and the MySQL Sandbox
Tokutek tests its TokuDB Fractal Tree storage engine with multiple MySQL distributions. We make extensive use of the MySQL Sandbox in our test automation. We tweaked the regular expressions that match binary tarball names in the MySQL Sandbox so that MariaDB releases can be easily loaded by the MySQL Sandbox. These changes can be found in launchpad lp:~prohaska7/mysql-sandbox/mariadb. tokutek.com | 7/26/10 8:49 PM
Webinar: What you need to know for a MySQL 5.0 -> 5.1 upgrade
IOUG has a free series of three webinars on upgrading MySQL. Each webinar is an hour long, and it starts with a webinar by me tomorrow at 12 noon Central time (GMT-5) on “Why and How to Upgrade to MySQL 5.1″. The webinar assumes you are upgrading from MySQL 5.0 to MySQL 5.1, and talks a little bit about the new features, server variables, and what you need to know when upgrading to MySQL 5.1. The software used is GoToWebinar (formerly GoToMeeting), so you will need to install that software. To register, use the links on the IOUG MySQL Upgrade Webinar Series page. The complete list of webinars in the MySQL Upgrade Series is: * MySQL 5.1: Why and How to Upgrade Sheeri Cabral, The Pythian Group Tuesday, July 27, 12:00 p.m. – 1:00 p.m. CT (GMT-5) * MySQL Upgrades With No Downtime Sean Hull, Heavyweight Internet Group Wednesday, July 28, 12:00 p.m. – 1:00 p.m. CT (GMT-5) * MySQL Upgrade Best Practices Matt Yonkovit, Percona Thursday, July 29, 12:00 p.m. – 1:00 p.m. CT (GMT-5) (note, I am not sure if it is free for everyone or just free for IOUG members; my apologies if it is the latter) www.pythian.com | 7/26/10 5:02 PM
UNHEX for Integers in MySQL
The following is a very, very specific subject matter, but I wish this blog post had existed when I was looking for the solution to a problem I was having. Most likely it has no interest to anyone who is not specifically looking for it, so feel free to stop reading...MySQL has a few built-in functions for handling binary data. One of them is HEX which converts any data into hexadecimal representation. The function which you would expect to do the opposite (as the manual states) is UNHEX which takes a hexadecimal representation and turns it into characters.So, if you try to do:SELECT UNHEX(HEX('a'));You get an "a" back. But if you try that on a integer:SELECT UNHEX(HEX(1)); You get back the char corresponding to the ASCII value of 1, which is not what was intended.The correct way to do it, and the real opposite of HEX is the CONV function which converts betweens bases:SELECT CONV(HEX(1), 16, 10);This time the result is the number "1" as expected. blog.shlomoid.com | 7/26/10 2:00 PM
OSCON and OpenStack
The past two weeks have been both exciting and extremely busy, first traveling to Austin, TX for the first OpenStack Design Summit, and then back home to Portland, OR for The O’Reilly Open Source Conference (OSCON) and Community Leadership Summit. The events were great in different ways, and there was some overlap with OpenStack since we announced it on the first day of OSCON and created quite a bit of buzz around the conference. I want to comment on a few things that came up during these two weeks. New Role I’m now focusing on OpenStack related projects at Rackspace. I’m no longer working on Drizzle, but I will still be involved in the MySQL and database ecosystems through future projects and conferences (see you at OpenSQL Camp). I will also still be working on a couple of Gearman related projects in my spare time. At OSCON I gave two presentations on Gearman and Drizzle, you can find the slides here. The Five Steps to Open One question that came up a few times over the past couple weeks is what the term “Open” means when a business or organization decides to adopt the open source philosophy. It turns out this means many different things to folks, and when an organization decides to go open, they need to make a decision on how open they are willing to be. Here are the various layers we’ve seen over the years: Open API – You’ve decided to take the first step to being open and released a well documented API to work with your web service or project. Everything behind the API is still a black-box though. Open Core – Beyond the APIs, you’ve decided to release part of the code open source, but you still keep some of the bits proprietary in an attempt to keep a competitive advantage. This is a hot debate lately on whether it is a viable Open Source business model. Open Source – You’ve decided keeping some code proprietary doesn’t help, and actually even hurts your project or adoption. You put all of the code out in the open for everyone to see. While everyone can see all of the source code, there still isn’t a whole lot of interaction going on. Open Development – Putting the source code out wasn’t enough. You want to enable users and external developers to be able to file bugs, submit patches, and track the development process to see what to expect next. This usually involves running your project on a public project site such as github or Launchpad. Open Decision Making – You’ve postponed the inevitable for long enough. Feature requests and bug reports are pouring in, and the community wants to have a say in what gets prioritized. Should we focus only on stability? Performance? New features? Porting to mobile platforms? Let the community decided the direction of the project. There have been examples of success for organizations who have stopped at each of these steps. Given the proper environment, any can work. My preference is to work on projects that are fully open, where company and organizational boundaries do not exist between developers and users. I’m thrilled to say that we’ve gone all in with OpenStack. We’re hosted on Launchpad and have a governance structure that allows all parties within the community to have a say in the future of the project. Preventing Vendor Lock-in During the Cloud Summit at OSCON, there was a debate titled: “Are Open APIs Enough to Prevent Lock-in?”. Most folks came to the conclusion that the answer is “no,” and I agree. While I feel open APIs are necessary, they are by no means sufficient. Even if a project is open source and allows for open development, it probably will not prevent vendor lock-in. The key is to provide some incentive for vendors to adopt and invest resources within a project. Much like customers don’t want vendor lock-in when choosing a platform, vendors do not want project or feature lock-in when choosing the software to power their business. Each vendor who chooses to participate must have the ability to voice their opinion on the direction of APIs, features, and other project priorities. This is why it is critical that any open source project must take all the steps described above to give the project a chance of being adopted and becoming the de facto standard. There is of course no guarantee that adoption and prevention of vendor lock-in will happen, but I see them as necessary steps. This is another area where OpenStack has done the correct thing. We are planning on having another developer summit in November, and then once every six months after that time. All design discussions and decision making will happen in public forums such as the mailing list and IRC. We want all participants in the community to have a chance to respond to topics being discussed, and we believe the more we have, the more successful the project will be. Having many voices allows the project to be more applicable to different environments. For example, Rackspace and NASA have different requirements for their compute architectures, but they also share many components as well. Through open participation we can ensure all needs are accounted for. Much like the LAMP stack has powered universities, governments, and competing business, we hope OpenStack can do the same. Contributor License Agreement (CLA) During the past couple of weeks a few folks asked what the CLA was all about. When the foundations of OpenStack were forming, the requirement of having a CLA came up from the legal side. Having been involved with open source projects that had very invasive CLAs, initially I had quite a bit of concern. The CLA is actually quite innocuous, and it does NOT require assignment or dual-ownership of copyright. You are the sole owner of code you contribute. For all intents and purposes it is a signed version of the Apache 2.0 license, the CLA just makes these terms more explicit. The CLA is handled through digital signatures, so no papers, pens, or faxing is required. Get Involved! Expect to see more posts on my blog related to OpenStack topics. If you would like to get involved, you can join the IRC channel (#openstack on irc.freenode.net), join the mailing list, or start contributing code! There are even jobs around OpenStack popping up already! oddments.org | 7/26/10 8:33 AM
JavaOne News Update 1
An update on some recent News on JavaOne 2010. As you know JavaOne San Francisco is Sep 19-23, 2010. The Official page has links to the Registration Page and the Online Catalog. News updates include: • A surprisingly useful & manageable Catalog-as-tweets via @javaoneconf • Availability of Schedule Builder (post) • Open enrollment in Java University (post) • Announcement of dates for JavaOne Brazil and JavaOne China (post). • The day before there is a MySQL Sunday! • And, the Duke Awards submissions page seems to still be active. Also, this year will be the 15th anniversary for Java, and the 5th for GlassFish.  Don't know if there will be a BDay party for Java; still hoping we can put something together for GlassFish, we will see! More related news are tagged JavaOne. blogs.sun.com | 7/26/10 1:46 AM
Why don't you use X?
Sometimes I am asked why don't I use X instead of official MySQL. The answer is simple. I like to use it because I have been using it, the MySQL development team (including InnoDB) has done great work this year and because change is expensive. The cost of change includes the cost of evaluating the alternatives and the cost of deploying them. The cost of change also includes the features I won't work on because I am doing an evaluation. I also use it because the quality of new 5.1 releases has been very high this year. I know because I test them and some of the alternatives.My initial evaluation criteria are simple. I don't like compiler or valgrind warnings. The alternative should not introduce new ones. I like regression tests. The alternative should not disable or fail existing tests. If the existing test is somewhat bogus, then it should be fixed. I love buildbot as done by MariaDB, fixes in official MySQL to reduce compiler warnings and all of the work done by Drizzle to not tolerate compiler warnings. When the alternative adds new features it must add new regression tests (hooray for status_user.test in MariaDB).I spent a lot of time debugging valgrind warnings that occur in MySQL 5.1.47. All of them were bogus and I think future versions of MySQL will prevent these. That is good news as I prefer to not repeat that effort the next time I upgrade to a new MySQL release.Percona and MariaDB confront the same issues and more when considering code to incorporate. In addition to what I wrote about above, they must review patches to make sure the code is reasonable. There is a lot of duplicate effort done by groups that patch or fork MySQL. I wish this weren't the case. If the external fork & patch effort is consolidated around MariaDB, then we can reduce some of this. mysqlha.blogspot.com | 7/25/10 8:18 PM
MySQL versions at Wikipedia
More of information about how we handle database stuff can be found in some of my talks. Lately I hear people questioning database software choices we made at Wikipedia, and I’d like to point out, that… Wikipedia database infrastructure needs are remarkably boring. We have worked a lot on having majority of site workload handled by edge HTTP caches, and some of most database intensive code (our parsing pipeline) is well absorbed by just 160G of memcached arena, residing on our web servers. Also, major issue with our databases is finding the right balance between storage space (even though text is stored in ‘external store’, which is just set of machines with lots of large slow disks) – we store information about every revision, every link, every edit – and available I/O performance per dollar for that kind of space needed. As a platform of choice we use X4240s (I advertised it before) – 16 SAS disks in compact 2u package. There’s relatively small hot area (we have 1:10 RAM/DB ratio), and quite a long tail of various stuff we have to serve. The whole database is just six shards, each getting up to 20k read queries a second (single server can handle that), and few hundred writes (binlog is under 100k/s – nothing too fancy). We have overprovisioned some hardware for slightly higher availability – we don’t have always available on-site resources – the slightly humorous logic is we need four servers, in case one goes down, another will be accidentally brought down by fixing person, then you got one to use as a source of recovery and remaining one to run the site. Application doesn’t have too many really expensive queries, and those aren’t the biggest share of our workload. Database by itself is minor part of where application code spends time (looking at profiling now – only 6% of busy application time is inside database, memcached is even less, Lucene is way up with 11%). This is remarkably good shape to be at, and it is much better than what we used to have when we had to deal with insane (“explosive”) growth. I am sure, pretty much anything deployed now (even sqlite!) will work just fine, but what we used has been created during bad times. Bad times didn’t mean that everything was absolutely overloaded, it was more that it could get overloaded very soon, if we don’t take appropriate measures, and our fundraisers were much tinier back then. We were using 6-disk RAID-0 boxes to be able to sustain good performance and have required disk space at the same time (or of course, go expensive SAN route). While the mainstream MySQL development with its leadership back then was headed towards implementing all sorts of features that didn’t mean anything to our environment (and from various discussions I had with lots of people, many many other web environments): utf8 support that didn’t support Unicode Prepared Statements that don’t really make much sense in PHP/C environments unoptimized subqueries, that allow people to write shitty performing queries later in 5.0 – views, stored routines, triggers etc… … nobody was really looking at MySQL performance at that time, and it could have insane performance regressions (“nobody runs these things anyway”, like ‘SHOW STATUS’) and a forest full of low hanging fruits. From operations perspective it wasn’t perfect either – replication didn’t survive crashes, crash recovery was taking forever, etc. Thats when Google released their set of patches for 4.0, which immediately provided incredible amount of fixes (thats what I wrote about it back then). To highlight some of introduced changes: Crash-safe replication (replication position is stored inside InnoDB along with transaction state) – this allowed to run slaves with innodb log flushing turned off on slaves and having consistent recovery, vanilla MySQL doesn’t have that yet, Percona added this to XtraDB at some point in time Guaranteed InnoDB concurrency slot for replication thread – however loaded the server is, replication does not get queued outside and can proceed. This allowed us to have way more load pointed towards MySQL. This is now part of 5.1 Multiple read-ahead and write-behind threads – again, allowed to bypass certain bottlenecks, such as read-ahead slots (though apparently it is wiser just to turn off read-ahead entirely) – now part of InnoDB Plugin Multiple reserved SUPER connections – during overloads systems were way more manageable Running these changes live have been especially successful (and that was way before Mark/Google released their 5.0 patch set which was then taken in parts by OurDelta/Percona/etc) – and I spent quite some time trying to evangelize these changes to MySQL developers (as I would have loved to see that deployed at our customers, way less work then!). Unfortunately, nobody cared, so running reliable and fast replication environments with mainline MySQL didn’t happen (now one has to use either XtraDB or FB build). So, I did some merging work, added few other small fixes and ended up with our 4.0.40 build (also known as four-oh-forever), which still runs half of shards today. It has sufficient in-memory performance for us, it can utilize our disk capacity fully, and it doesn’t have crash history (I used to tell about two 4.0 servers, both whitebox raid0 machines, having unbroken replication session for two years). By todays standards it already misses few things (I miss fast crash recovery mostly, after last full power outage in a datacenter ;-) – and developers would love to abuse SQL features (hehe, recently a read-only subquery locked all rows because of a bug :-) I’m way more conservative when it comes to using certain features live, as when working at MySQL Support I could see all the ways those features break for people, and we used to joke (this one was about RBR :): Which is the stable build for feature X? Next one! Anyway, even knowing that stuff breaks in one way or another, I was running a 5.1 upgrade project, mostly because of peer pressure (“4.0 haha!”, even though that 4.0 is more modern from operations standpoint). As MediaWiki is open-source project, used by many, we already engineer it for wide range of databases – we support MySQL 4.0, we support MySQL 6.0-whatever-is-in-future, and there’s some support for different vendor DBMSes (at various stages – PG, Oracle, MS SQL, etc) – so we can be sure that it works relatively fine on newer versions. Upgrade in short: Dump schema/data Load schema on 5.1 instance Adjust schema, as we can do it, set all varchar to varbinary to maintain 4.0 behavior Load data on 5.1 instance Fix MySQL to replicate from 4.0 (stupid breakage for nothing) Switch master to 5.1 instance We had some 5.0 and 5.1 replicas running for a while to detect any issues, and as there weren’t too many, the switch could be nearly immediate (English Wikipedia was converted 4.0->5.1 over a weekend). I had an engineering effort before to merge Google 5.0 patches into later than 5.0.37 tree, but eventually Mark left Google for Facebook and “Google patch” was abandoned, long live the Facebook patch! :) At first FB-internal efforts were to get the 5.0 environment working properly, so 5.1 development was a bit on hold. At that time I cherry-picked some of Percona’s patch work (mostly to get transactional replication for 5.1, as well as fast crash recovery) – and started deploying this new branch. Of course, once Facebook development focus switched to 5.1, maintaining separate branch is becoming less needed – my plan for the future is getting FB build deployed across all shards. The beauty of FB-build is that development team is remarkably close to operations (and operations team is close to development), and there is lots of focus on making it do the right stuff (make sure you follow mysql@facebook page). The visibility of systems (PMP!) we have at Facebook can be transformed into code fixes nearly instantly, especially when compared with development cycles outside. I’m sure some of those changes will trickle to other trees eventually, but we have those changes in FB builds already here, and they are state of the art of MySQL performance/operations engineering, while maintain great code quality. Yes, at Wikipedia we run a mix of really fresh and also quite old/frozen software (there will be unification, of course), but…. it runs fine. It isn’t as fascinating anymore as years ago, but it allows not paying any attention for years. Which is good, right? Oh, and of course, there’s full data on-disk compatibility with standard InnoDB builds, in case anyone really wants to roll back or switch to the next-best-fork. mituzas.lt | 7/25/10 7:36 PM
MySQL Server Protocol Bug
A few months ago I wrote a tool that verified MySQL and Drizzle protocol compatibility, along with testing for all sorts of edge cases. In analyzing protocol command interactions in mysqld, I found that the MySQL server will happily read an infinite amount of data if you exceed the maximum packet size while using a special sequence of protocol packets. The reasoning behind this behavior is so that the server can be polite and flush your data before sending a “max packet exceeded” error message, but perhaps there should be a limit to one’s politeness. What’s more interesting is that you can do this during the client handshake packet without authorization, so anyone could do this to any open MySQL server. The appropriate thing to do here would be to set some maximum limit of data to read and force a connection close when it is reached, otherwise your bandwidth and CPU could be consumed (essentially a DoS attack). This portion of code was ripped out entirely in Drizzle, so there are no risks there. I submitted this as a bug to MySQL and MariaDB back in February and they both have patches available to fix this as well. You can find the bug here and a patch here. If you have publicly accessible MySQL or MariaDB servers, you probably want to upgrade binaries or patch this. oddments.org | 7/25/10 1:47 AM
Caching could be the last thing you want to do
I recently had a run-in with very popular PHP ecommerce package which makes me want to voice a recurring mistake I see in how many web applications are architected. What is that mistake? The ecommerce package I was working with depended on caching.  Out of the box it couldn’t serve 10 pages/second unless I enabled some features [...] www.mysqlperformanceblog.com | 7/24/10 7:39 PM
Using MySQL Partitioning Instead of MERGE Tables
One common question I get is how to use partitioning instead of MERGE tables. The process I use involves using stored procedures to create and drop partitions. This article will go over the stored procedures I use; special thanks to Roland Bouman for taking a look and giving great feedback to optimize this process. First, a simple table, not partitioned (yet): use test; DROP TABLE IF EXISTS my_part; CREATE TABLE IF NOT EXISTS my_part ( id int NOT NULL, creationDate datetime NOT NULL, PRIMARY KEY (id,creationDate) ) ENGINE=InnoDB; In real, life there is more to the table than just id and creationDate. The most important part is that the partitioned field(s) need to be part of the primary key. Now, add the partition definition. This can be done in the CREATE statement, but I have found that it is easier for me to think about what fields and indexes I want first, and then worry about partitioning, so I naturally gravitate towards defining the fields and indexes first and then altering the table to add partitioning. This also helps me think about how to modify an existing table to add partitioning. ALTER TABLE my_part PARTITION BY RANGE (TO_DAYS(creationDate)) ( partition 2010_07_01 values less than (to_days('2010-07-02')), partition 2010_07_02 values less than (to_days('2010-07-03')), partition 2010_07_03 values less than (to_days('2010-07-04')), partition 2010_07_04 values less than (to_days('2010-07-05')) ); This makes it pretty clear what is happening — the idea is to give the partition names actual dates that they hold, so that it is easy to see what partitions need to be added and deleted. Deleting partitions I find that making stored procedures makes things easy….so I will define a procedure called partition_drop to drop partitions. The partition_drop stored procedure takes in a table name and schema name to drop partitions from, and a date to delete up through and including that date. Here’s the procedure: DELIMITER || DROP PROCEDURE IF EXISTS partition_drop || CREATE PROCEDURE partition_drop (IN through_date date, IN tbl varchar(64), IN db varchar(64)) BEGIN DECLARE delete_me varchar(64); DECLARE notfound BOOL DEFAULT FALSE; DECLARE pname CURSOR FOR SELECT PARTITION_NAME FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME=tbl AND TABLE_SCHEMA=db AND DATE(PARTITION_NAME)!= 0 AND DATE(PARTITION_NAME) IS NOT NULL AND DATE(PARTITION_NAME)<=through_date; DECLARE CONTINUE HANDLER FOR NOT FOUND SET notfound:=TRUE; OPEN pname; cursor_loop: LOOP FETCH pname INTO delete_me; IF notfound THEN LEAVE cursor_loop; END IF; SET @alter_stmt:=CONCAT("ALTER TABLE ",db,".",tbl," DROP PARTITION ",delete_me); # sanity check commented out for production use # SELECT @alter_stmt; PREPARE stmt_alter FROM @alter_stmt; EXECUTE stmt_alter; DEALLOCATE PREPARE stmt_alter; END LOOP; CLOSE pname; END || DELIMITER ; Go ahead and run CALL partition_drop('2010-07-02','my_part','test'); to verify that SHOW CREATE TABLE my_part; shows that the desired partitions have been dropped. Adding partitions Adding partitions is similar to deleting partitions — using a stored procedure that takes in the date to add partitions up to that date. It will not try to add so many partitions that the table will have more than 1024, and it won’t add any partitions that already exist. DELIMITER || DROP PROCEDURE IF EXISTS partition_add || CREATE PROCEDURE partition_add (IN through_date date, IN tbl varchar(64), IN db varchar(64)) BEGIN DECLARE add_me char(10); DECLARE max_new_parts,add_cnt smallint unsigned default 0; SELECT 1024-COUNT(*) AS max_new_parts, SUM(CASE WHEN DATE(PARTITION_NAME)>=through_date then 1 else 0 END) INTO max_new_parts, add_cnt FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_SCHEMA = db AND TABLE_NAME = tbl; IF add_cnt=0 THEN BEGIN SELECT MAX(DATE(PARTITION_NAME)) INTO add_me FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME=tbl and TABLE_SCHEMA=db AND DATE(PARTITION_NAME)<through_date; # to do: declare handler for exceptions here IF DATEDIFF(through_date,add_me)+1 < max_new_parts THEN BEGIN WHILE add_me<through_date do BEGIN SET add_me:=DATE_FORMAT(add_me + INTERVAL 1 DAY,"%Y_%m_%d"); SET @alter_stmt:=CONCAT("ALTER TABLE ",db,".",tbl," ADD PARTITION (PARTITION ",add_me," VALUES LESS THAN (TO_DAYS('",add_me+INTERVAL 1 DAY, "')))" ); PREPARE stmt_alter FROM @alter_stmt; EXECUTE stmt_alter; DEALLOCATE PREPARE stmt_alter; END; END WHILE; END; END IF; END; END IF; END || DELIMITER ; Here’s how to call that stored procedure: CALL partition_add ('2010_07_10', 'my_part','test'); Caveat: This stored procedure will only add in partitions from the maximum partition name/date until the date you add it in, it will not fill in any gaps. However, it is possible to add in another input parameter to be the “start date” and change the stored procedure to add the partition if it does not exist, from the start date through the end date. Note: It was pointed out by Roland Bouman that it would be better to change both the add and drop stored procedures to do one ALTER TABLE instead of sequential ones. He wrote: It’s probably better to generate a single statement to drop / add all partitions. So the general pattern would be: - generate and concatenate lines for each partition - one single sequence of prepare, execute, deallocate to execute one DDL statement. For the drop partition procedure, this approach would allow you to do away with the cursor. You can simply use GROUP_CONCAT in a single SELECT…INTO statement to generate the entire DDL statement. Note: Roland also mentioned that these stored procedures could be generalized to use any interval, for example if each table partition held 7 days. The only limitation is that to get the optimal partition performance with dates, use either TO_DAYS(date_field) or YEAR(date_field) as the partitionining function. Putting it all together The procedure: daily I would run the following in MySQL, to keep only the previous “x” days: CALL partition_drop('CURRENT_DATE()-INTERVAL x DAY', 'tbl_name','schema_name'); To add new partitions, I would do: CALL partition_add('CURRENT_DATE()-INTERVAL 32 DAY', 'tbl_name','schema_name'); It is run daily just in case; the side benefit is that it will catch any gaps if you do not modify the stored procedure to include a start date for when to add. Adding a partition does not take a long time, because there’s no data to reorganize. I would make a daily MySQL event, and then everything is stored nicely in the database, and backed up with the database. Others may choose to run the stored procedures in a cron script or scheduled task (Windows), but that requires a password to be stored somewhere — either in the script, somewhere the script can read, or in a config file such as .my.cnf. www.pythian.com | 7/24/10 3:19 AM
Going Mobile: Data Management Solutions for Mobile Devices
Crunched by the economy and squeezed by competition, businesses and consumers alike are turning to mobile computing. This report wraps up our overview of the top contenders in the mobile database market with a look at Sybase SQL Anywhere and SQLite. redir.internet.com | 7/24/10 12:23 AM
Gain the Competitive Advantage: Data Management Solutions for Mobile Devices
Today's workforce and consumers have become increasingly mobile. Businesses crunched by the economy or squeezed by competition look to mobile computing to gain a competitive advantage. This report examines the top contenders in the mobile database market space, starting with SQL Server Compact, Oracle Database Lite and IBM DB2 Everyplace. redir.internet.com | 7/24/10 12:08 AM
Dynamic Data Driven Variable Hierarchical Structures in SQL
SQL can support the creation of variable hierarchical structures in a number of differently controlled operations that can be combined. Michael M. David describes a powerful and flexible automatic method of dynamic structure generation. redir.internet.com | 7/23/10 11:53 PM
Selling the Business Intelligence Implementation
Increasingly more organizations are turning to Business Intelligence professionals to provide historical, current, and predictive views of business operations on their critical information. As a Business Intelligence consultant, your challenge is not usually the sourcing or delivery of the information, but how that information is used going forward once you have left the organization. redir.internet.com | 7/23/10 11:38 PM
Log Buffer #196, A Carnival of the Vanities for DBAs
Welcome to Log Buffer, the weekly roundup of database industry news. For your reading pleasure this week we have Log Buffer #196: Charles Hooper blogs about an in-depth investigation on what can cause Oracle to ignore a hint. Doug Burns reminds his readers that there are only two weeks left to submit papers for UKOUG. The deadline is Aug. 2. A while back Greg Rahn at Structured Data blog thought that the best way to get results out of Exadata is by changing your application to get the most out of Exadata. He was very happy to see that Pythian thinks the same. On the subject of Exadata and data warehousing, Greg posts this week on the core performance fundamentals of Oracle Data Warehousing – set processing vs row processing. Jonathan Lewis links to instructions and explanations on how to switch to a different UNDO tablespace. It is trickier than it sounds and Jonathan provided additional traps to watch out for. Jonathan also continues his fragmentation series with an explanation of table fragmentation and its causes. Alex Fatkulin explains about ASM mirroring and disk partnership and why you may have less redundancy than you thought. On the same subject, Jeremy Schneider of Ardent Performance blog explains about hot disks, raid and what it means for ASM mirroring. Back to blogging after a recent trip to TechInsights 2010, Edwin Sarmiento answers questions on what needs to be done as part of the installation of a SQL Server 2008 R2 Failover Cluster on Windows Server 2008 R2. A number of things related to Windows Clustering need to be considered. Willie Favero introduces the “IBM zEnterprise System”, on his blog Getting the Most out of DB2 for z/OS and System z. On Join-fu! the Art of SQL blog, Jay Pipes talks about getting started developing Nova on Linux, as he’s involved in a new OpenStack project. Peter Zaitsev, on MySQL Performance Blog, posts about estimating replication capacity so that replication load can be dealt with before slave is unable to catch up. Paul Randal publishes his survey results around the purchase and use of SSIDs. And, if you happen to be attending Oracle OpenWorld, register before July 30 to take advantage of early bird rates. www.pythian.com | 7/23/10 10:09 PM
Database Scalability Patterns - OSCon 2010
Howdy folks, slides are up for my talk, "Database Scalability Patterns", which I gave this week at OSCon 2010. You can get them from the OSCon page, from slideshare, or just watch it below Database Scalability PatternsView more presentations from Robert Treat. www.xzilla.net | 7/23/10 8:42 PM
Building MariaDB with the InnoDB plugin
This post was inspired by a couple events. But I won't explain them other than to say I think there have been too many subjective comments (or FUD) about quality. This is an attempt to quantify whether the grass is greener on the other side.Today I tried to build MariaDB with the InnoDB plugin. I was told his is now supported. The last time I checked XtraDB replaced the InnoDB plugin in MariaDB. MPAB had a reasonable reason for doing this as they don't want to test both the InnoDB plugin and XtraDB. But I prefer choice, despite the many great features in XtraDB.First I tried using the 5.3 release. That failed fast. I prefer fast failures over obscure ones:./configure --enable-thread-safe-client --with-plugins=partition,csv,blackhole,myisam,heap,innodb_plugin --without-plugin-innobase --with-fast-mutexes --with-extra-charsets=all --with-debug C_EXTRA_FLAGS="-fno-omit-frame-pointer -fno-strict-aliasing -Wall"...configure: error: unknown plugin: innodb_pluginThen I tried the 5.2 MariaDB release. This time the configure command worked. After running make, both XtraDB and the InnoDB were compiled. Time to try a test. There were no failures! Alas, all tests were skipped../mysql-test-run.pl --suite=innodb_plugin...innodb_plugin.innodb                     [ skipped ]  No innodb supportinnodb_plugin.innodb-analyze             [ skipped ]  No innodb supportinnodb_plugin.innodb-autoinc             [ skipped ]  No innodb supportinnodb_plugin.innodb-autoinc-44030       [ skipped ]  No innodb supportinnodb_plugin.innodb-consistent          [ skipped ]  No innodb supportinnodb_plugin.innodb-index               [ skipped ]  No innodb supportinnodb_plugin.innodb-index_ucs2          [ skipped ]  No innodb supportinnodb_plugin.innodb-lock                [ skipped ]  No innodb supportinnodb_plugin.innodb-replace             [ skipped ]  No innodb supportinnodb_plugin.innodb-semi-consistent     [ skipped ]  No innodb supportinnodb_plugin.innodb-timeout             [ skipped ]  No innodb supportinnodb_plugin.innodb-use-sys-malloc      [ skipped ]  No innodb supportinnodb_plugin.innodb-zip                 [ skipped ]  No innodb supportinnodb_plugin.innodb_bug21704            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug34053            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug34300            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug35220            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug36169            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug36172            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug38231            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug39438            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug40360            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug40565            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug41904            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug42101            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug42101-nonzero    [ skipped ]  No innodb supportinnodb_plugin.innodb_bug44032            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug44369            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug44571            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug45357            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug46000            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug46676            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug47167            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug47621            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug47622            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug47777            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug51378            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug51920            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug52663            [ skipped ]  No innodb supportinnodb_plugin.innodb_bug52745            [ skipped ]  No innodb supportinnodb_plugin.innodb_file_format         [ skipped ]  No innodb supportinnodb_plugin.innodb_information_schema  [ skipped ]  No innodb supportinnodb_plugin.innodb_trx_weight          [ skipped ]  No innodb supportThen I checked for compiler warnings. I really dislike compiler warnings. MySQL has recently done a lot of work to remove them from 5.1 (thanks Davi). I think all of this work was considered one bug, but it was a lot of work and will make MySQL better. They have also begun to do some builds with Werror. See bug 53445 for all of the details.After compiling storage engines with -Wall, there are no warnings for the official InnoDB plugin (storage/innodb_plugin). There are warnings for XtraDB (storage/xtradb):btr/btr0btr.c:2871: warning: null argument where non-null required (argument 1)btr/btr0cur.c:1841: warning: null argument where non-null required (argument 2)btr/btr0cur.c:1860: warning: null argument where non-null required (argument 1)btr/btr0cur.c:1967: warning: null argument where non-null required (argument 1)fil/fil0fil.c:3106: warning: pointer targets in passing argument 2 of ‘dict_table_get_index_on_name’ differ in signednessibuf/ibuf0ibuf.c:775: warning: null argument where non-null required (argument 1)ibuf/ibuf0ibuf.c:950: warning: null argument where non-null required (argument 1)os/os0file.c:4194: warning: pointer targets in assignment differ in signednessI want to compare the results from running all regression tests with valgrind. But that might take some time. I am able to run all InnoDB tests without valgrind warnings using the Facebook-patched MySQL 5.1.47. That required a few small changes that are likely in the recent 5.1.49 release. Have all regression tests been run under valgrind for MariaDB using either the InnoDB or XtraDB plugin?Stay tuned for part 2. mysqlha.blogspot.com | 7/23/10 6:55 PM
MyTOP 1.9 released
MyTOP is a console-based (non-gui) tool for monitoring the threads and overall performance of a MySQL created by Jeremy Zawodny. For months now I have been adding updates to mytop. When I started using version 1.6 it worked but didn’t return some data fields. After fixing these bugs I began to ideas for improvements. Here is a quick list of what I have done. New ‘!’ command to force fix replication errors. Display rows sorted per second. Added new ‘Cmd’ column to display the state of the query along with the statement. New ‘M’ command to change the [Mode] to ‘status’.  This was in the code but covered up by another command. If you find any bugs or would like to see your own updates added please email me at mark @ grennan.com. Enjoy! www.mysqlfanboy.com | 7/23/10 5:57 PM
How To: Export/Import Large MySQL Database
MySQL is frequently referred to as a database for Web applications. Partially it is really so, because MySQL became popular owing to its simplicity, high speed, and bounding with PHP. Developers of small Web projects often choose MySQL as a back end of their sites. Does this mean that MySQL can be used only for small databases? Not at all. There are lots of databases size of data in which is measured in gigabytes. Besides MySQL servers are frequently clustered to increase their performance. When a DBAs work with large amounts of data, they frequently have to make backup copies correctly and effectively, i. e. to export MySQL databases to SQL (or MySQL backup). It is exteremely important to import MySQL database from SQL correctly is when restoring a corrupted database and when migrating a database from one server to another. What should be taken into account when exporting a large MySQL database? Making a backup copy of a large database takes a lot of time. During this time some user can try to modify data in this database. But we want to get an all-of-a-piece database snapshot. For different tables this can be done in different ways: for InnoDB tables a separate transaction should be started MyISAM tables should be locked by FLUSH TABLES WITH READ LOCK Except taking a lot of time export and import of a MySQL database implies transferring a large amount of data between client and server. The most effective way of reducing the amount of data transferred through the net and the amount of the used disk space is compression. All points mentioned above are taken into account in the utility for making database backups included in the set of tools available in dbForge Studio for MySQL. The rest of the article is written referring to this utility. Step-by-step MySQL backup procedure 1. Set compression for the connection Using compression for a connection allows to reduce net traffic owing to strings compression. The Use Compression option is available on the Advanced page of the Database Connection Properties dialog. But if the amount of string data per record is not large, you won’t gain much this way. 2. Choose a database for export and open Database Backup Wizard In the Database Explorer tree choose the nod of the needed database or connection. Choose Backup Database from the pop-up menu. On the image below we show the main settings of MySQL backup. Pay your attention to the fact that we are using compression to reduce the disk space usage. Besides using compression allows to add comments to the backup. 3. Set options to create an all-of-a-piece database snapshot Earlier in this article we wrote about the necessity of avoiding data corruption while creating a database backup. There are options to achieve this, and they are marked red on the image below: 4. Make a backup of your MySQL database After performing all necessary settings press the Backup button and wait until the application finishes export. During the operation you will be able to see the progress of backup creation and its stages. Step-by-step MySQL import procedure So, we’ve performed database export and compressed the results into a ZIP-archive. And now we have to import this database on the new server. Will we have to decompress a large script and try to execute it manually? Not at all. Database Import can be performed neatly and easily with the help of Database Restore Wizard 1. Choose a database to import and open Database Restore Wizard In the Database Explorer tree choose the nod of the needed database or connection. Choose Restore Database from the pop-up menu. Choose the backup file in the window that opened. After you’ve chosen the file you should see approximately the same as shown on the image below: The program finds the *.sql file in the archive automatically and shows the comment we’ve added when creating the backup. Now we are ready to import our MySQL database. 2. Import the database After pressing the Restore button the program will automatically restore the database from the backup. Conclusion In this article we’ve reviewed some aspects of making backups of MySQL databases and illustrated capabilities of dbForge Studio for MySQL regarding export/import. You can download dbForge Studio for MySQL here. www.devart.com | 7/23/10 5:11 PM
SQL Server Integration Services 2008(SSIS) and MySQL
In my previous article I had created linked server, to access and query a database. However, it is just the basic step to try and bring two different RDBMS systems linked together. In this article I will discuss about creating a SSIS package to transfer data from MS SQL Server 2008 to MySQL 5.5 Prerequisite: ___________________________________________________ SQL Server Business Intelligence Development Studio MySQL Connector Net 5.2.7 You can download this package from http://www.mysql.com/downloads/connector/net/ Connector/Net is a fully-managed ADO.NET driver for MySQL. MySQL Connector/ODBC 5.1 (Which is already installed, in our previous article) http://www.mysql.com/downloads/connector/odbc/ Brief about SSIS ___________________________________________________ SSIS provides a graphical front end to design control flow data processing logic.  Once designed, these ‘packages’ are compiled into ‘.dtsx’ packages which can then be distributed and run on any machine with the SSIS tools installed. Packages contain two main logic flows, a ‘Control Flow’ which defines a sequence of logical operations which are processed in sequence. Each step is completed before the next starts e.g. 1.  Empty out work tables in a database 2.  Populate the work tables with data 3.  Perform calculations and update the values in the work table 4.  Update OLAP cubes with the data from the work tables 5.  Run reports against the OLAP cubes. This level of control also allows processing loops to be defined e.g. For each file in a specified folder, read the contents of the file and write it into a specified table. The second main logic flow is the ‘Data Flow’.  This allows for the processing of data at the record level. Data is read from a ‘Data Source’ and passes down a series of ‘Data Transformations’ to a ‘Data Destination’. These transformations can be as simple as changing the data type of fields e.g. varchar(4000) to varchar(2000) or decimal(18,2) to decimal(8,2), or can be more complex like data merges, joins, pivot tables, multicasts etc. Each transformation is represented by an icon in the designer and the icons are linked together to define the logic path. Creating SSIS package to transfer data from MS SQL Server 2008 to MySQL ___________________________________________________ Follow the below steps to create a SSIS package for data transfer. Go to START>MS SQL SERVER 2008>SQL SERVER BUSINESS INTELLIGENCE DEVELOPMENT STUDIO and Click on File>New>Project Under Business Intelligence Projects, select Integration Services Project and rename the Project Name. This will open a Package Designer Screen, where you can start designing the data flow for your package. Configure Connection Managers for you package ___________________________________________________ Right Click in the area where it shows the Connection Manager tab and select New ADO.Net Connection. You will see the Configuration Manager Editor window Click New Now you need to define connection manager configuration for source (i.e. MS SQL Server 2008) in our case. In the Connection Manager Editor, by default the Provider is set to .Net Providers\SqlClient Data Provider Use your test server name or IP for Server Name and under connect to database, select the database to use. I’m using ssistest. Similarly, you need to create connection manager configuration for Destination (i.e. MySQL) Right Click, under Connection Manager’s tab and Select ADO.NET Connection as mentioned above and click New on Connection Manager Editor. For Provider click on the drop down arrow to select .Net Providers\odbc data Provider. In the previous article we had created System DSN name MySQL. We will use the same here for Use user or system data source name. Enter your login information and Test Connection. It should succeed, and then click Ok. Now, on Configure ADO.NET Connection Manager screen, you can see both the source and destination are configured. It’s now time to add control flows to the package, ensure you are on Control Flow tab and Drag Data Flow Task from the Control Flow Items under ToolBox. Now click on the Data Flow tab above and Drag ADO NET Source and ADO NET Destination as shown below. You can rename the Source and Destination Names from Properties. Right click on ADO NET SOURCE i.e. MS SQL Server 2008 and click EDIT. Make the below changes as mentioned on the screen. As mentioned before I have ssistest database which has few sample tables, that I have exported from AdventureWorks database. I’m selecting one of the tables named HumanResources.Department for this example. In the above screen you can see the test server selected with database and table to transfer the data from. Similarly right click on MySQL (Destination) and click Edit. On this screen, under ADO.NET Connection Manager select MySQL.root from the drop downlist. Since we do not have any destination table to map with the source, we need to click on New. It will show you the below message, click OK Now remove the quotes and Click OK to create a table on destination as shown. It will come back to the editor, Click on the Mappings tab and check if all the columns are mapped and click OK. This will complete the data flow design of our package. Saving the SSIS Package ___________________________________________________ Once the package is created, save the copy of it to SSIS Package store or MSDB. Follow the below steps, Click on File> Save Copy of Package.dtsx As from the menu it will open a window, Select SSIS Package Store as the Package Location and Select the name of your test server. Enter /File System/Export2MySQL as the package path and Click OK. Select File > Save Copy of Package.dtsx as again from the menu and Select SQL Server as the Package Location. Select the name of your test server and fill in your authentication information. Enter Export2MySQL as the package path and Click OK. Once the copies are saved you can see them under Integration Services Stored Packages as shown below. Changes on MySQL ___________________________________________________ Before running the Package, make this change on the MySQL Server mysql> SET GLOBAL sql_mode= ‘ANSI’ Executing the Package ___________________________________________________ You can execute the package either from SSMS and BI Development Studio. I’m doing it from the BI, on the right hand side your BI screen you will find solution explorer, which has the Package.dtsx listed. Right click and select Execute Package. The moment you click execute the debugging process start and the Output is show below on the BI screen.  If the Data Flow Task is changed to Green, it means the package has succeeded, and if it is Red it means somewhere in the flow there was an error. On the Data Flow Tab, you can see the number of rows transferred in the execution. There is a progress tab on the screen, where you can check the steps taken in the flow, it also list errors and warnings if any. Query the MySQL Server to see if the data transfer was successful as show below. This was about transferring data from MS SQL Server 2008 to MySQL, you can do vice verse by changing the source and destination while designing the package on Data Flow tab. Hope this article is useful. dbperf.wordpress.com | 7/23/10 1:34 PM
Install, configure and run MySQL Cluster – demo video
There is a new video available: Demonstration of installing, configuring and running MySQL Cluster (LINUX) to accompany the MySQL Cluster Quick Start guides. The Flash video video lasts for about 7 minutes. If you aren’t able to view Flash on your device then a (poorer quality) version is included here – watch the Flash version if you’re able to! www.clusterdb.com | 7/23/10 11:27 AM
VistA scenarios, and other controversies at the Open Source health care track
The history and accomplishments attributed to VistA, the Veterans Administration's core administrative software, mark it as one of the most impressive software projects in history. Still, lots of smart people in the health care field deprecate VistA and cast doubt that it could ever be widely adopted. feedproxy.google.com | 7/23/10 7:21 AM
How to detect the user’s preferred language – smarter than Google
Sometimes I find Google annoying. Don’t get me wrong, I am not one of the many Google haters out there. They have done amazing things and especially since they have released WebM as Open Source and royalty free video format, they have earned a bonus in my attitude towards them. And even though Google Chrome is not my personal favorite, I have to admit that Chrome has brought a great momentum into the development of web browsers (and therefor the WWW as a whole) which users of any browser benefit from. Last but not least, I use a lot of Google’s services and they do their job quite well. So all in all, Google’s record from my point of view is pretty good. But one thing really sucks. I use an English operating system. I use an English Firefox (locale en-US). All the language preferences in the Google profiles of my accounts are set to English. Nevertheless, I often get pages from Google delivered in German by default. How they do that is quite obvious, or lets say, there is only one explanation how they may do this: they identify my home country based on my IP address. There are services that provide the data and they are easy to implement, sometimes even for free, like MaxMind‘s GeoLite Country database. Also I am using these services and I believe there is nothing wrong in doing so, except for a few things – and one of them is to identify the assumed web page visitor’s language preferences based on his or her home country. Thankfully, I am only a lightweight victim of this practice, because I speak the language which I get served based on their guesses. But how happy would a Japanese tourist on vacation in Austria be to get his pages delivered in German, only because he uses an Austrian IP address? There are many reasons why a person located in one country may not understand the language that is spoken there. I only have to drive about 20 kilometers to get into such a country (the Czech Republic). So to assume that a person’s preferred language is the language spoken by the majority of a country (what about minorities or countries that have more than one language?) is very wrong. What are better solutions? As I mentioned, I use an English operating system and an English browser. Almost every browser sends a user agent string. My currently most frequently sent user agent string for example is “Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.7) Gecko/20100713 Firefox/3.6.7″. Every web server that receives a request from a user agent (usually a browser) can read this user agent string and process it accordingly. What can we read in mine, amongst other information? en-US. So the locale tells a web server that it’s a browser (a Firefox 3.6.7) in American English. Now guess what my preferred language will be? Is it likely that I use a browser in a language which I don’t want? Is it likely that a tourist in a different country uses a browser in a language that he or she doesn’t want? It’s certainly possible (like if there is a terminal in the hotel room or an Internet Café), but nowadays most people have their laptops, notebooks, smart phones etc. And their browser language is still their preferred one, regardless where they are currently located. So the browser language serves as a much better criterion to assume a web site visitor’s preferred language. Alright, no big rocket science so far, and to be honest, there is no big rocket science to come. At one point, when I was annoyed about Google’s failure to identify correctly which language to deliver me, I wondered, how hard would it be to implement the (in my humble opinion) better solution, using the web browser’s locale. The answer is far from being rocket science. The solution can be as easy as this: First, get yourself a recent copy of the Zend Framework and install it so you make its libraries accessible to your code. You can add an __autoload() function, something like this: function __autoload($classname) { if (substr($classname, 0, 4) == "Zend") { $classname = str_replace("_", "/", $classname); require_once $classname . '.php'; } } or even easier, use the Zend Framework AutoLoader: require_once 'Zend/Loader/Autoloader.php'; Zend_Loader_Autoloader::getInstance(); Everything we need to identify the browser language and locale can be found inside Zend_Locale. The constructor of the Zend_Locale class allows to set the desired language or locale. In addition to that, there are three predefined constants named Zend_Locale::BROWSER, Zend_Locale::ENVIRONMENT and Zend_Locale::FRAMEWORK. According to the Zend_Locale introduction manual page they do the following: Zend_Locale::FRAMEWORK – “When Zend Framework has a standardized way of specifying component defaults (planned, but not yet available), then using this constant during instantiation will give preference to choosing a locale based on these defaults. If no matching locale can be found, then preference is given to ENVIRONMENT and lastly BROWSER.” Zend_Locale::ENVIRONMENT – “PHP publishes the host server’s locale via the PHP internal function setlocale(). If no matching locale can be found, then preference is given to FRAMEWORK and lastly BROWSER.” Zend_Locale::BROWSER – “The user’s Web browser provides information with each request, which is published by PHP in the global variable HTTP_ACCEPT_LANGUAGE. if no matching locale can be found, then preference is given to ENVIRONMENT and lastly FRAMEWORK.” And this is what we want! Lets get to the code: $zend_locale = new Zend_Locale(Zend_Locale::BROWSER); // returns en for English, de for German etc. $browser_language = $zend_locale->getLanguage(); // returns en-US for American English, en-GB for British English etc. $browser_locale = $zend_locale->toString(); Three lines of code to get all the information we need. And how to make a choice which language to serve based on that? Lets assume we have a site in English, French and German. French browsers should get French, German browsers should get German and everybody else should get English: $site_language = "en"; switch ($browser_language) { case "de" : $site_language = "de"; break; case "fr" : $site_language = "fr"; } And that’s it. That’s all Google would have to do to make me happy. I assume, their IP address based identification code requires more lines, and is still a poorer solution in my opinion. There are additional benefits to this solution. Want to display date and time according to the locale? Here is what to do: $zend_date = Zend_Date::now(); $date_time = $zend_date->get(Zend_Date::DATETIME_FULL, $zend_locale); print $date_time; My default browser now shows me “Friday, July 23, 2010 2:14:58 AM Europe/Vienna”. If I fire up a German browser, the same page delivers me “Freitag, 23. Juli 2010 02:17:11 Europe/Vienna” and there may be answers like “vendredi 23 juillet 2010 02:17:56 Europe/Vienna” (French), “viernes 23 de julio de 2010 02:18:29 Europe/Vienna” (Spanish), “pátek, 23. července 2010 02:18:53 Europe/Vienna” (Czech) or “2010年7月23日金曜日2時19分17秒 Europe/Vienna” (Japanese) as well. It’s pretty cool how the Zend Framework can make such tasks very simple . Where is a solution like mine being used in the wild? I don’t know where as of the time of this writing, but I know where very soon. The MySQL website will soon launch new localized content, delivered to people who we think really want to see these languages. We try very hard not to annoy people by enforcing a language upon them which they would not like to choose. And remember: to make good guesses is good. The best solution however is to provide choice! If you make assumptions and even if the assumptions are well-founded, always leave people the choice to select what they want. There may be situations which you are not thinking of and because of some unusual circumstances, somebody may still want to choose differently than you would think. www.mpopp.net | 7/23/10 1:21 AM