DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • How to Repair Corrupt MySQL Database Tables Step-by-Step
  • Harnessing the Power of AWS Aurora for Scalable and Reliable Databases
  • Optimizing MySQL Performance: Best Practices for Database Efficiency
  • Fine-Tuning Performance, Resolving Common Issues in FinTech Application With MySQL

Trending

  • Create Your Own AI-Powered Virtual Tutor: An Easy Tutorial
  • Rethinking Recruitment: A Journey Through Hiring Practices
  • Java Virtual Threads and Scaling
  • Problems With Angular Migration
  1. DZone
  2. Data Engineering
  3. Databases
  4. Avoiding MySQL ALTER Table Downtime

Avoiding MySQL ALTER Table Downtime

By 
Peter Zaitsev user avatar
Peter Zaitsev
·
Jan. 20, 15 · Interview
Likes (1)
Comment
Save
Tweet
Share
14.7K Views

Join the DZone community and get the full member experience.

Join For Free

Originally Written byAndrew Moore

MySQL table alterations can interrupt production traffic causing bad customer experience or in worst cases, loss of revenue. Not all DBAs, developers, syadmins know MySQL well enough to avoid this pitfall. DBAs usually encounter these kinds of production interruptions when working with upgrade scripts that touch both application and database or if an inexperienced admin/dev engineer perform the schema change without knowing how MySQL operates internally.

Truths
* Direct MySQL ALTER table locks for duration of change (pre-5.6)
* Online DDL in MySQL 5.6 is not always online and may incurr locks
* Even with Percona Toolkit‘s pt-online-schema-change there are several workloads that can experience blocking

Here on the Percona MySQL Managed Services team we encourage our clients to work with us when planning and performing schema migrations. We aim to ensure that we are using the best method available in their given circumstance. Our intentions to avoid blocking when performing DDL on large tables ensures that business can continue as usual whilst we strive to improve response time or add application functionality. The bottom line is that a business relying on access to its data cannot afford to be down during core trading hours.

Many of the installations we manage are still below MySQL 5.6, which requires us to seek workarounds to minimize the amount of disruption a migration can cause. This may entail slave promotion or changing the schema with an ‘online schema change’ tool. MySQL version 5.6 looks to address this issue by reducing the number of scenarios where a table is rebuilt and locked but it doesn’t yet cover all eventualities, for example when changing the data type of a column a full table rebuild is necessary. The topic of 5.6 Online Schema Change was discussed in great detail last year in the post, “Schema changes – what’s new in MySQL 5.6?” by Przemysław Malkowski

With new functionality arriving in MySQL 5.7, we look forward to non-blocking DDL operations such as; OPTIMIZE TABLE and RENAME INDEX. (More info)

The best advice for MySQL 5.6 users is to review the matrix to familiarize with situations where it might be best to look outside of MySQL to perform schema changes, the good news is that we’re on the right path to solving this natively.

Truth be told, a blocking alter is usually going to go unnoticed on a 30MB table and we tend to use a direct alter in this situation, but on a 30GB or 300GB table we have some planning to do. If there is a period of time where activity is low and the this is permissive of locking the table then sometimes it is better execute within this window. Frequently though we are reactive to new SQL statements or a new performance issue and an emergency index is required to reduce load on the master in order to improve the response time.

To pt-osc or not to pt-osc?

As mentioned, pt-online-schema-change is a fixture in our workflow. It’s usually the right way to go but we still have occasions where pt-online-schema-change cannot be used, for example; when a table already uses triggers. It’s an important to remind ourselves of the the steps that pt-online-schema-change traverses to complete it’s job. Lets look at the source code to identify these;

[moore@localhost]$ egrep 'Step' pt-online-schema-change # Step 1: Create the new table. # Step 2: Alter the new, empty table. This should be very quick, # Step 3: Create the triggers to capture changes on the original table and <--(metadata lock) # Step 4: Copy rows. # Step 5: Rename tables: orig -> old, new -> orig <--(metadata lock) # Step 6: Update foreign key constraints if there are child tables. # Step 7: Drop the old table.

[moore@localhost]$egrep'Step'pt-online-schema-change
# Step 1: Create the new table.
# Step 2: Alter the new, empty table. This should be very quick,
# Step 3: Create the triggers to capture changes on the original table and <--(metadata lock)
# Step 4: Copy rows.
# Step 5: Rename tables: orig -> old, new -> orig <--(metadata lock)
# Step 6: Update foreign key constraints if there are child tables.
# Step 7: Drop the old table.

I pick out steps 3 and 5 from above to highlight a source of a source of potential downtime due to locks, but step 6 is also an area for concern since foreign keys can have nested actions and should be considered when planning these actions to avoid related tables from being rebuilt with a direct alter implicitly. There are several ways to approach a table with referential integrity constraints and they are detailed within the pt-osc documentation a good preparation step is to review the structure of your table including the constraints and how the ripples of the change can affect the tables around it.

Recently we were alerted to an incident after a client with a highly concurrent and highly transactional workload ran a standard pt-online-schema-change script over a large table. This appeared normal to them and a few hours later our pager man was notified that this client was experiencing max_connections limit reached. So what was going on? When pt-online-schema-change reached step 5 it tried to acquire a metadata lock to rename the the original and the shadow table, however this wasn’t immediately granted due to open transactions and thus threads began to queue behind the RENAME command. The actual effect this had on the client’s application was downtime. No new connections could be made and all existing threads were waiting behind the RENAME command.

Metadata locks
Introduced in 5.5.3 at server level. When a transaction starts it will acquire a metadata lock (independent of storage engine) on all tables it uses and then releases them when it’s finished it’s work. This ensures that nothing can alter the table definition whilst a transaction is open.

With some foresight and planning we can avoid these situations with non-default pt-osc options, namely –nodrop-new-table and –no-swap-tables. This combination leaves both the shadow table and the triggers inplace so that we can instigate an atomic RENAME when load permits.

EDIT: as of percona-toolkit version 2.2 we have a new variable –tries which in conjunction with –set-vars has been deployed to cover this scenario where various pt-osc operations could block waiting for a metadata lock. The default behaviour of pt-osc (–set-vars) is to set the following session variables when it connects to the server;

wait_timeout=10000
innodb_lock_wait_timeout=1
lock_wait_timeout=60

when using –tries we can granularly identify the operation, try count and the wait interval between tries. This combination will ensure that pt-osc will kill it’s own waiting session in good time to avoid the thread pileup and provide us with a loop to attempt to acquire our metadata lock for triggers|rename|fk management;

–tries swap_tables:5:0.5,drop_triggers:5:0.5

The documentation is here http://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html#cmdoption-pt-online-schema-change–tries

This illustrates that even with a tool like pt-online-schema-change it is important to understand the caveats presented with the solution you think is most adequate. To help decide the direction to take use the flow chart to ensure you’re taking into account some of the caveats of the MySQL schema change. Be sure to read up on the recommended outcome though as there are uncharted areas such as disk space, IO load that are not featured on the diagram.

DDL Decision chart

Choosing the right DDL option

Ensure you know what effect ALTER TABLE will have on your platform and pick the right method to suit your uptime. Sometimes that means delaying the change until a period of lighter use or utilising a tool that will avoid holding a table locked for the duration of the operation. A direct ALTER is sometimes the answer like when you have triggers installed on a table.

– In most cases pt-osc is exactly what we need
– In many cases pt-osc is needed but the way in which it’s used needs tweaking
– In few cases pt-osc isn’t the right tool/method and we need to consider native blocking ALTER or using failovers to juggle the change into place on all hosts in the replica cluster.

If you want to learn more about avoiding avoidable downtime please tune into my webinar Wednesday, November 19 at 10 a.m. PST. It’s titled “Tips from the Trenches: A Guide to Preventing Downtime for the Over-Extended DBA.” Register now!(If you miss it, don’t worry: Catch the recording and download the slides from that same registration page.)

 

Database MySQL

Published at DZone with permission of Peter Zaitsev, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • How to Repair Corrupt MySQL Database Tables Step-by-Step
  • Harnessing the Power of AWS Aurora for Scalable and Reliable Databases
  • Optimizing MySQL Performance: Best Practices for Database Efficiency
  • Fine-Tuning Performance, Resolving Common Issues in FinTech Application With MySQL

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!