Today I learned… how to remove unwanted characters like â€ and Á from WordPress post content using a MySQL query.
Backstory: I recently migrated a blog from shared hosting to a VPS. This experience alone could fuel TILCode content into 2024, but this particular “TIL” article is about a curious character encoding artifact that occurred as I exported the WordPress MySQL data and imported it elsewhere: Á and â€ symbols everywhere!
Removing these unwanted symbols from my WordPress posts required running a few queries on my MySQL databases. Here are the steps I used.
(And if there’s a better/easier way to do this, please let me know in the comments. This was my first foray into MySQL queries, but it got the job done.)
Step 1: Log into PhpMyAdmin
Access phpmyadmin through your cPanel or yourdomain.com/phpmyadmin – access varies by host and setup
Step 2: Navigate to the SQL tab and change collation
Click on the “PhpMyAdmin” logo so that you’re at the “root” level (and not inside a database – this might work from inside a database, but I did it from outside).
Click on the SQL tab.
Here, you’ll get a large window in which you can type queries – operations you perform on your data.
The first query to run is one that will set the character encoding type.
ALTER DATABASE your_db_name_here DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
I learned this from JesseWeb’s helpful tutorial on resolving WordPress character issues, but his code didn’t work for me as-is (I suppose that’s not surprising considering his guide is nearly 5 years old). I had to remove the quotes from the database name.
Your query will look like the image below. Be sure to replace your_db_name_here with your database’s actual name (look in the left column for your db names).
Click the “Go” button on the right to run the query.
Step 3: Remove unwanted characters from existing posts
For this query, you need to navigate to the actual database itself. Click on its name on the left.
Click on the SQL tab. Now the helper text says that queries are run on the database you selected.
Enter this query string to replace all instances of â€ with an empty space:
UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€œ', '');
Repeat the query for every unwanted character. Note also that you can stack ’em up in the query window like so and run ’em all at once:
UPDATE wp_posts SET post_content = REPLACE(post_content, 'â€œ', ''); UPDATE wp_posts SET post_content = REPLACE(post_content, 'Á', ''); UPDATE wp_posts SET post_content = REPLACE(post_content, 'foo', 'bar');
You may also need to clean up comments. You can do that with:
UPDATE wp_comments SET comment_content = REPLACE(comment_content, 'Á', '');
See shadez’s question on WordPress.org for more examples – his example was instrumental in helping me understand this issue.
And with that, all the unwanted characters are gone. Phew!