Discussion:
UTF-8, MySQL (MariaDB) and the head-shaped dent in my desk
Ian Evans
2014-02-13 07:01:00 UTC
Permalink
I recently moved to a new server and took the move time as a chance to
migrate my database from Latin1 to UTF-8.

With my site's data forms I'm able to enter accented characters just fine
and the when PHP grabs the info from the DB the accents display just fine.
Sometimes for quick little changes, it's easier to go into a program like
Adminer or phpmyadmin.

I've already posted questions about this at stackoverlflow (
http://stackoverflow.com/questions/21439798/utf-8-input-problems-with-adminer-and-phpmyadmin)
and the adminer forum (
http://sourceforge.net/p/adminer/discussion/960418/thread/33595373/) but so
far no one's been able to crack this particular nut.

With the UTF-8 in place, I wanted to change director Alfonso Cuaron's name
to Cuarón. I went to his entry in Adminer. Edit. Cuar[alt+0243]n. It showed
in the edit box as Cuarón. But when I saved the change, Adminer showed it
as Cuarón. Okay. Looked at page info in Firefox. Says the character
encoding of the page is UTF-8. So all should be well, right?

I went to one of my php data entry forms and created a Bob Cuarón. It
showed up fine. I SSH'd into the server fired up a mysql command line and
ran an update sql line with Cuarón. That worked. But trying to change it in
Adminer still kept giving me Cuarón. I installed phpmyadmin (which was
giving me some issues with my nginx config) but I was able to edit his name
and...sigh...it too gave me Cuarón. I installed SQLbuddy and...success...I
was able to make the changes, but the program is lacking some of the things
I need, like the ability to edit search results.

I'm sure I've nailed everything down:

nginx.conf:

charset UTF-8;

my.cnf:

[client]
default-character-set=utf8
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
init-connect='SET NAMES utf8'

/etc/php5/fpm/php.ini

mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.encoding_translation = On
mbstring.http_input = auto
mbstring.http_output = UTF-8
mbstring.detect_order = auto
mbstring.substitute_character = none
default_charset = UTF-8

Iconv support is enabled. the implentation is glibc, the library version is
2.17. and the iconv encodings are set to UTF-8.


SHOW VARIABLES LIKE "%character_set%";

+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |

I can't see what I could be missing. Both Adminer and phpmyadmin handle
UTF-8 so I don't know why it's not working. It worked right out of the box
with SQLBuddy, but as I said it's missing some features.

Any thoughts where I should look? I feel like I'm an edge case as the
adminer devs can't duplicate the issue,
Ashley Sheridan
2014-02-13 08:09:51 UTC
Permalink
Post by Ian Evans
I recently moved to a new server and took the move time as a chance to
migrate my database from Latin1 to UTF-8.
With my site's data forms I'm able to enter accented characters just fine
and the when PHP grabs the info from the DB the accents display just fine.
Sometimes for quick little changes, it's easier to go into a program like
Adminer or phpmyadmin.
I've already posted questions about this at stackoverlflow (
http://stackoverflow.com/questions/21439798/utf-8-input-problems-with-adminer-and-phpmyadmin)
and the adminer forum (
http://sourceforge.net/p/adminer/discussion/960418/thread/33595373/) but so
far no one's been able to crack this particular nut.
With the UTF-8 in place, I wanted to change director Alfonso Cuaron's name
to Cuarón. I went to his entry in Adminer. Edit. Cuar[alt+0243]n. It showed
in the edit box as Cuarón. But when I saved the change, Adminer showed it
as Cuarón. Okay. Looked at page info in Firefox. Says the character
encoding of the page is UTF-8. So all should be well, right?
I went to one of my php data entry forms and created a Bob Cuarón. It
showed up fine. I SSH'd into the server fired up a mysql command line and
ran an update sql line with Cuarón. That worked. But trying to change it in
Adminer still kept giving me Cuarón. I installed phpmyadmin (which was
giving me some issues with my nginx config) but I was able to edit his name
and...sigh...it too gave me Cuarón. I installed SQLbuddy
and...success...I
was able to make the changes, but the program is lacking some of the things
I need, like the ability to edit search results.
charset UTF-8;
[client]
default-character-set=utf8
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
init-connect='SET NAMES utf8'
/etc/php5/fpm/php.ini
mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.encoding_translation = On
mbstring.http_input = auto
mbstring.http_output = UTF-8
mbstring.detect_order = auto
mbstring.substitute_character = none
default_charset = UTF-8
Iconv support is enabled. the implentation is glibc, the library version is
2.17. and the iconv encodings are set to UTF-8.
SHOW VARIABLES LIKE "%character_set%";
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
I can't see what I could be missing. Both Adminer and phpmyadmin handle
UTF-8 so I don't know why it's not working. It worked right out of the box
with SQLBuddy, but as I said it's missing some features.
Any thoughts where I should look? I feel like I'm an edge case as the
adminer devs can't duplicate the issue,
I know this doesn't exactly answer your issue, but as an alternative I quite like SQLyog. There's a community version on Google code, and it runs just fine under Wine (I use it myself at home). It makes a lot of things much easier than phpMyAdmin (I've not tried Adminer)
Thanks,
Ash
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Ian Evans
2014-02-13 08:29:02 UTC
Permalink
On Thu, Feb 13, 2014 at 3:09 AM, Ashley Sheridan
[snip]
I know this doesn't exactly answer your issue, but as an alternative I
quite like SQLyog. There's a community version on Google code, and it runs
just fine under Wine (I use it myself at home). It makes a lot of things
much easier than phpMyAdmin (I've not tried Adminer)
Thanks for the pointer! I'll keep it in mind, but I'm more interested in
web-based tools as opposed to programs.
Ian Evans
2014-02-13 08:25:27 UTC
Permalink
Weirdness.
Try this: Enter via command line, edit in Adminer, but only add ASCII
letters to the end. Does it still get botched?
Next question is what OS are you using?
Just realized our conversation had gone off-list.

Anyway...entered via the command line. Added a plain old 'o' to the name in
Adminer and ended up with Cuaróno, so still botched.

Our web server is running Ubuntu.
Joshua Kehn
2014-02-13 09:07:31 UTC
Permalink
Interesting. Check to make sure the browser is submitting the form with UTF-8 and that the server isn't mangling it somehow? I know that some form types (like multipart) can have encoding issues unless it's explicitly declared

Best,

-Josh
___________________________
http://byjakt.com
Currently mobile
Post by Ian Evans
Weirdness.
Try this: Enter via command line, edit in Adminer, but only add ASCII letters to the end. Does it still get botched?
Next question is what OS are you using?
Just realized our conversation had gone off-list.
Anyway...entered via the command line. Added a plain old 'o' to the name in Adminer and ended up with Cuaróno, so still botched.
Our web server is running Ubuntu.
Ian Evans
2014-02-13 09:19:00 UTC
Permalink
Post by Joshua Kehn
Interesting. Check to make sure the browser is submitting the form with
UTF-8 and that the server isn't mangling it somehow? I know that some form
types (like multipart) can have encoding issues unless it's explicitly
declared
How can I check if the browser is submitting the form as UTF-8? I guess I
can see if Firefox's HTTP Live Headers will show any info.

Looking at the adminer source the <form> tags don't have the
accept-charset="UTF-8" that my own forms have, but considering that adminer
_only_ supports UTF-8 (and I appear to be the edge case) I don't know if
that would be the issue.
Joshua Kehn
2014-02-13 11:19:36 UTC
Permalink
On 13 Feb 2014, at 1:19, Ian Evans wrote: How can I check if the
browser is submitting the form as UTF-8? I guess I
can see if Firefox's HTTP Live Headers will show any info.
Looking at the adminer source the <form> tags don't have the
accept-charset="UTF-8" that my own forms have, but considering that adminer
_only_ supports UTF-8 (and I appear to be the edge case) I don't know if
that would be the issue.
My guess is the browser is sending the data in some encoding (not UTF-8)
that the server is trying to guess at and convert it to UTF-8. It's
probably guessing poorly.

Or I'm completely wrong. Just throwing out ideas here 3AM local time ;)

--jk
Aziz Saleh
2014-02-13 14:55:06 UTC
Permalink
adminer-4.0.3.php files seem to be ANSI encoded not utf8. Try utf8 encoding
it before usage:

http://stackoverflow.com/a/64889/1935500

Aziz
On 13 Feb 2014, at 1:19, Ian Evans wrote: How can I check if the browser
Post by Ian Evans
is submitting the form as UTF-8? I guess I
can see if Firefox's HTTP Live Headers will show any info.
Looking at the adminer source the <form> tags don't have the
accept-charset="UTF-8" that my own forms have, but considering that adminer
_only_ supports UTF-8 (and I appear to be the edge case) I don't know if
that would be the issue.
My guess is the browser is sending the data in some encoding (not UTF-8)
that the server is trying to guess at and convert it to UTF-8. It's
probably guessing poorly.
Or I'm completely wrong. Just throwing out ideas here 3AM local time ;)
--jk
Ian Evans
2014-02-13 20:00:34 UTC
Permalink
Post by Aziz Saleh
adminer-4.0.3.php files seem to be ANSI encoded not utf8. Try utf8
http://stackoverflow.com/a/64889/1935500
Aziz,

I cp'd the file to indextest.php then ran:

iconv -f ISO-8859-1 -t UTF-8 < indextest.php > index.php

The resulting page was totally screwed up. It would sort of load but the
screen was full of the characters and black diamonds. Is there another way
to convert it that I should try?
Aziz Saleh
2014-02-13 20:07:27 UTC
Permalink
If you want to use adminier you would have to download the source, modify
the code to export the file in urf8 and re-compile via php (assumption).

Alternatively, why not use something established like phpmyadmin? It is
more than 1 file, but pretty flexible.
Post by Ian Evans
Post by Aziz Saleh
adminer-4.0.3.php files seem to be ANSI encoded not utf8. Try utf8
http://stackoverflow.com/a/64889/1935500
Aziz,
iconv -f ISO-8859-1 -t UTF-8 < indextest.php > index.php
The resulting page was totally screwed up. It would sort of load but the
screen was full of the characters and black diamonds. Is there another way
to convert it that I should try?
Ian Evans
2014-02-13 20:30:42 UTC
Permalink
Post by Aziz Saleh
If you want to use adminier you would have to download the source, modify
the code to export the file in urf8 and re-compile via php (assumption).
Alternatively, why not use something established like phpmyadmin? It is
more than 1 file, but pretty flexible.
Well I was having the same problem with phpmyadmin.

Turns out it was some setting in my php.ini. One of the adminer folks saw
the differences.

Of course I had grabbed those php.ino changes from some UTF-8 checklist
article. Didn't book mark it, so I'm not sure which one. Anyway...made the
changes back to the defaults and suddenly adminer and the command line
agree.

The solution is here:

https://sourceforge.net/p/adminer/discussion/960418/thread/33595373/#42df

Thanks to everyone here for lending some eyeballs to the issue!

Camilo Sperberg
2014-02-13 08:38:09 UTC
Permalink
Post by Ian Evans
I recently moved to a new server and took the move time as a chance to
migrate my database from Latin1 to UTF-8.
With my site's data forms I'm able to enter accented characters just fine
and the when PHP grabs the info from the DB the accents display just fine.
Sometimes for quick little changes, it's easier to go into a program like
Adminer or phpmyadmin.
I've already posted questions about this at stackoverlflow (
http://stackoverflow.com/questions/21439798/utf-8-input-problems-with-adminer-and-phpmyadmin)
and the adminer forum (
http://sourceforge.net/p/adminer/discussion/960418/thread/33595373/) but so
far no one's been able to crack this particular nut.
With the UTF-8 in place, I wanted to change director Alfonso Cuaron's name
to Cuarón. I went to his entry in Adminer. Edit. Cuar[alt+0243]n. It showed
in the edit box as Cuarón. But when I saved the change, Adminer showed it
as Cuarón. Okay. Looked at page info in Firefox. Says the character
encoding of the page is UTF-8. So all should be well, right?
I went to one of my php data entry forms and created a Bob Cuarón. It
showed up fine. I SSH'd into the server fired up a mysql command line and
ran an update sql line with Cuarón. That worked. But trying to change it in
Adminer still kept giving me Cuarón. I installed phpmyadmin (which was
giving me some issues with my nginx config) but I was able to edit his name
and...sigh...it too gave me Cuarón. I installed SQLbuddy and...success...I
was able to make the changes, but the program is lacking some of the things
I need, like the ability to edit search results.
charset UTF-8;
[client]
default-character-set=utf8
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
init-connect='SET NAMES utf8'
/etc/php5/fpm/php.ini
mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.encoding_translation = On
mbstring.http_input = auto
mbstring.http_output = UTF-8
mbstring.detect_order = auto
mbstring.substitute_character = none
default_charset = UTF-8
Iconv support is enabled. the implentation is glibc, the library version is
2.17. and the iconv encodings are set to UTF-8.
SHOW VARIABLES LIKE "%character_set%";
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
I can't see what I could be missing. Both Adminer and phpmyadmin handle
UTF-8 so I don't know why it's not working. It worked right out of the box
with SQLBuddy, but as I said it's missing some features.
Any thoughts where I should look? I feel like I'm an edge case as the
adminer devs can't duplicate the issue,
Are you sure the source files itself are saved in UTF-8?

Also check that set_names (in PHP) is not overriding character_set_client and that you don't have a meta tag that overrides the header tags.
Apart from that, I can only tell you that the problem you are having is that you try to print an UTF-8 character in a ISO-8859-1 encoding.

Just a silly question: you did remember to restart the db and nginx after making the changes in your confs right?

Greetings.
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Ian Evans
2014-02-13 09:00:36 UTC
Permalink
[Ah gmail...just sending this again so the list gets it...]
Post by Camilo Sperberg
Are you sure the source files itself are saved in UTF-8?
I wget'd the file directly from adminer.org. There was no editing and
re-saving. Adminer is just one php file.
Post by Camilo Sperberg
Also check that set_names (in PHP) is not overriding character_set_client
and that you don't have a meta tag that overrides the header tags.
Set names and character_set_client are both utf-8. The meta tag sent by
adminer is utf-8 and our nginx server sends a utf-8 header. Firefox reports
that the page sent by adminer is utf-8 encoded.
Post by Camilo Sperberg
Just a silly question: you did remember to restart the db and nginx after
making the changes in your confs right?
A million times. :-)
Loading...