Alain Williams
2014-05-28 09:03:03 UTC
I am trying to use this to validate input that is supposed to be UTF-8 and to
replace any bad characters with something - '?' would do.
I have the test program below. No matter what I try to give as an argument to
mb_substitute_character() it always removes the bad input sequence, I would like
to replace it.
Thanks in advance
<?php
mb_internal_encoding("UTF-8");
// I have tried many lines like the 2 below
// (comment out one or the other)
mb_substitute_character((int)0x3013);
mb_substitute_character((int)63); // '?' is ascii 63
// \xC0\xBC is invalid UTF-8 - over long encoding, should be \x3C
$input = "a bad angle bracket \xC0\xBC here";
$valid = mb_convert_encoding($input, "UTF-8", "UTF-8");
// I always find 2 spaces between 'bracket' and 'here'
echo "valid='$valid'\n";
--
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256 http://www.phcomp.co.uk/
Parliament Hill Computers Ltd. Registration Information: http://www.phcomp.co.uk/contact.php
#include <std_disclaimer.h>
replace any bad characters with something - '?' would do.
I have the test program below. No matter what I try to give as an argument to
mb_substitute_character() it always removes the bad input sequence, I would like
to replace it.
Thanks in advance
<?php
mb_internal_encoding("UTF-8");
// I have tried many lines like the 2 below
// (comment out one or the other)
mb_substitute_character((int)0x3013);
mb_substitute_character((int)63); // '?' is ascii 63
// \xC0\xBC is invalid UTF-8 - over long encoding, should be \x3C
$input = "a bad angle bracket \xC0\xBC here";
$valid = mb_convert_encoding($input, "UTF-8", "UTF-8");
// I always find 2 spaces between 'bracket' and 'here'
echo "valid='$valid'\n";
--
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256 http://www.phcomp.co.uk/
Parliament Hill Computers Ltd. Registration Information: http://www.phcomp.co.uk/contact.php
#include <std_disclaimer.h>
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php