Discussion:
preg_match_all to match <img> tags
Ólafur Waage
2007-08-09 23:45:59 UTC
Permalink
I know this isn't exactly a php related question but due to the
quality of answers ive seen lately ill give this a shot. (yes yes im
smoothing up the crowd before the question)

I have a weblog system that i am creating, the trouble is that if a
user links to an external image larger than 500pixels in width, it
messes with the whole layout.

I had found some regex code im using atm but its not good at matching
the entire image tag. It seems to ignore properties after the src
declaration and not match tags that have properties before the src
declaration .

preg_match_all("/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*)/i",
$data, $matches);
print_r($matches);

This currently makes two arrays for me, the source location from all
img tags and a large part of the tag itself. But not the entire tag.

What i do is i match the img tag, find the src, get the image
properties, and if the width is more than 500, i shrink it down and
add width="X" and height="Y" properties to the image tag.

How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
brian
2007-08-10 01:18:22 UTC
Permalink
Post by Ólafur Waage
I know this isn't exactly a php related question but due to the
quality of answers ive seen lately ill give this a shot. (yes yes im
smoothing up the crowd before the question)
I have a weblog system that i am creating, the trouble is that if a
user links to an external image larger than 500pixels in width, it
messes with the whole layout.
I had found some regex code im using atm but its not good at matching
the entire image tag. It seems to ignore properties after the src
declaration and not match tags that have properties before the src
declaration .
preg_match_all("/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*)/i",
$data, $matches);
print_r($matches);
This currently makes two arrays for me, the source location from all
img tags and a large part of the tag itself. But not the entire tag.
What i do is i match the img tag, find the src, get the image
properties, and if the width is more than 500, i shrink it down and
add width="X" and height="Y" properties to the image tag.
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
<style>
#your_content_div img { max-width: 500px !important; }
</style>

OK, so it won't work with IE6. Screw them.

But if the height is set in the img tag it'll keep that, so the image
could become distorted. So, you could also do something like:

#your_content_div img { visibility: none; }

Then run some Javascript routine onload to properly figure the
dimensions of each image. Adjust the width down to 500px, if necessary,
then the height by whatever percent difference between original width
over new width:

var new_height = (original_width / 500) * original_height;

Then, whether you change the dimensions of the image or not, change the
visibility of each to 'visible'.

So, um ... no PHP involved.

brian
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Jan Reiter
2007-08-10 03:35:47 UTC
Permalink
Maybe this is what you are searching for:

$images = array();
$data = "blah <img src=img.png width=\"400\" height='600'> src=blah.png <img
src=gg.tiff>";

preg_match_all("/\< *[img][^\>]*[.]*\>/i", $data, $matches);
foreach($matches[0] as $match)
{
preg_match_all("/(src|height|width)*= *[\"\']{0,1}([^\"\'\ \>]*)/i",
$match, $m);
$images[] = array_combine($m[1],$m[2]);
}

print_r($image);

It will produce:

Array
(
[0] => Array
(
[src] => img.png
[width] => 400
[height] => 600
)

[1] => Array
(
[src] => gg.tiff
)
)

I wrote it just as an example. So you may modify it for your needs!
Does anyone know if there is a way to put this into ONE regex??

Jan

-----Original Message-----
From: brian [mailto:***@subtropolix.org]
Sent: Friday, August 10, 2007 3:18 AM
To: php-***@lists.php.net
Subject: Re: [PHP] preg_match_all to match <img> tags
Post by Ólafur Waage
I know this isn't exactly a php related question but due to the
quality of answers ive seen lately ill give this a shot. (yes yes im
smoothing up the crowd before the question)
I have a weblog system that i am creating, the trouble is that if a
user links to an external image larger than 500pixels in width, it
messes with the whole layout.
I had found some regex code im using atm but its not good at matching
the entire image tag. It seems to ignore properties after the src
declaration and not match tags that have properties before the src
declaration .
preg_match_all("/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*)/i",
$data, $matches);
print_r($matches);
This currently makes two arrays for me, the source location from all
img tags and a large part of the tag itself. But not the entire tag.
What i do is i match the img tag, find the src, get the image
properties, and if the width is more than 500, i shrink it down and
add width="X" and height="Y" properties to the image tag.
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
<style>
#your_content_div img { max-width: 500px !important; }
</style>

OK, so it won't work with IE6. Screw them.

But if the height is set in the img tag it'll keep that, so the image
could become distorted. So, you could also do something like:

#your_content_div img { visibility: none; }

Then run some Javascript routine onload to properly figure the
dimensions of each image. Adjust the width down to 500px, if necessary,
then the height by whatever percent difference between original width
over new width:

var new_height = (original_width / 500) * original_height;

Then, whether you change the dimensions of the image or not, change the
visibility of each to 'visible'.

So, um ... no PHP involved.

brian
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Ólafur Waage
2007-08-10 12:23:47 UTC
Permalink
Guys i would like to thank you for the help.

Jan i used some of your code to help me get this. There was one issue
with yours, preg_match_all while in a foreach will only match the last
match and output that as the $m array.

Here is the code, i hope someone else can use it. Its a working code
in PHP4 and PHP5. I am currenly wrapping the imageManip(); around the
$_POST variable that is the blog post when im posting it to the SQL.
So the getimagesize(); only has to run once.

There is one bug, if the user resizes the image manually to something
less than 590, to say maybe 400, this program will still resize it to
590.

function imageProportions($image)
{
$imageLoad = @getimagesize($image);

// if the image is larger than 590 pixels, it will resize both
height and width
// if not it will do nothing to the variables.
if($imageLoad[0] > 590)
{
$proportion = $imageLoad[0] / $imageLoad[1];
$imageLoad[0] = 590;
$imageLoad[1] = $imageLoad[0] / $proportion;

return array($imageLoad[0], ceil($imageLoad[1]));
}
else
{
return array($imageLoad[0], $imageLoad[1]);
}
}

function imageManip($data)
{
// match all image tags (thanks to Jan Reiter)
@preg_match_all("/\< *[img][^\>]*[.]*\>/i", $data, $matches);

if( is_array($matches[0]) )
{
// put all those image tags in one string, since preg match all
needs the data
// to be in a string format
foreach($matches[0] as $match)
{
$imageMatch .= $match;
}

// match all source links within the original preg match all output
@preg_match_all("/src=\"(.+?)\"/i", $imageMatch, $m);

// for each match that has the same key as the second match,
replace the entire
// tag with my <img> tag that includes width, height and border="0"
foreach($matches[0] as $imageTagKey => $imageTag)
{
foreach($m[1] as $imageSrcKey => $imageSrc)
{
if($imageTagKey == $imageSrcKey)
{
$imageStats = imageProportions($imageSrc);

$data = str_replace($imageTag, "<img src=\"".$imageSrc."\"
width=\"".$imageStats[0]."\" height=\"".$imageStats[1]."\"
border=\"0\" />", $data);
}
}
}
}

return $data;
}
Post by Jan Reiter
$images = array();
$data = "blah <img src=img.png width=\"400\" height='600'> src=blah.png <img
src=gg.tiff>";
preg_match_all("/\< *[img][^\>]*[.]*\>/i", $data, $matches);
foreach($matches[0] as $match)
{
preg_match_all("/(src|height|width)*= *[\"\']{0,1}([^\"\'\ \>]*)/i",
$match, $m);
$images[] = array_combine($m[1],$m[2]);
}
print_r($image);
Array
(
[0] => Array
(
[src] => img.png
[width] => 400
[height] => 600
)
[1] => Array
(
[src] => gg.tiff
)
)
I wrote it just as an example. So you may modify it for your needs!
Does anyone know if there is a way to put this into ONE regex??
Jan
-----Original Message-----
Sent: Friday, August 10, 2007 3:18 AM
Subject: Re: [PHP] preg_match_all to match <img> tags
Post by Ólafur Waage
I know this isn't exactly a php related question but due to the
quality of answers ive seen lately ill give this a shot. (yes yes im
smoothing up the crowd before the question)
I have a weblog system that i am creating, the trouble is that if a
user links to an external image larger than 500pixels in width, it
messes with the whole layout.
I had found some regex code im using atm but its not good at matching
the entire image tag. It seems to ignore properties after the src
declaration and not match tags that have properties before the src
declaration .
preg_match_all("/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*)/i",
$data, $matches);
print_r($matches);
This currently makes two arrays for me, the source location from all
img tags and a large part of the tag itself. But not the entire tag.
What i do is i match the img tag, find the src, get the image
properties, and if the width is more than 500, i shrink it down and
add width="X" and height="Y" properties to the image tag.
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
<style>
#your_content_div img { max-width: 500px !important; }
</style>
OK, so it won't work with IE6. Screw them.
But if the height is set in the img tag it'll keep that, so the image
#your_content_div img { visibility: none; }
Then run some Javascript routine onload to properly figure the
dimensions of each image. Adjust the width down to 500px, if necessary,
then the height by whatever percent difference between original width
var new_height = (original_width / 500) * original_height;
Then, whether you change the dimensions of the image or not, change the
visibility of each to 'visible'.
So, um ... no PHP involved.
brian
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Richard Lynch
2007-08-11 01:48:51 UTC
Permalink
Post by Ólafur Waage
I know this isn't exactly a php related question but due to the
quality of answers ive seen lately ill give this a shot. (yes yes im
smoothing up the crowd before the question)
I have a weblog system that i am creating, the trouble is that if a
user links to an external image larger than 500pixels in width, it
messes with the whole layout.
I had found some regex code im using atm but its not good at matching
the entire image tag. It seems to ignore properties after the src
declaration and not match tags that have properties before the src
declaration .
preg_match_all("/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*)/i",
$data, $matches);
print_r($matches);
This currently makes two arrays for me, the source location from all
img tags and a large part of the tag itself. But not the entire tag.
What i do is i match the img tag, find the src, get the image
properties, and if the width is more than 500, i shrink it down and
add width="X" and height="Y" properties to the image tag.
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
Scaling the image in the browser is horrible for performance on the
client side...

The entire image still gets downloaded, and then the poor browser has
to scale this monster image down.

You may want to re-think your plan of attack...

You could, for example, force users to only use "registered" images,
and if they "register" an image large than 500, use http://php.net/gd
to scale it down.

As far as matching the image tag correctly goes, I'd have to suggest
that you try using a DOM to parse the HTML instead of regex.

That said, to get the WHOLE img tag in the same vein as you are using
now:
preg_match_all("/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*>)/i",

Note the addition of a closing ">" which will mark the end of the img
tag.

The [img] and [src] are kind of wonky, really, as they would also
match this bit of nonsense:

Sometimes <i src="foo">italics</i> could have bogus attributes.

YMMV
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
tedd
2007-08-11 17:57:57 UTC
Permalink
Post by Ólafur Waage
I know this isn't exactly a php related question but due to the
quality of answers ive seen lately ill give this a shot. (yes yes im
smoothing up the crowd before the question)
I have a weblog system that i am creating, the trouble is that if a
user links to an external image larger than 500pixels in width, it
messes with the whole layout.
I had found some regex code im using atm but its not good at matching
the entire image tag. It seems to ignore properties after the src
declaration and not match tags that have properties before the src
declaration .
preg_match_all("/\< *[img][^\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*)/i",
$data, $matches);
print_r($matches);
This currently makes two arrays for me, the source location from all
img tags and a large part of the tag itself. But not the entire tag.
What i do is i match the img tag, find the src, get the image
properties, and if the width is more than 500, i shrink it down and
add width="X" and height="Y" properties to the image tag.
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
You could set a class attribute for the image in
html and scale it in css. An img {width: 100%;}
would hold the size to the maximum of the parent.

Cheers,

tedd
--
-------
http://sperling.com http://ancientstones.com http://earthstones.com
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Richard Heyes
2007-08-11 18:14:51 UTC
Permalink
Post by Ólafur Waage
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
preg_match_all('/<img[^>]*>/Ui');

Off the top of my head. This wouldn't allow for using the right angle
bracket in the img tag, but that's almost never going to happen in reailty.
--
Richard Heyes
+44 (0)844 801 1072
http://www.websupportsolutions.co.uk

Knowledge Base and HelpDesk software
that can cut the cost of online support
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Richard Heyes
2007-08-11 18:17:45 UTC
Permalink
Post by Richard Heyes
Post by Ólafur Waage
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
preg_match_all('/<img[^>]*>/Ui');
Off the top of my head. This wouldn't allow for using the right angle
bracket in the img tag, but that's almost never going to happen in reailty.
Oops that should be:

preg_match_all('/<img[^>]*>/Ui', $input, $matches);
--
Richard Heyes
+44 (0)844 801 1072
http://www.websupportsolutions.co.uk

Knowledge Base and HelpDesk software
that can cut the cost of online support
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Tijnema
2007-08-11 18:22:20 UTC
Permalink
Post by Richard Heyes
Post by Richard Heyes
Post by Ólafur Waage
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
preg_match_all('/<img[^>]*>/Ui');
Off the top of my head. This wouldn't allow for using the right angle
bracket in the img tag, but that's almost never going to happen in reailty.
preg_match_all('/<img[^>]*>/Ui', $input, $matches);
--
Richard Heyes
< img src="image.jpg">

Your script doesn't catch above ;)

Tijnema
--
Vote for PHP Color Coding in Gmail! -> http://gpcc.tijnema.info
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Richard Heyes
2007-08-11 18:25:18 UTC
Permalink
Post by Tijnema
< img src="image.jpg">
Your script doesn't catch above ;)
So don't write HTML like that.
--
Richard Heyes
+44 (0)844 801 1072
http://www.websupportsolutions.co.uk

Knowledge Base and HelpDesk software
that can cut the cost of online support
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Tijnema
2007-08-11 18:27:04 UTC
Permalink
Post by Richard Heyes
Post by Tijnema
< img src="image.jpg">
Your script doesn't catch above ;)
So don't write HTML like that.
Depends where the HTML is coming from, it might be user input....

Tijnema
--
Vote for PHP Color Coding in Gmail! -> http://gpcc.tijnema.info
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Richard Heyes
2007-08-11 18:29:17 UTC
Permalink
Post by Tijnema
Post by Richard Heyes
Post by Tijnema
< img src="image.jpg">
Your script doesn't catch above ;)
So don't write HTML like that.
Depends where the HTML is coming from, it might be user input....
Ok, add \s* after the initial angle bracket.
--
Richard Heyes
+44 (0)844 801 1072
http://www.websupportsolutions.co.uk

Knowledge Base and HelpDesk software
that can cut the cost of online support
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Stut
2007-08-11 19:22:35 UTC
Permalink
Post by Tijnema
Post by Richard Heyes
Post by Richard Heyes
Post by Ólafur Waage
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
preg_match_all('/<img[^>]*>/Ui');
Off the top of my head. This wouldn't allow for using the right angle
bracket in the img tag, but that's almost never going to happen in reailty.
preg_match_all('/<img[^>]*>/Ui', $input, $matches);
--
Richard Heyes
< img src="image.jpg">
Your script doesn't catch above ;)
That's not valid HTML and won't get displayed correctly in most browsers.

-Stut
--
http://stut.net/
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Tijnema
2007-08-11 19:28:11 UTC
Permalink
Post by Stut
Post by Tijnema
Post by Richard Heyes
Post by Richard Heyes
Post by Ólafur Waage
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
preg_match_all('/<img[^>]*>/Ui');
Off the top of my head. This wouldn't allow for using the right angle
bracket in the img tag, but that's almost never going to happen in reailty.
preg_match_all('/<img[^>]*>/Ui', $input, $matches);
--
Richard Heyes
< img src="image.jpg">
Your script doesn't catch above ;)
That's not valid HTML and won't get displayed correctly in most browsers.
-Stut
hmm.. damn.. you're right :P I always assumed it was just correct...

Tijnema
--
Vote for PHP Color Coding in Gmail! -> http://gpcc.tijnema.info
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Ólafur Waage
2007-08-11 20:57:23 UTC
Permalink
Ive already finished the code and pasted it to this mailing list :)
Just an fyi.

Ólafur Waage
Post by Tijnema
Post by Stut
Post by Tijnema
Post by Richard Heyes
Post by Richard Heyes
Post by Ólafur Waage
How can i match an image tag correctly so it does not cause any issues
with how the user adds the image.
preg_match_all('/<img[^>]*>/Ui');
Off the top of my head. This wouldn't allow for using the right angle
bracket in the img tag, but that's almost never going to happen in reailty.
preg_match_all('/<img[^>]*>/Ui', $input, $matches);
--
Richard Heyes
< img src="image.jpg">
Your script doesn't catch above ;)
That's not valid HTML and won't get displayed correctly in most browsers.
-Stut
hmm.. damn.. you're right :P I always assumed it was just correct...
Tijnema
--
Vote for PHP Color Coding in Gmail! -> http://gpcc.tijnema.info
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Loading...