Discussion:
refernces, arrays, and why does it take up so much memory?
Daevid Vincent
2013-09-03 01:30:50 UTC
Permalink
I'm confused on how a reference works I think.

I have a DB result set in an array I'm looping over. All I simply want to do
is make the array key the "id" of the result set row.

This is the basic gist of it:

private function _normalize_result_set()
{
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$new_tmp_results[$id] =& $v; //2013-08-29 [dv] using a
reference here cuts the memory usage in half!
unset($this->tmp_results[$k]);

/*
if ($i++ % 1000 == 0)
{
gc_enable(); // Enable Garbage Collector
var_dump(gc_enabled()); // true
var_dump(gc_collect_cycles()); // # of elements
cleaned up
gc_disable(); // Disable Garbage Collector
}
*/
}
$this->tmp_results = $new_tmp_results;
//var_dump($this->tmp_results); exit;
unset($new_tmp_results);
}

Without using the =& reference, my data works great:
$new_tmp_results[$id] = $v;

array (size=79552)
6904 =>
array (size=4)
'id' => string '6904' (length=4)
'studio_id' => string '5' (length=1)
'genres' => string '34|' (length=3)
6905 =>
array (size=4)
'id' => string '6905' (length=4)
'studio_id' => string '5' (length=1)
'genres' => string '6|37|' (length=5)

However it takes a stupid amount of memory for some unknown reason.
MEMORY USED @START: 262,144 - @END: 42,729,472 = 42,467,328 BYTES
MEMORY PEAK USAGE: 216,530,944 BYTES

When using the reference the memory drastically goes down to what I'd EXPECT
it to be (and actually the problem I'm trying to solve).
MEMORY USED @START: 262,144 - @END: 6,029,312 = 5,767,168 BYTES
MEMORY PEAK USAGE: 82,051,072 BYTES

However my array is all kinds of wrong:

array (size=79552)
6904 => &
array (size=4)
'id' => string '86260' (length=5)
'studio_id' => string '210' (length=3)
'genres' => string '8|9|10|29|58|' (length=13)
6905 => &
array (size=4)
'id' => string '86260' (length=5)
'studio_id' => string '210' (length=3)
'genres' => string '8|9|10|29|58|' (length=13)

Notice that they're all the same values, although the keys seem right. I
don't understand why that happens because
foreach($this->tmp_results as $k => $v)
Should be changing $v each iteration I'd think.

Honestly, I am baffled as to why those unsets() make no difference. All I
can think is that the garbage collector doesn't run. But then I had also
tried to force gc() and that still made no difference. *sigh*

I had some other cockamamie idea where I'd use the same tmp_results array in
a tricky way to avoid a second array. The concept being I'd add 1 million
to the ['id'] (which we want as the new array key), then unset the existing
sequential key, then when all done, loop through and shift all the keys by 1
million thereby they'd be the right index ID. So add one and unset one
immediately after. Clever right? 'cept it too made no difference on memory.
Same thing is happening as above where the gc() isn't running or something
is holding all that memory until the end. *sigh*

Then I tried a different way using array_combine() and noticed something
very disturbing.
http://www.php.net/manual/en/function.array-combine.php


private function _normalize_result_set()
{
if (!$this->tmp_results || count($this->tmp_results) < 1)
return;

$D_start_mem_usage = memory_get_usage();
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$tmp_keys[] = $id;

if ($v['genres'])
{
$g = explode('|', $v['genres']);
$this->tmp_results[$k]['g'] = $g; //this causes a
massive spike in memory usage
}
}
//var_dump($tmp_keys, $this->tmp_results); exit;
echo "\nMEMORY USED BEFORE array_combine:
".number_format(memory_get_usage() - $D_start_mem_usage)." PEAK:
(".number_format(memory_get_peak_usage(true)).")<br>\n";
$this->tmp_results = array_combine($tmp_keys,
$this->tmp_results);
echo "\nMEMORY USED FOR array_combine:
".number_format(memory_get_usage() - $D_start_mem_usage)." PEAK:
(".number_format(memory_get_peak_usage(true)).")<br>\n";
var_dump($tmp_keys, $this->tmp_results); exit;
}

Just the simple act of adding that 'g' variable element to the array causes
a massive change in memory usage. WHAT THE F!?

MEMORY USED BEFORE array_combine: 105,315,264 PEAK: (224,395,264)
MEMORY USED FOR array_combine: 106,573,040 PEAK: (224,395,264)

And taking out the
$this->tmp_results[$k]['g'] = $g;

Results in
MEMORY USED BEFORE array_combine: 8,050,456 PEAK: (78,118,912)
MEMORY USED FOR array_combine: 8,050,376 PEAK: (86,507,520)

Just as a wild guess, I also added 'g' to my SQL so that PHP would already
have a placeholder variable there in tmp_results, but that made no
difference. And still used up nearly double the memory as above.
SELECT DISTINCT `id`, sag.`genres`, 'g' FROM.
Jim Giner
2013-09-03 03:13:39 UTC
Permalink
Post by Daevid Vincent
I'm confused on how a reference works I think.
I have a DB result set in an array I'm looping over. All I simply want to do
is make the array key the "id" of the result set row.
private function _normalize_result_set()
{
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$new_tmp_results[$id] =& $v; //2013-08-29 [dv] using a
reference here cuts the memory usage in half!
unset($this->tmp_results[$k]);
/*
if ($i++ % 1000 == 0)
{
gc_enable(); // Enable Garbage Collector
var_dump(gc_enabled()); // true
var_dump(gc_collect_cycles()); // # of elements
cleaned up
gc_disable(); // Disable Garbage Collector
}
*/
}
$this->tmp_results = $new_tmp_results;
//var_dump($this->tmp_results); exit;
unset($new_tmp_results);
}
$new_tmp_results[$id] = $v;
array (size=79552)
6904 =>
array (size=4)
'id' => string '6904' (length=4)
'studio_id' => string '5' (length=1)
'genres' => string '34|' (length=3)
6905 =>
array (size=4)
'id' => string '6905' (length=4)
'studio_id' => string '5' (length=1)
'genres' => string '6|37|' (length=5)
However it takes a stupid amount of memory for some unknown reason.
MEMORY PEAK USAGE: 216,530,944 BYTES
When using the reference the memory drastically goes down to what I'd EXPECT
it to be (and actually the problem I'm trying to solve).
MEMORY PEAK USAGE: 82,051,072 BYTES
array (size=79552)
6904 => &
array (size=4)
'id' => string '86260' (length=5)
'studio_id' => string '210' (length=3)
'genres' => string '8|9|10|29|58|' (length=13)
6905 => &
array (size=4)
'id' => string '86260' (length=5)
'studio_id' => string '210' (length=3)
'genres' => string '8|9|10|29|58|' (length=13)
Notice that they're all the same values, although the keys seem right. I
don't understand why that happens because
foreach($this->tmp_results as $k => $v)
Should be changing $v each iteration I'd think.
Honestly, I am baffled as to why those unsets() make no difference. All I
can think is that the garbage collector doesn't run. But then I had also
tried to force gc() and that still made no difference. *sigh*
I had some other cockamamie idea where I'd use the same tmp_results array in
a tricky way to avoid a second array. The concept being I'd add 1 million
to the ['id'] (which we want as the new array key), then unset the existing
sequential key, then when all done, loop through and shift all the keys by 1
million thereby they'd be the right index ID. So add one and unset one
immediately after. Clever right? 'cept it too made no difference on memory.
Same thing is happening as above where the gc() isn't running or something
is holding all that memory until the end. *sigh*
Then I tried a different way using array_combine() and noticed something
very disturbing.
http://www.php.net/manual/en/function.array-combine.php
private function _normalize_result_set()
{
if (!$this->tmp_results || count($this->tmp_results) < 1)
return;
$D_start_mem_usage = memory_get_usage();
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$tmp_keys[] = $id;
if ($v['genres'])
{
$g = explode('|', $v['genres']);
$this->tmp_results[$k]['g'] = $g; //this causes a
massive spike in memory usage
}
}
//var_dump($tmp_keys, $this->tmp_results); exit;
(".number_format(memory_get_peak_usage(true)).")<br>\n";
$this->tmp_results = array_combine($tmp_keys,
$this->tmp_results);
(".number_format(memory_get_peak_usage(true)).")<br>\n";
var_dump($tmp_keys, $this->tmp_results); exit;
}
Just the simple act of adding that 'g' variable element to the array causes
a massive change in memory usage. WHAT THE F!?
MEMORY USED BEFORE array_combine: 105,315,264 PEAK: (224,395,264)
MEMORY USED FOR array_combine: 106,573,040 PEAK: (224,395,264)
And taking out the
$this->tmp_results[$k]['g'] = $g;
Results in
MEMORY USED BEFORE array_combine: 8,050,456 PEAK: (78,118,912)
MEMORY USED FOR array_combine: 8,050,376 PEAK: (86,507,520)
Just as a wild guess, I also added 'g' to my SQL so that PHP would already
have a placeholder variable there in tmp_results, but that made no
difference. And still used up nearly double the memory as above.
SELECT DISTINCT `id`, sag.`genres`, 'g' FROM.
Are you sure that the data is what you expect? I've never used an
object to hold the results of a query, but I'm picturing that your
foreach may not be working at all unless an object as a result is
totally different than the type of query results I always process.

You're taking each object of the query results and assigning the key and
value to vars. But aren't query results 'keyless'? So what are you
actually assigning to $k and $v?

If I'm wrong and using an object is quite different, forget I said
anything. :)
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Daevid Vincent
2013-09-03 05:09:20 UTC
Permalink
-----Original Message-----
Sent: Monday, September 02, 2013 8:14 PM
Subject: [PHP] Re: refernces, arrays, and why does it take up so much
memory?
Post by Daevid Vincent
I'm confused on how a reference works I think.
I have a DB result set in an array I'm looping over. All I simply want
to
do
Post by Daevid Vincent
is make the array key the "id" of the result set row.
private function _normalize_result_set()
{
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$new_tmp_results[$id] =& $v; //2013-08-29 [dv]
using
a
Post by Daevid Vincent
reference here cuts the memory usage in half!
unset($this->tmp_results[$k]);
/*
if ($i++ % 1000 == 0)
{
gc_enable(); // Enable Garbage Collector
var_dump(gc_enabled()); // true
var_dump(gc_collect_cycles()); // # of
elements
Post by Daevid Vincent
cleaned up
gc_disable(); // Disable Garbage Collector
}
*/
}
$this->tmp_results = $new_tmp_results;
//var_dump($this->tmp_results); exit;
unset($new_tmp_results);
}
$new_tmp_results[$id] = $v;
array (size=79552)
6904 =>
array (size=4)
'id' => string '6904' (length=4)
'studio_id' => string '5' (length=1)
'genres' => string '34|' (length=3)
6905 =>
array (size=4)
'id' => string '6905' (length=4)
'studio_id' => string '5' (length=1)
'genres' => string '6|37|' (length=5)
However it takes a stupid amount of memory for some unknown reason.
MEMORY PEAK USAGE: 216,530,944 BYTES
When using the reference the memory drastically goes down to what I'd
EXPECT
Post by Daevid Vincent
it to be (and actually the problem I'm trying to solve).
MEMORY PEAK USAGE: 82,051,072 BYTES
array (size=79552)
6904 => &
array (size=4)
'id' => string '86260' (length=5)
'studio_id' => string '210' (length=3)
'genres' => string '8|9|10|29|58|' (length=13)
6905 => &
array (size=4)
'id' => string '86260' (length=5)
'studio_id' => string '210' (length=3)
'genres' => string '8|9|10|29|58|' (length=13)
Notice that they're all the same values, although the keys seem right. I
don't understand why that happens because
foreach($this->tmp_results as $k => $v)
Should be changing $v each iteration I'd think.
Honestly, I am baffled as to why those unsets() make no difference. All I
can think is that the garbage collector doesn't run. But then I had also
tried to force gc() and that still made no difference. *sigh*
I had some other cockamamie idea where I'd use the same tmp_results
array
in
Post by Daevid Vincent
a tricky way to avoid a second array. The concept being I'd add 1 million
to the ['id'] (which we want as the new array key), then unset the
existing
Post by Daevid Vincent
sequential key, then when all done, loop through and shift all the keys
by
1
Post by Daevid Vincent
million thereby they'd be the right index ID. So add one and unset one
immediately after. Clever right? 'cept it too made no difference on
memory.
Post by Daevid Vincent
Same thing is happening as above where the gc() isn't running or something
is holding all that memory until the end. *sigh*
Then I tried a different way using array_combine() and noticed something
very disturbing.
http://www.php.net/manual/en/function.array-combine.php
private function _normalize_result_set()
{
if (!$this->tmp_results || count($this->tmp_results) < 1)
return;
$D_start_mem_usage = memory_get_usage();
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$tmp_keys[] = $id;
if ($v['genres'])
{
$g = explode('|', $v['genres']);
$this->tmp_results[$k]['g'] = $g; //this
causes a
Post by Daevid Vincent
massive spike in memory usage
}
}
//var_dump($tmp_keys, $this->tmp_results); exit;
(".number_format(memory_get_peak_usage(true)).")<br>\n";
$this->tmp_results = array_combine($tmp_keys,
$this->tmp_results);
(".number_format(memory_get_peak_usage(true)).")<br>\n";
var_dump($tmp_keys, $this->tmp_results); exit;
}
Just the simple act of adding that 'g' variable element to the array
causes
Post by Daevid Vincent
a massive change in memory usage. WHAT THE F!?
MEMORY USED BEFORE array_combine: 105,315,264 PEAK: (224,395,264)
MEMORY USED FOR array_combine: 106,573,040 PEAK: (224,395,264)
And taking out the
$this->tmp_results[$k]['g'] = $g;
Results in
MEMORY USED BEFORE array_combine: 8,050,456 PEAK: (78,118,912)
MEMORY USED FOR array_combine: 8,050,376 PEAK: (86,507,520)
Just as a wild guess, I also added 'g' to my SQL so that PHP would already
have a placeholder variable there in tmp_results, but that made no
difference. And still used up nearly double the memory as above.
SELECT DISTINCT `id`, sag.`genres`, 'g' FROM.
Are you sure that the data is what you expect? I've never used an
object to hold the results of a query, but I'm picturing that your
foreach may not be working at all unless an object as a result is
totally different than the type of query results I always process.
You're taking each object of the query results and assigning the key and
value to vars. But aren't query results 'keyless'? So what are you
actually assigning to $k and $v?
If I'm wrong and using an object is quite different, forget I said
anything. :)
Positive the data is right. It all works. It's just the memory usage is the
issue.

I'm not assigning an object. I'm assigning an array to a property (array) IN
an object ($this).

The query itself is using mysql_fetch_array() and MYSQL_ASSOC
http://php.net/manual/en/function.mysql-fetch-array.php

Results are certainly NOT keyless as you wouldn't be able to use them
otherwise.

They are a sequential multi-dimensional array starting at [0] ... [x] with
column names as the sub-array hash keys and column values as the
corresponding hash value.
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Jim Giner
2013-09-03 13:02:20 UTC
Permalink
Post by Daevid Vincent
-----Original Message-----
Sent: Monday, September 02, 2013 8:14 PM
Subject: [PHP] Re: refernces, arrays, and why does it take up so much
memory?
Post by Daevid Vincent
I'm confused on how a reference works I think.
I have a DB result set in an array I'm looping over. All I simply want
to
do
Post by Daevid Vincent
is make the array key the "id" of the result set row.
private function _normalize_result_set()
{
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$new_tmp_results[$id] =& $v; //2013-08-29 [dv]
using
a
Post by Daevid Vincent
reference here cuts the memory usage in half!
unset($this->tmp_results[$k]);
/*
if ($i++ % 1000 == 0)
{
gc_enable(); // Enable Garbage Collector
var_dump(gc_enabled()); // true
var_dump(gc_collect_cycles()); // # of
elements
Post by Daevid Vincent
cleaned up
gc_disable(); // Disable Garbage Collector
}
*/
}
$this->tmp_results = $new_tmp_results;
//var_dump($this->tmp_results); exit;
unset($new_tmp_results);
}
$new_tmp_results[$id] = $v;
array (size=79552)
6904 =>
array (size=4)
'id' => string '6904' (length=4)
'studio_id' => string '5' (length=1)
'genres' => string '34|' (length=3)
6905 =>
array (size=4)
'id' => string '6905' (length=4)
'studio_id' => string '5' (length=1)
'genres' => string '6|37|' (length=5)
However it takes a stupid amount of memory for some unknown reason.
MEMORY PEAK USAGE: 216,530,944 BYTES
When using the reference the memory drastically goes down to what I'd
EXPECT
Post by Daevid Vincent
it to be (and actually the problem I'm trying to solve).
MEMORY PEAK USAGE: 82,051,072 BYTES
array (size=79552)
6904 => &
array (size=4)
'id' => string '86260' (length=5)
'studio_id' => string '210' (length=3)
'genres' => string '8|9|10|29|58|' (length=13)
6905 => &
array (size=4)
'id' => string '86260' (length=5)
'studio_id' => string '210' (length=3)
'genres' => string '8|9|10|29|58|' (length=13)
Notice that they're all the same values, although the keys seem right. I
don't understand why that happens because
foreach($this->tmp_results as $k => $v)
Should be changing $v each iteration I'd think.
Honestly, I am baffled as to why those unsets() make no difference. All
I
Post by Daevid Vincent
can think is that the garbage collector doesn't run. But then I had also
tried to force gc() and that still made no difference. *sigh*
I had some other cockamamie idea where I'd use the same tmp_results
array
in
Post by Daevid Vincent
a tricky way to avoid a second array. The concept being I'd add 1
million
Post by Daevid Vincent
to the ['id'] (which we want as the new array key), then unset the
existing
Post by Daevid Vincent
sequential key, then when all done, loop through and shift all the keys
by
1
Post by Daevid Vincent
million thereby they'd be the right index ID. So add one and unset one
immediately after. Clever right? 'cept it too made no difference on
memory.
Post by Daevid Vincent
Same thing is happening as above where the gc() isn't running or
something
Post by Daevid Vincent
is holding all that memory until the end. *sigh*
Then I tried a different way using array_combine() and noticed something
very disturbing.
http://www.php.net/manual/en/function.array-combine.php
private function _normalize_result_set()
{
if (!$this->tmp_results || count($this->tmp_results) < 1)
return;
$D_start_mem_usage = memory_get_usage();
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$tmp_keys[] = $id;
if ($v['genres'])
{
$g = explode('|', $v['genres']);
$this->tmp_results[$k]['g'] = $g; //this
causes a
Post by Daevid Vincent
massive spike in memory usage
}
}
//var_dump($tmp_keys, $this->tmp_results); exit;
(".number_format(memory_get_peak_usage(true)).")<br>\n";
$this->tmp_results = array_combine($tmp_keys,
$this->tmp_results);
(".number_format(memory_get_peak_usage(true)).")<br>\n";
var_dump($tmp_keys, $this->tmp_results); exit;
}
Just the simple act of adding that 'g' variable element to the array
causes
Post by Daevid Vincent
a massive change in memory usage. WHAT THE F!?
MEMORY USED BEFORE array_combine: 105,315,264 PEAK: (224,395,264)
MEMORY USED FOR array_combine: 106,573,040 PEAK: (224,395,264)
And taking out the
$this->tmp_results[$k]['g'] = $g;
Results in
MEMORY USED BEFORE array_combine: 8,050,456 PEAK: (78,118,912)
MEMORY USED FOR array_combine: 8,050,376 PEAK: (86,507,520)
Just as a wild guess, I also added 'g' to my SQL so that PHP would
already
Post by Daevid Vincent
have a placeholder variable there in tmp_results, but that made no
difference. And still used up nearly double the memory as above.
SELECT DISTINCT `id`, sag.`genres`, 'g' FROM.
Are you sure that the data is what you expect? I've never used an
object to hold the results of a query, but I'm picturing that your
foreach may not be working at all unless an object as a result is
totally different than the type of query results I always process.
You're taking each object of the query results and assigning the key and
value to vars. But aren't query results 'keyless'? So what are you
actually assigning to $k and $v?
If I'm wrong and using an object is quite different, forget I said
anything. :)
Positive the data is right. It all works. It's just the memory usage is the
issue.
I'm not assigning an object. I'm assigning an array to a property (array) IN
an object ($this).
The query itself is using mysql_fetch_array() and MYSQL_ASSOC
http://php.net/manual/en/function.mysql-fetch-array.php
Results are certainly NOT keyless as you wouldn't be able to use them
otherwise.
They are a sequential multi-dimensional array starting at [0] ... [x] with
column names as the sub-array hash keys and column values as the
corresponding hash value.
Hard to know that from your code sample.

Please ignore my my post. :)

Although - why not just use the query results instead of first putting
them into an array only to put some of them (?) into a second array?
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Stuart Dallas
2013-09-03 13:31:15 UTC
Permalink
Post by Daevid Vincent
I'm confused on how a reference works I think.
I have a DB result set in an array I'm looping over. All I simply want to do
is make the array key the "id" of the result set row.
private function _normalize_result_set()
{
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$new_tmp_results[$id] =& $v; //2013-08-29 [dv] using a
reference here cuts the memory usage in half!
You are assigning a reference to $v. In the next iteration of the loop, $v will be pointing at the next item in the array, as will the reference you're storing here. With this code I'd expect $new_tmp_results to be an array where the keys (i.e. the IDs) are correct, but the data in each item matches the data in the last item from the original array, which appears to be what you describe.
Post by Daevid Vincent
unset($this->tmp_results[$k]);
Doing this for every loop is likely very inefficient. I don't know how the inner workings of PHP process something like this, but I wouldn't be surprised if it's allocating a new chunk of memory for a version of the array without this element. You may find it better to not unset anything until the loop has finished, at which point you can just unset($this->tmp_results).
Post by Daevid Vincent
/*
if ($i++ % 1000 == 0)
{
gc_enable(); // Enable Garbage Collector
var_dump(gc_enabled()); // true
var_dump(gc_collect_cycles()); // # of elements
cleaned up
gc_disable(); // Disable Garbage Collector
}
*/
}
$this->tmp_results = $new_tmp_results;
//var_dump($this->tmp_results); exit;
unset($new_tmp_results);
}
Try this:

private function _normalize_result_set()
{
// Initialise the temporary variable.
$new_tmp_results = array();

// Loop around just the keys in the array.
foreach (array_keys($this->tmp_results) as $k)
{
// Store the item in the temporary array with the ID as the key.
// Note no pointless variable for the ID, and no use of &!
$new_tmp_results[$this->tmp_results[$k]['id']] = $this->tmp_results[$k];
}

// Assign the temporary variable to the original variable.
$this->tmp_results = $new_tmp_results;
}

I'd appreciate it if you could plug this in and see what your memory usage reports say. In most cases, trying to control the garbage collection through the use of references is the worst way to go about optimising your code. In my code above I'm relying on PHPs copy-on-write feature where data is only duplicated when assigned if it changes. No unsets, just using scope to mark a variable as able to be cleaned up.

Where is this result set coming from? You'd save yourself a lot of memory/time by putting the data in to this format when you read it from the source. For example, if reading it from MySQL, $this->tmp_results[$row['id']] = $row when looping around the result set.

Also, is there any reason why you need to process this full set of data in one go? Can you not break it up in to smaller pieces that won't put as much strain on resources?

-Stuart
--
Stuart Dallas
3ft9 Ltd
http://3ft9.com/
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Harrison, Roberto
2013-09-10 13:17:03 UTC
Permalink
I have a working project using the above resources, though these are on a
Windows Server 2008 trial installation on a virtual machine in VMWare.

Is there an easy way to transfer the whole project to a legit Windows
Server 2008? So far I have the following backed up:

administration.config
applicationHost.config
redirection.config
Arachnophilia folder
MySQL folder
php folder
phpdebug folder
wwwroot folder

Thanks in advance.


Roberto Harrison, MLIS
Technology Support Librarian
Medical College of Wisconsin Libraries
Link.Learn.Lead


.
Post by Stuart Dallas
Post by Daevid Vincent
I'm confused on how a reference works I think.
I have a DB result set in an array I'm looping over. All I simply want to do
is make the array key the "id" of the result set row.
private function _normalize_result_set()
{
foreach($this->tmp_results as $k => $v)
{
$id = $v['id'];
$new_tmp_results[$id] =& $v; //2013-08-29 [dv] using a
reference here cuts the memory usage in half!
You are assigning a reference to $v. In the next iteration of the loop,
$v will be pointing at the next item in the array, as will the reference
you're storing here. With this code I'd expect $new_tmp_results to be an
array where the keys (i.e. the IDs) are correct, but the data in each
item matches the data in the last item from the original array, which
appears to be what you describe.
Post by Daevid Vincent
unset($this->tmp_results[$k]);
Doing this for every loop is likely very inefficient. I don't know how
the inner workings of PHP process something like this, but I wouldn't be
surprised if it's allocating a new chunk of memory for a version of the
array without this element. You may find it better to not unset anything
until the loop has finished, at which point you can just
unset($this->tmp_results).
Post by Daevid Vincent
/*
if ($i++ % 1000 == 0)
{
gc_enable(); // Enable Garbage Collector
var_dump(gc_enabled()); // true
var_dump(gc_collect_cycles()); // # of elements
cleaned up
gc_disable(); // Disable Garbage Collector
}
*/
}
$this->tmp_results = $new_tmp_results;
//var_dump($this->tmp_results); exit;
unset($new_tmp_results);
}
private function _normalize_result_set()
{
// Initialise the temporary variable.
$new_tmp_results = array();
// Loop around just the keys in the array.
foreach (array_keys($this->tmp_results) as $k)
{
// Store the item in the temporary array with the ID as the key.
// Note no pointless variable for the ID, and no use of &!
$new_tmp_results[$this->tmp_results[$k]['id']] =
$this->tmp_results[$k];
}
// Assign the temporary variable to the original variable.
$this->tmp_results = $new_tmp_results;
}
I'd appreciate it if you could plug this in and see what your memory
usage reports say. In most cases, trying to control the garbage
collection through the use of references is the worst way to go about
optimising your code. In my code above I'm relying on PHPs copy-on-write
feature where data is only duplicated when assigned if it changes. No
unsets, just using scope to mark a variable as able to be cleaned up.
Where is this result set coming from? You'd save yourself a lot of
memory/time by putting the data in to this format when you read it from
the source. For example, if reading it from MySQL,
$this->tmp_results[$row['id']] = $row when looping around the result set.
Also, is there any reason why you need to process this full set of data
in one go? Can you not break it up in to smaller pieces that won't put as
much strain on resources?
-Stuart
--
Stuart Dallas
3ft9 Ltd
http://3ft9.com/
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Loading...