Talk:Lua object memory sizes

From Warcraft Wiki
Jump to navigation Jump to search

Erasing a table is more expensive?

I've noticed "It should be noted that erasing a table is generally more expensive than collecting the table during GC. One may wish to simply allocate a new empty table." quote in this article and since I remember using erasing relatively recently and benchmarking it with better results thank recreating table, I decided to write a quick benchmark to clarify it. In code bellow all functions do the same - fill and reset table of maxidx size up to maxtry times. Functions with "rewrite", reset fields manually. "recreate" - creates new table every time, "collect" - initiates GC after every recreate, "hash" works with non-numeric indexes.

As seen from benchmark results, while working with int sequental indexes it is faster to erase table manually even without GC, and with it, gain becomes huge (compare rewrite->recreate->recreatecollect). In "hash" case this sentence above is much closer to reality. Manual wiping is slower, however if we take GC in account, then speed becomes almost the same (compare rewritehash->recreatehash->recreatecollecthash).

Bottom line: if you use array-style table, do use manual erasing for both memory and performance gain, if you use hash-style table you still can erase if you're tight on memory, but at a cost of some performance. Running GC manually seems to help reduce both memory and CPU usage in "hash" case, but I'm not exactly sure results would be same if there'd be a lot more of objects in memory to check for GC, as it is normally in WoW case.

Unless there's objections, I'll edit that sentence accordingly in a day or two.

Code:


dofile('LuaBenchmark.lua')

local trymax=10000
local idxmax=50

function rewrite()
	local table1={}
	for try=1, trymax do
		for idx=1, idxmax do
			table1[idx]=idx
		end
		for idx=1, idxmax do
			table1[idx]=nil
		end
	end
end

function rewritehash()
	local table1={}
	for try=1, trymax do
		for idx=1, idxmax do
			table1["t"..idx]=idx
		end
		for idx=1, idxmax do
			table1["t"..idx]=nil
		end
	end
end

function recreate()
	local table1
	for try=1, trymax do
		table1={}
		for idx=1, idxmax do
			table1[idx]=idx
		end
	end
end

function recreatehash()
	local table1
	for try=1, trymax do
		table1={}
		for idx=1, idxmax do
			table1["t"..idx]=idx
		end
	end
end

function recreatecollect()
	local table1
	for try=1, trymax do
		table1={}
		for idx=1, idxmax do
			table1[idx]=idx
		end
	        collectgarbage("collect")
	end
end

function recreatecollecthash()
	local table1
	for try=1, trymax do
		table1={}
		for idx=1, idxmax do
			table1["t"..idx]=idx
		end
	        collectgarbage("collect")
	end
end

print "=== rewrite"
LuaBenchmark(rewrite)
print "=== rewritehash"
LuaBenchmark(rewritehash)
print "=== recreate"
LuaBenchmark(recreate)
print "=== recreatecollect"
LuaBenchmark(recreatecollect)
print "=== recreatehash"
LuaBenchmark(recreatehash)
print "=== recreatecollecthash"
LuaBenchmark(recreatecollecthash)
=== rewrite
Data generated   : 1.03125kb
Garbage collected: 1.03125kb
Memory in use    : 25kb (0kb change)
Time            : 0.234sec

=== recreate
Data generated   : 10312.5kb
Garbage collected: 10312.5kb
Memory in use    : 26kb (0kb change)
Time            : 0.391sec

=== recreatecollect
Data generated   : 1.03125kb
Garbage collected: 1.03125kb
Memory in use    : 26kb (0kb change)
Time            : 1.438sec

=== rewritehash
Data generated   : 4.91796875kb
Garbage collected: 3.91796875kb
Memory in use    : 26kb (1kb change)
Time            : 6.328sec

=== recreatehash
Data generated   : 20314.38671875kb
Garbage collected: 20314.38671875kb
Memory in use    : 26kb (0kb change)
Time            : 4sec

=== recreatecollecthash
Data generated   : 2.9990234375kb
Garbage collected: 2.9990234375kb
Memory in use    : 26kb (0kb change)
Time            : 5.969sec

--Rowaasr13 07:17, 16 July 2007 (UTC)

A few issues I see right off the bat with your code...

  • I don't know what code is contained in LuaBenchmark so...
  • You don't disable GC
  • You do not force a table to resize... this is a biggie here. You code reuses the same indexes every time, which means the table only has to alloc those indexes the first pass. That's a BIG timesaver for your example. The code on the main page erases the table AND forces it to dealloc back. If you don't do that then your table will bloat out and could potentially never shrink back. This could be good, if you're reusing the same indexes over again as you are, or bad if you're trying to do more generic table recycling. I will try to find time tomorrow to profile a bit for both cases and adjust the article accordingly. However, the point that's trying to be made here is table recycling is not always the best method, particularly in the more generic cases. User:Tekkub/Sig 07:48, 16 July 2007 (UTC)
I do stop GC. LuaBenchmark is a mostly pretty-printing/formating code with several checks to see what to use if we're in WoW or out, but it boils down to collect/stop/count before test and then count again at the end (with one more collect/count to see how much garbage we generated).
Here's resizing version:
function rewritehashresize()
	local table1={}
	for try=1, trymax do
		for idx=1, idxmax do
			table1["t"..idx]=idx
		end
		for idx=1, idxmax do
			table1["t"..idx]=nil
		end
		table1["r"..try]=1 table1["r"..try]=nil
	end
end
Results for this run indeed indicate some decline in performance:
=== rewritehash
Data generated   : 4.91796875kb
Garbage collected: 3.91796875kb
Memory in use    : 27kb (1kb change)
Time            : 5.422sec
=== rewritehashresize
Data generated   : 2.9990234375kb
Garbage collected: 2.9990234375kb
Memory in use    : 27kb (0kb change)
Time            : 5.89sec
The point about "array"-style tables still stays though. Resizing is not a problem for it, because its specification requires indices to be more or less the same and either fill already allocated indices or fill new ones only when all previous are allocated already, so there's nothing to shrink. Adding table1[try+idxmax]=1 table1[try+idxmax]=nil to "rewrite" to force this normally unnatural behavior shows that it is now indeed a bit slower than "recreate", but once we count postponed GC in by checking results of "recreatecollect", we see that it is still almost 3 times faster.
=== rewrite
Data generated   : 1.03125kb
Garbage collected: 1.03125kb
Memory in use    : 27kb (0kb change)
Time            : 0.234sec
=== rewritegap
Data generated   : 0.0625kb
Garbage collected: 0.0625kb
Memory in use    : 28kb (0kb change)
Time            : 0.578sec
=== recreate
Data generated   : 10312.5kb
Garbage collected: 10312.5kb
Memory in use    : 28kb (0kb change)
Time            : 0.516sec
=== recreatecollect
Data generated   : 1.03125kb
Garbage collected: 1.03125kb
Memory in use    : 28kb (0kb change)
Time            : 1.562sec
--Rowaasr13 08:13, 16 July 2007 (UTC)
  • Your "recreatecollect" tests also calls collectbarge("collect") 10k times. Bit flawed that one.
--Mikk (T) 12:58, 18 July 2007 (UTC)

Mikk's results

Mkay.. I seem to be getting very different results.

function LuaBenchmark(func)
	local time = os.clock()
	collectgarbage("collect")
	local mem = collectgarbage("count")
	func()
	collectgarbage("collect")
	time = os.clock() - time
	mem = collectgarbage("count") - mem
	print("Mem diff: " .. floor(mem*1024))
	print("Time    : " .. time)
	
end


trymax=100000
idxmax=100

function rewrite()
	local table1={}
	for try=1, trymax do
		for idx=1, idxmax do
			table1[idx]=idx
		end
		for idx=1, idxmax do
			table1[idx]=nil
		end
    table1[try] = true
  	table1[try] = nil
	end
end	

str = [[
function rewritehash()
	local table1 = {}
	for try=1 ,trymax do
    -- do things to the table
]]   
for idx=1,idxmax do 
	str = str .. [[ 
		table1["val]]..idx..[["] = true 
]]
end
str = str .. [[  
		for k in pairs(table1) do
        table1[k] = nil -- this section clears the temporary table
    end
    table1[try] = true
  	table1[try] = nil
  end
end
]]
assert(loadstring(str))()



function recreate()
	for try=1, trymax do
		local table1={}
		for idx=1, idxmax do
			table1[idx]=idx
		end
	end
end	


str = [[
function recreatehash()
	for try=1 ,trymax do
		local table1 = {}
]]   
for idx=1,idxmax do 
	str = str .. [[ 
		table1["val]]..idx..[["] = true 
]]
end
str = str .. [[  
  end
end
]]
assert(loadstring(str))()


print "=== rewrite"
LuaBenchmark(rewrite)
print "=== rewritehash"
LuaBenchmark(rewritehash)
print "=== recreate"
LuaBenchmark(recreate)
print "=== recreatehash"
LuaBenchmark(recreatehash)

Running with idxmax = 100

=== rewrite
Mem diff: -538
Time    : 4.218
=== rewritehash
Mem diff: 816
Time    : 4.485
=== recreate
Mem diff: -672
Time    : 3.625
=== recreatehash
Mem diff: 0
Time    : 5.5

Running with idxmax = 10

=== rewrite
Mem diff: -99
Time    : 0.828
=== rewritehash
Mem diff: 816
Time    : 0.406
=== recreate
Mem diff: -672
Time    : 0.844
=== recreatehash
Mem diff: 0
Time    : 0.89

Discussion

I'm not doing the collect-each-loop trick. That's totally unnatural and of course adds overhead. Anyway... what's up with rewritehash using so little CPU compared to rewrite? Seems a bit odd. --Mikk (T) 10:02, 17 July 2007 (UTC)

Tekkub's tests

Doing a batch of tests myself, and so far here's what I'm seeing... (test code here)

  • Erasing a table costs roughly the same time and erasing and shrinking.
  • Arrays (consecutive int indexes) 128 indexes or less are cheaper to erase and reshrink than doing a GC
  • Arrays 129-256 indexes are about equal to GC on erase
  • Arrays >257 indexs are more expensive to erase and shrink... the rate seems to go up at a steady rate, where the GC cost says mostly constant.
  • Hashes WILL NOT SHRINK. That means that you must GC them if you want that memory back.
I did not test to see if a mixed hash/array would shrink.

So here's my general conclusions: If you are using nearly the same number indexes every time, and your table is fairly small (<128 indexes) it's cheaper to erase and reuse the table. If the number of indexes is large (256+) it's generally cheaper to throw stuff out for GC. If you are using hash-type indexes your table will bloat out and never shrink back, so you might want to toss out to GC periodically "just in case" User:Tekkub/Sig 05:20, 18 July 2007 (UTC)


I guess it should be noted that those tests don't take into account the CPU cost of actually storing and retreiving tables from a compost.
Anyway... i _think_ that the right way to parse these results for real life use are:
- As long as you are using hashes with a relatively small number of keys, it's faster to recycle them. The fact that the internal storage space doesn't shrink matters less since the next user will use roughly as many keys anyway.
- For any large table, and for all integer indexed tables: gogo let the GC munch them
But even for the first case I'd want to do a full test that looks more like real life use (actually HAVING a compost, letting the compost having a variable number of tables in it, using keys with different names for each run, etc..)
--Mikk (T) 13:15, 18 July 2007 (UTC)
Well the conclusions I've come to are these personally...
  • Don't do generic compost, but DO reuse specific tables if you know their behavior fits the pattern (small, generally unchanging indexes)
  • It doesn't matter if it's a 'hash' or an 'array', as long as the total indexes is <= 128. In the 129-256 range you're roughly equal to GC, so there recycling is questionable.
  • Shrinking a table is a cool trick, but it doesn't work all the time. Considering that hash indexes appear to never shrink, it doesn't seem too helpful to hold on to those tables for a long time if the indexes are highly variable.
The long and the short of it is, recycle on a per-table basis, don't go for the generic recycling table cache like Compost had. It was a nifty trick, but it's not an all-encompassing solution to improve performance. User:Tekkub/Sig 19:02, 18 July 2007 (UTC)

Important note regarding composts

It is important to remember that compost tables also have indices. And that in-game, they will keep handling tables with wildly differing addresses (references). This means that the compost index keeps growing all the time, and never really shrinks back, since it's a hash.

To counter this, I've done some tests with linked-lists composts, which should totally avoid the problem of indices growing indefinitely. Unfortunately, they're even more expensive CPU wise than normal composts (partially because we have to keep setmetatable()ing entries added to and removed from the compost), so they are not really viable.

Added to this, WoW 2.3 will do extra GC runs while you are standing around idling. The answer does seem to remain "usually, you just don't want to do composts at all".

--Mikk (T) 01:41, 27 November 2007 (UTC)

Description of upvalues is a bit flawed

Just a note, the way upvalue memory use is described is somewhat flawed, as it's not separating the memory for the upvalue reference (4 bytes per closure) from the memory for the value (28 bytes, IIRC). Lua doesn't care if the upvalue changes or not (contrary to the wording here), the difference is that in the 'constant' case there's only one VALUE and a reference in each closure, but in the 'variable' example, there's a value for each instance, plus the 4 bytes on each closure. Flickering (talk)