Improving the Table class

Dargor · Jul 21, 2010

Hi,

I've been working with tables in Ruby lately using vgvgf's Table class. The advantage of this class is that it can handle a huge amount of info better than the Array class, but it can only contain Fixnum and nothing else. The problem I have with that is that I'd like to store ruby objects into a table, like this:

table[1,2,3] = MyObject.new

It's impossible right now because of the way the Table class is written. But, I was wondering if it is possible to modify the class so that it can support any kind of objects, just like Array and Hash does?

Here's the source code, written by vgvgf.

[ruby]class Table
def initialize(x, y = 1, z = 1)
@xsize, @ysize, @zsize = x, y, z
@data = Array.new(x * y * z, 0)
end
def [](x, y = 0, z = 0)
@data[x + y * @xsize + z * @xsize * @ysize]
end
def []=(*args)
x = args[0]
y = args.size > 2 ? args[1] :0
z = args.size > 3 ? args[2] :0
v = args.pop
@data[x + y * @xsize + z * @xsize * @ysize] = v
end
def _dump(d = 0)
s = [3].pack('L')
s += [@xsize].pack('L') + [@ysize].pack('L') + [@zsize].pack('L')
s += [@xsize * @ysize * @zsize].pack('L')
for z in 0...@zsize
for y in 0...@ysize
for x in 0...@xsize
s += [@data[x + y * @xsize + z * @xsize * @ysize]].pack('S')
end
end
end
s
end
def self._load(s)
size = s[0, 4].unpack('L')[0]
nx = s[4, 4].unpack('L')[0]
ny = s[8, 4].unpack('L')[0]
nz = s[12, 4].unpack('L')[0]
data = []
pointer = 20
loop do
data.push(*s[pointer, 2].unpack('S'))
pointer += 2
break if pointer > s.size - 1
end
t = Table.new(nx, ny, nz)
n = 0
for z in 0...nz
for y in 0...ny
for x in 0...nx
t[x, y, z] = data[n]
n += 1
end
end
end
t
end
attr_reader

xsize, :ysize, :zsize, :data)
end
[/ruby]

It's seems that it's the packing of the data that is problematic here. I'm not very familiar with this but it looks relatively simple.
So, is there a way to modify this class to do this?

Thanks,
- Dargor

e · Jul 21, 2010

You're right, it's in the dump/load methods. vgvgf packs the data as unsigned short, which is only 16 bits long, so your objects obviously get truncated. But that should be only when you use these methods, so temporary tables should still work normally?

Either way, to properly pack your stuff for dump/load, the best way would be to switch depending on the class of the object, and then pack it accordingly. That means you'll need a pack and unpack method for all your classes. That would be the neatest.

A quick and dirty solution would be to Marshall your objects into string and pack them as such.

I haven't done extensive testing but a quick solution would be:

Ruby:

class Table

  def initialize(x, y = 1, z = 1)

     @xsize, @ysize, @zsize = x, y, z

     @data = Array.new(x * y * z, 0)

  end

  def [](x, y = 0, z = 0)

     @data[x + y * @xsize + z * @xsize * @ysize]

  end

  def []=(*args)

     x = args[0]

     y = args.size > 2 ? args[1] :0

     z = args.size > 3 ? args[2] :0

     v = args.pop

     @data[x + y * @xsize + z * @xsize * @ysize] = v

  end

  def _dump(d = 0)

     s = [3].pack('L')

     s += [@xsize].pack('L') + [@ysize].pack('L') + [@zsize].pack('L')

     s += [@xsize * @ysize * @zsize].pack('L')

     for z in 0...@zsize

        for y in 0...@ysize

           for x in 0...@xsize

              # Dump our object

              dmp = [Marshal.dump(@data[x + y * @xsize + z * @xsize * @ysize])].pack('M*')

              # Encode the size of the dump first as a unsigned short (2 bytes)

              s += [dmp.size].pack('S')

              # Now encode our dumped data

              s += dmp

           end

        end

     end

     s

  end

  def self._load(s)

     size = s[0, 4].unpack('L')[0]

     nx = s[4, 4].unpack('L')[0]

     ny = s[8, 4].unpack('L')[0]

     nz = s[12, 4].unpack('L')[0]

     data = []

     pointer = 20

     loop do

        # Get the size of the object

        tmpSize = *s[pointer, 2].unpack('S')

        pointer += 2

        # Push the encoded data

        data.push(*s[pointer, tmpSize].unpack('M*')[0])

        pointer += tmpSize

        break if pointer > s.size - 1

     end

     t = Table.new(nx, ny, nz)

     n = 0

     for z in 0...nz

        for y in 0...ny

           for x in 0...nx

              t[x, y, z] = Marshal.load(data[n])

              n += 1

           end

        end

     end

     t

  end

  attr_reader(:xsize, :ysize, :zsize, :data)

end

vgvgf might be able to help you more though.

Dargor · Jul 22, 2010

Thanks for the fast reply.

Yes, temporary tables would work but I wouldn't be able to dump them.
I tried your method and it works great, with small tables at least. I tried dumping a 500x500x3 table and it took precisely 318.375 seconds, about 5 minutes and a half.

But I was expecting this. My guess is that using the Marshal.dump method makes it slower. I haven't completly tried dumping the same table containing only Numerics but I started dumping one and it took way too much time. I canceled it after 5 seconds.

Also, maybe I missunderstood you but I doubt I would need a dump and load for all my classes. They are very simple classes (Like RPG::Item for example).

Still, this is a big step forward, it helps a lot

EDIT:
Also, cycling through 750,000 objects and packing all of them individually might be slower than packing the array itself. :P

e · Jul 22, 2010

Well, the problem with your objects is you need some way to pack them in a custom way and interpret the unpacked data back into that object. Unfortunately for you, I think Marshal.dump is the fastest way to serialize Ruby objects...

So yes, you would need a dump/load method for all classes, because you need to know the exact data structure, since when you unpack you must know how many bytes the packed data was.

Here's an example class with its own dump/load (or pack/unpack I guess):

Ruby:

 

class Test

  # Constant: PACKSIZE, is the number of bytes a packed object of this class is

  PACKSIZE = 6

 

  attr_accessor :a, :b, :c

 

  def initialize

    a, b, c = 1, 2, 3

  end

 

  def pack

    return [a, b, c].pack('SSS') # Returns each variable in a string packed as unsigned shorts (2 bytes each)

  end

 

  def self.unpack(s) # say s is a pre-truncated string so our data should be at the front

    return_obj = Test.new

 

    return_obj.a = *s[0, 2].unpack('S') # We know a short is two bytes, so we take the first two bytes of the string and unpack them

    return_obj.b = *s[2, 4].unpack('S') # Same for b

    return_obj.c = *s[4, 6].unpack('S') # And c

 

    return return_obj

  end

end

Because packing is order based, you need to know the size of the packed objects, and then know the size of each variable. You could make this more generic, by looping on every visible attributes of a class and, based on their types, packing them, and then you'd know how to unpack them too, but that'd probably be pretty computer intensive, more so than coding custom pack/unpack methods which will probably be the fastest way, because all this information is static.

Yeyinde · Jul 23, 2010

Does RM* ever use any number large enough to flip the MSB of an unsigned short (0x8000)? If not, I suggest constructing a temp array with each non Fixnum object, and replacing the Table entry of it with the object's index in the temp array bitwise-or'ed with 0x8000, then appending this array with a standard Marshal.dump call.

For example, I have a Table instance with dimensions of 2x2x1 with the following data in it:
[[[Object.new], [255]], [[99], [Object.new]]]

Dumping this table with this change would result in something like this:
0x00000003,0x00000002,0x00000002,0x00000001,0x8000,0x00FF,0x0063,0x8001<dumped array of Object.new and Object.new>

Theoretically, this change would be mostly backwards compatible with the original serialization method of the Table class, assuming that it ignores everything after the last data entry and assuming that 0-sized temp arrays are not dumped.

vgvgf · Jul 23, 2010

The RGSS Table class has an optimized dump and load for uint16 integers. That's why it uses pack, and gets a smaller size in dumps. If it would dump all the table integers with the normal ruby dump, it would take an extra byte for each number in the table bigger than 127. (Btw, that's an old version of my table class, newest one: http://www.arpgmaker.com/viewtopic.php?p=624725#p624725)

If you want to have a table for any kind of object, optimization in size would be hard. I suggest just to create a new table class (or maybe a multidemensional array), and dump it as a normal object, here is an example:
[pastebin]72[/pastebin]
No great testing there, but it may work, and it should work for any number of dimensions. Dumping a Table.new(500, 500, 3) was fast (less than 10 seconds), but was 4,79 MB in size.

Edit: If you want to optimize the size of dumps, it would be hard for normal ruby objects like, integers, strings, floats, and else in the table if you don't know where they are exaclty. But for the custom classes, you can do as etheon suggested, but may be worse in some situations.

It depends on what most of your data in the table would be. Integers in ruby take when dumped:
for -127..127: 2 bytes
for -256..256: 3 bytes
for -32767..32767: 4 bytes
And so on...

If you know that all integers will be between -127..127 o 0..255, you can save 1 byte with a custom dumping. If between -32767..32767 or 0..65535, you can save 2 bytes for numbers greater than 256. However, if the integers are outside the -32767..32767 or 0..65535 ranges, you will need to pack all integers as 4 bytes long, and you will loose 2 bytes compared to ruby dumping when integers are between -127..127, one byte when they are between -256..256, and save 1 byte when they are outside from -32767..32767.

e · Jul 24, 2010

If they're simple structures, it shouldn't be very hard to dump. It depends on how static your data structure is, but if you always know the types of the variables you're dumping, and if these tend to be primitive types, then creating custom dump/load methods for each class is definitely the fastest, most optimized way.

Yeyinde · Jul 26, 2010

Check this one one:

Code:

class Table

    def initialize(x, y = 0, z = 0)

        @dim = 1 + (y > 0 ? 1 : 0) + (z > 0 ? 1 : 0)

        @xsize, @ysize, @zsize = x, [y, 1].max, [z, 1].max

        @data = Array.new(x * y * z, 0)

    end

    def [](x, y = 0, z = 0)

        @data[x + y * @xsize + z * @xsize * @ysize]

    end

    def []=(*args)

        x = args[0]

        y = args.size > 2 ? args[1] : 0

        z = args.size > 3 ? args[2] : 0

        v = args.pop

        @data[x + y * @xsize + z * @xsize * @ysize] = v

    end

    def _dump(d = 0)

        s = [@dim, @xsize, @ysize, @zsize, @xsize * @ysize * @zsize].pack('LLLLL')

        a = []

        ta = []

        @data.each do |d|

            if d.is_a?(Fixnum) && (d < 32768 && d >= 0)

                s << [d].pack("S")

            else

                s << [ta].pack("S#{ta.size}")

                ni = a.size

                a << d

                s << [0x8000|ni].pack("S")

            end

        end 

        if a.size > 0

            s << Marshal.dump(a)

        end

        s

    end

    def self._load(s)

        size, nx, ny, nz, items = *s[0, 20].unpack('LLLLL')

        # The * breaks apart an array into an argument list

        t = Table.new(*[nx, ny, nz][0,size])

        d = s[20, items * 2].unpack("S#{items}")

        if s.length > (20+items*2)

            a = Marshal.load(s[(20+items*2)...s.length])

            d.collect! do |i|

                if i & 0x8000 == 0x8000

                    a[i&~0x8000]

                else

                    i

                end

            end

        end

        t.data = d

        t

    end

    attr_accessor(:xsize, :ysize, :zsize, :data)

end

It can dump non-uint16 items, and is completely backwards compatible with the original Table implementation.

Dargor · Jul 26, 2010

Sorry for the delay.

Thanks a lot for the help guys, I really appreciate it.

I tried a couple of things based on your suggestions. What work the best up to now is vgvgf's method (but I haven't tried Yeyinde's one yet). It's a lot faster than anything else I tried (it took me about 7 seconds dumping a 750000 objects big table).

To put you in context a bit, I'm making my own map editor for my FF6 SDK. My FF6::Map class contains an array of FF6::Map::Layer objects which contains a tile table (map_width, height, 1) and this table will contain FF6::Map::Tile objects.

At the moment, I can create, edit, save and load my map using the modified table class and it works fine. I can also create a map by hand in the script editor in RM, save it and load it and this also works fine. However, I don't know why but when creating a map in my editor and loading it in RM, I get an error saying "dump format error".

It happens only when trying to load objects with a Table.
My script library is completly new. Nothing is left from RM except the game executable and the RGSS dll.

I think I'll be using only 1 specific object in this table so I know exactly what the variables are in . They all are Fixnum.

I'll check your script Yeyinde and try will also try to create a dump and _load method for my Tile class.

Thanks a lot!

Envision, Create, Share

Improving the Table class

Dargor

e

Dargor

e

Yeyinde

vgvgf

e

Yeyinde

Dargor

Share this page

Thank you for viewing

Discord

RPG Maker

Content