[WIP] ArzBaller - the CLI for packing dbrs

There is evidently demand for a CLI arz-packer and since I know the format of the arz files, I have begun putting this together.

There are a few points of discussion to be had here that can start while I am busy coding this thing up.

First of all, each record has 2 ints that store the date/time of last modification (I suspect). When you are going from dbrs (text files) to the arz format, I guess this information is available, but IMO, this is for the Assetmanager, not the game, so I will likely just zero out this area and not worry about it. Anyone know anything different with regard to this?

Second, the way the data is compressed, there is flag to describe if the field value is integer/float/boolean/string datatype. Going the opposite way will require using templates to obtain this information. Are there any modders making new templates? If so, we might need to start thinking about including them in source distributions of mods…cause guessing what datatype the field values are could bork things, if the code guesses wrong.

After that, I dunno, what else should I be aware of? Besides Davood, are there any other people out there waiting on this?

Attachment: 20160818064533_1.jpg

Arz format consists of 5 blocks of binary data: Header (HDR), Record Table (RT), Record Data (RD), String Table (ST) and the Footer (FTR).

The binary format is LittleEndian with SinglePrecision Floats.

The HDR consists of 6 32-bit ints.

  1. A version number (HDR_VER)
  2. The offset to the Record Table (HDR_RDPOS)
  3. The size of the Record Table (HDR_RDSIZE)
  4. The record count (HDR_RDCNT)
  5. The offset to the ST (HDR_STPOS)
  6. The size of the ST (HDR_STSIZE)

The RD directly follows the RT and has the same count of records, so offset/size information is not needed for both blocks. I probably should refactor the code to call the HDR values RTPOS and RTSIZE, but whatever. Strings are stored as int32 (size) + char* (string), referred to a CStrings. In most places, strings are stored as the 32-bit integer index of their position within the String Table (STidx: see below).

Beginning at HDR_RDPOS you can read in HDR_RDCNT Record Table entries with the following format of 32-bit ints plus a CString.

  1. The record name as an STidx
  2. The record Class as a CString
  3. The compressed data offset
  4. The compressed data size
  5. The uncompressed data size
  6. int32 - A, which I call fileLastWriteTimeLow
  7. int32 - B, which I call fileLastWriteTimeHigh

Once you have read all of these Record Table entries, you have the locations to the compressed data for each record. Using the LZ4 function LZ4_decompress_fast() you can unpack read block of compressed data.

Records are basically just a list of fields or name->value pairs. While each fieldName is stored as STidx, the fieldValues can be of various data types and can be singular values or arrays of these data types.
The uncompressed record data contains a series of fields with the format:

  1. dataType - a 16-bit integer for an enum of Int,String,Bool,Float
  2. valueCount - 16-bit integer for the number of values
  3. nameIndex - 32-bit STidx for the fieldName

This triplet is then followed by valueCount 32-bit integers or single-precision floats. Int,String and Bool are all stored as 32-bit integers. If the dataType is String, then the integer is an STidx. If it is boolean, it is merely 0/1, and if it is integer, then it has the actual value. If the dataType is float, then valueCount floats are stored instead of integers.

The String Table starts with a 32-bit int that is the number of strings in the table, and then just CString after CString of all the strings. When you read then all into an array, the index for each string becomes the STidx. This is the way most strings are stored, once in the ST and everywhere else as the STidx to that ST entry.

The Footer is four 32-bit integers from adler32() (zlib) calls on buffers containing various blocks of the data in binary format. Think of it as a bunch of checksums to ensure data integrity.

  1. adler32 for the HDR+RT+RD+ST
  2. adler32 for the ST
  3. adler32 for the RT
  4. adler32 for the RD

I determined all this from the source code of TQVault, a modding forum when zlib was switched to LZ4 for RD compression, and Rhis. So do not get your hopes up and think I am some kind of master reverse engineer type. I post this all here in English in case I am wrong about something, then maybe bugs in my code can be fixed without anyone needing to read the code.

Ignore version 0.0 and 0.1 they are broken. v0.2 is current testing version. awaiting more bug reports and then will move on to optimization for speed.

v0.1 : Fixed major bug where parser was expecting only comma delimiter and ignored semi-colon. Reformatted Float output to match the 0.000000 format.

So, I am ready to let this out in the wild and see if we cannot get it totally functional as a group.

I have attached the source code to this post. I assume that it will only be usable with QtCreator…as that is what I used to build it.

On google drive I put a zip containing the arzBaller.exe and the QtCore.dll, which I know needs to be in same folder as the exe unless you have Qt installed. In addition there is a pack.bat and unpack.bat that shows the usage. And the relevant parts of Grimmest used in testing. It is likely that during my failed attempt to port to Visual Studio C++, I have dlls installed that are not in the zip, post here what is missing and I will re-upload with the dlls included.

This was all done in 2.5 days on a laptop. I had not used Qt since 4.7 and they are now on 5.7 and 4->5 broke backwards compatibility so I just rewrote the whole thing, getting rid of lots of cruft and stuff that was half done in the old version. But on windows in QtCreator IDE, I failed to get it uploaded to git. So for now it is just on my google drive.

I am a fan of Qt, but could care less about most aspects of OO programming, so if someone is going to rage that I am not using getters and setters and private and protected and all that jazz, well, have fun with that.

The extraction process is fast, the compiling is very slow, I am sure I can speed this up, but I am on vacation and going to take a break for a day and see what kind of help I can find before continuing. I really never thought I would be getting involved in this. But here we go.

Links to Source v0.1

And test package with Binary v0.1:

Link to source [ OLD v0.0 ]

Link to test:

yeah I guess it stores some data to determine if it should replace that record or move on to the next. But Id be happy if we have a CLI, if it takes a min to compile every time rather than a sec because I only changed 1 file… who cares :smiley:

Besides Davood, are there any other people out there waiting on this?

well, when its out I am sure going to use it :wink:

even if its just Davood using it, thinking about the possibilities for DAIL makes it worth it. And really any other mod, with a CLI it is possible to let the user tweak some data and compile the files without having any modding experience.

You don’t need to care about new templates since they wouldn’t work as there is no code that would work with them.

Looking forward to the packer.

Will you release the source so that we can add stuff like replace / check for existant stuff etc?

I was always planning to write some type of mod installer that could be globally used and configured individually for every mod but due to no public existant cli I never got to do it…

Source will be available as soon as I rewrite it all and clean it up. (It is in Qt/C++/C)

I might even need help getting it finished, so definitely open sourcing this.

Yes.

/10chars

nn.n.n.n…ni.nig…nightly builds !!!1111111111111111

Thanks for this, parsing this template file is a pain, so I will create a binary object for the code with all the variable types from the arz.

yes!

also, will it be cross platform, or will you be using windows-only stuff?

Qt Source files are inherently cross-platform, if you compile it on your platform it should work. I am not using any Windows-specific trickery.

If you try it (once I post it that is), let me know if it works or not.

when will you post ? :stuck_out_tongue:

Soon…working or not. (purposefully vague, but you know how it is)

I need to know if anyone has seen any mod with records that are not in the subdirectory “records/”.

yep I do, and so does DAIL :slight_smile: I also believe everyone should do it that way, makes it easier to update a mod… .

DAIL/
mod_wanez/

cant you just start looking for all subdir starting at database rather than records? :slight_smile:

I will now that I know this. Thanks.

yeah, I ought to :stuck_out_tongue:

see you “soon”, then :smiley:

Soon should be tomorrow.

The code is all re-written, can extract and repack and then re-extract. BUT, the game will not load with my arz…so something is amiss. If I cannot fix it myself by end of tomorrow, I will just post it and see if anyone else can find my error.

can you extract .arz other than the one you packed? if so, and Im just going to point out the most obvious thing, since most of the time its a missing wire or something… and before you go over the code for hours.
is the name of the .arz the name of the mod? :slight_smile:

I have been testing with the game database.arz. And while I am not immune to making idiotic mistakes such as the one you propose, in this case, that is not the problem. I fear the problem is that AssetManager gets more information that just the text files (dbrs) when compiling…

this is great. ill be back to modding on tuesday hopefully.

i think 0054 of DAIL will self-build and seriously save on the hard drive space.

very exciting :smiley:

thank you so much for doing this jiaco