User:Dr ishmael/Gw.dat

I may as well put all this in one place, since I've been asked about it twice in the past month now.

Gw.dat info
For detailed info on the compression and internal file structure of Gw.dat, the XeNTaX wiki has an article about it, as does Guild Wars Wiki.

Some files have a header that identifies the file type: texture files begin with either ATEX or ATTX, while FFNA and AMAT are probably other game "material" files, i.e. meshes or shaders or something. Sound files are compressed in the mp3 format, and can be identified by an mp3 header of 0xFAFF or 0xFBFF.

All files in Gw.dat are little-endian, meaning the bytes are ordered with the least significant byte first. (i.e. 300 stored as an integer would be ). This also applies to Unicode text encodings, as described below.

There is no folder structure within Gw.dat, just a master file table that lists all the files.

Text files
Some of the files are text files, containing all of the in-game text: skill names and descriptions, NPC and quest dialogues (assumed), and text for the in-game store. Text files do not have a defined header to identify them, so they must be identified through their other characteristics.


 * The last 2 bytes of the file designate the language code and file code. Currently there are 12 languages and 88 files.  The languages are:
 * {| class="stdt"

! Code (hex) !! Language
 * 00 || English
 * 01 || Korean
 * 02 || French
 * 03 || German
 * 04 || Italian
 * 05 || Spanish
 * 06 || Traditional Chinese
 * 07 || Simplified Chinese
 * 08 || Japanese
 * 09 || Polish
 * 0A || Russian
 * 11 || Bork!
 * }
 * 06 || Traditional Chinese
 * 07 || Simplified Chinese
 * 08 || Japanese
 * 09 || Polish
 * 0A || Russian
 * 11 || Bork!
 * }
 * 0A || Russian
 * 11 || Bork!
 * }
 * 11 || Bork!
 * }


 * A text file contains exactly 1024 strings, each identified by a 6-byte header.
 * Bytes 0 and 1 encode the length of the string, in bytes, including the header. Thus the smallest length would be 0x06 0x00, for the mandatory 6-byte header.  The longest string in the English files is 2986 bytes, encoded as 0xAA 0x0B.
 * Byte 2 is a mystery. For all readable UTF-16 strings, it is 0x00, but it takes on a wide range of values for the unreadable strings.
 * Byte 3 is 0x00 for all UTF-16 strings and for most of the unreadable ones, but can also be 0xFF.
 * Byte 4 seems to be an indicator for the encoding of the string. For all readable UTF-16 strings, it is 0x10 (16).  Most of the other strings have it set to 0x07, but if that means the string is encoded as UTF-7, I haven't been able to figure out how to read it.
 * Byte 5 is always 0x00.


 * For strings that have header byte 4 = 0x10, the string itself is encoded in implicit (no byte-order marker) little-endian UTF-16.
 * For the English files, this means that every other byte is 0x00, since all standard English letters and punctuation are covered in the first 255 codepoints of Unicode.

Extracting from Gw.dat
There are a couple programs available that can perform the decompression and extraction of files from Gw.dat.


 * GW-Unpacker is a command-line program that decompresses Gw.dat and extracts every single file, sorting them into folders by file type (texture vs sound vs text etc.; textures are further sorted by compression type and level). You can tell it what file index to start from and stop at, but that's not usually very useful, since there's no way of knowing where in Gw.dat the file you're looking for is located.
 * The linked archive also contains a program called ATEXReader that can convert the raw ATEX and ATTX textures into png files.
 * I use this mostly for extracting the text files - I've actually compiled a custom version that ignores any file that can be identified (see above) and only writes out the "Unknown" files. Then, I run a Perl script that sorts out the files, looking at the first 6 and last 2 bytes for the characteristics of a text file, putting all the English-language files in their own folder, which I archive after every game update.  Finally, I use WinMerge to compare the current English folder to the last archived folder to quickly find everything that was changed or added.
 * On my system (Athlon64 300+ 1.81 GHz), it takes about 1 hour to completely unpack my Gw.dat (3.7 GB), and about 15 minutes for the text-only unpacking.


 * GWDatBrowser is a program that presents a listing of all the files in Gw.dat for you to browse through, and it includes a rudimentary texture renderer and sound player. This can be fun if you just want to kill time, but otherwise it isn't very useful.  "New" files that get downloaded to Gw.dat will usually appear at the end,

Other methods of obtaining texture files

 * 3D Ripper DX is a wrapper program that hooks into Gw.exe and extracts fully rendered textures from your video card's memory while the game is running. Textures are exported as .dds files, which both Photoshop and the GIMP require plugins in order to read.  This is the only way to get usable skill icons - the raw textures in Gw.dat apparently have shaders or some other processing applied to them before being displayed in-game.  This is also the most convenient way to get inventory icons.  Both types of icons are 5kB .dds files, and very few other textures are that size, making it easy to find them in the Texture output folder.
 * In order to get textures and numeric data for skills not normally available to characters, I have to use a memory editor to trick the game into displaying them. This only works in the Skills and Attributes window - there's no way to modify your or your heroes' actual skill bars, so unfortunately, you can't run around using something like Touch of Dhuum everywhere.  Skills and effects are all treated the same by the game engine and are identified internally by integer ids, which are documented at GWW:Guild Wars Wiki:Game integration/Skills.