User:Dr ishmael/Gw.dat

I may as well put all this in one place, since I've been asked about it twice in the past month now.

Gw.dat info
For detailed info on the compression and internal file structure of Gw.dat, the XeNTaX wiki has an article about it, as does Guild Wars Wiki.

Some files have a header that identifies the file type: texture files begin with either ATEX or ATTX, while FFNA and AMAT are probably other game "material" files, i.e. meshes or shaders or something. Sound files are compressed in the mp3 format, and can be identified by an mp3 header of 0xFAFF or 0xFBFF.

All files in Gw.dat are little-endian, meaning the bytes are ordered with the least significant byte first. (i.e. 300 stored as an integer would be ). This also applies to Unicode text encodings, as described below.

There is no folder structure within Gw.dat, just a master file table that lists all the files.

Text files
Some of the files are text files, containing all of the in-game text: skill names and descriptions, NPC and quest dialogues (assumed), and text for the in-game store. Text files do not have a defined header to identify them, so they must be identified through their other characteristics.


 * The last 2 bytes of the file designate the language code and file code. Currently there are 12 languages and 91 (0x00 – 0x60) files.  The languages are:
 * {| class="stdt"

! Code (hex) !! Language
 * 00 || English
 * 01 || Korean
 * 02 || French
 * 03 || German
 * 04 || Italian
 * 05 || Spanish
 * 06 || Traditional Chinese
 * 07 || Simplified Chinese
 * 08 || Japanese
 * 09 || Polish
 * 0A || Russian
 * 11 || Bork!
 * }
 * 06 || Traditional Chinese
 * 07 || Simplified Chinese
 * 08 || Japanese
 * 09 || Polish
 * 0A || Russian
 * 11 || Bork!
 * }
 * 0A || Russian
 * 11 || Bork!
 * }
 * 11 || Bork!
 * }


 * A text file contains exactly 1024 strings, each identified by a 6-byte header.
 * Bytes 0 and 1 encode the length of the string, in bytes, including the header. Thus the smallest length would be 0x06 0x00, for the mandatory 6-byte header.  The longest string in the English files is 2986 bytes, encoded as 0xAA 0x0B.
 * Byte 2 is a mystery. For all readable UTF-16 strings, it is 0x00, but it takes on a wide range of values for the unreadable strings.
 * Byte 3 is 0x00 for all UTF-16 strings and for most of the unreadable ones, but can also be 0xFF.
 * Byte 4 seems to be an indicator for the encoding of the string. For all readable UTF-16 strings, it is 0x10 (16).  Most of the other strings have it set to 0x07, but if that means the string is encoded as UTF-7, I haven't been able to figure out how to read it.
 * Byte 5 is always 0x00.


 * For strings that have header byte 4 = 0x10, the string itself is encoded in implicit (no byte-order marker) little-endian UTF-16.
 * For the English files, this means that every other byte is 0x00, since all standard English letters and punctuation are covered in the first 255 codepoints of Unicode.

Extracting from Gw.dat
Mostly I use GW-Unpacker to decompress and extract the individual files out of Gw.dat. It sorts them by file type (texture vs sound vs text etc.; textures are further sorted by compression method). It comes with a program called ATEXReader that can convert the raw ATEX and ATTX textures into png files.


 * Text files: I've compiled a custom version that ignores any file that can be identified (see above) and only writes out the "Unknown" files. Then, I run a Perl script that sorts out the files, looking at the first 6 and last 2 bytes for the characteristics of a text file, putting all the English-language files in their own folder, which I archive after every game update.  Finally, I use WinMerge to compare the current English folder to the last archived folder to quickly find everything that was changed or added.
 * Inventory icons: All inventory icons can be found tacked on to the end of their associated FFNA file - to get the icon, search for ATEXDXT3, delete everything before that, and you've got a standard ATEX/DXT3 file. After you've seen an item in-game, the icon will also be stored as a separate file.
 * On my system (a 5-year-old Athlon at 1.81 GHz), it takes about 1 hour to completely unpack my Gw.dat (3.7 GB), and about 15 minutes for the text-only unpacking.

Other methods of obtaining texture files

 * 3D Ripper DX is a wrapper program that hooks into Gw.exe and extracts fully rendered textures from your video card's memory while the game is running. Textures are exported as .dds files, which both Photoshop and the GIMP require plugins in order to read.  This is the only way to get usable skill icons - the raw textures in Gw.dat apparently have shaders or some other processing applied to them before being displayed in-game.  This is also a fairly direct way to get inventory icons, since 3D Ripper will only rip what's actually displayed on-screen.  Both types of icons are 5kB .dds files, and very few other textures are that size, making it easy to find them among the 3D Ripper output.
 * In order to get textures and numeric data for skills not normally available to characters, I have to use a memory editor to trick the game into displaying them. This only works in the Skills and Attributes window - there's no way to modify your or your heroes' actual skill bars, so unfortunately, you can't run around using something like Touch of Dhuum everywhere.  Skills and effects are all treated the same by the game engine and are identified internally by integer ids, which are documented at GWW:Guild Wars Wiki:Game integration/Skills.