User:Dr ishmael/Gw.dat

I may as well put all this in one place, since I've been asked about it twice in the past month now.

Gw.dat info
For detailed info on the compression and internal file structure of Gw.dat, the XeNTaX wiki has an article about it, as does Guild Wars Wiki.

Some files have a header that identifies the file type: texture files begin with either ATEX or ATTX, while FFNA and AMAT are probably other game "material" files, i.e. meshes or shaders or something. Sound files are compressed in the mp3 format, and can be identified by an mp3 header of 0xFAFF or 0xFBFF.

All files in Gw.dat are little-endian, meaning the bytes are ordered with the least significant byte first. (i.e. 300 stored as an integer would be ). This also applies to Unicode text encodings, as described below.

There is no folder structure within Gw.dat, just a master file table that lists all the files.

Text files
Some of the files are text files, containing all of the in-game text: skill names and descriptions, NPC and quest dialogues (assumed), and text for the in-game store. Text files do not have a defined header to identify them, so they must be identified through their other characteristics.


 * The last 2 bytes of the file designate the language code and file code. Currently there are 12 languages and 96 (0x00 – 0x60) files.  The languages are:
 * {| class="stdt"

! Code (hex) !! Language
 * 00 || English
 * 01 || Korean
 * 02 || French
 * 03 || German
 * 04 || Italian
 * 05 || Spanish
 * 06 || Traditional Chinese
 * 07 || Simplified Chinese
 * 08 || Japanese
 * 09 || Polish
 * 0A || Russian
 * 11 || Bork!
 * }
 * 06 || Traditional Chinese
 * 07 || Simplified Chinese
 * 08 || Japanese
 * 09 || Polish
 * 0A || Russian
 * 11 || Bork!
 * }
 * 0A || Russian
 * 11 || Bork!
 * }
 * 11 || Bork!
 * }


 * A text file contains exactly 1024 strings. Some are plaintext UTF-16, while others are encrypted.
 * Each string is identified by a 6-byte header:
 * Bytes 0 and 1 encode the length of the string, in bytes, including the header. Thus the smallest length would be 0x06 0x00, for the mandatory 6-byte header.  (This is commonly found at the end of the last text file, where those string IDs are not yet in use.)  The longest string in the English files is 4020 bytes, encoded as 0xB4 0x0F.
 * Byte 2 is a bit odd. For all plaintext UTF-16 strings (in all languages), it is 0x00, but it takes on a wide range of values for the encrypted strings (49 known values).  A working hypothesis is that this reflects the lowest byte value of all characters in the (decrypted) string.  (e.g., if the string is "Backpack" then byte2 is 0x42, the ASCII code for 'B')  It is unknown what purpose this serves.
 * Byte 3 is almost always 0x00. For about 2.5% of strings, both byte2 and byte3 are 0xFF, and this is the only time either byte takes this value (i.e., you'll never find 0xFF00).
 * Byte 4 is similar to Byte 2 in that all UTF-16 strings have the same value (in this case, 0x10 or 16) and takes various other values for the encrypted strings. Unlike Byte 2, the range of values is much smaller (only 6 known values).
 * Byte 5 is always 0x00.


 * The readable strings are encoded in implicit (no byte-order marker) little-endian UTF-16.
 * For the English files, this means that every-other byte is 0x00, since all standard English letters and punctuation are covered in the first 255 codepoints of Unicode. This makes it simple to convert the strings to ASCII when extracting them.
 * These strings cover all skill names and descriptions, zone names, most item names, some NPC names, some NPC dialogue, and various other interface text.
 * The encrypted strings appear to contain the remainder of in-game text, including NPC/quest dialogue.
 * It is possible that Byte 2 or Byte 4 identifies the key that the string is encrypted with.

Extracting from Gw.dat
Mostly I use GW-Unpacker to decompress and extract the individual files out of Gw.dat. It sorts them by file type (texture vs sound vs text etc.; textures are further sorted by compression method). It comes with a program called ATEXReader that can convert the raw ATEX and ATTX textures into png files.


 * Text files: I've compiled a custom version that ignores any file that can be identified (see above) and only writes out the "Unknown" files. Then, I run a Perl script that sorts out the files, looking at the first 6 and last 2 bytes for the characteristics of a text file, putting all the English-language files in their own folder, which I archive after every game update.  Finally, I use WinMerge to compare the current English folder to the last archived folder to quickly find everything that was changed or added.
 * Inventory icons: All inventory icons can be found tacked on to the end of their associated FFNA file - to get the icon, search for ATEXDXT3, delete everything before that, and you've got a standard ATEX/DXT3 file. After you've seen an item in-game, the icon will also be stored as a separate file.
 * On my system (a 5-year-old Athlon at 1.81 GHz), it takes about 1 hour to completely unpack my Gw.dat (3.7 GB), and about 15 minutes for the text-only unpacking.

Other methods of obtaining texture files

 * 3D Ripper DX is a wrapper program that hooks into Gw.exe and extracts fully rendered textures from your video card's memory while the game is running. Textures are exported as .dds files, which both Photoshop and the GIMP require plugins in order to read.  This is the only way to get usable skill icons - the raw textures in Gw.dat apparently have shaders or some other processing applied to them before being displayed in-game.  This is also a fairly direct way to get inventory icons, since 3D Ripper will only rip what's actually displayed on-screen.  Both types of icons are 5kB .dds files, and very few other textures are that size, making it easy to find them among the 3D Ripper output.
 * In order to get textures and numeric data for skills not normally available to characters, I have to use a memory editor to trick the game into displaying them. This only works in the Skills and Attributes window - there's no way to modify your or your heroes' actual skill bars, so unfortunately, you can't run around using something like Touch of Dhuum everywhere.  Skills and effects are all treated the same by the game engine and are identified internally by integer ids, which are documented at GWW:Guild Wars Wiki:Game integration/Skills.