[Euphoria Dev Team Release] Shaiya Data File Tool v2

castor4878 · 08/08/2014, 00:57

Since update 188 (current stage is 214) the european clients use some tailored sdata files.
Their goal is to provide sentences (name, description, and so on) in French, German, Italian, Polish and Turkish.
To do so, a folder "BinarySData" contains several
* xxxData.sdata
that comes with
xxxText_FRC.sdata, xxxText_GER.sdata, xxxText_ITA.sdata, xxxText_POL.sdata & xxxText_TUR.sdata

the first are ciphered with SEED cipher, the Text are in plain.

Quote:

The encryption is the same, the file format isn't.

actually, both are different.

the SEED key is the same, but the header of the ciphered file is (a bit) different:

for classic files, it is:

Code:

#pragma pack(push, 1)
struct sdata_header {
	char	signature[40];
	ulong	checksum;
	ulong	dataSize;
	char	padding[16];
};
#pragma pack(pop)

where the "signature" is "0001CBCEBC5B2784D3FC9A2A9DB84D1C3FEB6E99"

the 'dataSize' field contains the length of the plain data (because the SEED cipher works with 128-bits block, up to 15 padding bytes can be present at the end of the data in order to obtain a block whose length is multiple of 16 before encrypting it).
the 'padding' bytes contains 16 x 00s

with these new DBxxxData.sdata files, the header seems to be:

Code:

#pragma pack(push, 1)
struct sdata_header {
	char	signature[40];
	ulong	nullInt32;	// set to 00
	ulong	checksum;
	ulong	dataSize;
	char	padding[12];
};
#pragma pack(pop)

since the size of plain data is not at the same location, some (all?) of the tools used to decrypt the sdata files will fail to process.

Once the decryption is fixed, the files don't directly give some new clients files (I mainly check the "DBItemData.sdata file so far).
Instead they contain a kind of database definition (hmm isn't their names DBxxx ....), for instance, the DBItemData file reads as: (dump)

Code:

..i.t.e.m.t.y.p.e..i.t.e.m.t.y.p.e.i.d..i.m.a.g.e..i.c.o.n..l.e.v.e.l..c.o.u.n.t.r.y..a.t.t.a.c.k.f.i.g.h.t.e.r..d.e.f.e.n.s.e.f
.i.g.h.t.e.r..p.a.t.r.o.l.r.o.g.u.e..s.h.o.o.t.r.o.g.u.e..a.t.t.a.c.k.m.a.g.e..d.e.f.e.n.s.e.m.a.g.e..g.r.o.w..s.t.r..d.e.x..r.e
.c..i.n.t..w.i.s..l.u.c..v.g..o.g..i.g..r.a.n.g.e..a.t.t.a.c.k.t.i.m.e..a.t.t.r.i.b..s.p.e.c.i.a.l..s.l.o.t..q.u.a.l.i.t.y..e.f.
f.e.c.t.1..e.f.f.e.c.t.2..e.f.f.e.c.t.3..e.f.f.e.c.t.4..c.o.n.s.t.h.p..c.o.n.s.t.s.p..c.o.n.s.t.m.p..c.o.n.s.t.s.t.r..c.o.n.s.t.
d.e.x..c.o.n.s.t.r.e.c..c.o.n.s.t.i.n.t..c.o.n.s.t.w.i.s..c.o.n.s.t.l.u.c..s.p.e.e.d..e.x.p..b.u.y..s.e.l.l..g.r.a.d.e..d.r.o.p.
.s.e.r.v.e.r..c.o.u.n.t..d.u.r.a.t.i.o.n..e.x.t.d.u.r.a.t.i.o.n..s.e.c.o.p.t.i.o.n..o.p.t.i.o.n.r.a.t.e..b.u.y.m.e.t.h.o.d..m.a.
x.l.e.v.e.l..w.e.a.p.o.n.p.a.r.t..d.y.e.i.n.g.t.y.p.e..a.r.g.3..a.r.g.4..u.s.e.c.o.n.t.y.p.e..u.s.e.c.o.n.v.a.r..m.o.n.e.y.t.y.p
.e..i.t.e.m.s.k.i.l.l..i.t.e.m.u.p.g.r.a.d.e..a.r.g.1.0..g.e.n.e.c.o.u.n.t..a.r.g.1.2..s.p.e.l.l.b.o.o.k.e.x.p..s.p.e.l.l.b.o.o.
k.d.u.r.a.b.i.l.i.t.y..b........................................................................................................
..........................................Óa......................................................................°.............

all these labels look like the name of the columns of the item table, they are encoded as unicode (nothing in the client files is in unicode), ~~they are length-prefixed but that length is Big Endian encoded (and not little endian as 100% of the client data)~~ Edit: the lengths are coded on 1 byte.

so far, I see two possible explanations:
- only the xxxxText.sdata files are used (together with the classical items.sdata) to provide i18n'ed captions.
- or, the client uses a internal "BD reader" (instead of the old sdata parser) to recover each items; the label are then added once read from the right text file.

both possibilities offer opportunities to all of us, let just imagine the reading of the definition of the monsters, all their attacks included ...

nubness · 08/08/2014, 13:45

UPDATED:
- Fixed explorer bug
- Fixed name bug
This is the last version of Shaiya Data File Tool v2 that I'm releasing.
If you have any bugs to report, do it today.
Thank you.

castor4878 · 08/09/2014, 00:05

(post lost thanks to the "previous page" button of my mouse ... I rewrite a short version)

Bug: the dialog "Settings" doesn't have a Cancel button or anything (Esc handling) to allow to *not* register the .sah file type.

Question: when a file is extracted from an archive and edited with an external tool (launched by ShellExecute(NULL "open", path, ...); do not set a "folder-watcher" (such as a call to ReadDirectoryChangesW with a FILE_NOTIFY_CHANGE_SIZE file filter) to be notified of external changes and re-insert the file into archive (automatically on after user approval) ?

And (also in short), your tool is the faster & easier to use, definitely.

nubness · 08/09/2014, 00:38

Thanks for the feedback, I'll add a cancel button at the settings window.

As for detecting file changes, I didn't wanna complicate my life too much, and I also thought of the event when the user doesn't necessarily want to import the edited file back in the client.

Thanks for the appreciation.

I received some suggestions which will be implemented for the last release, tomorrow.

castor4878 · 08/09/2014, 11:27

Yep, agree with you; user won't always want an update of the .sah, and even when required, (re)dropping the file into the archive-folder is cake.

My concern (or simply thought) was actually more complex. I can summarize it as: how to manage edition of a file which depends on several other files (like item.sdata).
All scenarios I can imagine have some advantages but too much constraints.
(For instance, expanding the files into a rebuilt file structure instead of the archive root will allow an editor to access .mlt, .dds and all needed files, but will confuse the user; he won't know which files need to be reinserted nor where to find them; also when working with archive, expanded (coz extracted) files shall be the exception, most of the time the folder should be clean (only working files to be soonly packed should stay there).

The (quite) only good option that I think would be to have a communication interface that allows third party applications to request data from a particular file by communicating its relative path in the archive and receving its full content (preferably in plain).

You are currently managing 2 offsets within an opened archive, the offsets of current and parent directory. This allows very fast access to files & folders available at that level (as opposed to my strategy (my obligations) that consists in the parsing of the whole data of the sah, which needs too much time); so having two more offsets and browsing indexes of sah file to access files not connected to the user interface should be possible ... but interapplication communication has always been bad under Windows; and it would be much more than a small addition.

nubness · 08/09/2014, 14:40

I received some reports on this application, and most bugs seemed to be either graphical or other irrelevant things. I admit that there might still be some minor bugs, but it's nothing critical. I think it went far enough for what was supposed to be a simple WPF experiment.

#1 post was updated with the last version.

I'm done updating minor bugs. If anyone finds more critical bugs, like the application damaging certain files, I will look into it. Anything else is not worth the waste of time.

JuuF · 08/09/2014, 17:44

~~bar2eldad~~ · 08/12/2014, 20:24

Nubness . THANK YOU !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!

This features are really usefull . importing 1by1 killed my life.

~~Penchod-Shaiya~~ · 08/14/2014, 08:38

Awesome release!
#thumbs up!

_Diavolino_ · 08/16/2014, 20:32

Hello,
i got on some client blocked on this message :

"index and count must refer to a location within the buffer.
Parameter name: bytes" (based in french refer to the picture...)

I was getting already that message with lilpro tool... but maybe you got some solution to "jump" the protection.

Anyway thanks for reading and eventual answer

Regards,

castor4878 · 08/17/2014, 02:54

The issue is not related to a "protection" and it can not be jumped (it can of course be ignored).

Your index file (data.sah) contains one (or more) invalid records, each record defines one file of the data-set.
Your file have records with invalid offset to start to read the file (offsets greater than .saf file are of course invalid) or invalid length to read (if offset + length > saf size).

It may happen that a few number of records are corrupted (best reason is a buggy updater, or a crash during its run), it such case, you need a tool that extract files and ignore error (instead of stopping).
If the data are yours, it is of course easier to rebuild a .sah index.

OOT, I was also using imageshack ... it's really **** now. boring thumb generation and no way to get a link to the pict.

_Diavolino_ · 08/17/2014, 10:19

oh i see the way of the problem who is not really one but make some explanation to have to understand better that Error Type.

And thank you for that time passed to explain it.

Quote:

it such case, you need a tool that extract files and ignore error (instead of stopping).

is it existing one "beast" like that ?

And about imageShak is Shity now it was one of the best untill one moment... i will go to your provider i think more easy too to use

imgur.com ! lol

Regards,

sominus · 08/18/2014, 06:51

Don't know if still works, but you could try with 'quickbms' and it's shaiya script. I've always used that for full extraction.

Truth1010 · 08/18/2014, 09:07

I think Quick BMS gives the same range of errors, it can't read through the header and doesn't fully extract. I remember trying it with the first set of new EP5 files, so i don't think it will work with the newer ones that are also changed.
I may be wrong though, as i hadn't used QuickBMS before and wasn't fully sure on how tro modify it to read the file properly

~~Penchod-Shaiya~~ · 08/18/2014, 09:07

I can't find out how to "Create the patch" after importing **** loads of stuff...
Am I blind or doesn't this data file tool has it?
Either ways I used normal Data tool, Right Click > Add to patch list (MUCH faster than adding 68 files for adding 1 new set per class).