Completing a personal milestone by tying up a long-time loose end :)
It honestly doesn't feel like it, but it's been exactly 10 years since I first released the [Only registered and activated users can see links. Click Here To Register...] project. For anyone that doesn't know or remember, the community was working together in hopes of getting leaked JSRO files fixed and working. Working JSRO files never happened at that time, but VSRO 188 files were released a few months later, and the community finally had what it desired.
Looking back, it was a lot of fun for me, and an enjoyable experience. Nowadays, I see a lot of people talk about the state of the community and just how bad it is. A lot of people feel doing things for the community is a waste because of any number of reasons I won't bother to list. However, I'm of a different opinion. I think the community nowadays is pretty much the same as it was 10 years ago. We're all obviously a lot older and more experienced now, so we might see things differently than we used to, but for the most part, I think the nature of things is the same as it always has been.
I'd like to pre-emptively address the question of "Why come back 10 years later and work on this stuff?" my answer is simple: This is what I find to be fun! That simple answer is what drove me to work on this stuff in the first place, and remains to be my primary motivation. Community recognition, acknowledgment, and thanks are always appreciated and make me feel good, but the work itself is what is truly fulfilling.
With that out of the way, I'm going to post my now fully completed certification format, and then talk about a few things I've learned recently re-reversing it. I should note, this is just an initial information thread. An updated tools thread might come later. I have revamped a lot of my old code to be more modern and streamlined, but I'm also working on a lot of things at once, so I'd like to organize the project better for community consumption first, and that takes time.
Just to note, the comments about offsets and sizes are all code generated! I did not manually type all of that out.
Native.cs
Before I start, I'd like to shout out @[Only registered and activated users can see links. Click Here To Register...] for his [Only registered and activated users can see links. Click Here To Register...] project (which was released 5 years ago).
While I did not look at the details of it until after I had already updated most of my work, I saw he made some advancements on things I had initially missed, such as the SMC security stuff that is a part of the certification format, but not included in any "packt.dats". I personally wouldn't go the DB route for trying to manage certification stuff nowadays, but it's still cool to see people trying new and interesting things.
I guess that leads us into the first set of changes from the original certification reversing. At the end of certification data, there's another group of SMC related security data. This data is optional, and not included in any "packt.dat", but GlobalManager will try to parse and load it if it exists.
After I realized I had missed that data, I then saw I had originally misparsed the certification data itself. Luckily, the error was such that it didn't cause any problems, but in my new updated tools, I've finally fixed that. To quickly summarize how it should parse: the first byte is 1 if there's certification data and 0 if not for 0xA003. Then, for each group of certification data (the structs listed above) parse 1 dummy byte, and then 1 byte to know if there's certification data for that group.
In my original tools, I read the wrong data type at each location, but the total number of bytes read was still correct, so that error would only result in the non-existent security data at the end from not being parsed. It's funny it worked out that way, but I can't complain.
Next, I had a few wrong data types (such as in NativeNodeLink), and DaxterSoul had those fixed as well, but nothing ended up breaking as a result just due to what was wrong and how the data after it wasn't being used for static certification. Another lucky break. I went through and track down all the correct data types of each field by checking memory accesses across the modules.
That leads me into the last big thing to complete the certification format. As it turns out, Joymax used an old school method of sending placeholder data to reserve memory, so server modules can then fill it out with their own values at runtime without having to allocate a different certification object. Typically, network protocols are designed in such a way to minimize the size of data passed around, but for something like server certification, where it only runs once per server startup, it makes for a simpler design.
The reason why my original certification worked despite all the extra unknown stuff, was because all of the unknown stuff was pretty much runtime data! By reversing each of them now, I've gotten a better understanding of how certification data is stored in memory and how the server modules use it. I know 10 years ago I was certainly aware of this approach, but it never dawned on me at the time it would be used in this type of scenario.
I probably need to cleanup a few naming conventions, but for now I'll leave things as they were, since Joymax uses inconsistent names for things as well. Fixing the parsing bugs and making a new, simpler API to work with certification data is certainly more useful than simply knowing 100% of the format, but I wanted to share it anyways to complete the puzzle.
I think the biggest surprise for me is the runtime data stuff. There are lookup pointers, which make sense, and also other misc data (in NativeNodeLink and NativeShard for example) that is only set in certain modules. I actually had 2 unknown fields I had been trying to reverse the past week before I could make this thread: NativeNodeLink::connection_session and NativeShard:: operational_state
NativeNodeLink::connection_session took some time, because I didn't quiet understand how the server modules use the NodeLink data. I do now though, but it was really easy to determine once I saw the values in memory and the server logs when I was restart connections in different modules.
As it turns out, NativeShard:: operational_state is only updated in GatewayServer, so I had set HWBPs on the data in all modules and tried to track it down. It took some time, but I finally noticed it when on the title screen requesting server stats, as I had been only testing data while logged into the server.
Since certification stuff has been working the entire time, despite the flaws, and it's not too hard to make simple tools to build certs for you, there's not much that changes from knowing the exact format. It's nice to have though, but besides just having wanted to complete this task, my real focus was trying to understand the server modules and why various certification issues exist.
For example, it's known that you need to use a tool like ForceBindIP if you want to use multiple modules on the same physical machine. My question is "why?". Why do the server configs have support for a "certification_ip_bind", yet don't bind to that IP? How can you fix the server modules so you don't have to use ForceBindIP and can just use a correct certification architecture?
I finally have the answers to all those questions and more, but they're best for another thread ;)
That about wraps up this thread though. No promises or timelines on updated tools that make use of this stuff. Long ago I wrote a tool to create a VSRO dev server setup (as in, not meant to be run as a commercial project) using a single button click, and I'd like to remake that, as the old version was limited and for older windows/mssql versions.
That means finishing up some other tools and putting everything together in a nice package, which just takes time and a lot of testing. I needed to solve some of the cert stuff and module issues to know why the 2nd gameserver setups were so convoluted, so now that it's figured out, I can get back to doing some of the other stuff I wanted.
It honestly doesn't feel like it, but it's been exactly 10 years since I first released the [Only registered and activated users can see links. Click Here To Register...] project. For anyone that doesn't know or remember, the community was working together in hopes of getting leaked JSRO files fixed and working. Working JSRO files never happened at that time, but VSRO 188 files were released a few months later, and the community finally had what it desired.
Looking back, it was a lot of fun for me, and an enjoyable experience. Nowadays, I see a lot of people talk about the state of the community and just how bad it is. A lot of people feel doing things for the community is a waste because of any number of reasons I won't bother to list. However, I'm of a different opinion. I think the community nowadays is pretty much the same as it was 10 years ago. We're all obviously a lot older and more experienced now, so we might see things differently than we used to, but for the most part, I think the nature of things is the same as it always has been.
I'd like to pre-emptively address the question of "Why come back 10 years later and work on this stuff?" my answer is simple: This is what I find to be fun! That simple answer is what drove me to work on this stuff in the first place, and remains to be my primary motivation. Community recognition, acknowledgment, and thanks are always appreciated and make me feel good, but the work itself is what is truly fulfilling.
With that out of the way, I'm going to post my now fully completed certification format, and then talk about a few things I've learned recently re-reversing it. I should note, this is just an initial information thread. An updated tools thread might come later. I have revamped a lot of my old code to be more modern and streamlined, but I'm also working on a lot of things at once, so I'd like to organize the project better for community consumption first, and that takes time.
Just to note, the comments about offsets and sizes are all code generated! I did not manually type all of that out.
Native.cs
Before I start, I'd like to shout out @[Only registered and activated users can see links. Click Here To Register...] for his [Only registered and activated users can see links. Click Here To Register...] project (which was released 5 years ago).
While I did not look at the details of it until after I had already updated most of my work, I saw he made some advancements on things I had initially missed, such as the SMC security stuff that is a part of the certification format, but not included in any "packt.dats". I personally wouldn't go the DB route for trying to manage certification stuff nowadays, but it's still cool to see people trying new and interesting things.
I guess that leads us into the first set of changes from the original certification reversing. At the end of certification data, there's another group of SMC related security data. This data is optional, and not included in any "packt.dat", but GlobalManager will try to parse and load it if it exists.
After I realized I had missed that data, I then saw I had originally misparsed the certification data itself. Luckily, the error was such that it didn't cause any problems, but in my new updated tools, I've finally fixed that. To quickly summarize how it should parse: the first byte is 1 if there's certification data and 0 if not for 0xA003. Then, for each group of certification data (the structs listed above) parse 1 dummy byte, and then 1 byte to know if there's certification data for that group.
In my original tools, I read the wrong data type at each location, but the total number of bytes read was still correct, so that error would only result in the non-existent security data at the end from not being parsed. It's funny it worked out that way, but I can't complain.
Next, I had a few wrong data types (such as in NativeNodeLink), and DaxterSoul had those fixed as well, but nothing ended up breaking as a result just due to what was wrong and how the data after it wasn't being used for static certification. Another lucky break. I went through and track down all the correct data types of each field by checking memory accesses across the modules.
That leads me into the last big thing to complete the certification format. As it turns out, Joymax used an old school method of sending placeholder data to reserve memory, so server modules can then fill it out with their own values at runtime without having to allocate a different certification object. Typically, network protocols are designed in such a way to minimize the size of data passed around, but for something like server certification, where it only runs once per server startup, it makes for a simpler design.
The reason why my original certification worked despite all the extra unknown stuff, was because all of the unknown stuff was pretty much runtime data! By reversing each of them now, I've gotten a better understanding of how certification data is stored in memory and how the server modules use it. I know 10 years ago I was certainly aware of this approach, but it never dawned on me at the time it would be used in this type of scenario.
I probably need to cleanup a few naming conventions, but for now I'll leave things as they were, since Joymax uses inconsistent names for things as well. Fixing the parsing bugs and making a new, simpler API to work with certification data is certainly more useful than simply knowing 100% of the format, but I wanted to share it anyways to complete the puzzle.
I think the biggest surprise for me is the runtime data stuff. There are lookup pointers, which make sense, and also other misc data (in NativeNodeLink and NativeShard for example) that is only set in certain modules. I actually had 2 unknown fields I had been trying to reverse the past week before I could make this thread: NativeNodeLink::connection_session and NativeShard:: operational_state
NativeNodeLink::connection_session took some time, because I didn't quiet understand how the server modules use the NodeLink data. I do now though, but it was really easy to determine once I saw the values in memory and the server logs when I was restart connections in different modules.
As it turns out, NativeShard:: operational_state is only updated in GatewayServer, so I had set HWBPs on the data in all modules and tried to track it down. It took some time, but I finally noticed it when on the title screen requesting server stats, as I had been only testing data while logged into the server.
Since certification stuff has been working the entire time, despite the flaws, and it's not too hard to make simple tools to build certs for you, there's not much that changes from knowing the exact format. It's nice to have though, but besides just having wanted to complete this task, my real focus was trying to understand the server modules and why various certification issues exist.
For example, it's known that you need to use a tool like ForceBindIP if you want to use multiple modules on the same physical machine. My question is "why?". Why do the server configs have support for a "certification_ip_bind", yet don't bind to that IP? How can you fix the server modules so you don't have to use ForceBindIP and can just use a correct certification architecture?
I finally have the answers to all those questions and more, but they're best for another thread ;)
That about wraps up this thread though. No promises or timelines on updated tools that make use of this stuff. Long ago I wrote a tool to create a VSRO dev server setup (as in, not meant to be run as a commercial project) using a single button click, and I'd like to remake that, as the old version was limited and for older windows/mssql versions.
That means finishing up some other tools and putting everything together in a nice package, which just takes time and a lot of testing. I needed to solve some of the cert stuff and module issues to know why the 2nd gameserver setups were so convoluted, so now that it's figured out, I can get back to doing some of the other stuff I wanted.