Register for your free account! | Forgot your password?

Go Back   elitepvpers > MMORPGs > Conquer Online 2 > CO2 Programming
You last visited: Today at 03:51

  • Please register to post and access all features, it's quick, easy and FREE!

Advertisement



Debugging 4351 client random closes.

Discussion on Debugging 4351 client random closes. within the CO2 Programming forum part of the Conquer Online 2 category.

Reply
 
Old   #1
 
elite*gold: 0
Join Date: Jan 2008
Posts: 31
Received Thanks: 0
Debugging 4351 client random closes.

The client doesn't log enough, and all i see server side is that the client forcibly disconnected. Sometimes i see an unknown client packet logged by the server, but that seem to have no relation to the crashes.

My research seems to be that it's either.

1) server sending invalid packets, which always causes the client to instantly close.
2) general instability on pre 5017, which is why the servers like to use 5017.
3) weirdness with old clients and new operating systems.
4) router being paranoid and messing with/blocking packets.

The fact that i can't seem to find any private server on the net still using 4351 doesn't help.

This version was deliberately chosen because it's the one before potency, and the source i'm using i have used in the past.

What i've tried.

1) running the server as lan only with 192.168.0.x address space behind the router. Still closes randomly.
2) compatibility mode on client. no difference, though fullscreen works again if i use vista compatibility, which is what the troubleshooter says to use.
3) setting client to one core.
4) disabling SPI on the router.

What i haven't tried yet.
1) 4351 client on another server. (if anyone is actually running one that takes 4351, let me know, so i can test.)
2) placing the server source onto a windows VPS (needs odbc for it's database). that requires money.
3) placing it on a linux vps and trying to run it with wine (would still need odbc to work somehow). in theory this can be done for free for testing purposes.
4) upgrade source to 5017 or something more stable. (quite the undertaking)
5) logging and validating the packets to figure out if there is a bug in the source. This seems unlikely to get anywhere, since the frequency of the crashes is totally random, but it may be what's required. Every time i actually hit a breakpoint all clients end up disconnecting, so that does make me think it could be bad packets.
6) actually installing xp on something.
7) throttling the app.
8) rewriting the socket system (it's the usual xynetsocket, which is known as xynetshit, so that could very well be to blame)

The source i'm using has the rather unique feature of lua scripting for npcs, items, and even making stuff happen during levelup and as time passes, so scrapping the source completely isn't an option unless someone else has another source that does the same that i can port all the custom npcs and quests into.

Any recommendations?
zaphod77 is offline  
Old 01/07/2024, 19:58   #2


 
CptSky's Avatar
 
elite*gold: 0
Join Date: Jan 2008
Posts: 1,434
Received Thanks: 1,147
Most crashes I've seen client-side is due to invalid packets, too large packets or sending invalid IDs (i.e. for meshes). For too large packets, most sources do not respect the max packet size of the client and arbitrarily set a max size server-side, so when you start sending large packets (i.e. Scatter a lot of monsters), it can crash the client.

Back in the days, many servers have ran on 4351 without issues. It was a quite stable patch for pre-potency stuff. So except if it is a bug introduced by new operating systems (so far, Windows remains quite backward compatible for CO), I'd look at the source for issues.
CptSky is offline  
Old 01/07/2024, 20:20   #3
 
elite*gold: 0
Join Date: Jan 2008
Posts: 31
Received Thanks: 0
I do large scatters just fine. (the lab spawns are pretty huge). i've had crashes just wandering around and attacking melee, and with taos zapping with lightning bolts. That shouldn't be overloading the packets, right?

and nothing gets logged by the client at all. is the some trick. there is a debug directory, but pretty sure invalid mesh ids would get logged there.

i sometimes see something about ani index not found, but not always. my most recent crash put nothing in debug at all.

And now had one in the training ground.

It seems something is causing strange and repeatable packet corruption. The impression i'm getting is that an area is showing up wrong in the packets, and i have no idea where the corruption is happening. i'm seeing debug errors for minicons for items that i know for a fact can't be dropping, but it's always the SAME one! There's nothing obvious in the source, and i'm pretty sure that this didn't use to happen.

The client is also doing the random extra characters on the chatline entry thing. I suspect the two are related somehow. I remember it didn't used to do this. Was there ever a fix for that except use a newer client?

never mind using locale stuff made the random characters vanish, but the crashes still happen. so that' clearly not it. still trying to figure out how to actually trace the packets
zaphod77 is offline  
Old 01/08/2024, 22:39   #4
 
elite*gold: 0
Join Date: Jan 2008
Posts: 31
Received Thanks: 0
finally tried an xp virtual machine. no change. this is really annoying.

i'm beginning to wonder if the router is somehow to blame. i have vague memories about this not happening when i was was running this source off a vps, but naturally i don't have one of those. does xynetsocket get confused with lan ips or something?

wait a minute...

running the server in compatibility mode for vista sp2 seems to have fixed things. now how do i get it to run in windows 10 or 11?
zaphod77 is offline  
Old 01/08/2024, 22:55   #5
 
Spirited's Avatar
 
elite*gold: 12
Join Date: Jul 2011
Posts: 8,214
Received Thanks: 4,118
It's probably something to do with the socket system and packet lengths, as CptSky mentioned. Or that's what I've seen the most of as well. Running it in compatibility mode could just be forcing performance constraints on it - like single threading or slower IO. It's not the idea of overloading packets, but overloading the buffer in the socket system that stores packets / messages before being split (assuming your server splits the buffer as well, which could be another cause of disconnects). Those messages in the buffer could be incomplete (cut off at the end of the buffer) and need more bytes to be received to complete it.
Spirited is offline  
Old 01/09/2024, 00:42   #6
 
elite*gold: 0
Join Date: Jan 2008
Posts: 31
Received Thanks: 0
Okay compatibility mode didn't fix it after all (though it did seem to help), but limiting the gameserver to one core core seem to have fixed it, so it seem the issue is probably thread related.

The socket system is xynetsocketlib, judging by what the source says. in a SocketXY.cs.

Seems like it's race conditions that running single core seems to solve, but how the heck do i debug that?

and with it forced to a single core, a different problem happens. instead the sever just starts ignoring all packets sent by the client until i press the kick button.

maybe need to do the same to the login server? gah!

Okay, i understand the new problem.

if a packet arrives too close to another packet for whatever reason, the source decides that there was a packet flood, and fails to acknowledge the packet.

since the server ignores the packet, the client waits and waits for the answer, which never comes, and refuses to actually send any more packets.

And it doesn't mater who send the packet that arrives really close to the same time as the other one.

It seems it was intended to be some anti flood measure, but it doesn't check separately for each connection.

it's odd that this wasn't happening until i fixed the process to one core.

so now i'm going to do a quick sleep and then process the packet anyway instead. either this will fix things, or cause the previous problem to resume happening.
zaphod77 is offline  
Old 01/09/2024, 14:11   #7
 
elite*gold: 0
Join Date: Sep 2014
Posts: 189
Received Thanks: 48
Packets received should be queued and handled in order AFAIK. Maybe this will solve your issue, enqueue the packets your receive and handle them accordingly.
iBotx is offline  
Old 01/09/2024, 17:55   #8
 
elite*gold: 0
Join Date: Jan 2008
Posts: 31
Received Thanks: 0
The source seems to do a lot of asynch callbacks, which only seems to actually work on a single cpu core properly, and naturally i want a simple fix. Which there probably isn't.
zaphod77 is offline  
Old 01/09/2024, 18:53   #9
 
Spirited's Avatar
 
elite*gold: 12
Join Date: Jul 2011
Posts: 8,214
Received Thanks: 4,118
Quote:
Originally Posted by iBotx View Post
Packets received should be queued and handled in order AFAIK. Maybe this will solve your issue, enqueue the packets your receive and handle them accordingly.
That's true if the buffer hasn't been decrypted yet. Once decrypted, messages can be processed in any order. Though, a lot of servers also implement secondary queues per message type to help keep processing fair for attacks and such.

Quote:
Originally Posted by zaphod77 View Post
The source seems to do a lot of asynch callbacks, which only seems to actually work on a single cpu core properly, and naturally i want a simple fix. Which there probably isn't.
Which source are you using for this?
Spirited is offline  
Old 01/11/2024, 03:44   #10
 
elite*gold: 0
Join Date: Jan 2008
Posts: 31
Received Thanks: 0
Pioneer Conquer Server, after trinity gaming got ahold of it and switched it to xynetsocket. it no longer seems to actually use pioneer.dll anymore.

I've also discovered that file corruption on the client side can cause the same insta closing.

ON 1 cpu core, it's stable. compatibility troubleshooter says it needs to be xp compat but that doesn't seem needed. I notice that when shooting arrows the damage shows before the arrow gets there.

The packets do block when being sent, but when there's multiple cpu cores used, only one of them gets blocked, and the second one is free to send packets, apparently. that seems to be the issue. Is there a way to force a compiled c-hsarp exe to use a single core only at compile time?
zaphod77 is offline  
Old 01/11/2024, 09:07   #11
 
Spirited's Avatar
 
elite*gold: 12
Join Date: Jul 2011
Posts: 8,214
Received Thanks: 4,118
Quote:
Originally Posted by zaphod77 View Post
Pioneer Conquer Server, after trinity gaming got ahold of it and switched it to xynetsocket. it no longer seems to actually use pioneer.dll anymore.

I've also discovered that file corruption on the client side can cause the same insta closing.

ON 1 cpu core, it's stable. compatibility troubleshooter says it needs to be xp compat but that doesn't seem needed. I notice that when shooting arrows the damage shows before the arrow gets there.

The packets do block when being sent, but when there's multiple cpu cores used, only one of them gets blocked, and the second one is free to send packets, apparently. that seems to be the issue. Is there a way to force a compiled c-hsarp exe to use a single core only at compile time?
Forcing it to run on a single core is very doable through Task Manager, but that isn't actually a fix for the problem. If you wanna find the real issue, then maybe we can help identify issues with your socket system. We can't do a whole lot for a private source, so if you wanna post the socket system here, then maybe we can make some recommendations.
Spirited is offline  
Old 01/12/2024, 00:31   #12
 
elite*gold: 0
Join Date: Jan 2008
Posts: 31
Received Thanks: 0
The issue is what i said. it's using threads, and letting windows manage them. you know, cuz c sharp is a managed language.

as i said, it does request locks before it sends a packet, but that only works if you are using a single thread. as long as packets aren't being sent simultaneously to the same client, it's fine.

I am able to force it onto a single core with a shortcut. i want it to stay on a core without having to do that.

The structure of the source is NOT very suitable for extracting just the packet part out of it. there does seem to be a queue, but it's usually bypassed as near as I can tell. SO there seems to be both a single packet queue, and a way to skip the queue, and usually the queue is skipped. Which does't work so well on a multicore cpu.

PAckets that skip the queue go here.

from PAcketQ.cs

Quote:
public class PacketSend : Instances
{
private byte[] _packet;
private COClient _COC;




public PacketSend(COClient COC, byte[] packet)
{
try
{
_COC = COC;
_packet = packet;
runImpl();
}
catch (Exception ex)
{
Pioneer.ConsoleColour.SetForeGroundColour(Pioneer. ConsoleColour.ForeGroundColour.Red);
Console.WriteLine(ex.ToString());
Pioneer.ConsoleColour.SetForeGroundColour();
}
}

void runImpl()
{
try
{





if (_packet.Length > 4)
{


Connection.Connection sm = new Pioneer.Connection.Connection(_COC, this._packet);
sm.SendData();

}


}
catch (Exception ex)
{
Pioneer.ConsoleColour.SetForeGroundColour(Pioneer. ConsoleColour.ForeGroundColour.Red);
Console.WriteLine(ex.ToString());
Pioneer.ConsoleColour.SetForeGroundColour();
}
}
}
sendata is this.
Quote:
public void SendData()
{
try
{
if (this == null)
return;
if (_data == null)
return;
if (_client == null)
return;

_data.CopyTo(_Sender, 0);
System.Threading.Monitor.TryEnter(this, new TimeSpan(0, 0, 0, 8, 0));//8
_client.SendDataPus(_Sender);
System.Threading.Monitor.Exit(this);
}
catch (Exception ex)
{
Pioneer.ConsoleColour.SetForeGroundColour(Pioneer. ConsoleColour.ForeGroundColour.Red);
Console.WriteLine(ex.ToString());
World.WinLog.AddErorrMessto(ex.ToString());
Pioneer.ConsoleColour.SetForeGroundColour();
System.Threading.Monitor.Exit(this);
}
}
and SendDataPus is

Quote:
public void SendDataPus(byte[] data)
{
try
{
lock (this)
{
if (data.Length > 100 || (data[2] == 0x4d && data[3] == 0x04)) //|| (data[2] == 0xfe && data[3] == 0x03))
{
Crypto.Encrypt(ref data);

if (asyncCallback == null)
{
asyncCallback = new AsyncCallback(OnSendCallback);
}

Sock.BeginSend(data, 0, data.Length, System.Net.Sockets.SocketFlags.None, asyncCallback, Sock);
}
else
{
Senddata_s.Adddata(data);

if (YY != null || YY.Enabled == true)
{
YY.Stop();
}
YY.Start();
}

}
}
catch (Exception e)
{
Console.WriteLine(e);
}


}
sock.beginsend is apparently standard system level stuff. so what's the error that breaks this on multicore? is it simply that packets aren't actually queued?
zaphod77 is offline  
Old 01/15/2024, 08:38   #13
 
Spirited's Avatar
 
elite*gold: 12
Join Date: Jul 2011
Posts: 8,214
Received Thanks: 4,118
Okay... well, that's not a helpful code snippet, so there's little that can be identified from that. If the project's socket system can't be extracted, then that's a little shocking... Classes and namespaces should be modular. But you do you. Not much I can think of to help you at this point though if you don't share code.

You can try comparing your socket system's flow of operations with my own, if that helps. A socket system is really nothing special... it just needs to not be awful. So I do encourage you to still post yours if you can't figure out the core issue from this. Good luck.


Spirited is offline  
Old 01/17/2024, 04:05   #14
 
elite*gold: 0
Join Date: Jan 2008
Posts: 31
Received Thanks: 0
I'm trying to share the parts related to the actual packet sending, without sharing the private code.

as i said, with a single core, the client doesn't close unless i scatter a spawn big enough to blow past the packet limit. Which there was exactly one spawn in game capable of doing that, and it's now been nerfed.

as near as I can tell the data can't be getting modified while the packet is being composed and sent, yet clearly it's crashing clients on multicore. every parameter is passed by value in C# unless its' passed by reference.

that said it seems socket.beginsend is deprecated. it may very well be that on current operating systems it just breaks on multicore processors.



Okay. i think i understand what's actually going on.

socket.beginsend has some "intelligence" in it. that if packets are sent close enough together, they are combined into one tcp/ip packet, and if the windows tcp/ip packet buffer gets filled, the packet that filled it gets split into two.

This breaks the packet completely, and conquer client closes when it sees the partial packet. the force to one core manages to throttle performance enough that the buffer never fills before a packet gets sent, and thus all packets avoid being mangled.

Knowing this, what's the fix? The issue here is that socket.beginsend send isn't aware of the status of the packet buffer, and even though it's told how much data is being sent, windows can still go behind the scenes and combine multiple server packets into one tcp/ip packet, and will split the server packet between tcp/ip packets.

ideally i want to allow windows to combine server packets into the same tc/pip packet, but not to split them up between two tcp/ip packets.

Is there a better option then adding a sleep into the netcode to ensure that only one server packet gets put into the buffer before the tcp/ip packet gets sent? A way to force the server to block until the tcp/ip packet actually gets sent?

Ahh. seems the issue is that the endsend callback can return if only part of the packet was put into the buffer. and if this happens we needs to call beginsend a second time to send the rest of the data.
zaphod77 is offline  
Old 01/17/2024, 06:16   #15
 
Spirited's Avatar
 
elite*gold: 12
Join Date: Jul 2011
Posts: 8,214
Received Thanks: 4,118
Well, as mentioned, the issue is probably with receiving and nothing to do with the theories you have... But if you're not willing to post your socket class or read people's advice... then idk what to tell you. A socket class is some basic stuff... use mine if it gets your server working. Good luck with your private code.
Spirited is offline  
Reply


Similar Threads Similar Threads
DC Random 8 -> 100 Gems + Random 3M --> 18 M Food + Random 3M --> 18M Gold
01/24/2015 - Dragon City - 25 Replies
DC Random 8 -> 100 Gems + Random 3M --> 18 M Food + Random 3M --> 18M Gold Dragon City Gem + Gold + Food +Exp
Java Color.rgb(random,random,random)
05/19/2014 - Java - 4 Replies
Guten Abend liebe Community, ich habe ein Anliegen. Wie ist es möglich bei der Methode : setTextColor(Color.rgb(0,0,0)); anstatt (0,0,0) 3 random Integer zu verwenden wenn ja wie? Ich habe es mal so probiert: final Random r = new Random();
agbot-auto closes i tried everythin its still closes
06/04/2008 - Silkroad Online - 5 Replies
guys i'm realli sry to bother man, but i tried everythin i hav been readin almost all the thread everydae to hope there is a solution to my prob. i reinstall the bot and tried the srproxy , i jus cant figures wad went wrong the bot work perfect with my laptop but with tis laptop there is tis prob. i did everythin as per normal everythin went perfect but when i logged in the charac its either the bot don reacd my charac or it jus closes all the time.. i realli tried all i can but nth seems to...



All times are GMT +2. The time now is 03:51.


Powered by vBulletin®
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
SEO by vBSEO ©2011, Crawlability, Inc.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Support | Contact Us | FAQ | Advertising | Privacy Policy | Terms of Service | Abuse
Copyright ©2024 elitepvpers All Rights Reserved.