[Discussion] Is any other Silkroad Security API out there?

08/02/2020 09:32 vietnguyen09#1
Hi guys,

I've done some overall researches last few days about packet handle for SRO filter and what I've found that almost filter (standalone application) out there is used SilkroadSecurityAPI from @[Only registered and activated users can see links. Click Here To Register...].

While I'm thankful to him for make a simple/lightweight API but I still wondering is any other "better" API out there in nowadays (sorry @[Only registered and activated users can see links. Click Here To Register...], no offence)? [Only registered and activated users can see links. Click Here To Register...] seems the best one, but it is like abandoned since this community likely dead and it requires really high skill and so long time to get on with it.

A release that seems great recently is from @[Only registered and activated users can see links. Click Here To Register...] with .NET Core with some really decent test cases but I'm just wondering can we discuss about weak / strong part of each API in real environment server?

Any idea?
08/02/2020 22:03 JellyBitz#2
SilkroadSecurityAPI does all you need to handle the blowfish/handshake pretty well even on asynchronized states, and has nothing to do with a bot/filter performance at my opinion.
08/03/2020 18:58 vietnguyen09#3
Quote:
Originally Posted by JellyBitz View Post
SilkroadSecurityAPI does all you need to handle the blowfish/handshake pretty well even on asynchronized states, and has nothing to do with a bot/filter performance at my opinion.
Due the test that @[Only registered and activated users can see links. Click Here To Register...] just did, can you tell me what you think about it?
08/03/2020 19:22 b0ykoe#4
Quote:
Originally Posted by vietnguyen09 View Post
Due the test that @[Only registered and activated users can see links. Click Here To Register...] just did, can you tell me what you think about it?
Well in my filter its just a little modified to work with .Net core honestly. I didnt changed the code at all. Also I've resized the buffers a little since 8k seemed a little much. 4k was plenty for my case.
08/04/2020 14:04 DaxterSoul#5
Drew's SilkroadSecurityAPI is great at what it does but was probably never intended to be used in a scenario where it becomes part of the horrible choke point we know as filters today.

X-Filter is using regular SilkroadSecurityAPI with a few QoL changes so I wouldn't really call it a different API but there are probably also people who have written another one from scratch.


In order to write a better API you'll first have to understand [Only registered and activated users can see links. Click Here To Register...] and then where the weaknesses are in SilkroadSecurityAPI and how it connects to C# in particular.
There are some who ported the API to C++ for performance reasons but in reality they're just offsetting the weaknesses instead of working on them. Creating more headroom for mistakes made in the higher application code.

Before we analyze some of the weaknesses of SilkroadSecurityAPI we have to address why C# plays a big role in this.

C# or rather the CLR ([Only registered and activated users can see links. Click Here To Register...]) uses a [Only registered and activated users can see links. Click Here To Register...] that is responsible of cleaning up all the objects (Sessions, Packets, Strings) you've created when they're no longer needed. So every time you allocate something (on the heap) it needs to be collected later. This magic does not come for free. So we'll have to design around the GC to write high performance code in C#.

We can achieve this by:
- Pooling Memory
- Pooling Objects
- Avoiding unnecessary copying
- Allocating on the stack instead of heap
- Using value types when appropriate

Here are some links for techniques you can use to tame the GC.
- [Only registered and activated users can see links. Click Here To Register...]
- [Only registered and activated users can see links. Click Here To Register...]
- [Only registered and activated users can see links. Click Here To Register...]


1. Blowfish

Code:
        public byte[] Encode(byte[] stream, int offset, int length)
        {
            if (length == 0)
                return null;

            var workspace = new byte[GetOutputLength(length)];

            Buffer.BlockCopy(stream, offset, workspace, 0, length);
            for (int x = length; x < workspace.Length; ++x)
                workspace[x] = 0;

            for (int x = 0; x < workspace.Length; x += 8)
            {
                uint l = BitConverter.ToUInt32(workspace, x + 0);
                uint r = BitConverter.ToUInt32(workspace, x + 4);
                Blowfish_encipher(ref r, ref l);
                Buffer.BlockCopy(BitConverter.GetBytes(r), 0, workspace, x + 0, 4);
                Buffer.BlockCopy(BitConverter.GetBytes(l), 0, workspace, x + 4, 4);
            }

            return workspace;
        }
If we look at the Encode method of the the Blowfish there are several issues.

1. Allocates a new buffer for the output (workspace).
2. Copies the input (stream) into the output (workspace).
3. Clears the array where no data has been copied over. (unnecessary as arrays are always default initialized in C#)
5. BitConverter.GetBytes allocates the result array which is then copied back into the output (workspace)

This could be improved by using unsafe pointers but .NET Core has brought us [Only registered and activated users can see links. Click Here To Register...] so we can avoid a lot of these issues while writing safe code.

Span<T> allows us to access the same memory sequentially as any blittable data type with little to no overhead, thus we encipher without converting and copying back between parts of the byte[] and uints.

Code:
        internal int Transform(TransformMode transformMode, in ReadOnlySpan<byte> input, in Span<byte> output, int length)
        {
            var input32 = MemoryMarshal.Cast<byte, uint>(input);
            var output32 = MemoryMarshal.Cast<byte, uint>(output);

            int blockIndex = 0;
            for (int i = 0; i < length / 8; i++)
            {
                var left = input32[blockIndex + 0];
                var right = input32[blockIndex + 1];

                if (transformMode == TransformMode.Encode)
                    this.Encipher(ref left, ref right);
                else if (transformMode == TransformMode.Decode)
                    this.Decipher(ref left, ref right);

                output32[blockIndex + 0] = left;
                output32[blockIndex + 1] = right;

                blockIndex += 2;
            }
        }
final block transforming omitted for simplicity

With this method you can also easily encrypt into the the same buffer which gets rid of the output (workspace) allocation and copying input to output which is done by the transform anyway.
[Only registered and activated users can see links. Click Here To Register...]

2. Packet

The Packet class is a nice general purpose class which you can use to read and write your packets data, no matter if massive or not, but not without flaws.
I don't have any fancy tables to prove my point but some of the write and read methods are rather slow. Further more the underlying MemoryStream in PacketWriter has a default buffer size of 256 and will grow when running out of space by allocating a new buffer that is double the size of the old buffer. The older buffer is also copied over to the new buffer. Spawn packets can easily break this threshold.

Thus it's probably not a good idea to pool a MemoryStream as you can't shrink it's growing buffer. So with enough time every pooled instance of a Packet could've been a spawn packet once and will have a rather large buffer wasting memory when the majority of the packets are small. But you can try to use the [Only registered and activated users can see links. Click Here To Register...].

I'd suggest creating a Packet class that works on a fixed 4096 byte buffer which has been allocated from a Pool (ArrayPool<T> / MemoryPool<T>) and is living on an object pool as well. But since it can't grow there is no way to create a massive packet anymore. For this you'll need to use multiple regular packets and read and write across them thus it simply contains a list of packets.

3. Security
I intendet to write this section in more detail but it turned out more exhausting than expected.
In essence it comes down to a full rewrite as you'll need to carefully avoid copying and allocating where possible.
- Don't allocate a new list everytime you TransferOutgoing/TransferIncoming

With a fixed 4096 byte message you can:
- Receive directly into the the fixed message buffer.
- encrypt/decrypt into itself avoiding allocations and coyping found in Recv and FormatPacket.

Pointing out every small issue would make an obscene large list and it's not all purely performance focused.
The SecurityFlags class for example can be easily swapped with a [Flags] enum to improve readability and reduce complexity.

4. Other
I'm not saying you should all now go and try to rebuild SilkroadSecurityAPI but if you truly care about performance by all means give it a try.
But it won't magically solve all your issues as the Network and Threading design equally important.

Another problem with filters I've seen is that nobody seems to think about which data structures to use.
Things like List<T> for white/blacklists are easy to avoid where a Dictionary<TKey, TValue> or HashSet<T> is appropriate. Please educate yourself about the [Only registered and activated users can see links. Click Here To Register...] so you understand how to judge the [Only registered and activated users can see links. Click Here To Register...] and choose accordingly.
08/05/2020 06:56 vietnguyen09#6
Quote:
Originally Posted by DaxterSoul View Post
Drew's SilkroadSecurityAPI is great at what it does but was probably never intended to be used in a scenario where it becomes part of the horrible choke point we know as filters today.

X-Filter is using regular SilkroadSecurityAPI with a few QoL changes so I wouldn't really call it a different API but there are probably also people who have written another one from scratch.


In order to write a better API you'll first have to understand [Only registered and activated users can see links. Click Here To Register...] and then where the weaknesses are in SilkroadSecurityAPI and how it connects to C# in particular.
There are some who ported the API to C++ for performance reasons but in reality they're just offsetting the weaknesses instead of working on them. Creating more headroom for mistakes made in the higher application code.

Before we analyze some of the weaknesses of SilkroadSecurityAPI we have to address why C# plays a big role in this.

C# or rather the CLR ([Only registered and activated users can see links. Click Here To Register...]) uses a [Only registered and activated users can see links. Click Here To Register...] that is responsible of cleaning up all the objects (Sessions, Packets, Strings) you've created when they're no longer needed. So every time you allocate something (on the heap) it needs to be collected later. This magic does not come for free. So we'll have to design around the GC to write high performance code in C#.

We can achieve this by:
- Pooling Memory
- Pooling Objects
- Avoiding unnecessary copying
- Allocating on the stack instead of heap
- Using value types when appropriate

Here are some links for techniques you can use to tame the GC.
- [Only registered and activated users can see links. Click Here To Register...]
- [Only registered and activated users can see links. Click Here To Register...]
- [Only registered and activated users can see links. Click Here To Register...]


1. Blowfish

Code:
        public byte[] Encode(byte[] stream, int offset, int length)
        {
            if (length == 0)
                return null;

            var workspace = new byte[GetOutputLength(length)];

            Buffer.BlockCopy(stream, offset, workspace, 0, length);
            for (int x = length; x < workspace.Length; ++x)
                workspace[x] = 0;

            for (int x = 0; x < workspace.Length; x += 8)
            {
                uint l = BitConverter.ToUInt32(workspace, x + 0);
                uint r = BitConverter.ToUInt32(workspace, x + 4);
                Blowfish_encipher(ref r, ref l);
                Buffer.BlockCopy(BitConverter.GetBytes(r), 0, workspace, x + 0, 4);
                Buffer.BlockCopy(BitConverter.GetBytes(l), 0, workspace, x + 4, 4);
            }

            return workspace;
        }
If we look at the Encode method of the the Blowfish there are several issues.

1. Allocates a new buffer for the output (workspace).
2. Copies the input (stream) into the output (workspace).
3. Clears the array where no data has been copied over. (unnecessary as arrays are always default initialized in C#)
5. BitConverter.GetBytes allocates the result array which is then copied back into the output (workspace)

This could be improved by using unsafe pointers but .NET Core has brought us [Only registered and activated users can see links. Click Here To Register...] so we can avoid a lot of these issues while writing safe code.

Span<T> allows us to access the same memory sequentially as any blittable data type with little to no overhead, thus we encipher without converting and copying back between parts of the byte[] and uints.

Code:
        internal int Transform(TransformMode transformMode, in ReadOnlySpan<byte> input, in Span<byte> output, int length)
        {
            var input32 = MemoryMarshal.Cast<byte, uint>(input);
            var output32 = MemoryMarshal.Cast<byte, uint>(output);

            int blockIndex = 0;
            for (int i = 0; i < length / 8; i++)
            {
                var left = input32[blockIndex + 0];
                var right = input32[blockIndex + 1];

                if (transformMode == TransformMode.Encode)
                    this.Encipher(ref left, ref right);
                else if (transformMode == TransformMode.Decode)
                    this.Decipher(ref left, ref right);

                output32[blockIndex + 0] = left;
                output32[blockIndex + 1] = right;

                blockIndex += 2;
            }
        }
final block transforming omitted for simplicity

With this method you can also easily encrypt into the the same buffer which gets rid of the output (workspace) allocation and copying input to output which is done by the transform anyway.
[Only registered and activated users can see links. Click Here To Register...]

2. Packet

The Packet class is a nice general purpose class which you can use to read and write your packets data, no matter if massive or not, but not without flaws.
I don't have any fancy tables to prove my point but some of the write and read methods are rather slow. Further more the underlying MemoryStream in PacketWriter has a default buffer size of 256 and will grow when running out of space by allocating a new buffer that is double the size of the old buffer. The older buffer is also copied over to the new buffer. Spawn packets can easily break this threshold.

Thus it's probably not a good idea to pool a MemoryStream as you can't shrink it's growing buffer. So with enough time every pooled instance of a Packet could've been a spawn packet once and will have a rather large buffer wasting memory when the majority of the packets are small. But you can try to use the [Only registered and activated users can see links. Click Here To Register...].

I'd suggest creating a Packet class that works on a fixed 4096 byte buffer which has been allocated from a Pool (ArrayPool<T> / MemoryPool<T>) and is living on an object pool as well. But since it can't grow there is no way to create a massive packet anymore. For this you'll need to use multiple regular packets and read and write across them thus it simply contains a list of packets.

3. Security
I intendet to write this section in more detail but it turned out more exhausting than expected.
In essence it comes down to a full rewrite as you'll need to carefully avoid copying and allocating where possible.
- Don't allocate a new list everytime you TransferOutgoing/TransferIncoming

With a fixed 4096 byte message you can:
- Receive directly into the the fixed message buffer.
- encrypt/decrypt into itself avoiding allocations and coyping found in Recv and FormatPacket.

Pointing out every small issue would make an obscene large list and it's not all purely performance focused.
The SecurityFlags class for example can be easily swapped with a [Flags] enum to improve readability and reduce complexity.

4. Other
I'm not saying you should all now go and try to rebuild SilkroadSecurityAPI but if you truly care about performance by all means give it a try.
But it won't magically solve all your issues as the Network and Threading design equally important.

Another problem with filters I've seen is that nobody seems to think about which data structures to use.
Things like List<T> for white/blacklists are easy to avoid where a Dictionary<TKey, TValue> or HashSet<T> is appropriate. Please educate yourself about the [Only registered and activated users can see links. Click Here To Register...] so you understand how to judge the [Only registered and activated users can see links. Click Here To Register...] and choose accordingly.

It is so lucky me and this section to have your opinion, I don't need to tell how valuable in your information you've just wrote here, learning so much. Thanks for your knowledge.

I've read your SilkroadSecurity so many times and the Silkroad packet document. Thanks again for what you've done for us, the beginer.