Heyho guys,
my question is quite easy but it seems like the answer is really hard (maybe impossible to solve at all?).
I am searching for a colisionfree fast one way function. This function does not have to be very secure. A simple "Factorization" of givens number would be enough if we could make it colisionfree.
It is really important that I can guarantee no colisions at all or just very very few colisions (if no colisions is not possible).
The second really important thing is speed. The function must be as fast as possible.
Security is nearly unimportant because if breaking one "hash" would take about 2-5mins this is enough. If you want to break the "complete Problem" you would have to calculate 100 to 200 hashes which then will take around 200 to 1000mins and this is really more than enough.
Even 10mins for 100 "succesfully breaks" would be enough.
I know there are several really good and fast perfect hashing algorithms.
But the problem is I dont know the set of inputs.
How exactly I want to use it, is this way:
1. Input of my program will be a set of strings (length can vary between few chars and 100 and more chars). They will all be given as plaintext. So at this moment I could use a perfect hash.
2. After hashing all strings I will write them to a file, each "hashed" value will get a new line.
Then Ive got a second program which I want to spread with my generated text file.
The input of the second program will be a concrete word.
This program now calculates the hash and searches for this hash in my generated text file.
The problem is, the user should be able to add new hashed values, too.
So at this moment I am not longer able to use a perfect hash.
Why cant I use an encrypted database or something similiar?
The answer is quite simpel:
The second programm will be open source, so if I use a fixed password/username/etc. for encryption/authentification everyone could easily read all hashes and their corresponding values of database or decrypt text file easily.
So does anyone know a fast and colisionfree (at least nearly colisionfree) one way function for strings (but for ints its ok too then ill group chars up to an int)?
What I thought about was something of that:
Just multiply the value of each character with a variable, e.g.:
This function is not colisionfree at all. So I thought Ill use the generated value as key for an encryption algorithm.
But this wont guarantee me colisionfreeness, too.
Because 2 different key with different plaintext may result in the same "hashed" value.
So does anyone have got a good suggestion?
Will e.g. using SHA256 and appending length of original string give me enough "colisionfreeness" to securely do searching for strings in the way I want to do it? Even if it would fit my needs. Generating a 256Bit number for strings with only one character isnt what i really want to do, but ok if there are no other solutions. And even if this is the only solution there is one more problem:
I would need a copyrightfree/zlib-licensed version of this hash function.
Just to note:
There will be many of these text files generated (I think around 1.000.000 of these files and more when time goes on). So hash must be extremely good for nearly any combination of strings from around 1 char to 1000 chars. There will be around (when user used second program completely) 1000 words/hashed values in the generated text file.
Thanks in advance for any suggestion. :)
my question is quite easy but it seems like the answer is really hard (maybe impossible to solve at all?).
I am searching for a colisionfree fast one way function. This function does not have to be very secure. A simple "Factorization" of givens number would be enough if we could make it colisionfree.
It is really important that I can guarantee no colisions at all or just very very few colisions (if no colisions is not possible).
The second really important thing is speed. The function must be as fast as possible.
Security is nearly unimportant because if breaking one "hash" would take about 2-5mins this is enough. If you want to break the "complete Problem" you would have to calculate 100 to 200 hashes which then will take around 200 to 1000mins and this is really more than enough.
Even 10mins for 100 "succesfully breaks" would be enough.
I know there are several really good and fast perfect hashing algorithms.
But the problem is I dont know the set of inputs.
How exactly I want to use it, is this way:
1. Input of my program will be a set of strings (length can vary between few chars and 100 and more chars). They will all be given as plaintext. So at this moment I could use a perfect hash.
2. After hashing all strings I will write them to a file, each "hashed" value will get a new line.
Then Ive got a second program which I want to spread with my generated text file.
The input of the second program will be a concrete word.
This program now calculates the hash and searches for this hash in my generated text file.
The problem is, the user should be able to add new hashed values, too.
So at this moment I am not longer able to use a perfect hash.
Why cant I use an encrypted database or something similiar?
The answer is quite simpel:
The second programm will be open source, so if I use a fixed password/username/etc. for encryption/authentification everyone could easily read all hashes and their corresponding values of database or decrypt text file easily.
So does anyone know a fast and colisionfree (at least nearly colisionfree) one way function for strings (but for ints its ok too then ill group chars up to an int)?
What I thought about was something of that:
Just multiply the value of each character with a variable, e.g.:
PHP Code:
var=new BigNum();
for (i=0;i<stringLength;i++)
{
factorized.push(DoFactorization(str[i]));
var*=str[i];
}
// do something with factorized numbers
// solving factorization of calculated bignum is really hard
// but solving factorization for my small numbers is quite easy so it should be fast
But this wont guarantee me colisionfreeness, too.
Because 2 different key with different plaintext may result in the same "hashed" value.
So does anyone have got a good suggestion?
Will e.g. using SHA256 and appending length of original string give me enough "colisionfreeness" to securely do searching for strings in the way I want to do it? Even if it would fit my needs. Generating a 256Bit number for strings with only one character isnt what i really want to do, but ok if there are no other solutions. And even if this is the only solution there is one more problem:
I would need a copyrightfree/zlib-licensed version of this hash function.
Just to note:
There will be many of these text files generated (I think around 1.000.000 of these files and more when time goes on). So hash must be extremely good for nearly any combination of strings from around 1 char to 1000 chars. There will be around (when user used second program completely) 1000 words/hashed values in the generated text file.
Thanks in advance for any suggestion. :)