the easiness is important for the initial coding and the maintenance but the efficiency is more important. there, efficiency means payload of stored data, transferred ones & ability to solve the questions thanks to these data.
let check your system, you are proposing 2 possible tables:
Code:
CREATE TABLE CharsUnknown1 (
CharName varchar(30) NOT NULL,
UserID varchar(12) NOT NULL
}
-- or
CREATE TABLE CharsUnknown2 (
CharName varchar(30) NOT NULL,
UserID varchar(12) NOT NULL,
Counter int NOT NULL
}
(the lengths of the varchar fields are defined by their definition in the Chars table)
in first case you will insert a new record each time the CharName + UserID couple fires an event; in second case, you will insert a new record if none odf existing contains the given CharName and increase the Counter field of matching record otherwise.
at this point, the first option is "easier", indeed there is no need to check if a record exists and then insert a new one or update a field of a specific record, but simply to insert a new record, it can't be easier, but meantime it is definitively a waste of space and time !
(also note that a record with Counter=0 can be inserted when the toon is created removing the need to test the existence of the record).
the first point to be addressed is indeed the stored data and their format.
there, the two fields are correlated (the value of UserID is implied by a value of CharName), so UserID is useless; futhermore the varchar records should never be duplicated, nor used for comparison to order rows or to count some of them; of course it is sometime needed but whenever possible the process should use numerical fields and not text fields (substantially a comparaison of 2 int requires 1 test, a comparaison of 2 varchar(30) requires up to 30 tests).
the second point to take into account is the amount of data you will generate.
if you will generate one event and thus one record per day, both table formats are acceptable; but if you generate 1 event every minute, they will generate a huge amount of data (with 46 bytes per record, the second table need +/- 64k per day, 2MB per month).
next point will be how to designate a row / a record within this table?
a SQL table has no intrinsic row identifier, in some shaiya tables the RowID is useless, but only when the tables already have an unique identifier; here you don't provide any row identifier.
if you use the format 1 (table contains n rows with p names present 1 up to n times) there is no way to randomly choose a name! you can generate a number m in [0..n[ but you can not select the row m (no way except a loop over the first m records after you select all, such an algorithm would be very inefficient and will be possible in a client only, not inside a SP)
so the 2 possibles tables should be:
Code:
CREATE TABLE CharsUnknown1 (
RowID int IDENTITY(0,1) NOT NULL,
CharID int NOT NULL
}
-- or
CREATE TABLE CharsUnknown2 (
CharID int NOT NULL,
Counter int NOT NULL
}
-- the 2nd format can use an algorithm not relying on row index
but the question remains, what format is most suitable?
it depends on several criteria, but all things being equal, this will largely depend on the context of achieving.
if all is done in a SQL stored procedure, the first format - that relies on a potentially important table - will be fine because no data are actually extracted & forwarded as long as the process stays in the SP.
one can so code:
Code:
declare @countEvents int = (select count(*) from CharsUnknown1)
declare @anEventRow int = round(rand() * @countEvents, 0)
declare @aWinnerID int = (select CharID from CharsUnknown1 where RowID=@anEventRow)
but if it is done in a script language (php, asp, ...) it means that the data will be actually read from the DB and sent over the net (LAN or internet), the weight of the transfer must be take into account, so we can prefer to use the second table format.
one can for instance code:
Code:
$connectionInfo = array(
'UID' => 'foo',
'PWD' => 'bar',
'Database' => 'PS_UserData'
);
$conn = sqlsrv_connect('127.0.0.1', $connectionInfo);
// get sum of all 'events'
$set = sqlsrv_query($conn, "SELECT SUM(Counter) FROM CharsUnknown2");
$row = sqlsrv_fetch_array($set);
$max = intval($row[0]);
sqlsrv_free_stmt($set);
// randomly choose an event id
$evt = rand(0, $max - 1);
// find selected toon according weights of each records
$charID = 0;
$set = sqlsrv_query($conn, "SELECT CharID,Counter FROM CharsUnknown2 ORDER BY Counter DESC");
$cur = 0;
while ($row = sqlsrv_fetch_array($set)){
$pts = intval($row[1]);
if ($cur + $pts > $evt){
$charID = $row[0];
break;
}
$cur += $pts;
}
sqlsrv_free_stmt($set);
sqlsrv_close($conn);
print "the winner is $charID\n";
this code requires more lines than the previous, but it is not intrinsicly more or less easy, it complies to different constraints and uses another algorithm to perform the operation (and we can imagine several other ways to process).
(my main purpose was not to answer the closed question: "what is the easier?", but to reformulate the constraints and opportunities showing a possible analysis of the problem).