Table public.in_vote format

conseo conseo at polyc0l0r.net
Mon May 20 11:43:48 EDT 2013


Hi Mike,

Problems:
1) The voting table does not store all vote events, but rather updates them in 
place. Hence we have no complete history of vote events, but only compiled 
snapshots of certain intervals. 
2) The current xml string makes it fairly difficult to compile a historical 
vote count, because I need to query by timestamp as well.

Proposal:

Change the table format from its current form: 

servicename | voteremail | xml

where the xml contains at least a timestamp and the candidate-email (can be 
empty), as well as a dart-sector.

to:

timestamp | servicename | voteremail | candidateemail | ds

which basically destructures the xml in plain SQL to make it queryable. If you 
would like to have a collection in a field to flexibly save attribtues like ds 
(dartsector) without schema changes in the future, I'd propose to use the new 
JSON support in Postgres. This would still allow to get some of SQL's query 
functionality for data in this field (e.g. filtering by JSON attributes) as 
well as validation. (1)

I propose a primary key over the tuple (timestamp, servicename, voteremail) 
for now, as these should be unique (you shouldn't cast different votes for one 
voter on the same poll at once). We also should have a B-Tree index on 
timestamp and any index (probably hash-map) on servicename and voteremail to 
make querying fast, but this is not necessary to pin down the data format, 
just taking the benefits of SQL at runtime.

Votes are not updated in place then, but added as new events (rows) with a 
newer timestamp. You can still get the same query behaviour, by just selecting 
the newest vote. You can also trivially compose the current xml form from the 
SQL query, so I can't see any big breaks by this change. We need to recreate 
the table from current data once, I can do so for you.

What do you think? I can help to adjust the Java code, if you don't like to 
put time into the effort.

conseo

(1) http://www.postgresql.org/docs/current/static/functions-json.html
There is also a xml type:
http://www.postgresql.org/docs/current/static/functions-xml.html
but since we use JSON for most communication, I think it is the best fit.
Accessing JSON: 
http://people.planetpostgresql.org/andrew/index.php?/archives/249-Using-PLV8-to-index-JSON.html
>From my side we can also keep the field as XML-String as long as we pull 
timestamp and candidate out of it.



More information about the Votorola mailing list