Table public.in_vote format
conseo
conseo at polyc0l0r.net
Mon May 20 11:43:48 EDT 2013
Hi Mike,
Problems:
1) The voting table does not store all vote events, but rather updates them in
place. Hence we have no complete history of vote events, but only compiled
snapshots of certain intervals.
2) The current xml string makes it fairly difficult to compile a historical
vote count, because I need to query by timestamp as well.
Proposal:
Change the table format from its current form:
servicename | voteremail | xml
where the xml contains at least a timestamp and the candidate-email (can be
empty), as well as a dart-sector.
to:
timestamp | servicename | voteremail | candidateemail | ds
which basically destructures the xml in plain SQL to make it queryable. If you
would like to have a collection in a field to flexibly save attribtues like ds
(dartsector) without schema changes in the future, I'd propose to use the new
JSON support in Postgres. This would still allow to get some of SQL's query
functionality for data in this field (e.g. filtering by JSON attributes) as
well as validation. (1)
I propose a primary key over the tuple (timestamp, servicename, voteremail)
for now, as these should be unique (you shouldn't cast different votes for one
voter on the same poll at once). We also should have a B-Tree index on
timestamp and any index (probably hash-map) on servicename and voteremail to
make querying fast, but this is not necessary to pin down the data format,
just taking the benefits of SQL at runtime.
Votes are not updated in place then, but added as new events (rows) with a
newer timestamp. You can still get the same query behaviour, by just selecting
the newest vote. You can also trivially compose the current xml form from the
SQL query, so I can't see any big breaks by this change. We need to recreate
the table from current data once, I can do so for you.
What do you think? I can help to adjust the Java code, if you don't like to
put time into the effort.
conseo
(1) http://www.postgresql.org/docs/current/static/functions-json.html
There is also a xml type:
http://www.postgresql.org/docs/current/static/functions-xml.html
but since we use JSON for most communication, I think it is the best fit.
Accessing JSON:
http://people.planetpostgresql.org/andrew/index.php?/archives/249-Using-PLV8-to-index-JSON.html
>From my side we can also keep the field as XML-String as long as we pull
timestamp and candidate out of it.
More information about the Votorola
mailing list