Structure of in_vote table

Tue Jun 4 07:58:37 EDT 2013

Conseo and I spoke.  Here's a summary of our conclusions (please
correct anything that's wrong C).

We're concerned with the problem of information loss, specifically we
have no voting history.  The solution we agreed is to add a serial
column to the voter input table.

    serial  (bigserial)

We spoke of adding other columns if the need arises in future.  We
also spoke of one possible advantage of the current minimal structure:
it's compatible with all voting methods and might therefore underpin
the mirroring network and its frontal caches, labeled here as "p2p
caches": http://mail.zelea.com/list/votorola/2013-April/001686.html
With this in mind, we agreed to generalize the other column names:

    serial  (bigserial)
    poll    (character varying)      primary keys
    voter   (character varying)
   -----------------------------
    data    (character varying)

Voting data from any source may be streamed and stored in this common
form regardless of the voting method provided that each source gets a
separate, dedicated channel/store.  The data "payload" includes the
timestamp (correctness not guaranteed), method-specific content such
as candidate identifier and external decorations such as dart sector.
The data for a given serial are read-only; they must never change.

This is also compatible with mirror-based verification facilities:
http://zelea.com/project/votorola/a/count/verification.xht#rationalization

A given store may nevertheless break out the data into columns.  We'll
do that ourselves if/when the need arises, though it'll be less robust
in the face of structural changes.

Mike

conseo said:
> Hi Mike,
> 
> > 
> > I wasn't objecting in the end to the timestamp datatype, but to the
> > "with time zone" qualifier.  My understanding is that it doesn't store
> > the time zone as implied; instead it converts the writes to UTC and
> > the reads to local time.
> I cannot reproduce your described behaviour through JDBC 
> (java.sql.Timestamp.getTime() returns millisecs since beginning of Unix Time, 
> which is UTC, even though Postgres shows them with MEZ (+01) offset). If we 
> leave out the timezone, then timestamp buys us little more than better SQL 
> semantics. Take the bigint millis since Unix time for simplicity, it is not 
> too important for me, only the actual data is and here the timezone would add 
> information. 
> 
> > But now I think this is all a red herring.
> > There's a larger underlying problem.  We need a clean sequence that
> > isn't liable to information loss.  We borked the solution and it's
> > mostly my fault.  I've made this mistake in the past; I try to solve a
> > sequencing problem with timestamps and instead I create a mess.  The
> > correct solution is a serial counter (probably bigserial).  I think
> > the primary keys should be:
> > 
> >     serial, serviceName, voterEmail
> 
> Whatever you prefer, I will have to transform it anyway. The relevant voter-
> generated data is: 
> <timestamp, globally unique voter-id, globally unique consensus target id>
> Unless this is shared through mirroring, it is of no use, all additional data 
> is nice, but not mandatory and might in fact be just an accidential attribute 
> of computation (like the dart-sector).
> 
> > 
> > All the other data is parked in an xml column for max flex.  And since
> > we have so few servers to overhaul, I think we should take advantage
> > of the overhaul to normalize the name of the poll column:
> > 
> >     serial, pollName, voterEmail    (this is better)
> > 
> > So the serial column will eventually enable us to do historical
> > queries for purposes of verification.  If we ever need to query by
> > calendar date, then we could construct a sister table to cross-index
> > the serial numbers and calendar days, or whatever precision we need
> > (we'll make it reconstructable).  It will deal with any anomolies in
> > the vote timestamps.  Timestamps are not guaranteed to be correct nor
> > even correctly ordered, so we won't depend on them for anything
> > important.  We'll document this in the API.
> > 
> > What do you think?  (please post to list if you prefer)
> Sounds reasonable. I still would not put timestamp and candidate in xml, as 
> they are first-class attributes imo, but it is your engine, mine will have to 
> extract that information first to count, which is supposed to happen anyway in 
> vote-mirroring. I hoped we could share the table and I could code an in-place 
> alternative, but this should still work out, so don't worry. All I really need 
> is the voter-data. This is already in there, so feel free to change as you 
> wish.
> 
> conseo