Ticket #1077 (new enhancement) — at Version 3

Opened 3 years ago

Last modified 23 months ago

Switch to new vdm changeset model

Reported by: rgrp Owned by: kindly
Priority: awaiting triage Milestone: ckan-backlog
Component: ckan Keywords:
Cc: Repository: ckan
Theme: none

Description (last modified by rgrp) (diff)

Have developed a new "changeset" based model for revisioning in vdm. This has several advantages:

  • Much simpler
  • Cleaner separation of continuity from changesets
    • Supports certain operations that are impossible now (e.g. deleting all changes to a particular object irrespective of whether other objects were changed in same revisions).
  • Easier support for pending state and similar behaviour
  • No need to introduce new tables (and hence migrations) when making something revisioned (or not).
  • Almost identical API

Possible Disadvantages

  • Difficult to query revision history. Currently we have a way of finding out the diffs of particular packages. These diffs *include* changes to objects associated with packages (i.e a resource attached to a package). With the new model the only way to get this information is by looking in the json stored in the change object which is very awkward.
    • RP: not sure this is true. You can query on object id very easily in the changeset model. Possible complication here is working out what objects are associated to say a package (e.g. have to look up ids of package_tags) but this does not seem more problematic than what you would do in other model to achieve the same ends.
  • Does not give us anything extra if we simplify our use of vdm currently. (see alternative below)
    • RP: not quite true. E.g. pending support and API.
  • A large change to database structure needs to happen.

Implementation

  • The main challenge with this change is schema and data migration

Migration

Every revisioned object has a revision_id and revision attribute.

Approximate algorithm:

Revision -> Changeset

for revtype in [PackageRevision, ...]:
    for pkgrev in package_revision:
        changeset = lookupchangeset(package_revision)
        ChangeObject(cset, (table, id), dictize(pkgrev))

Question:

  • does pkg include tags attributes or not? or we have to dictize, pkgrev, pkg2tagrev, and tag. Probably the latter.

Alternative

Instead of restructuring the whole of the database to fit the new changeset model just simplifying our use of the current vdm by removing stateful list/dicts and handling this state ourselves in the logic layer could be adequate. The vdm would then be just a simple copy on write mechanism at the table level. This seems to cover all advantages/disadvantages above.

Change History

comment:1 Changed 3 years ago by rgrp

  • Priority changed from awaiting triage to critical
  • Description modified (diff)

comment:2 Changed 3 years ago by kindly

  • Description modified (diff)

comment:3 Changed 3 years ago by rgrp

  • state set to draft
  • Description modified (diff)
Note: See TracTickets for help on using tickets.