Context Navigation

← Previous Ticket
Next Ticket →

Ticket #1077 (new enhancement) — at Version 4

Opened 3 years ago

Last modified 23 months ago

Switch to new vdm changeset model

Reported by:	rgrp	Owned by:	kindly
Priority:	awaiting triage	Milestone:	ckan-backlog
Component:	ckan	Keywords:
Cc:		Repository:	ckan
Theme:	none

Description (last modified by kindly) (diff)

Have developed a new "changeset" based model for revisioning in vdm. This has several advantages:

Much simpler
Cleaner separation of continuity from changesets
- Supports certain operations that are impossible now (e.g. deleting all changes to a particular object irrespective of whether other objects were changed in same revisions).
Easier support for pending state and similar behaviour
No need to introduce new tables (and hence migrations) when making something revisioned (or not).
Almost identical API

Possible Disadvantages

Difficult to query revision history. Currently we have a way of finding out the diffs of particular packages. These diffs *include* changes to objects associated with packages (i.e a resource attached to a package). With the new model the only way to get this information is by looking in the json stored in the change object which is very awkward.
- RP: not sure this is true. You can query on object id very easily in the changeset model. Possible complication here is working out what objects are associated to say a package (e.g. have to look up ids of package_tags) but this does not seem more problematic than what you would do in other model to achieve the same ends.s
  - DR: In looking for related objects we do joins between revision tables and the main tables. For example we join the package_extras_revision table to the package table. We could not do this with the new model as we would need to look into change object table dict for the join, which is painful. Also the object_ids are tuples as the moment which is difficult to join on.

Does not give us anything extra if we simplify our use of vdm currently. (see alternative below)
- RP: not quite true. E.g. pending support and API.
  - DR: pending support would be there if we did not use any stateful lists/dicts and use vdm as a copy on write only with revision_id only. I do not know what you mean by api.

A large change to database structure needs to happen.

Implementation

The main challenge with this change is schema and data migration

Migration

Every revisioned object has a revision_id and revision attribute.

Approximate algorithm:

Revision -> Changeset

for revtype in [PackageRevision, ...]:
    for pkgrev in package_revision:
        changeset = lookupchangeset(package_revision)
        ChangeObject(cset, (table, id), dictize(pkgrev))

Question:

does pkg include tags attributes or not? or we have to dictize, pkgrev, pkg2tagrev, and tag. Probably the latter.

Alternative

Instead of restructuring the whole of the database to fit the new changeset model just simplifying our use of the current vdm by removing stateful list/dicts and handling this state ourselves in the logic layer could be adequate. The vdm would then be just a simple copy on write mechanism at the table level. This seems to cover all advantages/disadvantages above.

Change History

comment:1 Changed 3 years ago by rgrp

Priority changed from awaiting triage to critical
Description modified (diff)

comment:2 Changed 3 years ago by kindly

Description modified (diff)

comment:3 Changed 3 years ago by rgrp

state set to draft
Description modified (diff)

comment:4 Changed 3 years ago by kindly

Description modified (diff)

Note: See TracTickets for help on using tickets.

Download in other formats: