1 | | = Syncing = |
| 1 | = Distributed Data and Syncing Between CKAN Instances = |
| 2 | |
| 3 | Aim: '''to support pulling and pushing metadata between different CKAN registries while ''preserving'' history.''' |
| 4 | |
| 5 | This problem has strong similarities to distributed version control and distributed databases. |
| 6 | |
| 7 | = Research = |
| 8 | |
| 9 | * Distributed version control systems: |
| 10 | * Mercurial -- for an overview [http://mercurial.selenic.com/wiki/Mercurial?action=AttachFile&do=get&target=Hague2009.pdf Inside a Distributed VCS] |
| 11 | * Distributed databases - NB: we are looking to preserve history (not just allow multiple distributed writes) |
| 12 | |
| 13 | = Use cases = |
| 14 | |
| 15 | == 1. data.gov.uk and ckan.net == |
| 16 | |
| 17 | Want to make data on data.gov.uk (hmg.ckan.net) available in a public CKAN instance. We will therefore end up with: |
| 18 | |
| 19 | 1. Package on data.gov.uk |
| 20 | 2. Package on ckan.net |
| 21 | |
| 22 | Need to keep these two representations of the package in "sync". |
| 23 | |
| 24 | Remarks: This is easy if we only edit in one place. But what if we want to let community edit on ckan.net? Two options: |
| 25 | |
| 26 | 1. have 2 copies on ckan.net one community owned and one locked down |
| 27 | * Pro: easy to keep stuff separate |
| 28 | * Con: terrible user experience and still have issue that two items can diverge |
| 29 | 2. Have one copy that is world editable into which gets *merged back* into the official data every so often |
| 30 | |
| 31 | |
| 32 | = Detailed Problem Description = |
| 33 | |
| 34 | Let us consider different CKAN registries: A,B,C |
| 35 | |
| 36 | Consider the following way we may want information to travel:: |
| 37 | |
| 38 | {{{ |
| 39 | A ------- |
| 40 | \ |
| 41 | C |
| 42 | A --> B--/ |
| 43 | }}} |
| 44 | |
| 45 | In words: |
| 46 | 1. Changes go directly from A to C |
| 47 | 2. Changes go directly from A to B. Then changes from B are pulled to C |
| 48 | |
| 49 | = Specification v1 = |