Ticket #1397 (new enhancement) — at Version 5
[super] Resource archiving
Reported by: | rgrp | Owned by: | kindly |
---|---|---|---|
Priority: | major | Milestone: | ckan-v1.7 |
Component: | ckan | Keywords: | |
Cc: | Repository: | ckan | |
Theme: | none |
Description (last modified by rgrp) (diff)
We want to cache/archive data associated to a resource so it is available if the resource url disappears (and in order to support other processing we may wish to do e.g. webstorer ...)
Etherpad: http://ckan.okfnpad.org/queue (most relevant parts inlined here)
Preliminaries
- Add task_status table to store qa/archiever/webstore information that does not need to be versioned. - #1363 (and #1371 - related logic functions)
Tasks
- Resource change notifications in core - Make an IResourceChange and IResourceUrlChange. [1d] [0.75d] - #1383
- ckanext-archiver implements IResourceUrlChange and sends tasks to celery. [0.25d][0.25d] - ???
- Archiver daemon #891
- implement link-check function and task (point 2 from Archiver.update above) [1d] [0.5d]
- Rewrite archiver to use external storage. (decide how!)[3d][~2d]
- Write to resource and task status table.[1d][0.75d]
- Make archived data available in WUI - #892
Archiver process
Archiver:
- A resource is added to CKAN
- IResourceCreate event generated
- IF: resource url points to ckan storage or falls within some other set of exclusion conditions then END else continue
- Generate a Archiver.Update task with resource.id
Archiver.update: see #891
Link checker: same as Archiver.update up to 2.1
Change History
Note: See
TracTickets for help on using
tickets.