Ticket #1397 (new enhancement) — at Version 6
[super] Resource archiving
Reported by: | rgrp | Owned by: | kindly |
---|---|---|---|
Priority: | major | Milestone: | ckan-v1.7 |
Component: | ckan | Keywords: | |
Cc: | Repository: | ckan | |
Theme: | none |
Description (last modified by rgrp) (diff)
We want to cache/archive data associated to a resource so it is available if the resource url disappears (and in order to support other processing we may wish to do e.g. webstorer ...)
Etherpad: http://ckan.okfnpad.org/queue (most relevant parts inlined here)
Preliminaries
- Add task_status table to store qa/archiever/webstore information that does not need to be versioned. - #1363 (and #1371 - related logic functions)
Configuration setup for daemons
- Standard ini file
- Sections are named after daemon / extension. E.g. [my-daemon]
- Arbitrary values but anticipate at least 2 stanard values:
- ckan_url e.g. http://thedatahub.org/
- ckan_apikey e.g. xxxxxx
- celery_config
celeryd config:
All providers of tasks will add an item to the following entry point:
[ckan.tasks] name = ckanext.{name}.tasks:....
celeryconfig.py
from pkg_resources import iter_entry_points for entry in iter_entry_points: celeryimports.appen(....) CELERY_IMPORTS = celeryimports
Work Items
- Resource change notifications in core - Make an IResourceChange and IResourceUrlChange. [1d] [0.75d] - #1383
- Generate archiving request on resource url change [0.25d][0.25d] - #1399
- Archiver daemon #891
- implement link-check function and task (point 2 from Archiver.update above) [1d] [0.5d]
- Rewrite archiver to use external storage. (decide how!)[3d][~2d]
- Write to resource and task status table.[1d][0.75d]
- [Required?] Make archived data available in WUI - #892
- Documentation - #1400
Archiver process
Generate archiving request on resource url change: #
Archiver.update: see #891
Link checker: same as Archiver.update up to 2.1
Change History
Note: See
TracTickets for help on using
tickets.