Ticket #1397 (new enhancement) — at Version 8

Opened 3 years ago

Last modified 2 years ago

[super] Resource archiving

Reported by: rgrp Owned by: kindly
Priority: major Milestone: ckan-v1.7
Component: ckan Keywords:
Cc: Repository: ckan
Theme: none

Description (last modified by kindly) (diff)

We want to cache/archive data associated to a resource so it is available if the resource url disappears (and in order to support other processing we may wish to do e.g. webstorer ...)

Etherpad: http://ckan.okfnpad.org/queue (most relevant parts inlined here)

Preliminaries

  • Add task_status table to store qa/archiever/webstore information that does not need to be versioned. - #1363 (and #1371 - related logic functions)

Configuration setup for daemons

Pass config through to workers i.e site_url, user, api_key. Need to make site user account. #1408

celeryd config:

All providers of tasks will add an item to the following entry point:

[ckan.tasks]
name = ckanext.{name}.tasks:....

celeryconfig.py

from pkg_resources import iter_entry_points
for entry in iter_entry_points:
     celeryimports.appen(....)

CELERY_IMPORTS = celeryimports

Work Items

  1. Resource change notifications in core - Make an IResourceChange and IResourceUrlChange. [1d] [0.75d] - #1383
  2. Generate archiving request on resource url change [0.25d][0.25d] - #1399
  3. Archiver daemon #891
    1. implement link-check function and task (point 2 from Archiver.update above) [1d] [0.5d]
    2. Rewrite archiver to use external storage. (decide how!)[3d][~2d]
  4. Write to resource and task status table.[1d][0.75d]
  5. [Required?] Make archived data available in WUI - #892
  6. Documentation - #1400

Change History

comment:1 Changed 3 years ago by rgrp

  • Description modified (diff)

comment:2 Changed 3 years ago by rgrp

  • Description modified (diff)

comment:3 Changed 3 years ago by rgrp

  • Description modified (diff)

comment:4 Changed 3 years ago by rgrp

  • Milestone changed from ckan-v1.6 to ckan-sprint-2011-10-24

comment:5 Changed 3 years ago by rgrp

  • Description modified (diff)

comment:6 Changed 3 years ago by rgrp

  • Priority changed from awaiting triage to major
  • Description modified (diff)
  • Milestone changed from ckan-sprint-2011-10-24 to ckan-v1.6

comment:7 Changed 3 years ago by rgrp

  • Description modified (diff)

comment:8 Changed 3 years ago by kindly

  • Description modified (diff)
Note: See TracTickets for help on using tickets.