| 28 | | Archiver.update |
| 29 | | |
| 30 | | 1. [Archiver.Update checks queue (automated as part of celery)] |
| 31 | | 2. Open url |
| 32 | | 1. If FAILURE status: update task_status table (could retry if not more than 3 failures so far). Report task failure in celery |
| 33 | | 2. Check headers for content-length and content-type ... |
| 34 | | * IF: content-length > max_content_length: EXIT (store outcomes on task_status, and update resource with size and content-type and any other info we get?) |
| 35 | | * ELSE: check content-type. |
| 36 | | * IF: NOT data stuff (e.g. text/html) then EXIT. (store outcomes and info on resource) |
| 37 | | * ELSE: archive it (compute md5 hash etc) |
| 38 | | 3. Archive it: connect to storage system and store it. Bucket: from config, Key: /{timestamp}/{resourceid}/filename.ext |
| 39 | | * Add cache url to resource and updated date |
| 40 | | * Update task_status |
| 41 | | * Add other relevant info to resource such as md5, content-type etc |
| | 28 | Archiver.update: see #891 |