id	summary	reporter	owner	description	type	status	priority	milestone	component	resolution	keywords	cc	repo	theme
1037	More Robust Harvesting for DGU	thejimmyg	amercader	"CKAN's harvesting facility is now live on DGU but there are some major improvements that could be made to make it more robust and better fit the generic CKAN harvesting framework proposed in #987.

Some of the key issues:

 * Error reports do not currently contain the ID or title of the document with the error.
 * We only have ""added"" and ""error"" logging on jobs when we really need a report of ""added"", ""updated"", ""not changed"" and ""errors"" with the items in each referencing a real metadata document for which harvesting was attempted
 * We need deletion and editing of sources, without deleting the harvested documents or packages
 * We need a more robust harvesting mechanism than a cron job or we need to deal with the case of multiple cron jobs running at once.
 * We need to know the last time a list of documents was scheduled for harvest and the last time each one was fetched.
"	defect	closed	major	ckan-v1.4-sprint-6	uklii	fixed			ckan	none
