Custom Query (2152 matches)
Results (211 - 213 of 2152)
Ticket | Resolution | Summary | Owner | Reporter |
---|---|---|---|---|
#1641 | fixed | ckanext-archiver: Content-length header not reliable to check if resource has been modified | amercader | amercader |
Description |
The download task in ckanext-archiver performs a HEAD request on the resource URL and checks if the "Content-Type" and "Content-Length" headers differ from the values stored to see if the resource needs to be updated [1]. The "Content-Length" header, although widely used, is not mandatory and some servers don't provide it, e.g.: $ curl -I http://portfolio.theglobalfund.org/en/IATI/Activities?countryCode=AFG HTTP/1.1 200 OK Cache-Control: private Transfer-Encoding: chunked Content-Type: text/xml Vary: Accept-Encoding Server: Microsoft-IIS/7.5 Set-Cookie: ASP.NET_SessionId=3qhqekddgmre0kmk5cynq0sy; path=/; HttpOnly X-AspNetMvc-Version: 3.0 content-disposition: attachment; filename=AFG_IATI_12012012.xml X-AspNet-Version: 4.0.30319 X-Powered-By: ASP.NET Date: Thu, 12 Jan 2012 12:36:43 GMT Also worth noting that requests, the python library that uses ckanext-archiver, sets an "Accept-Encoding: gzip" header by default, which depending on the configuration of the remote web server, may prevent the "Content-Length" server from being sent, e.g.: $ curl -H "Accept-Encoding: gzip" -I http://iatistandard.org/published-temp/adb-activities.xml HTTP/1.1 200 OK Date: Thu, 12 Jan 2012 12:12:46 GMT Server: Apache Last-Modified: Mon, 28 Nov 2011 15:55:35 GMT Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Content-Type: application/xml curl -I http://iatistandard.org/published-temp/adb-activities.xml HTTP/1.1 200 OK Date: Thu, 12 Jan 2012 11:56:23 GMT Server: Apache Last-Modified: Mon, 28 Nov 2011 15:55:35 GMT Accept-Ranges: bytes Content-Length: 2686720 Vary: Accept-Encoding Content-Type: application/xml All this can lead to some resources never getting updated, and of course the size property of the resource not being set. As we need to download the resource anyway, it would be better to check if the real length of the data has been modified (and store it). [1] https://github.com/okfn/ckanext-archiver/blob/0a189262dca4ab5b286fb6a02b4ab8a201f639f3/ckanext/archiver/tasks.py#L72 |
|||
#1207 | fixed | ckanclient.package_entity_get should raise more specific exception | dread | dread |
Description |
When package does not exist in ckan catalogue, ckanclient.package_entity_get should raise more specific exception, such as CkanNotFoundError? instead of generic CkanApiError?. |
|||
#867 | fixed | ckanclient raises exceptions | dread | dread |
Description |
To be more pythonic, raise exceptions when ckanclient gets status which isn't 200. |