Custom Query (2152 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (49 - 51 of 2152)

Ticket Resolution Summary Owner Reporter
#1641 fixed ckanext-archiver: Content-length header not reliable to check if resource has been modified amercader amercader

Reported by amercader, 2 years ago.

Description

The download task in ckanext-archiver performs a HEAD request on the resource URL and checks if the "Content-Type" and "Content-Length" headers differ from the values stored to see if the resource needs to be updated [1].

The "Content-Length" header, although widely used, is not mandatory and some servers don't provide it, e.g.:

$ curl -I http://portfolio.theglobalfund.org/en/IATI/Activities?countryCode=AFG
HTTP/1.1 200 OK
Cache-Control: private
Transfer-Encoding: chunked
Content-Type: text/xml
Vary: Accept-Encoding
Server: Microsoft-IIS/7.5
Set-Cookie: ASP.NET_SessionId=3qhqekddgmre0kmk5cynq0sy; path=/; HttpOnly
X-AspNetMvc-Version: 3.0
content-disposition: attachment; filename=AFG_IATI_12012012.xml
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Thu, 12 Jan 2012 12:36:43 GMT

Also worth noting that requests, the python library that uses ckanext-archiver, sets an "Accept-Encoding: gzip" header by default, which depending on the configuration of the remote web server, may prevent the "Content-Length" server from being sent, e.g.:

$ curl -H "Accept-Encoding: gzip" -I http://iatistandard.org/published-temp/adb-activities.xml
HTTP/1.1 200 OK
Date: Thu, 12 Jan 2012 12:12:46 GMT
Server: Apache
Last-Modified: Mon, 28 Nov 2011 15:55:35 GMT
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Type: application/xml

curl -I http://iatistandard.org/published-temp/adb-activities.xml
HTTP/1.1 200 OK
Date: Thu, 12 Jan 2012 11:56:23 GMT
Server: Apache
Last-Modified: Mon, 28 Nov 2011 15:55:35 GMT
Accept-Ranges: bytes
Content-Length: 2686720
Vary: Accept-Encoding
Content-Type: application/xml

All this can lead to some resources never getting updated, and of course the size property of the resource not being set.

As we need to download the resource anyway, it would be better to check if the real length of the data has been modified (and store it).

[1] https://github.com/okfn/ckanext-archiver/blob/0a189262dca4ab5b286fb6a02b4ab8a201f639f3/ckanext/archiver/tasks.py#L72

#1654 fixed [super] Update Publicdata.eu to the latest CKAN stable version amercader amercader

Reported by amercader, 2 years ago.

Description

Tasks include:

  • #1813 Update ckanext-pdeu (4d)
  • #1814 Update harvesters (2d)
  • #1649 Verify ckanext-rdf works with latest CKAN (3d)
  • #1815 Reenable Sparql endpoint (?)
  • #1816 Update ckanext-apps (2-3d)
#1655 fixed Setup issues on s025 (Publicdata.eu) amercader amercader

Reported by amercader, 2 years ago.

Description

Time estimate: 2d

  • Fix logs (apache, ckan, harvest): rotate, set suitable levels
  • Fix harvesting jobs: supervisord for gather consumer, cron job
  • Fix backups

Also it may be worth setting up a test instance ( on s023 ?)

Note: See TracQuery for help on using queries.