{22} Trac tickets (2647 matches)

Results (501 - 600 of 2647)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Id Type Owner Reporter Milestone Status Resolution Summary Description Posixtime Modifiedtime
#598 story johnbywater ckan-v1.2 closed List remote metadata entities for given publisher 1284213730000000 1286200795000000
#613 task johnbywater johnbywater closed Store result of schema validation check on metadata document object 1284218828000000 1286798466000000
#653 requirement dread ckan-backlog new Trackback links for packages

When people link to a package, a track-back link is auto-created. (Similar system as for blogs).

As suggested by Tim Davies:

Allowing some form of ‘track back’ against datasets When a non-technical user comes to look at a dataset it would be really useful for them to be able to see if anyone has created an interface interpretation of it already.

I found quite a few cases in research of end-users struggling to make sense of a dataset when good interfaces to that data had already been built and blogged about, but without there being any link from the dataset listing to those data uses. Accepting track backs could also make it easier for technical users to find blog posts / shared code etc. relating to a given dataset.

1285062025000000 1339774636000000
#654 bug johnbywater ckan-v1.2 closed Harvest sources and jobs should return 404 when missing (not 500) 1285170717000000 1285254097000000
#655 story johnbywater ckan-v1.2 closed Return status code 404 when harvest source is not found 1285252399000000 1285254088000000
#656 story johnbywater ckan-v1.2 closed Return status code 404 when harvesting job is not found 1285252429000000 1285254084000000
#673 task johnbywater closed Construct and send CSW GetRecordById request 1286214712000000 1286786758000000
#674 task johnbywater closed Extract metadata documents from CSW GetRecordById 1286214780000000 1286787032000000
#688 task johnbywater johnbywater ckan-v1.2 closed Example GeoNetworks service for CSW development 1286786893000000 1286980827000000
#728 requirement amercader johnbywater ckan-backlog assigned CSW Harvesting shall be optimised in respect of reharvesting only records that have changed

Hi Will, this is important again because some CSW servers we use have over 300 documents in. Could you take a look at modifying the filter please?

1287675340000000 1310124784000000
#737 enhancement dread ckan-backlog new Markdown syntax summary page

I suggest we produce a quick Markdown cheat-sheet page, showing the key runes: e.g. create a title and quote some text. This page can link to the full Markdown docs for advanced users.

A user going to the Markdown docs that we link will have to read a couple of pages of the raison-d'etre of Markdown before he gets to the syntax. And it's not very easy to read, and being white on black it looks like proper geek stuff.

1287766749000000 1323170239000000
#763 enhancement dread ckan-future assigned Read-only mode - Setup

Admin configures entering read-only mode in one of two places:

  • CKAN config file (e.g. ckan.ini)
  • environment variable from Apache config

Once enabled, no writes can occur to the database (including user ratings and other usage stats).

1288091506000000 1338206204000000
#765 enhancement dread ckan-backlog assigned Read-only mode - API usage

All writes to the API are captured and you are returned an error explaining the reason.

Possible errors:

  • 503 temporary maintenance
  • 403 forbidden (if server if permanently read-only)
1288091897000000 1338206123000000
#794 requirement amercader johnbywater ckan-backlog assigned Investigate reconciling UKLP Publisher and Provider with DGU

This needs more analysis, but the GEMINI2 attribute "metadata point of contact" must be reconciled with the registered publisher (or agent).

This might also be used to filter records harvested from a CSW source, but filtering also needs more analysis, as does distinction between agent and provider.

1289227811000000 1311179581000000
#811 defect cygri ckan-backlog new Extra field editing form layout breaks when there are long field names

The layout of the editing section for extra fields breaks when a field name is slightly too long. Field names jump over to the right. See http://ckan.net/package/edit/dbpedia for examples.

1289994812000000 1323170289000000
#812 defect cygri ckan-backlog new Package edit form only allows three extra fields

Rationale

The package edit form is restricted to three extra fields. To enter more than three fields, one has to save the package and hit edit again (or hit preview).

Implementation

A mechanism similar to the one for resources (where you can add lines as you go) would solve this. So, have a button that adds more extra field rows via JS. (Extra fields don't need up/down buttons that the Resource table has)

Nice to have: a blank field is added when you tab from the last filled-in field in the table.

1289995010000000 1311176917000000
#818 requirement cygri ckan-backlog new Rethinking the author and maintainer fields

The semantics of the Author and Maintainer fields are really unclear at the moment. This leads to very inconsistent usage. Also, perhaps Name and Email are not the only fields that are needed for a contact.

Here is a table that shows the current usage of these fields in CKAN: http://richard.cyganiak.de/2010/ckan/ckan-ppl.html

We note several problems:

  • Author and Maintainer are often the same
  • Author and Maintainer are often used interchangeably
  • People really want to specify URLs for the contacts and stick them into random places because there is no field for it
  • Multiple comma-separated names in a single field

I'm not sure what to do about this, but a redesign is necessary in my opinion.

Some ideas:

  • Remove the maintainer field?
  • Make really clear that Author doesn't refer to the metadata on CKAN, but to the original data
  • Add an “author URL” field?
1290003524000000 1339774621000000
#1168 enhancement thejimmyg dread ckan-backlog assigned Test system for deb packaging

Get buildbot to:

  • build the deb packages
  • install them into a fresh virtual machine
  • run smoke tests on the installed ckan
1306441994000000 1330990423000000
#1341 enhancement kindly kindly ckan-backlog reopened Delete spam users from ckan

Spam users where added to thedatahub and we need to clean them.

1315995034000000 1320141540000000
#1447 defect kindly dread ckan-backlog assigned disk space leakage

Periodically we see some CKAN servers fall over because they run out of disk space. We need to find out if there is a common cause and fix it.

One problem in the past has been file handles running out when creating lots of tiny files in the data directory.

Another problem has been several enourmous backups being created every day - pdeu on eu25.

1320666843000000 1340727330000000
#2202 enhancement rgrp ckan-future reopened Display page view count on dataset and resource pages

Just like we display download counts we should display view counts.

1330765455000000 1338204929000000
#2243 enhancement seanh seanh ckan-v1.9 reopened Fix ckanext-example 1332172710000000 1340635768000000
#2331 defect kindly rgrp ckan-sprint-2012-05-29 reopened Search should AND terms not OR terms

Appears current default search in CKAN ORs terms rather than ANDing them (i.e. adding more terms increasing number of items found rather than reducing it).

Not sure when this crept in or if it has been there for a long time.

1335637485000000 1356474344000000
#2457 enhancement johnmartin aron.carroll demo phase 4 assigned Create demo tags list page

This includes the tag page as well for now.

Discussion:

https://okfn.basecamphq.com/projects/9558659-demo-ckan-front-end/posts/62998445/comments

Implementation:

http://s031.okserver.org:2375/en/tag

1338211735000000 1352658878000000
#26 enhancement somebody johnbywater closed duplicate A registered person creates their own tags for a package 1152551351000000 1152555283000000
#75 enhancement dread rgrp closed duplicate Record and display package "usage" information
  • Number of package page visits on ckan (can we get this straight from google analytics)
  • Number of times url or download url is used - now ticket:937 (Record download stats for resoures)

How do we do this?

  • Google analytics will miss a lot of this usage (and how do we get that data out anyway)
  • Could use javascript but again misses usage.
  • One option is to redirect link but that is kind of nasty (but may be only option ...)
1247828785000000 1296341223000000
#87 enhancement rgrp rgrp closed duplicate Multiple download links

Multiple download links, including links to mirrors and multiple formats/versions

1248693302000000 1258470719000000
#97 enhancement rgrp rgrp closed duplicate Do not create a distribution on a path is something already exists there

(2009-03-09) Do not create a distribution at path X if path X already exists and contains material (unless forced via a force option).

Cost: 1h

1249983557000000 1318181317000000
#106 enhancement dread rgrp closed duplicate Regularly convert CKAN data to RDF and put on Talis CC

Sister to ticket:90 (Link to RDF version of CKAN data on Talis Connected Commons).

Talis have already kindly done an initial conversion. We should repeat this process regularly and re-upload the data to Talis CC.

In the long run may wish to only re-convert packages changed since the last upload. However given relatively smaller size of full dataset this optimization is probably not yet required.

Attached is the ruby script used by Talis for conversion

Cost: ? (1d+ depending on e.g. how easy integration with Talis CC is)

1251454474000000 1256140649000000
#125 enhancement dread rgrp v1.0 closed duplicate Edit Generic Package Attributes in WUI

Split out from ticket:43

1253709712000000 1258377621000000
#137 enhancement rgrp dread closed duplicate User has editable home page
  • Generic text box for markdown about the user 'About'

Model's user table reflects these:

  • 'about' attribute
1254741703000000 1254741830000000
#144 enhancement rgrp dread v0.11 closed duplicate Most popular packages listed on homepage

Based on number of views.

Related to ticket:143.

1255010391000000 1265284457000000
#147 enhancement dread dread v0.11 closed duplicate Parser and loader for esw.org data 1255440695000000 1255515162000000
#151 enhancement dread rgrp v0.11 closed duplicate User object should have a created attribute

User object should have a "created" attribute initialized to current datetime.

Require a db migration but o/w very simple.

Cost: 1.5h

1255589694000000 1257414545000000
#153 enhancement dread dread v0.11 closed duplicate Group's packages listed alphabetically

This is so you can easily look up whether a given package is already listed - otherwise as lists get bigger becomes difficult to see what is already there.

Suggested by Jonathan Gray

1255621515000000 1258971895000000
#155 enhancement dread dread v1.0 closed duplicate Adding multiple packages to a group

Ability to add multiple packages to a group in one go (e.g. with 'add' link which makes drop down menu appear - so can add one after another - then submit simultaneously)

Use a bit of javascript to add more dropdowns.

Suggested by Jonathan Gray

1255621779000000 1271760041000000
#168 enhancement rgrp dread closed duplicate Show admins for a group in group view 1256291481000000 1257414795000000
#169 enhancement dread dread closed duplicate Package derivations

A 'Derived' relationship can be applied from one package to another.

e.g. sussex-demography is derived from census-2001

'Derived' relationship is:

  • directional
  • many:many
  • stateful

'derived' table columns:

  • id (primary key)
  • source_package (foreign key)
  • result_package (foreign key)
  • description (markdown text)

Further tickets:

  • WUI - package view - shows 'derives from package x' and 'derived package y' with UML-like diagram of x -> this package -> y
  • WUI - package edit form - new option to say it 'derives from' or 'has derivation' and you select the appropriate
  • REST if - expose reading and writing this property
1256304927000000 1266928708000000
#176 enhancement dread dread closed duplicate Package dependencies

(Related to ticket:169 - Package derivations)

A 'dependency' relationship can be applied from one package to another. It implies that a package requires the download or existence of another package which it 'depends on'. (Analogous to software package dependencies.)

e.g. london-traffic-visualisation depends on road-map

'Dependency' relationship is:

  • directional
  • many:many
  • stateful

'dependency' table columns:

  • id (primary key)
  • dependent (foreign key)
  • dependency (foreign key)

Further tickets:

  • WUI - package view - have list of dependencies (do not need to list packages which depend on this one)
  • WUI - package edit form - new option to say 'depends on' (no need for 'has dependent package')
  • REST api - expose reading and writing 'depends on' property.

Issues

  • How do we deal with dependency at a particular version?
1257162812000000 1266928721000000
#180 enhancement rgrp jwyg v0.11 closed duplicate Tag cloud as way to view CKAN tags

Create big tag cloud with all CKAN tags - perhaps weighting with size and colour...

1257534254000000 1265284374000000
#186 enhancement rgrp rgrp closed duplicate Automated upload to archive.org s3

(Follows on from ticket:107). We want to provide facility for users to automatically upload material.

1257803430000000 1296341182000000
#188 enhancement rgrp rgrp v0.11 closed duplicate Improve package listing views

Propose change to tabular-like format showing these attributes (perhaps should be configurable?)

  • Openness status
  • Title (not sure name is needed)
  • Tags

Cost: 4h

1257870031000000 1265294090000000
#228 enhancement rgrp rgrp closed duplicate Deal with duplicate packages

This needs to be thought out ...

1262085763000000 1290596875000000
#245 enhancement rgrp rgrp closed duplicate Support for composite primary keys

Problem here is that foreign key then becomes "complicated" (composite).

  • Could also deprecate continuity_id field in favour of the basic foreign key on ie
1265882630000000 1297066620000000
#246 enhancement rgrp rgrp closed duplicate Support for primary key not named id

At the moment setting of continuity_id depends on base table pkcol being id. Should not be hard to change this -- and may get for free as part of ticket:245 (composite primary keys)

1265882862000000 1297066757000000
#268 defect rgrp dread closed duplicate Select groups in Package edit form 1268068896000000 1285070682000000
#294 enhancement thejimmyg dread closed duplicate Add/remove extra fields in Package edit form

Currently the package form gives you 3 fields for extras. To get more you have to hit preview. This is obscure. It would be better to have some buttons to add/remove fields, just like with the resources.

1271756591000000 1291830960000000
#296 enhancement johnbywater johnbywater closed duplicate Commit CKAN revisions to changeset system 1272279521000000 1294407032000000
#297 enhancement johnbywater johnbywater closed duplicate Update CKAN repository from changeset system 1272279556000000 1294407051000000
#298 enhancement johnbywater johnbywater closed duplicate Pull changesets from remote CKAN instance 1272279591000000 1294407080000000
#299 enhancement johnbywater johnbywater closed duplicate Merge diverging lines of changesets 1272279698000000 1294407099000000
#306 enhancement rgrp rgrp closed duplicate datapkg build command

Need to be able to build a distribution. Need:

  • new 'build' command
  • specify distribution format. Suggest at the moment a simple zip or tar.gz build in most straightforward way form distribution.
1272474212000000 1318181194000000
#308 enhancement rgrp rgrp closed duplicate Autocomplete package names & tags in package search

Extracted from ticket:216.

Dubious of its merit.

1273050549000000 1275302577000000
#321 enhancement thejimmyg johnbywater closed duplicate Delegate authentication to Drupal

When CKAN is included in a Drupal front-end, CKAN edit pages are used in a slave-mode, such that authentication is delegated to the Drupal front-end user model.

The Drupal front-end shall have:

  1. Login page - fixed location, can authenticate users, on successful authentication sets auth cookie and redirects to HTTP_REFERER.
  1. Access control resource - fixed location, can authorise users, on receipt of valid auth cookie return message listing account details and permitted actions.
  1. Access denied page - fixed location, static resource, gently

indicates what has happened, and how to ask for permission.

The CKAN slave edit page shall:

  1. Try to detect a Drupal session key (passed as cookie or as request param).
  1. Redirect to Drupal login page if no session key.
  1. Check authorisation if session key is found.
  1. Redirect to access denied page if session key not authorised.
  1. Present the Package edit page.
  1. Reject unauthenticated or unauthorised edit submissions.
  1. Snag invalid edit submissions from authenticated and authorised users.
  1. Respond to valid edit submissions from authenticated and authorised users, by saving the new package state, and redirecting to Package read page in Drupal front-end.
1274705234000000 1291831399000000
#358 enhancement rgrp dread ckan-v1.5 closed duplicate Resources in REST API

(spun out of ticket:336)

Resource added to model API at:

api/rest/resource

Example model request

GET to: /api/2/rest/resource/a3dd8f64-9078-4f04-845c-e3f047125028

returns:

 [{"id": "a3dd8f64-9078-4f04-845c-e3f047125028",
   "package_id": "b8a325c8-af2a-43f3-8245-9db7d73dfbfe",
   "URL": "http://scraperwiki.com/lincolnshire-councillors", 
   "format": "CSV", 
   "Description": "Scrape of www.lincs.gov/councillors.pdf by ScraperWiki.",
   "hash": "", 
   "position": 2
 }]

Authorization

  1. Have it generic (ie. not per resource) and use an action/role on system
  2. Require all resources to attach to packages an inherit their permissions (i.e. read/write etc if and only read/write on associated packages)
  3. Introduce Resource in authorization system (requires migration)

Mixed model

Create / Edit:

if resource associated to package:
    check_permissions(package, update)
else:
    check_system_permissions(c.user, model.Action.Resouce Create/Update, model.System)
1277483282000000 1310128782000000
#394 task johnbywater johnbywater closed duplicate Fix munin on DGU (?) 1280485351000000 1294407189000000
#395 task pudo ckan-v1.3 closed duplicate Set up profiling to analyze performance issues

At the moment, some pages within CKAN tend to load slowly. We should create a profiling setup in which we can measure response times for complete requests and individual methods calls.

This could be used to identify bottlenecks and find an appropriate caching or tuning strategy to improve CKAN performance.

NB: We should also agree on a maximum request latency.

TODO: Read up on all those QoS tickets to avoid overlapping efforts.

1280824739000000 1294417538000000
#402 task pudo pudo ckan-v1.3 closed duplicate Archiving worker to back up package resources from a CKAN instance

Write a worker that scans all packages in a ckan instance and uploads the data to storage.ckan.net or another suitable storage system.

  • Naming scheme?
    • Bucket: {ckan-instance-id}-{package-name}? {ckan-instance-id}-{package-id}?
      • What happens if names change
    • File: filename? hash?
  • Store hash back on ckan instance?

The caching worker will consumer update notifications and fetch packages.

Extra points for:

  • Properly checking for source file modification (Last-modified, Etag)
  • Using PIP VCS Backends for retrieval
  • OFS/S3 Storage
1281018912000000 1296467635000000
#440 task dread dread closed duplicate Write and pass comprehensive performance tests

Run latest ckan on eu0. Automate some queries and searches. Check load and database connections / processes.

1282226932000000 1294417436000000
#441 requirement dread dread ckan-v1.3 closed duplicate CKAN read-only state

When performing maintenance on CKAN it may be necessary to make CKAN obviously read-only, telling the users and restricting access to 'edit' pages.

Examples of use:

  • Administrator wants to upgrade CKAN or move it to another server. During this time the database is being administered and either edits are lost or can't be done.
  • A CKAN is used just for distributing metadata and so is always read-only. Updates may still arrive through direct db manipulation, e.g.:
    • another (but writable) CKAN instance is connected to the same db
    • restoring database dumps from another CKAN db
  • Should a security be breached, all editing could be stopped
1282227314000000 1292586309000000
#444 task dread closed duplicate Discuss package relationships ideas with JF
  • Create test data on visible ckan
  • Discuss with JF
1282299238000000 1294414008000000
#456 story johnbywater dread ckan-v1.2 closed duplicate Daily dump 1282299917000000 1282665858000000
#464 task rgrp dread closed duplicate Request dgu db server access 1282306104000000 1282325194000000
#467 story johnbywater ckan-v1.3 closed duplicate Admin configures CKAN to expect API key in named HTTP header 1282310562000000 1294411681000000
#477 story johnbywater ckan-v1.3 closed duplicate Discover location of the daily database dumps 1282313788000000 1294411761000000
#485 story johnbywater closed duplicate Performance beats QoS criteria 1282425219000000 1294411946000000
#486 requirement johnbywater ckan-v1.3 closed duplicate Catalogue service shall notify and query SOLR service 1282425790000000 1291639321000000
#487 story johnbywater closed duplicate Notify SOLR service of model events 1282425910000000 1291639404000000
#497 story johnbywater johnbywater closed duplicate Respond to CSW "GetRecords" request 1282427334000000 1294407718000000
#500 defect dread ckan-v1.2 closed duplicate Exception from diff

Investigate exception occured occasionally in last couple of days on ckan.net:

WebApp Error: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'key' 					

URL: http://ckan.net/revision/diff/dbtune-audioscrobbler?diff=66a47b9e-232a-4838-8674-66fa1a5c76e1&oldid=a99c98be-767a-4e49-9025-2472b2d18b9c
Module weberror.errormiddleware:162 in __call__
<<              __traceback_supplement__ = Supplement, self, environ
                   sr_checker = ResponseStartChecker(start_response)
                   app_iter = self.application(environ, sr_checker)
                   return self.make_catching_iter(app_iter, environ, sr_checker)
               except:
>>  app_iter = self.application(environ, sr_checker)
Module beaker.middleware:73 in __call__
<<                                                     self.cache_manager)
               environ[self.environ_key] = self.cache_manager
               return self.app(environ, start_response)
>>  return self.app(environ, start_response)
Module beaker.middleware:152 in __call__
<<                          headers.append(('Set-cookie', cookie))
                   return start_response(status, headers, exc_info)
               return self.wrap_app(environ, session_start_response)
           
           def _get_session(self):
>>  return self.wrap_app(environ, session_start_response)
Module routes.middleware:130 in __call__
<<                  environ['SCRIPT_NAME'] = environ['SCRIPT_NAME'][:-1]
               
               response = self.app(environ, start_response)
               
               # Wrapped in try as in rare cases the attribute will be gone already
>>  response = self.app(environ, start_response)
Module pylons.wsgiapp:125 in __call__
<<          
               controller = self.resolve(environ, start_response)
               response = self.dispatch(controller, environ, start_response)
               
               if 'paste.testing_variables' in environ and hasattr(response,
>>  response = self.dispatch(controller, environ, start_response)
Module pylons.wsgiapp:324 in dispatch
<<          if log_debug:
                   log.debug("Calling controller class with WSGI interface")
               return controller(environ, start_response)
           
           def load_test_env(self, environ):
>>  return controller(environ, start_response)
Module ckan.lib.base:73 in __call__
<<          # available in environ['pylons.routes_dict']    
               try:
                   return WSGIController.__call__(self, environ, start_response)
               finally:
                   model.Session.remove()
>>  return WSGIController.__call__(self, environ, start_response)
Module pylons.controllers.core:221 in __call__
<<                  return response(environ, self.start_response)
               
               response = self._dispatch_call()
               if not start_response_called:
                   self.start_response = start_response
>>  response = self._dispatch_call()
Module pylons.controllers.core:172 in _dispatch_call
<<              req.environ['pylons.action_method'] = func
                   
                   response = self._inspect_call(func)
               else:
                   if log_debug:
>>  response = self._inspect_call(func)
Module pylons.controllers.core:107 in _inspect_call
<<                        func.__name__, args)
               try:
                   result = self._perform_call(func, args)
               except HTTPException, httpe:
                   if log_debug:
>>  result = self._perform_call(func, args)
Module pylons.controllers.core:60 in _perform_call
<<          """Hide the traceback for everything above this method"""
               __traceback_hide__ = 'before_and_this'
               return func(**args)
           
           def _inspect_call(self, func):
>>  return func(**args)
Module ckan.controllers.revision:119 in diff
<<          c.revision_to = model.Session.query(model.Revision).get(
                   request.params.getone('diff'))
               diff = pkg.diff(c.revision_to, c.revision_from)
               c.diff = diff.items()
               c.diff.sort()
>>  diff = pkg.diff(c.revision_to, c.revision_from)
Module ckan.model.package:340 in diff
<<                              display_id = to_obj_rev.tag.name
                               elif obj_class.__name__ == 'PackageExtra':
                                   display_id = to_obj_rev.key
                               else:
                                   display_id = related_obj_id[:4]
>>  display_id = to_obj_rev.key
AttributeError: 'NoneType' object has no attribute 'key'
CGI Variables
DOCUMENT_ROOT	'/htdocs'
GATEWAY_INTERFACE	'CGI/1.1'
HTTP_ACCEPT	'*/*'
HTTP_ACCEPT_ENCODING	'gzip'
HTTP_ACCEPT_LANGUAGE	'zh-cn,zh-tw'
HTTP_CONNECTION	'close'
HTTP_HOST	'ckan.net'
HTTP_USER_AGENT	'Baiduspider+(+http://www.baidu.com/search/spider.htm)'
PATH	'/usr/local/bin:/usr/bin:/bin'
PATH_INFO	'/revision/diff/dbtune-audioscrobbler'
PATH_TRANSLATED	'/home/okfn/var/srvc/ckan.net/pyenv/bin/ckan.net.py/revision/diff/dbtune-audioscrobbler'
QUERY_STRING	'diff=66a47b9e-232a-4838-8674-66fa1a5c76e1&oldid=a99c98be-767a-4e49-9025-2472b2d18b9c'
REMOTE_ADDR	'123.125.66.32'
REMOTE_PORT	'63767'
REQUEST_METHOD	'GET'
REQUEST_URI	'/revision/diff/dbtune-audioscrobbler?diff=66a47b9e-232a-4838-8674-66fa1a5c76e1&oldid=a99c98be-767a-4e49-9025-2472b2d18b9c'
SCRIPT_FILENAME	'/home/okfn/var/srvc/ckan.net/pyenv/bin/ckan.net.py'
SCRIPT_URI	'http://ckan.net/revision/diff/dbtune-audioscrobbler'
SCRIPT_URL	'/revision/diff/dbtune-audioscrobbler'
SERVER_ADDR	'10.226.226.118'
SERVER_ADMIN	'[no address given]'
SERVER_NAME	'ckan.net'
SERVER_PORT	'80'
SERVER_PROTOCOL	'HTTP/1.1'
SERVER_SIGNATURE	'<address>Apache/2.2.9 (Debian) mod_wsgi/2.5 Python/2.5.2 Server at ckan.net Port 80</address>\n'
SERVER_SOFTWARE	'Apache/2.2.9 (Debian) mod_wsgi/2.5 Python/2.5.2'
WSGI Variables
application	<beaker.middleware.CacheMiddleware object at 0x9f603ec>
beaker.cache	<beaker.cache.CacheManager object at 0x9f6042c>
beaker.get_session	<bound method SessionMiddleware._get_session of <beaker.middleware.SessionMiddleware object at 0x9f602ac>>
beaker.session	{'_accessed_time': 1282385101.4243281, '_creation_time': 1282385101.4243281}
mod_wsgi.application_group	'ckan.net|'
mod_wsgi.callable_object	'application'
mod_wsgi.listener_host	''
mod_wsgi.listener_port	'80'
mod_wsgi.process_group	''
mod_wsgi.reload_mechanism	'0'
mod_wsgi.script_reloading	'1'
mod_wsgi.version	(2, 5)
paste.cookies	(<SimpleCookie: >, '')
paste.parsed_querystring	([('diff', '66a47b9e-232a-4838-8674-66fa1a5c76e1'), ('oldid', 'a99c98be-767a-4e49-9025-2472b2d18b9c')], 'diff=66a47b9e-232a-4838-8674-66fa1a5c76e1&oldid=a99c98be-767a-4e49-9025-2472b2d18b9c')
paste.registry	<paste.registry.Registry object at 0x104552ec>
paste.throw_errors	True
pylons.action_method	<bound method RevisionController.diff of <ckan.controllers.revision.RevisionController object at 0xfb17aec>>
pylons.controller	<ckan.controllers.revision.RevisionController object at 0xfb17aec>
pylons.environ_config	{'session': 'beaker.session', 'cache': 'beaker.cache'}
pylons.pylons	<pylons.util.PylonsContext object at 0x10286d4c>
pylons.routes_dict	{'action': u'diff', 'controller': u'revision', 'id': u'dbtune-audioscrobbler'}
repoze.who.logger	<logging.Logger instance at 0xa16e0cc>
repoze.who.plugins	{'openid': <OpenIdIdentificationPlugin 167584972>, 'auth_tkt': <AuthTktCookiePlugin 169253516>}
routes.route	<routes.route.Route object at 0x9f3690c>
routes.url	<routes.util.URLGenerator object at 0xfd8d7cc>
webob._parsed_query_vars	(GET([('diff', '66a47b9e-232a-4838-8674-66fa1a5c76e1'), ('oldid', 'a99c98be-767a-4e49-9025-2472b2d18b9c')]), 'diff=66a47b9e-232a-4838-8674-66fa1a5c76e1&oldid=a99c98be-767a-4e49-9025-2472b2d18b9c')
webob.adhoc_attrs	{'language': 'en-us'}
wsgi process	'Multi process AND threads (?)'
wsgi.file_wrapper	<built-in method file_wrapper of mod_wsgi.Adapter object at 0x103a5bf0>
wsgiorg.routing_args	(<routes.util.URLGenerator object at 0xfd8d7cc>, {'action': u'diff', 'controller': u'revision', 'id': u'dbtune-audioscrobbler'})



1282553033000000 1287747652000000
#501 requirement pudo ckan-v1.2 closed duplicate Read-only maintenance mode

CKAN should have a read-only maintenance mode with a nice little banner on all pages, appropriate REST messages etc. Bonus points if this is triggered via an environment variable and thus can be triggered by the surrounding apache.

1282554617000000 1282724566000000
#512 story dread closed duplicate User creates package via API with incorrect core fields specified 1282754750000000 1294917121000000
#514 defect dread ckan-v1.2 closed duplicate Inconsistent use of 'location' header in API

When you create a package then the 'location' header gets set. This doesn't happen for any other domain objects. I think this should be consistent - either none or all.

I've removed the info about the header in the docs in the meantime.

1282757357000000 1282757391000000
#535 defect dread ckan-v1.2 closed duplicate genshi error when logged into sl.ckan.net

Genshi exception when rendering the page whilst logged in to sl.ckan.net.

1283165774000000 1283167040000000
#537 task wwaites wwaites closed duplicate Caching and Performance improvement

There are several places where performance is unacceptably slow. Even in places where it is not, the system could still be more responsive for read requests.

Introducing caching has to be done carefully and should be done in a standards compliant manner.

General strategy

  • Where possible, cache output within the pylons app (beaker).
  • Facilitate external caching in an end-user's web browser or a caching proxy
  • Slightly stale data is not necessarily much of a problem so allow the output to be cached for a relatively short period (e.g. 5-15 minutes).
  • When cache expiry has been reached, a request will be made to the server. The server should check if its internally cached data is still valid, and serve that, otherwise regenerate the data.

Tasks

These tasks should be broken into sub-tickets:

  • caching of parts of templates that are expensive to render (package list, tag list, group list)
  • caching of entire output using beaker particularly for API read operations.
  • need to perform a check to see if the cache should be invalidated by checking if anything in the output would have changed -- i.e. checking timestamps on package modifications. this is a natural place to introduce the ETag which will help browsers and web caches.
  • cache infrastructure front end - varnish, squid, etc. To do this right, the controllers need to set the cache control headers appropriately (max-age, must-revalidate). This is a good resource: http://www.mnot.net/cache_docs/#CACHE-CONTROL
    • Deploy varnish on a host dedicated to this purpose for research. This will be useful for other sites as well
    • Do not configure varnish to ignore cache control headers or otherwise behave in a non HTTP/1.1 compliant manner

Future Work

  • Investigate ckanclient library maintaining a local cache as a web browser would
  • Investigate using a CDN like Google Storage or Amazon for serving cached data.
1283184362000000 1311178929000000
#543 task wwaites rgrp closed duplicate Investigate partial page caching and edge-side includes

Edge-side includes or partial page caching are a standard way to deal with caching of pages in which some (usually small) part of the content cannot be cached or should be cached in a different manner (e.g. much more briefly) than the rest of the page.

Edge-side includes have the advantage that they integrate with general 3rd-party caching systems such as varnish.

Introducing either partial page or ESI will require some overhaul work to change the page render processing somewhat.

1283244784000000 1311178918000000
#544 requirement pudo ckan-v1.3 closed duplicate Backport facet browsing to CKAN 1.2

This is in IATI, would be nice to have in generic CKAN.

1283267292000000 1291638966000000
#561 defect pudo ckan-v1.2 closed duplicate Deleted packages are returned in the API

Anja is reporting this, severe bug, I think.

1283775578000000 1283775711000000
#563 requirement thejimmyg johnbywater closed duplicate Support a minimal CSW server interface or export to GeoNetwork 1284033576000000 1296592472000000
#570 story johnbywater johnbywater ckan-v1.3 closed duplicate Validate metadata document against UKLP schematron 1284040256000000 1294407974000000
#573 story johnbywater closed duplicate Add metadata entity to harvesting queue 1284045353000000 1284220987000000
#574 story johnbywater closed duplicate Create UKLII package with attributes from remote metadata record 1284045805000000 1284222410000000
#580 story johnbywater closed duplicate Write (create or update) CKAN package for metadata document 1284210730000000 1284223068000000
#612 task johnbywater johnbywater ckan-v1.3 closed duplicate Check given XML schema validates given metadata document 1284218750000000 1294408188000000
#617 task johnbywater johnbywater ckan-v1.3 closed duplicate Check UKLP schematron validates given metadata document 1284219298000000 1294408164000000
#665 requirement johnbywater johnbywater ckan-v1.3 closed duplicate The system shall support withdrawing a harvested dataset or service from publication

Discussion between John and Peter:

Given we can identify a document, does the disappearance of a document from a registered source imply the disappearance of the metadata (such that we delete packages once the documents disappear from the registered source)?

I would expect a more explicit 'delete'. The UKLP Use Case Library describes this as "withdraw a dataset or service from publication" (part of UCD03 Maintain resources).

1285588250000000 1297268097000000
#691 requirement thejimmyg johnbywater ckan-backlog closed duplicate Package Relationships 1286822735000000 1295610145000000
#702 requirement johnbywater johnbywater ckan-v1.2 closed duplicate The system shall support changing package groups when editing a package 1287403778000000 1287403850000000
#730 task rgrp closed duplicate Back up package data from all CKAN packages to storage.ckan.net

Write a worker that scans all packages in a ckan instance and uploads the data to storage.ckan.net.

  • Naming scheme?
    • Bucket: {ckan-instance-id}-{package-name}? {ckan-instance-id}-{package-id}?
      • What happens if names change
    • File: filename? hash?
  • Store hash back on ckan instance?
1287737109000000 1291139609000000
#740 requirement thejimmyg johnbywater closed duplicate Get copy of harvested metadata for a given package 1287779799000000 1296592889000000
#748 story johnbywater closed duplicate Link new sample package to previous sample package of continuous series 1288013849000000 1294412976000000
#749 story johnbywater closed duplicate Fold up continuous series in search results behind newest sample package 1288014002000000 1294412986000000
#750 enhancement thejimmyg johnbywater closed duplicate Get CSW records modified since given time 1288014402000000 1296592940000000
#751 story johnbywater closed duplicate Get harvested document for a given package 1288014518000000 1288014616000000
#752 task johnbywater johnbywater ckan-v1.3 closed duplicate Change package attribute names used by Gemini harvesting to DGU "v.4" 1288039205000000 1294408472000000
#755 task johnbywater johnbywater ckan-v1.3 closed duplicate Add filter attribute to harvest source entity 1288040506000000 1294408632000000
#756 task johnbywater johnbywater ckan-v1.3 closed duplicate Add filter field to harvest source form 1288040545000000 1294408642000000
#757 task thejimmyg johnbywater ckan-v1.3 closed duplicate Create migration script to add harvest source filter attribute to existing tables 1288040584000000 1296593448000000
#758 task johnbywater johnbywater ckan-v1.3 closed duplicate Change API documentation to indicate harvest source entity has filter attribute 1288040643000000 1294409053000000
#759 story johnbywater johnbywater ckan-v1.3 closed duplicate Construct and send filtered CSW GetRecords request 1288040753000000 1294408652000000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Note: See TracReports for help on using and creating reports.