{22} Trac tickets (2647 matches)

Results (1 - 100 of 2647)

1 2 3 4 5 6 7 8 9 10 11
Id Type Owner Reporter Milestone Status Resolution Summary Description Posixtime Modifiedtime
#84 enhancement kindly rgrp ckan-future assigned Revert support on versioned objects

Basic revert in the classic wiki form is already support by purging a Revision. However may wish to support:

  1. Cases where multiple objects changed in a revision but only want to revert 1 (low priority)
  2. Want to revert but have reversion as a new revision of that object.

Seems low priority at present.

Cost: ?

1248339543000000 1340626385000000
#140 enhancement rgrp ckan-backlog new News section on front page

Have a news section (suggest as a sidebar item).

News section will link to latest 3/4 blog posts on CKAN from blog.okfn.org.


  • Suggest pulling via rss or similar.
  • Will want to cache this ...

Cost: 4h?

1254902541000000 1265625159000000
#143 enhancement thejimmyg dread ckan-backlog assigned Most active users listed on homepage

Display league of users' recent activity on homepage.

1255010373000000 1312372381000000
#235 enhancement tobes dread ckan-v1.9 assigned Resource format normalization and detection

Try to gather proper MIME information for all package resources in CKAN. This is a shared ticket with dcat-tools (https://bitbucket.org/pudo/dcat-tools), i.e. opendatasearch.org. This can then also be used by ckanrdf, the CKAN RDF conversion service.


  • Create a Google Spreadsheet with two Worksheets: "MIME-Mappings", i.e. "CSV" -> "text/csv" and "Name mappings", i.e. "text/csv" -> "Comma-Separated Spreadsheet".
  • Collect and map surface forms from all CKANs
  • Access this via Swiss and apply, store as a PackageResource? extra field pending #826 (Resource extras).
  • Add heuristics for format auto-detections:
    • Map well-known file extensions
    • Recognize obvious magic (Zip, Tar)
    • Peek into Zipfile/Tarfiles?
  • Define a convention for generic data types (many CKAN packages have only "Spreadsheet" defined, either detect specific type or set MIME to */tabular-data or similar)
  • See also: #816 (Autocomplete for the resource format field)
1263827604000000 1340627624000000
#250 enhancement icmurray dread ckan-v1.9 assigned RDF link in Atom feed

Add link to RDF representation of a package in our Atom feed.

1266507695000000 1340631430000000
#256 requirement dread ckan-backlog assigned Package relationships - 3. Edit in WUI


  • Editable as part of package or separately? (e.g. like authz)
  • Do we normalize to only one type name of the pair?
  • Do we allow create of relationship from both ends (e.g. only from dependency to dependent or either way?)
1266928561000000 1339774714000000
#277 enhancement zephod dread ckan-backlog assigned Set some config options / settings in WUI (extension)

Use case

As a ckan administrator I want to easily change options about the CKAN install.


Settings to be in DB


## Title of site (using in several places including templates and <title> tag
ckan.site_title = CKAN

## Logo image to use (replaces site_title string on front page if defined)
ckan.site_logo = http://assets.okfn.org/p/ckan/img/ckan_logo_box.png

## Site tagline / description (used on front page)
ckan.site_description = 

## Used in creating some absolute urls (such as rss feeds, css files) and 
## dump filenames
ckan.site_url =

## Favicon (default is the CKAN software favicon)
ckan.favicon = http://assets.okfn.org/p/ckan/img/ckan.ico

## An 'id' for the site (using, for example, when creating entries in a common search index) 
## If not specified derived from the site_url
# ckan.site_id = ckan.net

## API url to use (e.g. in AJAX callbacks)
## Enable if the API is at a different domain
# ckan.api_url = http://www.ckan.net

## html content to be inserted just before </body> tag (e.g. google analytics code)
## NB: can use html e.g. <strong>blah</strong>
## NB: can have multiline strings just indent following lines
# ckan.template_footer_end = 

NB: these will still need to be stored somewhere for loading on initialization. do this in db init function ...

Settings / Options / KeyValues? Table


  • [namespace]: ? only if KeyValues? (for settings this would then always be settings)
  • key
  • label
  • value (json)
  • type (e.g. date and to specify in advance what type should be)
  • description
  • tags: ?? (for grouping ...)

Loading settings from DB

Do this in ckan/config/environment.py


  • /ckan-admin/settings
  • Show label, plus description plus text field


  • Would be part of ckan-admin section and hence build on ticket:833 (Administrative dashboard)
1269274861000000 1318247121000000
#285 enhancement rgrp assigned Paginate list of packages on tag read page

Is this worth doing? On hmg.ckan.net start to have a lot of packages with a given tag ...

1270664606000000 1340631923000000
#331 enhancement rgrp ckan-backlog new Timezone of CKAN timestamps should be configurable

Revisions are timestamped using the server's clock, which may not relate to the expected timezone for the site. e.g. the Norway site has a server on GMT. No timezone info is displayed either.

Would like to set timezone for a CKAN instance to use in rendering revision timestamps. For example, use CET or EST timezone.

1275302440000000 1339774701000000
#338 story johnbywater johnbywater v1.1 closed Reference groups by ID in addition to name, since group names can change 1275901137000000 1280446480000000
#351 enhancement dread ckan-backlog new Homepage: list new, updated and 'hot' packages

Have a simpler list of exciting data, as opposed to the big revision list.

For example:

Hot data

New packages: package1, package2, package3
Updated resources: package1, package2, package3
Popular packages: 
1276595816000000 1339774677000000
#369 enhancement [email protected] ckan-backlog new "Package Listing Key" should appear on Tag results

Currently there's a nice legend titled "Package Listing Key" that appears in right side of "Browse Packages" results. The same key should show on other search results like when searching for a tag.

1279821634000000 1339774666000000
#370 enhancement [email protected] ckan-backlog new Use better email encryption for author_email and maintainer_email

The JavaScript? email encryption used is not very reassuring. Google's MailHide? is a much better solution that is easily implemented.


Check on the Mailhide API where there are even some Python libraries already built.

1279821819000000 1339774649000000
#372 bug johnbywater johnbywater ckan-v1.2 closed Fix system limits on CKAN for DGU

Set limits in /etc/security/limits.conf so that we can always ssh in at least. Requested by DGU.

1279885752000000 1281522535000000
#837 enhancement rgrp ckan-backlog new CKAN integration with freebase gridworks / google refine

Thread: http://lists.okfn.org/pipermail/ckan-discuss/2010-November/000718.html

Scenario 1

  1. User installs Refine and CKAN extension for refine
  2. On booting refine and asked to load data they can choose from any data package on CKAN.net (or any other CKAN instance)
  3. They edit the dataset on Refine
  4. On save (or perhaps as a separate option) they are prompted as to whether they wish to sync the dataset back to CKAN (either as a new package or as a new resource on the existing package)

NB: for the dataset sync back some form of "CKAN" storage would be required (we already have storage.ckan.net running but a closer integration would be required)

Scenario 2

  1. User visits a package on CKAN.net (or another CKAN instance)
  2. There is a button on the page "View and edit this dataset in Google Refine"
  3. Click button -- ask them if they have Google refine installed
    • Yes: instructions for loading dataset into refine
    • No: load dataset in hosted version of google refine (we could run this)
  4. User edits dataset and hits save. As in previous scenario they are prompted to sync the dataset.
1291140609000000 1339774605000000
#895 defect memespring ckan-backlog new Add version number (or simular) to css/js includes query string

Updates to css after a new deploy don't come through without a hard refresh. Adding the version number to the include urls will solve this e.g.


1294343382000000 1339774593000000
#909 enhancement pudo ckan-backlog assigned DCat importer for CKAN

Write an importer that supports most well known variants of DCat in importing the data as CKAN packages. In particular, the following sources should be supported:

  • CKANrdf generated exports
  • opengov.se RDF (not really DCat)
  • Sunlight Nationdatacatalog as harvested by dcat-tools.
1295265958000000 1346669602000000
#924 enhancement dread ckan-backlog new Search box has no search button

The search box at the top-right of CKAN's page doesn't have a 'go' button. I feel that a larger percentage of users expect a 'go' or 'search' button on the right-hand side of the box to press to start searching. Techies tend to know the keyboard shortcut of pressing 'carriage-return' but it might be better to follow standard practise on this.

Examples with 'search' button: Internet Explorer, Firefox, Google, Amazon, trac Examples without: ?

1295867533000000 1323170436000000
#948 enhancement dread ckan-future assigned Highlight (to a sysadmin) which packages are deleted

When a customer logs in as a sysadmin then he/she see all packages, including deleted and pending ones. These are hidden to the average user, but the sysadmin has no idea of this until he clicks on the package and sees at the bottom 'state: deleted'.

It should be more obvious than that on the search view - an icon, message or crossed-out name to packages are deleted.

1296646369000000 1338206280000000
#979 enhancement kindly dread ckan-backlog assigned Edit Resource extras in the API

Follows on from #826. We can now edit resource extras in the WUI (to some extent - see #978 for remaining issues) and we can view resource extras in the API, but we can't yet edit them in the API.

1297429777000000 1315222244000000
#989 enhancement kindly pudo ckan-future new Extending the model from plugins

We need to support extending the model from plugins. This could involve:

  • Adding a plugin hook to extend the mapper
  • Adding an upgrade hook for plugin schema migrations
  • Documenting how this is to be done
  • Find a way to avoid conflicts
1297689724000000 1340034311000000
#1009 enhancement pudo rgrp ckan-backlog new Improvements to user accounts sytem
  • Forgot password (email a new password)
  • Confirm email
  • Do not show register page if you are logged in (redirect to home page)
  • ticket:1010 - listing of users.
    • Do not use /user for general user account home page (either user normal user page /user/{id} or /user/myaccount)
1298635991000000 1310128574000000
#1041 enhancement thejimmyg thejimmyg ckan-backlog assigned Start Using the CKAN Wiki for Tutorial-style documentation

For example, I will document the following:

I'd love if someone else would write:

  • An authorisation tutorial covering the core model, the command line tools and examples of every possible way of using the system
  • A HOWTO guide with screenshots for adding a package
1300284715000000 1312372367000000
#1062 defect johnglover sebbacon ckan-backlog assigned Data preview encoding error

The preview of "Species Misc Turtle Download" at http://ckan.net/package/taxonconcept results in the following error:

Unable to Preview - Had an error from dataproxy: Data Transformation Error (Data transformation failed. Reason: 'utf8' codec can't decode byte 0x8b in position 1: unexpected code byte

1301396143000000 1311773731000000
#1069 enhancement tobes rgrp assigned Stub datasets (request for datasets)

Idea is to have stubs for datasets that someone wants but don't yet exist (or haven't been discovered) in the way one has stub pages on a wiki.

We could do this within the existing model by a slight 'abuse' - create a dataset and mark it with a special tag e.g. todo.does-not-yet-exist or similar ...

(Just as we have datasets listed that exist but aren't available ...)

Alternative would be to have a request for datasets subsystem.

I prefer the stub dataset model because it's simpler, provides a simple workflow (as a dataset is found or comes into existence), and the package page provides a natural space in which to accumulate information about what is wanted and what exists.


  • Agree a new dedicated tag. e.g. todo.does-not-exist
1301666919000000 1340632215000000
#1077 enhancement kindly rgrp ckan-backlog new Move to simpler vdm system

Option 1: 'Changeset' Model

See ticket:1135 for vdm ticket. This would involve a) moving to changeset in vdm b) doing the migration in ckan to support this.

Have developed a new "changeset" based model for revisioning in vdm.


  • The main challenge with this change is schema and data migration

Every revisioned object has a revision_id and revision attribute.

Approximate algorithm:

Revision -> Changeset

for revtype in [PackageRevision, ...]:
    for pkgrev in package_revision:
        changeset = lookupchangeset(package_revision)
        ChangeObject(cset, (table, id), dictize(pkgrev))


  • does pkg include tags attributes or not? or we have to dictize, pkgrev, pkg2tagrev, and tag. Probably the latter.

Option 2: Simplify Revision Object Model

Just use a simpler vdm, see ticket:1136 (move to SessionExtension) and ticket:1137 (remove need for statefulness in vdm).


Advantage of Option 1 versus 2:

  • Easier support for pending state and similar behaviour
  • No need to introduce new tables (and hence migrations) when making something revisioned (or not).


  • Migration is required
  • More difficult to query revision history.
    • Could be addressed by having ChangeObject have separate cols for table name and id but would likely be more difficult.
  • Performance (?)
    • Have one big ChangeObject table to query when looking at changed objects rather than many revision tables.
      • Not sure this is a biggie as even with Revision model biggest revision object tables are probably on the order of the ChangeObject table


Implement Option 2 and leave Option 1 for present.

Option 1 includes Option 2 so it seems that that is required in either case (so we may as well with Option 2).

Option 1 requires significant effort (esp migration) so leave for present and then review the situation at some later date.

1302304464000000 1340034345000000
#1096 defect rufuspollock pudo ckan-future new [super] CKAN Hosted

Many users of CKAN want to have their own instance without much effort. Setting these up in separate places is a maintenance nightmare, we should much rather have some tenant separation in core CKAN. Some ideas:

  • introduce model.Site and c.site
    • site has: custom CSS, extra_template_path, title, languages list, package_form, group_form (all configured via web UI)
  • Subdomain detector to activate sites.
  • use site in Authorizer instead of System, have a NullSite? for global things
  • allow cross-site search
  • packages are in a list of sites, m:n rather than 1:n
    • list of sites is string-based, can contain sites not in site table to express harvested external material which is not editable locally.
1303235062000000 1339774484000000
#1101 enhancement sebbacon ckan-backlog new Integrate googlanalytics into site nav

There's a stats plugin (e.g. at http://trac.ckan.org/ticket/832).

Output from the googleanalytics plugin should append to that page, if the stats plugin is present.

Possibly the stats plugin and the googleanalytics plugin should be merged?

Finally, if the stats plugin is active, then a link to the stats page should be added to the main site footer.

1303393926000000 1339774582000000
#1120 enhancement tsm ckan-backlog new Atom feeds of each tag

Tags could/should have an Atom feed. This would mean that every edit to relevant packages could be easily monitored. See [1].

[1] http://lists.okfn.org/pipermail/ckan-discuss/2011-May/001162.html

1304309285000000 1339774568000000
#1130 enhancement lucychambers assigned First time users

Send users to FAQ first time on CKAN

1304938761000000 1340633514000000
#1134 CREP amercader ckan-backlog new CREP0003: Description and Configuration of Harvesters

Proposer: Adrià Mercader


The new harvester interface allows to create harvesters for different sources, but right now harvesters don't have many ways to describe and configure themselves. We need a way of allowing them to:

  • Expose their type and other details so they can be used internally and on the UI.
  • Define configuration settings for particular harvester instances.

The Problem

Harvester description

The current UI for adding and editing harvest sources is the same used in ckanext-dgu, and thus the 3 harvester types used in DGU to harvest various GEMINI realted sources are hardcoded in the form. The form will be migrated to a DGU-independent one, so we need the harvesters to provide all the necessary data. There is a current get_type method that returns the harvester type, but for make it compatible with the DGU forms, it returns a machine-readable string (e.g. "CSW Server"), making it error prone.

Arbitrary configuration

In the current implementation, when the harvest process is started, ckanext-harvest looks for all the available plugins that implement the IHarvester interface and calls the appropiate methods for the current stage (gather_stage,fetch_stage,import_stage). At these stages, harvesters have no way of applying arbitrary configuration options, so all harvesters of the same type behave on the same way. For instance, the CKAN harvester needs a way to define the API version to use when harvesting remote instances (Right now, the version 2 is hardcoded on the code).


Harvester description

Harvesters will need to provide the following information so the UI form can be built:

  • name: machine-readable name (e.g. "waf"). This will be the value stored in the database, and the one used by ckanext-harvest to call the appropiate harvester.
  • title: human-readable name (e.g. "Web Accessible Folder (WAF)"). This will appear in the form's select box.
  • description: a description of what the harvester does (e.g. "A Web Accessible Folder (WAF) displaying a list of GEMINI 2.1 documents"). This will appear on the form as a guidance to the user.

The way to provide it will be an info method that all harvesters must implement, which will return a dictionary with the previous elements:

        'name': 'csw',
        'title': 'CSW Server',
        'description': 'A server that implements OGC's Catalog Service 
                        for the Web (CSW) standard'

Arbitrary configuration

As different harvesters will have very different needs, we need to provide a way to persist arbitrary configuration flags for each harvest source. The more flexible way given the current architecture in my opinion would be to store the configuration options as a JSON encoded object as a property of the harvest source (There already is an unused DB field called config in the database) (Maybe using JsonType??).

This will mean adding an extra field in the harvest source form to allow entering the configuration. This could be just a simple text field where users enter the JSON encoded object or a more clever mechanism (i.e an "Add a configuration flag" link that adds two new text fields for the key and value for each flag, and a mechanism to later build the JSON object). In any case, this should probably be hidden in an "Advance options" section.

Why do it this way

Harvester description

The info method would provide a single point to get all the information related to the harvester, and future properties could be added to the dictionary returned without having to modify the interface.

Arbitrary configuration

There is an already existing config field in the database, so we won't need to change the model. Harvesters could access the config object at any of the stages. Of course they could provide default values in their implementations so users don't need to enter them everytime.

Implementation plan


Risks and mitigations

The highest risk on the harvesters info method side is that harvester implementation don't offer one of the necessary properties (namely name and title). This could fire a warning when showing the UI form or using the CLI.


Adrià Mercader to do it.


None yet.

1305108868000000 1339774554000000
#1135 enhancement kindly rgrp assigned Changeset model for vdm

Move to Changeset model for vdm.

A changeset model is like an Audit-Log model in which we just record Changesets with Change-Objects rather than have Revision-Objects for each Object that is revisioned.

This change would also incorporate significant simplication of vdm.

1305209986000000 1340632267000000
#1136 enhancement kindly rgrp assigned Move to SessionExtension in vdm

When vdm was created there was no SessionExtension so we use MapperExtension for doing revisioning. Now that SessionExtension? exists we should use it. We can also follow the existing SQLAlchemy recipe: <http://www.sqlalchemy.org/docs/orm/examples.html?highlight=versioning#versioned-objects>

1305210855000000 1340632980000000
#1137 enhancement kindly rgrp assigned Remove need for statefulness in vdm

Statefulness, especially statefulness for relation (esp m2m) is cause of most of the complexity in vdm. It is required because, atm, revision objects have FKs to continuity objects.

This ticket proposes the following changes:

NB: this could be limited just to case of join tables (leaving state stuff on other tables)

  • Remove FKs from revision to continuity (or allow for them to be nullable).
    • We could just limit this to m2m stuff
  • Delete of an object leads to:
    • Deletion of continuity object
    • Adding an entry in revision table with state set to deleted (we retain state on revision table)

If this is done we will no longer need to worry about filtering on state on relationships as join table will only contain "active" relationships.

If we do this on all tables we remove need for any state awareness in client (e.g. no need to filter tables on active state).

The only disadvantage of this change is that undeletion becomes more problematic (we have to recreate some continuity objects).

1305211628000000 1340631974000000
#1144 enhancement timmcnamara ckan-backlog new Support DSPL

DSPL, the Dataset Publishing Language, is being promoted by Google for its "Google Public Data Explorer" system. It is an XML format with metadata.

The format is described on the developer docs ofthe Google Code site.

Google provides a Python script which reads CSV data and generates DSPL

Sample from http://code.google.com/apis/publicdata/docs/dspl_sample.html:

<?xml version="1.0" encoding="UTF-8"?>
<dspl xmlns="http://schemas.google.com/dspl/2010"

  <import namespace="http://www.google.com/publicdata/dataset/google/time"/>
  <import namespace="http://www.google.com/publicdata/dataset/google/quantity"/>
  <import namespace="http://www.google.com/publicdata/dataset/google/entity"/>
  <import namespace="http://www.google.com/publicdata/dataset/google/geo"/>
      <value>My statistics</value>
      <value>Some very interesting statistics about countries</value>

      <value>Bureau of Statistics</value>

    <topic id="geography">
    <topic id="social_indicators">
        <name><value>Social indicators</value></name>
      <topic id="population_indicators">
          <name><value>Population indicators</value></name>
      <topic id="poverty_and_income">
          <name><value>Poverty & income</value></name>
      <topic id="health">

    <!-- As noted in the tutorial, this concept should extend quantity:amount.-->
    <concept id="population">
          <value>Size of the resident population.</value>
      <topic ref="population_indicators"/>
      <type ref="integer"/>

    <!-- This country concept is defined for educational purposes only. A country
    concept exists in the Google geo dataset. See:

    http://code.google.com/apis/publicdata/docs/canonical/geo.html --> 
    <concept id="country" extends="geo:location">
          <value>My list of countries</value>
      <type ref="string"/>
      <property id="name">
          <name><value xml:lang="en">Country name</value></name>
            <value xml:lang="en">The official name of the country</value>
        <type ref="string"/>
      <table ref="countries_table"/>

    <!-- This US state concept is defined for educational purposes only. A US state
      concept exists in the Google geo US dataset. See:

      http://code.google.com/apis/publicdata/docs/canonical/geo.us.html --> 
    <concept id="state" extends="geo:location">
          <value>US states</value>
      <type ref="string"/>
      <property concept="country" isParent="true"/>
      <table ref="states_table"/>

    <concept id="gender" extends="entity:entity">
          <value>Gender, Male or Female</value>
        <totalName><value>Both genders</value></totalName>
      <type ref="string"/>
      <table ref="genders_table"/>

    <concept id="unemployment_rate" extends="quantity:rate">
          <value>unemployment rate</value>
          <value>The percent of the labor force that is unemployed, not seasonally
      <topic ref="social_indicators"/>
      <type ref="float"/>
      <attribute id="is_percentage">
        <type ref="boolean"/>


    <slice id="countries_slice">
      <dimension concept="country"/>
      <dimension concept="time:year"/>
      <metric concept="population"/>
      <table ref="countries_slice_table"/>

    <slice id="states_slice">
      <dimension concept="state"/>
      <dimension concept="time:year"/>
      <metric concept="population"/>
      <metric concept="unemployment_rate"/>
      <table ref="states_slice_table"/>

    <slice id="countries_gender_slice">
      <dimension concept="country"/>
      <dimension concept="gender"/>
      <dimension concept="time:year"/>
      <metric concept="population"/>
      <table ref="countries_gender_slice_table"/>


    <table id="countries_table">
      <column id="country" type="string"/>
      <column id="name" type="string"/>
      <column id="latitude" type="float"/>
      <column id="longitude" type="float"/>
        <file format="csv" encoding="utf-8">countries.csv</file>

    <table id="countries_slice_table">
      <column id="country" type="string"/>
      <column id="year" type="date" format="yyyy"/>
      <column id="population" type="integer"/>
        <file format="csv" encoding="utf-8">country_slice.csv</file>

    <table id="states_table">
      <column id="state" type="string"/>
      <column id="name" type="string"/>
      <column id="country" type="string">
      <column id="latitude" type="float"/>
      <column id="longitude" type="float"/>
        <file format="csv" encoding="utf-8">states.csv</file>

    <table id="states_slice_table">
      <column id="state" type="string"/>
      <column id="year" type="date" format="yyyy"/>
      <column id="population" type="integer"/>
      <column id="unemployment_rate" type="float"/>
        <file format="csv" encoding="utf-8">state_slice.csv</file>

    <table id="genders_table">
      <column id="gender" type="string"/>
      <column id="name" type="string"/>
        <file format="csv" encoding="utf-8">genders.csv</file>

    <table id="countries_gender_slice_table">
      <column id="country" type="string"/>
      <column id="gender" type="string"/>
      <column id="year" type="date" format="yyyy"/>
      <column id="population" type="integer"/>
        <file format="csv" encoding="utf-8">gender_country_slice.csv</file>

1305763609000000 1339774517000000
#1145 enhancement timmcnamara ckan-backlog new Support the Handle System

The Handle System is an initiative to provide persistent references for resources. That is, it's basically a proxy system for preventing link rot.

Its documentation is here: http://www.handle.net/. Servers running CKAN could host a "Local Handle Service", which redirects a hash of a resource to an actual URL.

Some suggested use cases:

  • Researcher would like to cite where data came from
  • Agencies would like to have a way to prevent vendor lock-in from CKAN if they decide to move to another platform
1305764775000000 1339774502000000
#1152 enhancement amercader amercader ckan-backlog new True support for generic CSW servers

The CSW harvesters implemented at the moment were developed with the DGU project in mind, and they assume all remote CSW servers to implement the Gemini 2 specification. Gemini 2 is the profile defined in the UK for INSPIRE complying metadata, so obviously catalogs from other countries or non-INSPIRE complying ones won't be able to be harvested.

The changes needed to support generic CSW servers (i.e. those implementing the ISO 19139 profile) are:

  • Handling the validators (right now are hardcoded in the harvester

code). This probably involves issues discussed in the CREP 3 (ticket #1134)

  • Changes in the model to adapt the specification to ISO 19139
  • Renaming objects and classes which are now Gemini-centric

List of CSW servers tested:


1306141334000000 1313411822000000
#1163 enhancement rgrp rgrp ckan-backlog new Improvements to Storage Extension

Storage is now working but there are

  • Integrate with Resources (e.g. create a resource for each file upload and give option to associate with a package)
    • Should we introduce rule that files *not* associated with a Resource are periodically deleted?
  • Allow setting of a file name/path before upload
  • Allow for file overwriting/deleting etc (how should this work -- do we want to allow this sort of thing)
  • Integrate local file upload stuff in api/auth/*

Different Backend Issues

Local file store is rather different from 'remote' storage in various ways:

  • For remote you don't want to use many buckets as there are bucket limits while for local you want to. Should we there have a single path that users provide which we then partition differently for different backends.
1306408778000000 1310133808000000
#1165 enhancement nils.toedtmann ckan-future new Add multi-site support to ckan

Currently, each ckan site needs its own ckan wsgi process. That eats a lot of resources where many ckan sites are served from one machine (e.g. eu3).

That would dramatically change if a ckan process could behave like multiple ckans (e.g. like Apache's "<VirtualHost?>", or tracd). Depending on the "Host:" header in the HTTP1.1 request, it would choose which local ckan ini file to obey.

I see two ways to constitute the map hostname-to-ini-file map:

  • ckan reads a set of ini files, and each ini file declares which servers names it is responsible for
  • In a global ini file, there are directives mapping servernames to ini files.

In either case there should be a global ckan ini having the default settings for all local ckan sites. Each site ini could be very short then, just having e.g. title, name, database credentials, active plugins etc.

1306413667000000 1339774466000000
#1171 enhancement mark.wainwright dread ckan 2.0 assigned Citation instructions on dataset and resource view pages

Some sort of citation helper. Something small on the dataset and resource page that would show how to cite.

wwaites: Some related thoughts on this from opb: http://homepages.inf.ed.ac.uk/opb/papers/ssdbm2006.pdf

timclicks: I'm looking at Dataverse for the first time[0]. It seems very popular in the social sciences. I noticed that there is a recommended citation for each dataset. For example, [1] is has this one: "Targeted Input Programme (TIP) 2000-01", http://hdl.handle.net/1902.1/SSC-MWI-TIP2000-01-M1 V1 [Version]"


Add a small box at bottom of dataset / resource page (or in sidebar on dataset page) with title "Cite this" with contents like:

%title. %author. Retrieved %date. %site_title.

For resource: %title = %dataset_title. %resource_name.

Could also add export to ref managers (e.g. to bibtex) but that is for later.

1306920799000000 1347358705000000
#1179 enhancement timmcnamara ckan-backlog new Support tag aliases

A small number of tags are near-duplicates of each other.

Perhaps we could support word stemming from NLTK and/or manual tag aliases:

statistics statistik ... survey surveying surveys

1307429221000000 1339774332000000
#1182 defect timmcnamara ckan-backlog new Comments from deleted packages appear in "Recent Comments" feed

When a package has been deleted, say for spam moderation, comments still appear in the recent comments section.

This is a problem because non-admin users will be shown a warning that they're not authorised to view the package if they click on the link.

At CKAN.net currently, this affects the most recent comment.

1307658251000000 1339774319000000
#1184 enhancement timmcnamara ckan-backlog new Support Wuala as CKAN storage option

Most of CKANs storage options are tied to the USA. This brings concerns of data security for some organisations who may wish to adopt the system. Wuala is a distributed file system that stores data in a peer-to-peer manner. The company behind it, LaCie? sells storage for a fee. However, they also enable clients to have 'free' storage space when machines act as a storage node.

In order to be a storage node, a machine needs to be online for more than 14% of the time - roughly 4h per day. Most CKAN servers are likely to have a far greater uptime than this.

Supporting Wuala would go some way to enabling CKAN to be used in a secure manner. That is, CKAN could be promoted for organisational use where there is lots of data to be stored and large geographic distances to be managed. There is a Python client available and a fairly long Google Tech Talk that overviews the system.

1308034751000000 1339774306000000
#1185 defect timmcnamara ckan-backlog new Administrators can't delete packages from web UI

Administrators have "View", "Edit" and "History" tabs. However, I can't see a way to delete a package from the web UI.

Version: CKAN.net as of today

1308111818000000 1339774289000000
#1188 enhancement nickstenning ckan-backlog new Allow diffing against initial (blank) package version

Currently the history page only allows diffing between different versions of a package, but there doesn't appear to be any easy way to see the changes introduced by the first version of a package.

I'm requesting the ability to diff against a "blank slate" initial state of a project, so I can see the content of the first project commit.

Not sure if this is a vdm feature, so I'm putting this ticket in against ckan.

1308153160000000 1339774275000000
#1198 enhancement dread ckan-backlog new Publisher hierarchy

'Publisher' entities in the model. They are hierarchical.

'User-Publisher' connections with one or more roles (e.g. drafter, moderator).

Authorization settings can control who can set what values in a 'published by' type field.

Publishers and User-Publishers available to read in the API.

Future tickets will provide:

  • API to write Publishers and User-Publishers
  • UI to edit Publishers and User-Publishers

(This feature deprecates authorization groups)

1308820592000000 1339774200000000
#1201 enhancement kindly ckan-backlog new seperate out logic in atom feeds to logic layer.

Simplify the logic in the atom feed an make all feeds use logic layer to return lists.

1308928892000000 1310124297000000
#1203 defect johnglover rolf ckan-backlog new Moderated edits: html code shows as "changed" although it is not

I've installed the Moderated Edits extension (ckanext-moderatededits) and am editing a package imported from IATIregistry.org, with an extra field which contains a bit of HTML.

The editor indicates the field has changed, although the content hasn't (see screenshot). All I can find so far is a minor difference: in the field content, there is a code &#8212 and in the rendered table that is an &mdash;

1309274970000000 1313401579000000
#1227 enhancement timmcnamara ckan-backlog new Display packages' tags in search results

In when displaying search results, it would be useful to also display the tags of a package. Sometimes it's difficult to infer the scope of what the package does from the title and the first sentence of the description. Tags are quite concise way to display rich information.

ENV=datacatalos.org, with CKAN 1.4.2a

1311034262000000 1339774147000000
#1232 enhancement thejimmyg ckan-backlog new [super] Interface improvements

Child tickets:

  • #1194 "Welcome back" message for newly registered user
  • #1202 Links to datapkg utility don't lead to info about it
  • #925 Change the search box icon to remove the down arrow
  • #923 Search box doesn't work in leaderboard page in stats extension
  • #1034 Flash message cached
  • #737 Markdown syntax summary page
  • #811 Extra field editing form layout breaks when there are long field names
1311178296000000 1315948536000000
#1233 enhancement thejimmyg ckan-backlog new [super] Improve wiki-style functionality for history

At the moment we have a good revisioning system but a poor history interface. We need to improve this in a number of areas:

  • #191 Searching by modification date
  • #193 Searching by time-related field
  • #301 Package discussion pages
  • #1236 Package history page should provide links to pages at particular revisions, similar to the wikipedia pages
  • #1236 Viewing old revisions or unmoderated changes should have a message at the top of the package page
  • Other improvements as per my word doc.
1311179392000000 1315948668000000
#1235 enhancement thejimmyg ckan-future new [super] Search Improvements

Child tickets:

  • #234 UI Review - Autocomplete package names & tags in search
  • #193 Searching by time-related field
  • #191 Searching by modification date
  • #905 Unable to search with accented characters in package names
  • #906 Ability to search without accents for accented words
  • #924 Search box has no search button

Broadly speaking though we need to choose PostgreSQL, Solr or something else. We don't want to invest our time maintaining two search backends with a limited abstraction layer between the two.

1311182641000000 1311182641000000
#1240 enhancement kindly rgrp ckan-backlog assigned [super] API v4

(Just creating this ticket as somewhere to keep notes)

  • Decide on REST api versus action API
    • Do we want to support both?
  • Tidying
    • Unify on /api/v{version num}/... structure (do we want a default option that points to current default? e.g. /api/default/ ...)
    • extras merged into normal field list in package
    • Get rid of /rest/ so just have api/v1/package
    • Get rid of separation of search api from 'rest' api
      • Propose that GET on REST index is search e.g. /package/?q=...
        • This is also resolves issue whereby GET at root returns whole package set (a *bad* idea) as this would now become the matchall search query (with a default limit on items returned)
  • Resource read/write in API (separate from package)
    • Does this need authorization work?
  • user/account API - read/write
  • Remove autocomplete -- can just use search
    • Do not worry about backwards compat as should only be used in our js (if others using it too bad!)
1311525660000000 1325473312000000
#1244 enhancement dread assigned Notes field carriage-returns converted to CRLF

When you edit a package in the web form, if the notes field had \n as the End Of Line symbol, it gets lost when you preview or save the package, and the notes field is displayed all on one line.

This can be seen when editing annakarenina (as created by 'paster create-test-data'). The diff shows for example:

- Some test notes
+ Some test notes
?                +

but it would more clearly be shown as:

- Some test notes\n
+ Some test notes
?                ++

This is a significant problem with DGU, since a lot comes in via the API.

It's not clear what we should do about it. We could standardise on \n or \r\n when the form submission comes in. Do different browsers on different platforms do different things with EOLs?


Displaying the package: the Markdown processor respects both EOLs when displaying the field, putting each line in a <p> tag.

Creating the package edit form: placed into <textfield>.

Browser displaying package edit form: <textfield> displays \n and \r\n as EOL. But \n\n gets compressed to one EOL. But on submission, both are returned as \r\n.

Receiving the edited package: Somewhere along the line the EOL gets converted to \n\n.

1311689456000000 1340191253000000
#1255 enhancement kindly kindly ckan-backlog assigned Drupal consistancy checks.

Make a robust way to make sure the drupal database is consistent with the ckan data.

1312219968000000 1313400054000000
#1257 enhancement dread ckan-backlog new Anti-Spam tools

We are getting more and more spam on ckan.net and we need to improve our strategy of combating it. It is bad because google ranks who we link to (which we do want for legitimate links), and our front page contains the latest package edits, so spam is immediately very visible.


  • Package creation
  • Package edit
  • User creation
  • Group creation

Systems to consider:

  • Automatic
    • captcha
    • bayesian scoring
  • Sysadmin
    • a tool to quickly analyse and remove packages, edits and users
  • Helpful users
    • button to click indicating spam, leading to any/all of:
      • add TODO item (which emails package admin and sysadmins)
      • quarantine the package/user so it doesn't show up on the front page and a warning is shown when you view it

General thoughts:

  • We should learn from the wikipedia tools, as they are the experts
  • This is a serious problem that is only going to get worse and is a threat to the whole site.
1312283565000000 1339774129000000
#1259 enhancement johnglover pudo ckan-backlog new "Add a row" for Extras on Package form

The default package form offers 4 empty extras fields. Like the resource section, it should have an "add more" button to add another row.

1312302693000000 1312907056000000
#1260 enhancement pudo ckan-backlog new Remove duplicate functions from _util.html

There seems to be both a list view for dictized and non dictized data structures for all entities in _util.html at the moment. Probably in the back of someone's mind already, but cleanup here would be nice.

1312366652000000 1313401499000000
#1261 defect pudo ckan-backlog new Investigate dots in extras search

It seems that searching for extras_foo:value works with solr, but extras_foo.bar:value doesn't. No theory on why.

1312366768000000 1312366768000000
#1262 enhancement pudo ckan-backlog new Enforce "create-user" permission

This does not seem to have any implications at the moment, it should lock down registration and remove all related links.

1312375296000000 1323090112000000
#1273 requirement amercader ckan-backlog new Create docs for API v3 1313412083000000 1313412083000000
#1278 enhancement amercader ckan-backlog new Refactor authorized_query calls

There are some functions that still use the Auhtorizer().authorized_query method:

./ckan/controllers/authorization_group.py:24:        query = ckan.authz.Authorizer().authorized_query(c.user, model.AuthorizationGroup)
./ckan/lib/base.py:237:        groups = ckan.authz.Authorizer.authorized_query(c.user, model.Group, 
./ckan/lib/search/sql.py:55:        q = authz.Authorizer().authorized_query(username, model.Group)
./ckan/lib/search/sql.py:118:        q = authz.Authorizer().authorized_query(self.options.get('username'), model.Package)
./ckan/logic/action/get.py:154:    query = Authorizer().authorized_query(user, model.Group, model.Action.EDIT)

./ckan/tests/test_authz.py:158:        q = self.authorizer.authorized_query(self.notadmin.name, model.Package)
./ckan/tests/test_authz.py:353:        q = self.authorizer.authorized_query(self.notmember.name, model.Package)
./ckan/tests/test_authz.py:357:        q = self.authorizer.authorized_query(self.member.name, model.Package)
./ckan/tests/functional/test_authorization_group.py:44:        group_count = Authorizer.authorized_query(u'russianfan', model.AuthorizationGroup).count()
1313415177000000 1313415177000000
#1285 enhancement dread ckan-future assigned Errors cause emails

Currently a sysadmin gets an email when an exception is not caught. But there are occasions when we DO want to catch an exception so we can fail nicely for the user, but the sysadmin STILL gets an email to know to fix something. e.g. if there is an exception when search indexes a package. You want to catch the exception and still run any other notify calls.

1314116944000000 1338206151000000
#1286 enhancement rgrp ckan-backlog new Remove remaining formalchemy stuff

Stuff I've spotted:

  • forms/*
  • template/group/edit_form.html
  • template/package/edit_form.html

This can go once new DGU form is in.

1314116996000000 1342436420000000
#1287 enhancement thejimmyg dread ckan-backlog assigned NAVL validation errors - Junk fields should be listed explicitly

When you create a package, but specify a key that is not allowed (e.g. 'relationships') then you get error message:

{"__junk": ["The input field __junk was not expected."]}

It should mention the actual key which is not expected. e.g.

{"relationships": ["The input field 'relationships' was not expected."]}

Kindly said that James' version of NAVL was better in this respect, so this might be best solved by moving to that.

1314203102000000 1330990459000000
#1288 defect dread ckan-backlog new Package edit/creation can't include 'relationships' field

When you create or edit a package (via the API), you aren't able to specify the relationships it has. (If you do you get 409 {"__junk": ["The input field __junk was not expected."]} )

The normal way to create relationships is via /api/rest/relationships/ and this works. But when you GET a package, the dictionary lists all relationship details. So this bug creates a problem for editing a package that has relationships - you want to GET it, make any edits and then PUT it back. The work-around is to delete the 'relationships' key from the dict before you PUT it back.


Ideally, CKAN would read the 'relationships' key and make the necessary changes. This is a chunk of work.

Another good option is to allow an unchanged 'relationships' value, but barf it is edited. This is also a chunk of work.

A bad option would be to just ignore the 'relationships' value, since users will get frustrated changing this value and wonder why it never saves, not understanding it is different to all the rest, without error message.

A final option is to get rid of relationships altogether.

1314203695000000 1339774098000000
#1311 enhancement rgrp rgrp ckan-backlog new Modal user register and login form

Subticket of: #1294

Rather than having to visit a dedicated page it would be good if registration and login could be done from a modal form (separate or combined ...).


  • It could be used from dataset creation page in situations where user needs to be registered / logged in to create a dataset (so we could allow someone to start creating a dataset and only get them to login at the end ...)
  • It allows for quicker and easier logging in


  • See Friedrich's work on the datahub
1315297227000000 1315297227000000
#1314 enhancement dread ckan-backlog assigned ckanclient search - generator improvements

Apparently the search generator always makes two requests, even if you don't want to see the search results, which might be slow. Can this be optimised?

Maybe we should also provide a second search function that doesn't use the generator - the original simple search function (that leaves the user to deal with limit & offset).

1315395410000000 1340191233000000
#1317 defect dread ckan-backlog assigned password reset - improve user search

In password reset, it gets confused if you have two similar users. This is because with the string the user provides, it searches several fields, not just name but also fullname and email address, allowing you to search for these. But only name is unique.

Specific problem: Ira searches for "Irina" then it finds both: <User name=irina fullname=Irina [email protected] ] and <User name=shevski fullname=Ira email=> (I think)

Maybe need to choose which field it searches?

1315415539000000 1340191221000000
#1322 enhancement dread assigned Action API improvements

Focusing on improving Action API as the v3 API:

  • have an optional parameter of the data_dict called "options". Options would contain items that would get passed into the context. e.g. {"options" : {"ref_package_by": "id"}}.
  • instead of using API version to change the way packages are referenced, use the ref_package_by.
  • All package_show, group_show etc. to accept an object 'name' as an alternative to object 'id'.
  • Action API is v3 of api, replacement for v1 & v2. Default for most urls is still v1, but if url is /api/action then default to v3.

Next steps:

  • Add search API (package, resource,
  • Add Util API
  • Clarify JSONP still applies
  • Add doc strings, clarifying parameters
1315474749000000 1340191088000000
#1326 enhancement thejimmyg ckan-backlog new Write a set of auth plugin functions to integrate with Druapl

Ticket #787 described join auth between CKAN and Drupal. The authentication part is live and implemented. This ticket is a placeholder for work that will be needed in the new auth system to link authorization functions to Drupal. It is dependent on the groups refactor.

1315821084000000 1315821084000000
#1328 defect [email protected] assigned Unicode & paster commands

A possible bug in CKAN when I tried deleting users using "paster --plugin=ckan user delete" command.

To reproduce the bug do the following:

  1. Create a user with an ID (which in my case was a user's full name)

that contains non-unicode caracters like Norwegian "æ", "ø", or "å".

  1. Make sure that you can see something like the example below:

(pyenv) [email protected]:$ paster --plugin=ckan user Users: name=Rustæm

  1. Then try deleting the user with following command:

(pyenv) [email protected]:$ paster --plugin=ckan user delete "Rustæm"

You should now get a python encoding error. I know that this is quite rare case, but in our case it caused some trouble. Could you guys have a look at this bug?

CKAN ver. 1.3.3.

1315823110000000 1340191065000000
#1336 defect johnglover dread ckan-backlog new License fudge

cset:4b59ab34137d ckan/logic/action/get.py:

-            isopen = model.Package.get_license_register()[license_id].isopen()
-            result_dict['isopen'] = isopen
+            try:
+                isopen = model.Package.get_license_register()[license_id].isopen()
+                result_dict['isopen'] = isopen
+            except KeyError:
+                # TODO: create a log message this error?
+                result_dict['isopen'] = False 

This change hides problems with the license server and returns potentially incorrect values for openness.

This has been noted as 'temporary fix' but seems to be forgotten about, since it has been merged to default and gone into release 1.4.3.

I suggest the licenses are cached (I thought this was already the case when CKAN first requests them after start-up?). I suggest failure would return 503.

1315912057000000 1323173073000000
#1343 enhancement rgrp rgrp ckan-backlog new [super] User related improvements (login, user pages etc)
  • Disallow account creation via openid - #1386
  • Require email field - #1319
    • Require email confirmation to be activated (?)
  • Improvements to user page (e.g. show activity and more info about user) - #1396
  • Modal user login - #1311
1316017098000000 1318528138000000
#1352 enhancement amercader ckan-backlog new Use logic functions instead of as_dict when indexing entities

The current search implementation uses the output of the the as_dict method of the domain Package object to update the index


It also uses package_to_api1 in the SynchronousSearch? plugin:


This prevents extensions from being able to index custom properties (e.g. faceting by custom extras not included in the model).

The search should use the logic function to get the package properties:

1316615397000000 1339774086000000
#1355 defect amercader ckan-backlog new Package extras property does not include the newly created ones

The extras in the package object sent to the extensions after editing (https://bitbucket.org/okfn/ckan/src/01efd5649c10/ckan/logic/action/update.py#cl-226) do not include the newly added.

1317034126000000 1339774056000000
#1358 enhancement zephod rgrp ckan-backlog assigned Generate configuration documentation automatically from the deployment_ini_tmpl file

At the moment documentation of config options is duplicated between source (deployment_ini_tmpl in ckan/config which is used to generate user ini file) and the docs.

Suggest we write a script that automatedly generates reference documentation for the config from the source.

May be obsoleted by #277 (some config in db)

1317076350000000 1318257823000000
#1366 defect dread ckan-future assigned Search inside extra fields

SOLR search doesn't support searching for part of an extra field, but it does for other fields.

i.e. title="One Two Three" matches q=one AND q=title:one and geographic_coverage="England Scotland" matches q=England BUT NOT q=geographic_coverage:England

This problem emerged when we went to SOLR in #1275 (CKAN 1.5a). Tests were skipped.

This is could be a problem for DGU and maybe elsewhere.

1317290992000000 1338206707000000
#1382 defect thejimmyg thejimmyg ckan-backlog new Deleted resources are present for harvested package

Perhaps the importer deletes them before re-importing. We shouldn't have deleted resources, so let's investigate.

1318337889000000 1318337889000000
#1384 task rgrp shevski ckan-backlog new CKAN wiki needs updating to refer to thedatahub.org instead of ckan.net and datasets instead of packages

Most articles still refer and link to ckan.net, wiki.ckan.net and to packages

1318414077000000 1318414077000000
#1393 enhancement johnglover dread ckan-backlog assigned Don't skip search tests

Now we don't use postgres search, all the tests involving search now don't need to be skipped when running on sqlite. Should help coders spot earlier if these tests break.

1318505453000000 1320153590000000
#1403 enhancement zephod zephod ckan-backlog new Refactor groups index page

Groups are listed alphabetically with paging - not an ideal user experience. We would like to list groups in order of 'popularity': The number of datasets they contain.

Following this chain of thought, then, it would be nice to rearrange the groups table by clicking on column headers and having it sort by that column.

Furthermore, then, we'd like to implement a full-fledged groups search feature (if this is at all feasible).

The forthcoming groups refactor will probably have some bearing on this task.

1318847512000000 1318847566000000
#1406 enhancement zephod ckan-backlog new Re-enable RSS subscriptions

RSS 'subscribe' buttons appeared in many places on the site but were not very helpful. They took (confused) users pointed to the raw feed code, and Google Reader could not understand the feed. Safari, however, could interpret it correctly.

Their presentation needs to be clear and consistent; the RSS feed really needs testing in a variety of readers; and we need to decide exactly which items should get a feed. (Package updates? Groups?)

1318861327000000 1320930088000000
#1411 enhancement zephod rgrp ckan-backlog new Force resource format to be lower case (also mimetype)

Format should be lowercase. Automatically lower case (for extra points have a bit of javascript to force lower case when entering).

  • Even more points: do a update on thedatahub repo to make all format lower case (or script this as an update?)
1319319604000000 1319319604000000
#1414 enhancement shevski ckan-backlog new track user log-ins on thedatahub.org

Set up tracking for user logins so that we have stats about how many active users of thedatahub exist want to be able to see who logged in the the last x months

1319454782000000 1319454782000000
#1423 enhancement markbrough ckan-backlog new Edit resources suggestions
  • Description vs Name - Edit Resources view is showing the name of the package rather than the description, and a lot (all?) of the packages before the upgrade don't have names, so might be good to swap this round again, e.g.: http://thedatahub.org/dataset/edit/iati-registry
  • Moving resources - Moving them up or down the list used to be quite useful if you had a lot of resources that you might want to leave on the resources page, but only one or two that were actually current and that you wanted to draw attention to. This doesn't exist any more on CKAN but I think it would be good to add it back in.
1319641906000000 1338203678000000
#1424 enhancement dread ckan-backlog new Openness notice should be clearer

ckan-discuss discussion suggests changes to the 'openness' indicator ( http://lists.okfn.org/pipermail/ckan-discuss/2011-October/001786.html )

Dataset view page:

  • If there is an explicit but non-OKD compliant license, such as CC-BY-NC, then this should be stated explicitly, perhaps: “This dataset is Not Open. License: Creative Commons Attribution Noncommerical. This is not an open license as it does not meet the Open Knowledge Definition.”
  • If the license is marked as “Other::License Not Specified”, then this should be stated explicitly, perhaps: “This dataset is Not Open. It is published without an explicit license, the publisher reserves all rights to the dataset.”
  • 3. If the license field was left empty by the contributor of the Data Hub record, then again this should be stated explicitly, perhaps: “This dataset is Not Open. The license of this dataset is unknown or unspecified. Start an enquiry on IsItOpenData? »
  • There is a bug so that non-open licenses doesn't have an openness notice.
  • If downloadable resources are not available, this should not affect 'openness' - check this has been removed.
1319648089000000 1319648089000000
#1429 enhancement rgrp rgrp ckan-backlog new Provide DOIs for datasets in a CKAN instance

DOI = digital object identifier = http://www.doi.org/

As a Publisher I want a DOI for my dataset so that it can be cited by and linked to by others in a standard and easy way.


  • Probably implement as extension rather than core
1319977305000000 1319977305000000
#1432 enhancement rgrp ckan-backlog new [super] Data processing system for CKAN and Webstore

Super ticket: #1190

A data processing system which utilizes the Webstore. One could get a long way with simple javascript running in the browser for development with this javascript then run offline using something like nodejs. Alternatively one could allow one to specify a url to e.g. a python file which would then be run in a sandbox (with access to some specified set of python modules)

1320142747000000 1339774041000000
#1438 enhancement dread ckan-future new Action API - parameter discovery/checking

Many actions in the Action API require parameters. What params are needed should be listed and checked. Because currently, if you get them wrong you simply get a useless 500 error.

Currently they are listed in the docs, extracted from the code manually.

So you could GET /action/api/package_list to receive not only the help text, but a list of arguments.

And if you send an extra or missing argument then an intelligent error message can be returned.


How about some sort of decorator on the action function:

@logic_params(id, offset, limit)
def get_package_list(context, data_dict):

This would do the param checking, and is there a way to extract these params from the function? Or do a registration of the logic function?

I'd certainly like to keep the list of the list of params for the function with the function, for ease of reading the code.

Another good thing would be to pass in the params named as themselves, rather than having them contained in the data_dict.

1320161890000000 1338205050000000
#1439 enhancement dread ckan-backlog new Action API discoverablility

A good service API needs to be discoverable, so you are not always having to refer to the documentation html.

Maybe /api/action should return a list of actions available? (Currently this returns a 404.)

  • It would be nice to sort these into get/create/update/delete.
  • #1438 Parameters for each of the actions must be discoverable too

/api/action/{action_name} should also return the help text / parameters allowable. (Currently this returns 400 error)

1320161970000000 1325474974000000
#1457 defect jilly.mathews ckan-future new Bug with DataNL instance

"when logging into http://register.data.overheid.nl/ with OpenID, the /user/me page gives a 404. CKAN version 1.3.4."

n the manual it says an API key kan be created via http://test.ckan.net/user/me /. However, when I try the corresponding http://register.data.overheid.nl/user/me, I get a 404 error (not found).

1320925041000000 1320925041000000
#1458 enhancement rgrp rgrp ckan-future assigned Support previewing kml files in data viewer

Super ticket: #1151 (viewing geo data)

E.g. preview of http://thedatahub.org/dataset/louisvillecrime should bring up a map

1320936488000000 1340632932000000
#1459 enhancement rgrp rgrp ckan-backlog new Featured Dataset feature

Provide way to mark a dataset as featured. Featured database show up on the front page.

TODO: detail this more.

1321113012000000 1321113012000000
#1460 enhancement rgrp assigned Improve extensions documentation

Current extensions documentation needs some work: http://docs.ckan.org/en/latest/plugins.html

  • Queue extension section may now be out of date (?)
  • Think about how it integrates with https://github.com/okfn/ckanext-example (especially tutorial and example extension)
  • Document all plugin points (auto-extract from CKAN source??)
1321114523000000 1340191001000000
#1466 enhancement thejimmyg ckan-backlog new Need to support https login for multiple instances as part of the CKAN package install 1321375978000000 1328529062000000
#1489 enhancement dread ckan-backlog assigned Updating example theme/extension

ckanext-example needs updating for CKAN 1.5:

  • theme changes
  • new forms

About: 'ckanext-exampletheme' was created in Spring 2011 as an example CKAN extension that showed how to customise the look & operation of CKAN. This moved to github and renamed 'ckanext-example'.

1322137920000000 1324292384000000
#1507 enhancement zephod rgrp ckan-backlog assigned Minor fixes to dataset add on Group edit form - 0.5d

Group edit dataset add form needs some work

  • Dataset name is not cleared when you add
  • No way to remove item from list of datasets to be added if I make a mistake
  • (2nd Apr 2012) It now seems that option to add multiple datasets at once has disappeared (perhaps during the CSS/HTML refactor ...)
1323088429000000 1338205220000000
#1509 defect dread assigned Mis-dated old revisions of datasets

e.g. http://thedatahub.org/dataset/osm%402011-07-12T12%3A16%3A47.590358 gives:

This is the current revision of this dataset, as edited at 2011-07-12 12:16.

but it should say the date 2011-07-12 (which is the data being displayed).

The problem is looking at PackageRevision?, when it might be a tag or extra.

1323090398000000 1340190814000000
#1534 enhancement rgrp ckan-backlog new Change revisions to record userid rather than username

The use of username is problematic because username's can change.

  • Change all revision creation code to use user id (simplest is to change c.author field in lib/base.py (?))
    • (?) Add a field ipaddr for ip address of anonymous users? (or just keep putting this in author field on Revision and then acception that those won't match when we do a look up against user table)
  • Change user view page to look up against user id rather than name
  • Perform migration on existing Revision objects
    • Match should probably be against both openid and username when searching Revisions' author field (especially true on CKAN where some people have already changed their username from being their openid)
1323278790000000 1338205050000000
1 2 3 4 5 6 7 8 9 10 11
Note: See TracReports for help on using and creating reports.