{22} Trac tickets (2647 matches)

Results (2401 - 2500 of 2647)

Id Type Owner Reporter Milestone Status Resolution Summary Description Posixtime Modifiedtime
#2508 enhancement seanh ckan-backlog new Make it possible to run CKAN tests for each language

Mistakes in translated strings can cause CKAN to crash or otherwise not work, but it's not practical to manually test every page and function of CKAN in every language that we have new translations for before a CKAN release. It'd be great if the tests could automatically be run for each language.

This is probably a big job, we would have to get the tests to respect a language setting in the ini file, check for any individual test cases that specify the language (e.g. in the URL), and also fix test cases that look for specific English words in HTML output, etc.

In the meantime, a good stop-gap solution might be a script that tests for common mistakes in the po files.

1339411335000000 1339770771000000
#2513 enhancement ross ckan-backlog assigned Dataproxy should not default to utf8

Unless explicitly told by the source web server the dataproxy should not assume that the content it has can be encoded as UTF-8. Even though the chars from 128 - 255 overlap an attempt to decode some byte array as utf8 will fail whenever a latin1 char whose bitpattern has the MSB set.

This will mean that the UTF8Recoder can be more rigid in its acceptance of data, Postel aside.

1339575820000000 1346669646000000
#2516 enhancement seanh seanh ckan-v1.9 accepted Make 'Assign to:' field on trac.ckan.org into a dropdown list

there's a setting for this

1339578442000000 1341234822000000
#2520 defect seanh seanh ckan-v1.9 assigned Document undocumented config options

There are 21 undocumented config options in CKAN, some of which are not mentioned in the config file template either:

ckan.admin.name ckan.admin.email ckan.default.group_type ckan.page_cache_enabled ckan.cache_enabled ckan.cache_expires ckan.extra_resource_fields ckan.extra_resource_group_fields ckan.storage.key_prefix ckan.storage.max_content_length ckan.feeds.authority_name * ckan.feeds.date * ckan.feeds.author_name * ckan.feeds.author_link * ckan.mail_from ckan.gravatar_default * ckan.plugins ckan.api_url ckan.auth.profile ckan.datastore.enabled ckan.tracking_enabled

There are also some options that are in the default deployment.ini even though they're deprecated:

ckan.async_notifier carrot_messaging_library ckan.build_search_index_synchronously

See email to ckan-dev from David Read: http://lists.okfn.org/pipermail/ckan-dev/2012-June/002447.html

It'd be best if the docs could be automatically pulled from the source into sphinx using autodoc, see #1358

1339588368000000 1340624908000000
#2524 enhancement kindly kindly ckan-ecportal new If there are no translation files for selected language fall back to default lang.

If a user selects a language there are no mo files for then an error is raised. Revert to default language instead.

1339609048000000 1340117608000000
#2529 enhancement rgrp ckan-v1.9 new DataHub (or CKAN) widgets

Simple widgets in pure JS. For example:

  • Count of datasets in a group (could generalise to a query but not sure how useful that is ...)
  • Embeddable list of top X (5) datasets for a given query
  • Embeddable list of *my* datasets

Either these live at: {site}/widgets and we have some kind of generator (form where I choose my group, or my query).

Or: we have this attached to areas of site where relevant.

Can combine the 2 so that the latter links to the former. Think first will be easier to do and possibly more useful long-term (e.g. can just link people to that page).

Cf. http://okfnlabs.org/ckanjs/

1339750049000000 1340624917000000
#2530 enhancement kindly rgrp ckan-v1.9 new DataHub purge fails on some revisions

See http://datahub.io/ckan-admin/trash and try to purge revisions (*not* datasets). It will fail on some of the revisions with errors like:

Problem purging revision 391db9e8-df57-4e0e-8fe6-d4e0c2318344: (IntegrityError?) update or delete on table "revision" violates foreign key constraint "group_extra_revision_revision_id_fkey" on table "group_extra_revision" DETAIL: Key (id)=(391db9e8-df57-4e0e-8fe6-d4e0c2318344) is still referenced from table "group_extra_revision". 'DELETE FROM revision WHERE revision.id = %(id)s' {'id': u'391db9e8-df57-4e0e-8fe6-d4e0c2318344'}

1339750498000000 1341268280000000
#2531 enhancement rgrp ckan-backlog new New state option: archived / deprecated

Deleted means things will get purged at some point.

Archived means they stay around but get hidden from search results and a big warning notice gets displayed saying this is archived / deprecated.

@richard cyganiak ...

1339750787000000 1339770649000000
#2535 enhancement rgrp assigned SSL certificate for DataHub + https by default

DataHub? is increasingly used and we should ensure it uses ssl as part of general security.

See also #1446 (Need to support https login for multiple instances as part of the CKAN package install)

1339758027000000 1346662082000000
#2537 enhancement seanh seanh ckanbuild accepted Test and document ckanbuild

https://github.com/okfn/ckanbuild

Verify that what's there so far still works, write a README explaining how it works

1339775328000000 1340639830000000
#2538 enhancement seanh seanh ckanbuild accepted Add multiple-instance support to ckanbuild

Probably use ansible to do this. To create an instance, create a dir at /etc/ckan/MYSITE, and put MYSITE.wsgi, MYSITE.ini and who.ini files in it. Also put a MYSITE file in /etc/apache2/sites-available. See the example files already present in ckanbuild. Booting a new site should be a single command.

May not handle the postgres/solr/elastic-search side of things yet, could just require the user to set these up herself first and then pass them as args to the create-instance command.

1339775499000000 1340639836000000
#2539 enhancement seanh seanh ckanbuild accepted Investigate the existing ckan debian package for ckanbuild

Do we want to build on top of the existing debian packaging code? Or throw it away and start fresh?

1339775661000000 1340639845000000
#2540 enhancement seanh seanh ckanbuild accepted Implement a way of upgrading ckan sites using ckanbuild

When there are multiple ckan sites installed on a single server via ckanbuild, there needs to be some way of upgrading them all to a new ckan version at once.

1339775740000000 1340639850000000
#2541 enhancement seanh seanh ckanbuild accepted Add non-core extensions to ckanbuild

We want some extensions from outside of CKAN core to be included in ckanbuild. These would be pip installed into the virtualenv before packaging the debian package. Decide which extensions to include.

1339775826000000 1340639856000000
#2542 enhancement seanh seanh ckanbuild accepted Create jenkins job to run ckanbuild, and run tests

It should run the script to create the debian package, boot a VM, install the debian package on the VM, boot a CKAN instance, then run the tests.

1339775888000000 1340639863000000
#2543 enhancement icmurray ckan-v1.9 new facet.sort is not available in the package_search action

Not all solr facet parameters are available through the pcakage_search action. In particular, facet.sort has been asked for; but this ticket should check to see if there are other parameters that would be easy to add too.

See: http://wiki.apache.org/solr/SimpleFacetParameters#facet.sort

1340013335000000 1340633091000000
#2546 requirement ross ckan-backlog assigned ODS Managing homepage content

Requirements

Require the ability for users to control some level of content that is visible on the home page of their ODS installation. This may be through RSS/Atom feeds (see #2234) or another mechanism but should result in admins being able to change blocks of text on their homepage.

This should not be configuration, but accessible through WUI.

Interface

None

User Stories

  • As a system administrator I want to have control over content displayed on the front page beyond featured/popular items.

  • As a system administrator I don't want to manage content through having to write an extension.

Tasks

[ ] Analysis

Estimates

1340016842000000 1346663437000000
#2547 enhancement shevski ross opendatasuite 2 assigned ODS Initial data sets

Requirements

The ODS demo site will need data adding, initially as fixtures but it would also be useful if we started evaluating datasets that we can ship with ODS installations (at least in the UK) from places such as DGU and ONS.

May wish to create a ticket for making sure the datasets within the system are reset every X hours. Perhaps.

Interface

None

User Stories

  • As a new system administrator for an ODS instance, I don't want to have a site devoid of any data. Geographically relevant datasets would be welcomed.
  • As a bizdev person I would like to be able to demonstrate how ODS works with real datasets.

Tasks

[ ] Identify relevant sources for datasets

[ ] Pick datasets

[ ] Set them up for import

Estimates

1340016906000000 1340705614000000
#2548 enhancement kindly ross datahub-july assigned Object ownership for groups/package

Requirements

We need to be able to easily determine who the owner of a dataset or group is. Datasets and Groups should have an Owner, who may change over time but is a specific user within the CKAN instance. It should be easy for CKAN components to determine the user and for the initial version we should ignore the can of worms labelled 'ownership transfer'.

At this point migration is likely to be the biggest issue, and would suggest that it is acceptable that the last user to edit a dataset be set as the current owner.

More tickets should arise as a result of this work where we may be able to optimise some queries to use the new feature.

Interface

None

User Stories

None

Tasks

[ ] Analysis/Clarification?

[ ] Tests

[ ] Migration

[ ] Code/Schema? changes

[ ] Documentation

Estimates

1340017331000000 1340706539000000
#2550 enhancement icmurray ross datahub-july assigned User types

Requirements

In the data hub plugin we require the ability to differentiate users between those that have paid for a service, and those that haven't. The distinction isn't boolean as there may be levels of service for paid users, so it may be that we need a 'type' of user where there are various grades of 'paid' which are likely to be strings (specific to the data hub).

Required interface

Once changes have been made to the user schema, for a given user we want to be able to:

  • determine if they have a paid or a free account, and
  • get a string name of the type of paid account.

Care should be taken to ensure that the 'paid' status of the user cannot be set through the API and only by the datahub plugin.

User Stories

User stories related to the management, setting and changing of a user's payment level, as well as historical information on payments should be done as part of the work that includes actually allowing purchases. For now it is adequate that we can manually control these things through paster commands.

Payments types should be linear as I don't believe for this type of service a pick-and-mix modular model would work well. Organizations will inherit the payment level of their owner, so currently there is no requirement for it to affect organizations at all.

  • As a sysadmin I would like to be able to use a paster command to manually set a user's payment level, or remove it entirely.
  • As a sysadmin I would like to be able to run a paster command to view a list of users who have a payment plan, grouped by the plan that they have.
  • As a sysadmin I would like to be able to use the API to change the payment status of a specific user through user_create and user_update. This shouldn't be available to anybody else.
  • As a user, and only if I have one, I'd like to see my current payment level on my user profile page.

Tasks

[x] Tests

[x] Plugin based migration

[x] Code

[x] Model

[x] API

[x] Documentation

Estimates

1340017590000000 1346669497000000
#2552 enhancement ross ckan-future assigned Controlling access to features

Requirements

To provide a freemium service it is necessary to be able to provide differing levels of functionality based on the type of user (see #2550). These levels can be specific to the data hub but may require overriding functionality from core to provide these checks.

Initial implementation should focus on limiting access to datastore disk space.

Interface

These changes are currently only for the data hub and should be kept as much as possible within the data hub extension.

User Stories

  • As a system component I want to find out if the current user has access to a feature (i.e. storage) and if so to what extent (xMb, xGb or unlimited).
  • As a system administrator I don't expect to need to manage the levels of users or the features that this applies to.

Tasks

[ ] Clarification of requirements/analysis

[ ] Tests

[ ] Code

[ ] Model

[ ] API

[ ] UI

[ ] Documentation

Estimates

1340018770000000 1346669544000000
#2554 enhancement ross ckan-backlog assigned Research Virtuoso cartridges

Look into writing a cartridge for importing CKAN data into a Virtuoso quadstore

http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger#How Does It Work?

1340026645000000 1346670433000000
#2555 enhancement toby aron.carroll demo phase 5 new Demo site needs a breadcrumb helper

Something to make building breadcrumbs a bit nicer

1340026983000000 1342618384000000
#2572 enhancement toby toby ckan-v1.9 new clean up stats plugin

attempt to disengage the stats plugin from core as much as possible

1340105054000000 1340899524000000
#2573 enhancement icmurray ckan-future new package_search does not allow solr's per-field facet parameters

Solr allows its facet parameters to be specified on a per-field basis, eg. facet.limit applies to all facet fields, but solr allows it to be overriden for a specific field, eg. facet.tags.limit.

We don't support this at the moment because we have a whitelist of valid solr query parameters that we accept. See ckan.lib.search.query.VALID_SOLR_PARAMETERS.

1340112439000000 1340633101000000
#2579 enhancement toby toby ckan-v1.9 new move sort_by functions into lib.helpers

make these more available but keep existing functionality so not to break any users

remove_field()

drill_down_url()

etc

1340281798000000 1340899337000000
#2582 enhancement rgrp ckan-v1.9 new Do not hide notes / readme on dataset pages

Current we hide most of readme and then let users reveal it. Stop doing this and if necessary add a quick link down to resources section. (Maybe also rename resources to Data and Resources ...?)

Aside: believe I have mentioned this somewhere a month + ago but could not find the ticket.

1340312340000000 1340625009000000
#2583 enhancement toby toby demo phase 5 new make sure that we implement authentication where needed

in development many auth checks may have been lost we need to check they are still working etc

1340359478000000 1342086274000000
#2585 enhancement seanh ckan-v1.9 new Escape solr control characters in search queries, add advanced search screen

Suggestion from David Read:

We noticed that some search queries produce unexpected search results in CKAN, due to them containing special characters. For example if you were to search for "Spend over £25,000 - NHS Leeds" then it would not come up with the dataset with that exact name. It was excluding datasets with the word "NHS" due to the dash/minus sign. It works fine if you escape the minus sign: "Spend over £25,000 \- NHS Leeds".

So in data.gov.uk I've added escaping of such control characters in our plugin and this useful routine:

http://fragmentsofcode.wordpress.com/2010/03/10/escape-special-characters-for-solrlucene-query/

Perhaps you would consider providing this in CKAN core in future?

I think there is an occasional case when power users would want to use the special characters - brackets, +, -, boolean operators etc. but maybe these could be reserved for an 'advanced search' screen?

1340360773000000 1340625078000000
#2590 enhancement shevski ross ckan-backlog assigned Publisher dashboard

Need proper user stories but ...

Publisher admins/editors may need a more useful group read page showing things like:

  • The current search
  • Recent activity
  • People within the group
  • Followers
  • Others?
1340618617000000 1346663416000000
#2603 refactor icmurray icmurray ckan-v1.9 new Remove deprecated 'fields' parameter from resource_search

The fields parameter of resource_search was deprecated when fixing #2438. It can be removed in release 1.9, and the action tidied up as a result.

1340730601000000 1340730601000000
#2607 defect seanh ckan-backlog assigned 'Upload a file' appears on resource form when storage not enabled

if the user tries to upload a file they will get "Failed to get credentials for storage upload. Upload cannot proceed"

Maybe add a test for it this time, this bug has appeared and reappeared before

1340803808000000 1346663383000000
#2619 enhancement seanh seanh ckan-v1.9 assigned Omit private datasets from public activity streams

Activities about private datasets should not appear in public activity streams.

I don't think you want to actually purge the activities from the db, because you might still want them to appear in private activity streams.

I do think that when a dataset goes private all its past activity should go private, because I imagine that users are going to want to hide everything about the dataset and not have any past activities 'leaking out'

I don't think you want to consider whether the dataset was private when the activity happened, rather if a dataset is private now then all its past activities are private (and the simplest thing would be to say that if a dataset is public now then all its past activities become public as well, but is that a privacy concern?)

The easiest way to implement this is going to be by modifying the *_activity_list() action functions in get.py, after they pull their activity lists out of the db they should pass them through a function that filters out stuff about private datasets.

An activity about a private dataset is one whose object_type is 'dataset' and whose object_id matches the id of a private dataset. You should also check the object_type and object_id of all of the activity object's activity detail objects, if any of those match a private dataset then mark the whole activity as private.

Currently all activity streams are public so should have all private datasets filtered out from them, except for the dashboard activity stream which is private to the individual user. In this case private datasets that the user has permission to see should not be filtered.

1340884140000000 1351531137000000
#2621 refactor icmurray icmurray ckan-v1.9 new Remove the deprecated 'fields' parameter from tag_search and tag_autocomplete

This was deprecated in 1.8 as it wasn't accessible via GET requests due to being a dict. See #2439

In a future release of CKAN (probably 1.9) it can be removed.

Internal uses of it were removed in #2439, but there are tests that still use it.

1340900569000000 1340900569000000
#2622 defect seanh new Login fails in Opera 12

Try to login to CKAN using Opera 12, get "Login failed. Bad username or password. (Or if using OpenID, it hasn't been associated with a user account.)"

1340902602000000 1340902602000000
#2625 enhancement seanh ckan-v1.9 new Add i18n strings from non-core but supported extensions to ckan.pot file

Have to decide which non-core extensions are going to be supported first.

1341236903000000 1341236903000000
#2635 enhancement dread new Non-destructive SOLR reindex

You can't run the search-index reindex on a live server because it will give us bad results for 2 to 3 hours while it runs. Can there be an option that doesn't delete the entire index at the start?

Instead it could just delete any items that don't exist any more, then delete them and regenerate them one by one. So the total number of datasets doesn't change much.

1341829394000000 1341829394000000
#2641 enhancement johnmartin amercader demo phase 5 assigned Adapt spatial widgets to new theme

Dataset extent map and spatial filter need to adapted to the new theme, as they are not showing up now

1341846147000000 1352658854000000
#2644 enhancement shevski toby demo phase 5 assigned user dashboard for demo theme

we now have a user dashboard that needs theming not sure if we need sam to look at it

http://localhost:5000/user/dashboard

1341910821000000 1344255836000000
#2654 enhancement ross ckan 2.0 assigned UI support for ordering groups on group_read page

The group_index page has no support in WUI for ordering the groups displayed. Should allow sorting by name

Add support for this for datahub now, and discuss for new 1.9 UI

1341943891000000 1346662156000000
#2656 defect seanh dread new Feed with few results has bad paging link, causing exception

This page http://thedatahub.org/feeds/custom.atom?q=wombat has 0 results and contains a link to http://thedatahub.org/feeds/custom.atom?q=wombat&page=0 which the page=0 causes this exception:

ckan.lib.search.common.SearchError'>: SOLR returned an error running query
Error: "'start' parameter cannot be negative"
1342001112000000 1342001112000000
#2663 enhancement toby toby ckan-v1.9 new h.resource_display_name needs love

This function is shit and needs cleaning up and a doc string

description is markdown and should be treated properly

either we should truncate all or leave it to the templates but work universally

url if no name / desc this is in demo-theme branch

1342017746000000 1342017746000000
#2673 enhancement rgrp new simplify set of options for resources

Far too many resource options. Lets restrict back to data file and API. Visualizations etc can either get linked in description or in the Related items.

1342300559000000 1342300559000000
#2674 defect kindly shevski demo phase 5 assigned Data preview not loading on s031

Not loading for all resources as far as I can tell; e.g. http://s031.okserver.org:2375/dataset/afghanistan-election-data/resource/f6331f99-51f6-44d9-95b9-b20f3b74f360

Fine on demo.ckan.org

1342435102000000 1344349324000000
#2679 enhancement icmurray icmurray ckan-v1.9 new Change default behaviour of TemplateController.view to 404.

The current behaviour of TemplateController?.view() (which is the fallback controller should all others fail) is to attempt to render (as a genshi template) the requested file.

Although this may be a feature that some instances want. In general, it leads to:

  • 500s when attempting to access a normal template (eg - http://datahub.io/importer/preview)
  • A way of inadvertantly serving things you may not want to serve. (Small risk, as it needs to be renderable as a genshi template).

Solution:

  • Change the controller to 404
  • Ensure there's a way for existing ckan instances to override that behaviour should they need it.
1342436133000000 1342436133000000
#2683 enhancement seanh new Add no-cache header to _tracking API call's response to make sure it doesn't get cached 1342446577000000 1342446577000000
#2686 defect shevski ckan-backlog assigned enabling datastore & data API breaks recline

First I noticed that the gold prices dataset preview was not displaying & has data API enabled Secondly I tried enabling datastore for http://datahub.io/dataset/adur_district_spending/resource/281dffa6-ea9b-4446-be41-05dced06591f and after I saved the preview no longer worked. Unticking the datastore & data api checkbox brought it back

Is this a known issue?

1342516011000000 1346663300000000
#2688 enhancement ross new Allow ordering of groups in WUI

Currently the group_index page just shows the entire list of groups, forcing the ordering to be by name. It would be better if it could be sortable by name (or reversed) or by package_count (or reversed)

1342520875000000 1342520875000000
#2697 enhancement johnmartin shevski demo phase 5 assigned create dataset validation

Includes: missing fields, existing field checks (i.e. whether a name/dataset already exists with that name) during input (i.e. no need to submit form to check)

1342620035000000 1346235925000000
#2698 enhancement toby shevski demo phase 4 assigned markdown preview

for description / other fields with markdown support

1342620085000000 1344543252000000
#2699 enhancement shevski shevski demo phase 5 assigned workflow for associating datasets with groups

needs review & speccing out e.g. datasets created by a user who belongs to a certain publisher (group) get auto added to this group

1342620176000000 1344507133000000
#2702 enhancement shevski shevski demo phase 5 assigned Future Javascript wishlist for demo

tooltip on popular datasets with number of views facets to update automatically creating a dataset without reloading page between steps hover on licences information autocomplete on search terms group filtering social share buttons in lightboxes dataset counts on homepage

1342620475000000 1344255984000000
#2708 enhancement kindly toby ckan-v1.9 new limit extra data for package/group show

contextpackage_limits? = { 'tags': 5, <- get first 5

'extras': 0, <- get all

}

only get what you ask for have to be explicit

contextgroup_limits? = {} only main item

start with datasets/groups expand if we like it

1342622420000000 1342622420000000
#2709 enhancement icmurray markw new Atom feeds are undocumented

There doesn't seem to be any documentation yet for Atom feeds.

1342624310000000 1342626212000000
#2718 enhancement toby shevski demo phase 4 new can't add dataset to more than one group

trying to add a dataset to another group means it's no longer part of the first group

http://demo.ckan.org/dataset/edit/afterfibre

1342780550000000 1344544203000000
#2719 defect dread new Feeds controller does not catch NotAuthorized exception

Results in bad user experience and WebApp? errors emailed out. Seen in 1.7.1

1342872863000000 1342872863000000
#2721 defect toby shevski demo phase 4 new deleted groups should not show on 'Add to Groups' dropdown

Groups previously deleted still show up in the add dataset process in step 3 'Additional info'

http://s031.okserver.org:2375/dataset/new_metadata/ff

1342948632000000 1344544214000000
#2725 enhancement toby shevski demo phase 5 new Case sensitivity on tags

My feeling is that 'country-US' and 'country-us' should be the same tag. However currently tags with caps are treated differently

see http://s031.okserver.org:2375/en/dataset/test-dataset

with TEST and test - there also get indexed twice in the search page

1342949667000000 1343030773000000
#2726 enhancement toby shevski demo phase 5 new confusing logic on data preview formats
  1. If a user enters the wrong format on a file that can be previewed - it simply won't be previewed (e.g. a CSV or XML file that can be filled in with JSON in format will just not work or check this
  1. If I incorrectly edit format to one that data preview will try to preview it will work even for a format that it doesn't accept (sometimes) e.g. this PDF file I changed the metadata to HTML http://s031.okserver.org:2375/en/dataset/test-dataset/resource/9d27a9d9-36ec-460e-9edb-6dff7ba4fc28
1342949927000000 1343030906000000
#2728 defect toby shevski demo phase 4 new deleted group shows on search index - for admins

'test-group', which has been deleted,shows up on main search page under groups - and can be filtered by - see http://s031.okserver.org:2375/dataset?groups=test-group

1342950784000000 1345023944000000
#2729 enhancement kindly shevski ckan-backlog new searching for tags:[tag] works but tag:[tag] doesn't

which is confusing since you can only search for one tag like this at a time. I.e. tags:economics,cvs or tags:economics, csv or tags:economics+CSV doesn't work for example; therefore tag:economics, should also work!

http://s031.okserver.org:2375/dataset?q=tags%3Aeconomics&sort=relevance+asc

1342951109000000 1342951176000000
#2731 enhancement markw new Some sites permanently 'down for maintenance'

A large number of XXX.ckan.net sites give the following message:

"This Site is Down for Maintenance We apologize for the inconvenience. ~ The Open Knowledge Foundation sysadmins."

The message is unhelpful and patently false - the sites do not exist. Some of them were supposed to have been redirected to a relevant group at thedatahub.org in this ticket (now closed):

http://trac.okfn.org/ticket/933

However, the redirection only seems to have worked in one case, http://si.ckan.net.

The problem still affects the following sites - the first 4 of which have supposedly been merged:

Please sort this out by redirecting, removing the sites, giving a more helpful (and accurate) failure message, etc, as appropriate.

1343045168000000 1343051608000000
#2732 enhancement ross ckan-backlog assigned New file upload functionality

We should simplify upload and storage of files, initially only to local storage with archiver eventually being fixed to archive data externally. WIP pad is http://ckan.okfnpad.org/uploads

Simplifying uploads

Currently uploads are too painful/difficult/fiddly to use and/or configure. We want to simplify uploads so that they are done directly to the CKAN server, without support for remote services (S3 etc) and/or the dependencies it introduces.

We want to fix:

  • File uploads themselves
  • Storage of uploaded files
  • Notification of the upload to other components

File uploads

Things file upload should do:

  • Allow sysadmin to disable
  • Allow auth'ed users to upload
  • Store whatever they send on disk, and store DB entry linking the file to the person
  • When creating the resource, the user should be able to choose from all of the files they have uploaded but not yet associated with a resource. This will allow for bulk upload and then a delayed association. Whenver a user creates a resource they either upload a file now, or see previously uploaded files.
Can we do the upload asynchronously and then associate the 
uploaded key with the resource before the save ? What happens 
if the user tries to submit before asymc upload finishes ? Should 
we delay them?

The upload workflow should look like...

  1. File upload should be a straightforward file upload with normal auth checks and normal processing of the posted data.
    1. When called via ajax then the ID of the newly created file should be returned,
    2. When called via WUI then it should also be given the url to redirect to after the file upload has been handled - the id will be passed as a query param.
  2. The resource save should check whether it has a file id and in that case updates the file object to point to the resource.

This should enable:

  • Separate file upload into a user's temporary store, either individually or as a batch.
  • Creating resources and simply choosing from previously uploaded, unassigned files
  • Adding files/data to a resource after the fact.

File storage

File storage should be local to the CKAN install, and not a remote service. Any archiving to remove storage providers should be outside of the main request.

File storage should:

  • allow moving data, a sysadmin should be able to move the storage root and change configuration and have the system continue running (i.e. don't store absolute paths).
  • provide maintainability, it should be easy to determine which old files are not associated with resources and thus can be cleaned up.
  • allow for collection of information (i.e. estimate of storate space used)
  • check whether there is enough space and handling the conequences cleanly
  • ensure files to be written only underneath its own root folder, checks should be made after any path generation that the file begins with the location of the file storage.
  • Have a configurable maximum accepted blob size during upload.
  • Should store what meta-data was provided with the upload, such as mimetype.

Somewhere in the DB we should store ...

ColumnNotes
idAn identifier
ownerThe owning user, who uploaded the file
pathThe path (from the 'storage root') to the file
sizeThe size in bytes of the file on disk
mimetypeThe mimetype of the file, as provided by the uploader
upload_dateWhen the data was uploaded
resourceThe ID of the resource it belongs to. A unidirectional relationship.
archived_urlThe URL where this file has been archived

Generating paths should try and separate the files, perhaps based on username of the owner, or some other mechanism to avoid a single folder full of files.

Notifications

We need to make sure that it is possible to notify other components within the system that an upload has taken place, or at least make it easy for them to be notified. The primary use case for this is to notify the component that will translate/upload certain formats to the data store.

We could do this based on the post-upload update to the file model (i.e. when we record the total received size of the file).

1343058789000000 1346663270000000
#2733 enhancement johnglover johnglover ckan-v1.9 new Datastore logic functions

Where does the data go?

In a postgres database configured by the ckan.datastore_write_url config option which is a sqlalchemy url.

The user should have rights to create tables.

Whats the api like?

We will just implement it as logic functions like the rest of CKAN and will part of core. After that we may add some nicer api functions that use these but that is a secondary concern.

What are the initial logic functions?

  • datastore_create
  • datastore_delete
  • datastore_show

What is the JSON input format for datastore_create

To begin with it can have the following keys. It is fairly consistent with Max Ogdens' gut servers. Except adds resource_id.

{
resource_id: resource_id # the data is going to be stored against.
fields: a list of dictionaries of fields/columns and their extra metadata.
records: a list of dictionaries of the data eg  [{"dob": "2005", "some_stuff": ['a', b']}, ..]
}
  • The first row will be used to guess types not in the fields and the guessed types will be added to the headers permanently. Consecutive rows have to conform to the field definitions.
  • rows: can be empty so that you can just set the fields
  • fields are optional but needed if you want to do type hinting or add extra information for certain columns or to explicitly define ordering.

eg: [{"id": "dob", "type": "timestamp" }, {"id": "some_stuff", "type": "text"}, ...]. A header items values can not be changed after it has been defined nor can the ordering of them be changed. They can be extended though.

  • Any error results in total failure!! For now pass back the actual error.
  • Should be transactional

What json does datastore_delete take?

{
resource_id: resource_id # the data is going to be deleted.
filters: dictionary of matching conditions to delete
    e.g  {'key1': 'a. 'key2': 'b'}  this will be equivalent to "delete from table where key1 = 'a' and key2 = 'b' ".
    No filters (either not present or not defined) then delete the table. If we want truncate then add truncate: true to truncate the table.
}

What json does datastore_search take?

{
resource_id: resource_id # the data is going to be selected.
filters : dictionary of matching conditions to select
    e.g  {'key1': 'a. 'key2': 'b'}  this will be equivalent to "select * from table where key1 = 'a' and key2 = 'b' "
q: full text query
limit: limit the amount of rows to size default 100
offset: offset the amount of rows
fields:  list of fields return in that order, defaults (empty or not present) to all fields in fields order.
sort: comma separated field names with ordering e.g "fieldname1, fieldname2 desc"
}

Some free code: https://gist.github.com/3163864

What json does datastore_search return?

{
fields: same type as datastore_create accepts (i.e. with metadata)
offset: The same offset that was supplied in datastore_show
limit: The original limit
filters: The filters that were applied in data_show
total: # total matching records without size or offset
records: [same as data_create] # list of matching results
}

On error will return:

{
__error__: … sql error …
}

What types are allowed?

Aim to support as many postgres/postgis types that have string representations.

http://www.postgresql.org/docs/9.1/static/datatype.html

http://www.postgresql.org/docs/9.1/static/sql-createdomain.html

IDs

Each row in a table will be given an _id column which has an id generated by us which you can use in queries.

Other Features

Each row will store the _full_text index of all the data in the row. At some later point there will most likely be a way to index fields add constraints etc.

1343058886000000 1343656105000000
#2735 enhancement toby shevski demo phase 5 assigned Dataset order on user page

I think the datasets on user pages http://s031.okserver.org:2375/user/me should be ordered by latest updated (with most recent at the top) instead of in alphabetical order.

What do you think?

1343062877000000 1344349245000000
#2745 defect amercader ckan-v1.9 new Password reset returns an exception if the key parameter is missing

Instead of showing a notice, the password reset page throws an exception if the key parameter is missing:

Module ckan.controllers.user:329 in perform_reset
         c.reset_key = request.params.get('key')
               if not mailer.verify_reset_link(user_obj, c.reset_key):
                   h.flash_error(_('Invalid reset key. Please try again.'))
                   abort(403)
 if not mailer.verify_reset_link(user_obj, c.reset_key):
Module ckan.lib.mailer:100 in verify_reset_link
     if not user.reset_key or len(user.reset_key) < 5:
               return False
           return key.strip() == user.reset_key
 return key.strip() == user.reset_key
AttributeError: 'NoneType' object has no attribute 'strip'

Apart from the obvious fix of checking for the 'key' parameter, it seems like is quite common to get these reset urls without the key parameter, so I suspect some email clients might strip the query params when building the links. We could avoid this problem by making the key part of the url instead of a param:

http://thedatahub.org/en/user/reset/3086e91c-fe09-4a98-92e1-19de67a9ac9d/b4c2d03fa8

instead of:

http://thedatahub.org/en/user/reset/3086e91c-fe09-4a98-92e1-19de67a9ac9d?key=b4c2d03fa8

1343145931000000 1343145931000000
#2748 enhancement shevski demo phase 5 new add 'add new resource' button to sidebar

When editing a resource you see the current and any other existing resources in sidebar

(e.g. see http://s031.okserver.org:2375/en/dataset/format-error-test/resource_edit/d1eac556-c16f-44af-8148-5e3467b57cf8?inner_span=True)

Would be good to have a pretty 'add new' slightly transparent resource folder/pointer undearneath - letting you add resources from the end resource page

1343212878000000 1344503744000000
#2751 enhancement toby toby demo phase 5 new check translations for full demo site

need to check everything gets translated - sean did this before so will have info

1343216443000000 1344243046000000
#2758 enhancement toby toby ckan-v1.9 new file storage gives error if config not available but no useful user information

We get an error which should be improved the actual problem is this but not passed to user

KeyError?: 'ofs.impl'

View as: Interactive (full) | Text (full) | XML (full) Module ckan.controllers.storage:2 in auth_form view Module ckan.lib.jsonp:26 in jsonpify view

data = func(*args, kwargs)

Module ckan.controllers.storage:407 in auth_form view

authorize(method, bucket, label, c.userobj, self.ofs)

Module ckan.controllers.storage:200 in ofs view

StorageAPIController._ofs_impl = get_ofs()

Module ckan.controllers.storage:71 in get_ofs view

storage_backend = configofs.impl?

Module paste.registry:146 in getitem view

return self._current_obj()[key]

KeyError?: 'ofs.impl'

1343287709000000 1343287709000000
#2761 enhancement seanh ckan-v1.8 new Document all the errors you can get when setting up filestore, and how to fix them

Add it to a 'Troubleshooting' section on the filestore page: http://docs.ckan.org/en/ckan-1.7.1/filestore.html

For the error messages and their solutions, see various threads on ckan-dev

1343302566000000 1343302566000000
#2762 defect seanh ckan-v1.8 new test_related.py crashes

/home/seanh/Projects/ckan/ckan/ckan/tests/functional/test_related.py

ImportError? (cannot import name assert_regexp_matches)

1343303753000000 1343303753000000
#2763 defect seanh ckan-v1.8.1 new Multilingual tests failing

test_multilingual_plugin.TestDatasetTermTranslation?.test_dataset_index_translation, test_multilingual_plugin.TestDatasetTermTranslation?.test_group_read_translation both failing for me on master

1343303819000000 1350303864000000
#2766 enhancement seanh shevski demo phase 4 assigned prevent draft datasets making it to activity stream

The new ckan creates datasets as part of a 3 phase process. To allow for this partially created datasets can have a state that is 'draft' or 'draft-complete'. These datasets should not be seen as active by the activity stream.

If we click 'add dataset' and then complete the first phase of adding a dataset then we end up having a activity stream created. When we add a resource in the next phase (add data) again an activity stream item xxx added resource to dataset is created

We do not want these adding. Essentially if a dataset has a state.startswith('draft') then we want the activity stream to ignore all actions involving it.

finally when the state is changed form state.startswith('draft') to state=='active' we want a xxx has created dataset ... to be added to the stream.

The best way to do this would be to branch from 2375-demo-theme-stable and get it to work there.

Let me know if you need any help with this ticket or a better explanation of the problem.

1343318795000000 1344543193000000
#2768 enhancement toby shevski demo phase 5 new normalise excel to xls

so that data proxy works

1343319382000000 1344351663000000
#2771 enhancement seanh seanh ckan-v1.8 new Documentation and examples for IDatasetForm and IGroupForm

Add minimal, working IDatasetForm and IGroupForm example extensions to core, with tests.

The IDatasetForm example should use tag vocabularies (two birds with one stone)

The IDatasetForm and IGroupForm docs are not very good (and are somewhat spread around different doc chapters), fix them up, and reference the new working examples.

Tab Vocabularies docs should reference IDatasetForm example.

When using convert_to/from_extras() you have to remove any free extras from the form or it won't work, this needs to be documented (in the docstring maybe)

There have been recent changes to the schemas that IDatasetForm and IGroupForm use, make sure the docs are up to date.

1343392238000000 1350303564000000
#2773 enhancement markw markw ckan-v1.9 new About page needs improving

The about page for the DataHub? (thedatahub.org/about) could be improved. More importantly the default about page for a generic CKAN instance should be completely different - focus more on Open Data rather than the community hub idea (as this is more relevant for most installations) and remove specific references to tdh.

1343646795000000 1343646795000000
#2775 enhancement toby aron.carroll demo phase 4 new Add bin/less to paster serve command

Ideally the ./bin/less command would be run when the server is started.

  • Also it would be good to have a paster command to build the production CSS with {{{ ./bin/less --production }}
  • The command could also detect missing node binaries and redirect to the documentation.
1343685686000000 1344543962000000
#2777 enhancement icmurray new bug: user attributes 1343726363000000 1343726363000000
#2780 enhancement toby shevski demo phase 4 new way for admins to undelete datasets

Since admins can see deleted datasets - there should be a way for them a) to know they are currently deleted & not viewable by normal users - ticket: #2779 b) way to undelete such datasets - this ticket

I suggest a button on the edit form instead of the delete button i.e. remove normal delete button with 'deleted dataset, only admins can view' with undelete button next to the message?

1343737248000000 1345023811000000
#2784 defect icmurray icmurray new model dictize sensitive data

The model dictize layer doesn't consistently remove sensitive data from the dictized models. It should use the current context to decide whether to include sensitive data or not.

1343814685000000 1343814685000000
#2785 enhancement johnmartin aron.carroll demo phase 5 assigned Allow resources to be re-ordered

Not sure where this functionality should be added, possibly in one of the sidebar widgets when editing a resource?

Ira, what are your thoughts?

1343816523000000 1346235916000000
#2786 enhancement shevski demo phase 5 new target blank HTML downloads

e.g. if I click on download here: http://s031.okserver.org:2375/dataset/example-dataset/resource/d8797e51-b497-46ca-a274-8675533d110b can it take me to a new tab instead of navigating away from ckan?

1343819814000000 1343819814000000
#2788 enhancement amercader amercader ckan-v1.9 new Speed improvements on creating/updating and indexing

Specially needed when importing large numbers of datasets.

Profiling the import command from the harvesting extension has shown some areas where improvements could be made.

1343832992000000 1343832992000000
#2790 enhancement kindly toby demo phase 4 new logic.action.user_show is slow

This is a very slow call it would benefit from the sort of speed-ups that package-search received

for me locally this is taking 6 seconds for rufus using the datahub data I have. I think a lot of this is the dataset retrival/dictization

can we just grab json blobs from solr?

also is it possible to specify a sort order/paging?

I've put this as a demo-theme ticket as it is an big issue on the demo we are at 25 second page loads - which i can get down to about 8.8 secs so this is the main pain point now

1343852483000000 1345023734000000
#2795 enhancement toby demo phase 5 new Check validation of HTML, CSS, JS

Ensure that we are being standards compliant

1343903128000000 1343903128000000
#2796 enhancement mark.wainwright ross new Need a datahub one-pager

A one-pager explaining what the datahub is and with howto/examples for new users. This would make it much easier to explain the value in using the datahub for storing data.

1343924916000000 1345129495000000
#2810 enhancement kindly ckan-future new heroku ckan support

Get ckan working on heroku

1344364858000000 1344364858000000
#2813 enhancement toby markw demo phase 5 new Confusing sidebar on demo dataset page

On a dataset page on demo.ckan.org, the left sidebar is confusing.

  • It starts with some random links. Actually they are links to groups which the dataset is in, but this isn't clear.
  • The sidebar elements that are actually part of the dataset are 'Datset extent' and 'License', so these should be right at the top (if they belong in the sidebar at all), instead of which they are right at the bottom in the junk part of the page (i.e. probably lower than the bottom of the main page, and hence lower than anyone will scroll).
1344420206000000 1344445419000000
#2814 enhancement shevski markw assigned Demo: upload file behaves oddly
  1. Uploading a file behaves counter-intuitively (I would suggest wrongly).

When adding a new resource by uploading a file, I select a file called say create-group.png. I expect the following to happen:

  • the pathname of that file is filled in the box;
  • nothing is actually uploaded till I hit 'add' (confirming that I've got the right file etc).

Instead of this,

1344420360000000 1346670381000000
#2815 defect seanh seanh ckan-v1.8.1 new db_to_form_package_schema() strips tracking summary, isopen

If an IDatasetForm plugin with a db_to_form_schema() based on db_to_form_package_schema() (which is in turn based on default_package_schema()) is in use then the 'tracking_summary' dict and the 'isopen' bool get stripped from package dicts during validation, e.g. during package_show(), and these values are then not available to templates.

1344444427000000 1350303821000000
#2818 defect seanh danieljohnlewis demo phase 4 assigned Improve related item schema

Problem: When creating a related item (e.g. a Visualisation), if you don't put in a URL it succeeds, but on the related items and apps pages it renders it as a link to the same page. Expected: Always require a URL and it should only submit if one is added

1344504176000000 1346231718000000
#2820 defect danieljohnlewis demo phase 5 new English Language: Visualization -> Visualisation

Problem: In the English version (which has a UK flag, indicating British English), the word "Visualization" is used. For an example see the "Filter by type" drop down on the /apps page. Expected: This should be "Visualisation" in British English. Any instances of "Visualize" should be changed to "Visualise" too.

1344504455000000 1344504455000000
#2821 enhancement danieljohnlewis demo phase 5 new Featured Items on Filter

Problem: On /apps page in the Filter Results box there is a "Only show featured items" checkbox, on selection it comes up with 0 solutions. Expected: Presumably an admin can create "featured items" so that they can be randomly selected on front page (is this correct)? If there are no "featured items" in the whole database can this check box be hidden? Bug is: no UI or obvious way to create featured items. Also the checkbox looks un-styled

1344504504000000 1344505492000000
#2822 enhancement toby toby demo phase 4 new Resource additional info titles format/i18n

the title for additional info should be translated

capitalised etc

1344504620000000 1344543985000000
#2823 enhancement toby toby demo phase 5 new resource additional info title order

Order the items so that none user fields are first from ticket #2707

1344504773000000 1344504773000000
#2828 enhancement toby shevski demo phase 4 new Draft datasets are confusing - tickets need creating

reported as editing datasets incorrect

e.g. clicking on edit here http://s031.okserver.org:2375/dataset/ff takes you to create dataset page http://s031.okserver.org:2375/dataset/edit/ff

but this is the correct behaviour of a draft dataset

We need to show draft datasets correctly

proper tickets need making for the different issues after review of issues with them - who can see, where, admins and viewing, orgs too etc

1344506178000000 1344547324000000
#2829 enhancement johnglover toby ckan-v1.9 new Archiver fails on 403 http response

Had this issue with the archiver on my local machine need to be logged in (I am admin) to see via web front end

$ paster archiver update -c ../ckan/development.ini 


2012-08-09 11:01:37,636 INFO  [ckanext.archiver.commands] Archival of dataset resource data added to celery queue: opencontext-chogha-mish-fauna (1 resources)
2012-08-09 11:01:37,671 INFO  [ckanext.archiver.commands] Getting dataset metadata: south-african-national-gov-budget-2012-13
2012-08-09 11:01:37,900 INFO  [ckan.lib.base]  /api/action/package_show render time 0.043 seconds
Traceback (most recent call last):
  File "/home/toby/okfn/pyenv/bin/paster", line 8, in <module>
    load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
  File "/home/toby/okfn/pyenv/lib/python2.7/site-packages/paste/script/command.py", line 104, in run
    invoke(command, command_name, options, args[1:])
  File "/home/toby/okfn/pyenv/lib/python2.7/site-packages/paste/script/command.py", line 143, in invoke
    exit_code = runner.run(args)
  File "/home/toby/okfn/pyenv/lib/python2.7/site-packages/paste/script/command.py", line 238, in run
    result = self.command()
  File "/home/toby/okfn/pyenv/src/ckanext-archiver/ckanext/archiver/commands.py", line 98, in command
    response = app.post(api_url + '/package_show', data)
  File "/home/toby/okfn/pyenv/lib/python2.7/site-packages/paste/fixture.py", line 262, in post
    expect_errors=expect_errors)
  File "/home/toby/okfn/pyenv/lib/python2.7/site-packages/paste/fixture.py", line 243, in _gen_request
    return self.do_request(req, status=status)
  File "/home/toby/okfn/pyenv/lib/python2.7/site-packages/paste/fixture.py", line 406, in do_request
    self._check_status(status, res)
  File "/home/toby/okfn/pyenv/lib/python2.7/site-packages/paste/fixture.py", line 439, in _check_status
    res.body))
paste.fixture.AppError: Bad response: 403 Forbidden (not 200 OK or 3xx redirect for /api/action/package_show)
{"help": "Return the metadata of a dataset (package) and its resources.\n\n    :param id: the id or name of the dataset\n    :type id: string\n\n    :rtype: dictionary\n\n    ", "success": false, "error": {"message": "Access denied", "__type": "Authorization Error"}}
1344508484000000 1344508484000000
#2830 enhancement toby toby demo phase 4 new Need method to undelete groups

need controller action and front-end method

1344509408000000 1344547341000000
#2831 enhancement aron.carroll ckan 2.0 new Create a limited subset of markdown that's supported

Allowing people to use the full range of markdown results in extremely messy output across the site. I'd suggest limiting support to only a subset of common use cases.

  • Allow all inline elements, this allows bold, italic, code and links.
  • Allow lists.

Disallow

  • Horizontal Rules
  • Headings
  • Block quote and code (this may turn out to be useful and so could be included)

This way you get markdowns paragraph handling and a few inline styles without breaking the entire layout of the page.

Here's the full syntax if anyone is interested http://daringfireball.net/projects/markdown/syntax

1344512467000000 1344512467000000
#2833 enhancement aron.carroll demo phase 5 new Load module templates before calling .initialize()

I think this would be a nice feature for remote loading templates if the options.template value ends in ".html".

ckan.module('my-module', {
  options: {
    template: 'my-template.html'
  },
  initialize: function () {
    this.template // This is the loaded template.
  }
});
1344531939000000 1344531939000000
Note: See TracReports for help on using and creating reports.