{22} Trac tickets (2647 matches)

Results (2101 - 2200 of 2647)

Id Type Owner Reporter Milestone Status Resolution Summary Description Posixtime Modifiedtime
#2222 enhancement rgrp zephod closed duplicate Tests are broken for some of us: NotAPairTreeStoreException

Since my last pull I'm getting a strange new error which is presumably to do with my local config or assumptions about the storage extension.

Test output here: https://gist.github.com/2007985

1331318783000000 1331318825000000
#2238 enhancement seanh seanh ckan-future closed duplicate Deploy some test vocabs and publishers on test.ckan.net, check tutorial-style instructions for these 1332165311000000 1343392681000000
#2273 enhancement seanh seanh ckan-v1.8 closed duplicate Review publisher organisations code with Ross 1333375723000000 1340635788000000
#2312 enhancement ross ross ckan-future closed duplicate Analysis of how datasets could belong to users instead of Groups

DUPLICATE OF #2548

Currently datasets can only be part of a group but that is quite heavyweight when a single user wants to upload a single dataset. To resolve this it would be great if a dataset could be attached to a user directly - find out how.

1334663770000000 1340188765000000
#2344 enhancement seanh seanh ckan-v1.8 closed duplicate Get jenkins install script into CKAN core

After checking out a commit to test from CKAN's GitHub?, Jenkins runs a script that creates a new virtual environment and installs CKAN and its dependencies into it, and does some other necessary tasks. Jenkins then runs the tests in this virtualenv.

The install script may have to change from one commit to the next as CKAN's install instructions change, so it would be good if the script was shipped in CKAN core. That way Jenkins will run different versions of the script depending on which commit it's testing and if the tests fail because the script is wrong then that's actually a bug that needs to be fixed in CKAN core. Also the CKAN install instructions could be simplified a lot by just having users run this script instead of doing each step by hand.

1335876673000000 1340040449000000
#2380 enhancement ross ross opendatasuite 1 closed duplicate DataGM Upgrade

Provide a new test install of DataGM based on the ODS demo site being completed through June.

1337002144000000 1340627558000000
#2398 defect seanh seanh ckan-sprint-2012-05-29 closed duplicate Ubuntu 10.04 source install instructions not working? 1337160253000000 1337963342000000
#2423 enhancement ross seanh ckan-sprint-2012-05-29 closed duplicate Get rid of CKAN's lxml dependency 1337946234000000 1338193199000000
#2424 enhancement ross seanh ckan-sprint-2012-05-29 closed duplicate Get rid of CKAN's autoneg dependency 1337946279000000 1338193269000000
#2462 enhancement johnglover johnglover ckan-sprint-2012-06-25 closed duplicate Add converter to rename resource_type field to type 1338212065000000 1338309596000000
#2469 enhancement seanh seanh ckanbuild closed duplicate Find a better way to deploy CKAN instances

We want a much quicker and easier way of deploying multiple CKAN instances to the same or multiple servers and managing them (e.g. configuration, themes, extensions, upgrades, and adding new instances). Currently we just deploy and manage each instance separately, which doesn't scale.. We'd also want to have (as far as possible) one single way of deploying CKAN that works across different situations, e.g. deployment server, demo server, jenkins, local deployment for development, these should share the same tools for installing CKAN and its dependencies.

Components that this might include:

Virtualenv bootstrap script https://gist.github.com/2206132

Ansible http://ansible.github.com/

CKAN deb package

1338216064000000 1340639702000000
#2495 enhancement rgrp ckan-v1.8 closed duplicate Stats page has lost styling ...

http://datahub.io/stats

1339056932000000 1340624905000000
#2534 defect amercader seanh ckan-v1.8 closed duplicate ckanext-spatial is broken with CKAN 1.8b

Looks like recent cleanups in CKAN broke an import in the extension:

https://gist.github.com/2935688

1339754762000000 1340631124000000
#2544 enhancement icmurray ckan-future closed duplicate facet.sort is not available in the package_search action

Not all solr facet parameters are available through the pcakage_search action. In particular, facet.sort has been asked for; but this ticket should check to see if there are other parameters that would be easy to add too.

See: http://wiki.apache.org/solr/SimpleFacetParameters#facet.sort

1340013421000000 1340015580000000
#2610 enhancement aron.carroll shevski demo phase 1 closed duplicate Text / link changes

In the footer: Change 'Powered by CityData?' to Powered by DataSuite? (with link to ckan.org/datasuite)

Links in left hand side of footer should change to a link to ckan.org, link to OKFN, link to ckan.org/datasuite/about and link to Pricing Page (on ckan.org .. will be /datasuite/solutions)

In create dataset form, prefilled tag examples should include one example with a space, e.g. "mental health"

1340810738000000 1341501424000000
#2628 enhancement seanh seanh closed duplicate Add docs for upgrading a source install 1341585472000000 1343392717000000
#2629 enhancement seanh seanh ckan-v1.9 closed duplicate Move ckanext-examplevocabs into master and document 1341586555000000 1343392596000000
#2630 enhancement seanh seanh ckan-v1.8 closed duplicate UPdate docs after IDatasetForm schema change 1341586625000000 1343392411000000
#2636 enhancement aron.carroll aron.carroll demo phase 2 closed duplicate Style activity stream in user profile 1341835166000000 1342090614000000
#2681 enhancement aron.carroll shevski demo phase 2 closed duplicate autofill on resource format

This is pretty key for standardization.

Also would like us to change the grey text to be: e.g. CSV, JSON, [more examples... i have no idea what they key ones are tbh]

1342440406000000 1343135874000000
#2764 enhancement rgrp closed duplicate Simplify filestore

Definitely do:

  • Local file filestore: Remove pairtree (and OFS) and do something very simple

Options:

  1. Local file only. Allow for uploaders to GS / S3 in the background
    • Advantages: simpler, if local upload can do progress bars etc etc
    • Disadvantages: strain for web-app (upload a 2GB file what happens)
      • This is probably fixable ...
  2. Stick with how we are

New filestore without pairtree

Each uploaded file is located on disk at:

{uuid}/{filename}

Alternative:

yyyy/mm/dd/{uuid}/{filename}

Need the uuid to avoid collisions.

Metadata: Store no metadata.

1343303928000000 1343320440000000
#2832 enhancement shevski demo phase 4 closed duplicate can't add a dataset to more than one group

add to groups is a drop down menu where you can only choose one needs a new UI & logic allowing user to add new groups & potentially remove from other lists

1344521472000000 1344542984000000
#2946 enhancement dominik closed duplicate Pdf preview does not load in IE

The pdf preview does not load in IE 9.

1349090873000000 1349090992000000
#253 enhancement dread ckan-backlog assigned Package relationships

Overview

Functionality to formally associate packages. We see a need for specific parent-child, inheriting or dependency relations. Not only should this help navigation between packages in the web interface, but it also provides a mechanism to automatically pull dependencies when downloading a data package, in a similar manner as we see in software package management.

Examples

  1. There are 27 packages in data.gov.uk to do with the Data4NR's Health Poverty Index. There is currently no common link between these, unless you search for 'HPI' (which also brings up House Price Index), or look under tag 'health' (which also has 600 other results). There should be a link on each HPI package page to navigate to the other 'sibling' HPI packages, and to a 'root' package that has info about the set. This could be partially achieved using the existing tag or group concepts, but a more explicit/official/obvious marking of their relationship could be beneficial.
  1. In ckan.net is freedict, a collection of translation dictionaries. You could make each dictionary a child package and use this system. But it would probably be better to make each dictionary a different resource in the same package. (There are other ideas to denote a resource as the data making up a 'portion' of package, or a 'whole' of the package, to help people downloading datasets in the software package style.)
  1. OSM has had some Naptan data imported (bus stops), with special permission - i.e. a more liberal license. It would be useful to show this link on both OSM and Naptan packages in CKAN: OSM 'derives from' Naptan with a comment about the license change. I'm not sure this is useful to an automatic download or use of these datasets, but may aid exploration on the CKAN website and understanding the provenance of the bus stop data on it.
  1. IPCC collection of data linked / mirrored. Not sure if there are useful relationships here?
  1. Dracos gets postbox locations from crowd sourcing and OSM. We could say Dracos 'derives from' OSM.

See more examples discussed here: http://trac.ckan.org/ticket/253

Implementation

This is split into four tickets:

No need for write access to be provided API for the moment.

This ticket also encompasses ticket:169 (Package derivations) and ticket:176 (Package dependencies).

1266854721000000 1339774726000000
#301 enhancement rgrp assigned Package discussion pages

A package discussion page is like a wikipedia discussion page: an editable free text page for people to have discussion/post comments about a given package.

It provides a way for people to make suggestions about a package without needing access to main package.

1272301033000000 1340632055000000
#350 enhancement dread ckan-backlog reopened Search engine optimisation

Need to research what can easily be done to improve CKAN packages in the search rankings.

Comments from Glen Barnes:

We've been pretty successful at SEO without even really trying (see http://www.google.co.nz/search?client=safari&rls=en&q=auckland+google+transit+feed&ie=UTF-8&oe=UTF-8&redir_esc=&ei=dsYSTOzJLs2eceuZiI8I as an example). This to me is key. If we are to make data available it has to be findable which is the main reason for a catalogue. There are probably things we should be doing on CKAN like using slugged urls (http://www.ckan.net/package/ascoe -> http://www.ckan.net/package/ascoe/atmospheric-chemistry-studies-in-the-oceanic-environment), setting the H1 tag correctly ("Atmospheric Chemistry Studies in the Oceanic Environment" on the example above). Some basic SEO 101 on page optimisations.

1276594541000000 1339774690000000
#382 story johnbywater johnbywater v1.1 closed Measure quality of service parameters

As a service administrator, I want to measure responsiveness, throughput, and availability of a CKAN service.

1280346974000000 1280854608000000
#384 story johnbywater johnbywater v1.1 closed Send alert when QoS measurements break expectation 1280347841000000 1280496812000000
#471 story johnbywater ckan-v1.2 closed API user sends API key in correctly named header 1282312088000000 1283248729000000
#472 story johnbywater ckan-v1.2 closed API user sends API key in incorrectly named header 1282312108000000 1282932740000000
#473 story johnbywater ckan-v1.2 closed API user discovers correct header for sending API key 1282312203000000 1283248736000000
#491 story johnbywater ckan-v1.2 closed Get form for creating harvest source entity 1282427008000000 1284493173000000
#492 story johnbywater ckan-v1.2 closed Submit harvest source create form response to the API 1282427042000000 1284493145000000
#493 story johnbywater ckan-v1.2 closed Get harvest source entity 1282427083000000 1284493130000000
#494 story johnbywater ckan-v1.2 closed Get form for updating remote metadata entity 1282427106000000 1285198894000000
#495 story johnbywater ckan-v1.2 closed Put form for updating remote metadata entity 1282427150000000 1285198898000000
#545 requirement johnbywater johnbywater ckan-v1.2 closed The system shall support creating packages via the Form API 1283339310000000 1283458648000000
#546 story johnbywater johnbywater ckan-v1.2 closed Get the package create form from the API 1283339414000000 1286896814000000
#547 story johnbywater johnbywater ckan-v1.2 closed Submit package create form response to the API 1283339487000000 1283458645000000
#567 story johnbywater ckan-v1.2 closed Post new harvest job for given harvest source 1284039943000000 1284689374000000
#568 story johnbywater johnbywater ckan-v1.3 closed Pull metadata documents from given harvest source entity 1284040131000000 1288038207000000
#572 story johnbywater johnbywater ckan-v1.3 closed Write CKAN package from metadata document

The attributes we need to read in are being defined here: http://okfnpad.org/uklii

The design advice is to prepare, test, and maintain a dictionary of XPaths for each of the attributes.

Work effort is to be directed to the maintainability of the XPath statements. We need to know which are needed for which documents, and to have a way of preventing cruft.

1284040784000000 1288038237000000
#577 story johnbywater ckan-v1.2 closed Get remote metadata harvest job errors 1284209381000000 1285348049000000
#578 story johnbywater ckan-v1.2 closed Get remote metadata harvest job 1284209407000000 1284493202000000
#579 story johnbywater ckan-v1.2 closed Delete remote metadata harvest job 1284209440000000 1285260356000000
#598 story johnbywater ckan-v1.2 closed List remote metadata entities for given publisher 1284213730000000 1286200795000000
#613 task johnbywater johnbywater closed Store result of schema validation check on metadata document object 1284218828000000 1286798466000000
#653 requirement dread ckan-backlog new Trackback links for packages

When people link to a package, a track-back link is auto-created. (Similar system as for blogs).

As suggested by Tim Davies:

Allowing some form of ‘track back’ against datasets When a non-technical user comes to look at a dataset it would be really useful for them to be able to see if anyone has created an interface interpretation of it already.

I found quite a few cases in research of end-users struggling to make sense of a dataset when good interfaces to that data had already been built and blogged about, but without there being any link from the dataset listing to those data uses. Accepting track backs could also make it easier for technical users to find blog posts / shared code etc. relating to a given dataset.

1285062025000000 1339774636000000
#654 bug johnbywater ckan-v1.2 closed Harvest sources and jobs should return 404 when missing (not 500) 1285170717000000 1285254097000000
#655 story johnbywater ckan-v1.2 closed Return status code 404 when harvest source is not found 1285252399000000 1285254088000000
#656 story johnbywater ckan-v1.2 closed Return status code 404 when harvesting job is not found 1285252429000000 1285254084000000
#673 task johnbywater closed Construct and send CSW GetRecordById request 1286214712000000 1286786758000000
#674 task johnbywater closed Extract metadata documents from CSW GetRecordById 1286214780000000 1286787032000000
#688 task johnbywater johnbywater ckan-v1.2 closed Example GeoNetworks service for CSW development 1286786893000000 1286980827000000
#728 requirement amercader johnbywater ckan-backlog assigned CSW Harvesting shall be optimised in respect of reharvesting only records that have changed

Hi Will, this is important again because some CSW servers we use have over 300 documents in. Could you take a look at modifying the filter please?

1287675340000000 1310124784000000
#737 enhancement dread ckan-backlog new Markdown syntax summary page

I suggest we produce a quick Markdown cheat-sheet page, showing the key runes: e.g. create a title and quote some text. This page can link to the full Markdown docs for advanced users.

A user going to the Markdown docs that we link will have to read a couple of pages of the raison-d'etre of Markdown before he gets to the syntax. And it's not very easy to read, and being white on black it looks like proper geek stuff.

1287766749000000 1323170239000000
#763 enhancement dread ckan-future assigned Read-only mode - Setup

Admin configures entering read-only mode in one of two places:

  • CKAN config file (e.g. ckan.ini)
  • environment variable from Apache config

Once enabled, no writes can occur to the database (including user ratings and other usage stats).

1288091506000000 1338206204000000
#765 enhancement dread ckan-backlog assigned Read-only mode - API usage

All writes to the API are captured and you are returned an error explaining the reason.

Possible errors:

  • 503 temporary maintenance
  • 403 forbidden (if server if permanently read-only)
1288091897000000 1338206123000000
#794 requirement amercader johnbywater ckan-backlog assigned Investigate reconciling UKLP Publisher and Provider with DGU

This needs more analysis, but the GEMINI2 attribute "metadata point of contact" must be reconciled with the registered publisher (or agent).

This might also be used to filter records harvested from a CSW source, but filtering also needs more analysis, as does distinction between agent and provider.

1289227811000000 1311179581000000
#811 defect cygri ckan-backlog new Extra field editing form layout breaks when there are long field names

The layout of the editing section for extra fields breaks when a field name is slightly too long. Field names jump over to the right. See http://ckan.net/package/edit/dbpedia for examples.

1289994812000000 1323170289000000
#812 defect cygri ckan-backlog new Package edit form only allows three extra fields

Rationale

The package edit form is restricted to three extra fields. To enter more than three fields, one has to save the package and hit edit again (or hit preview).

Implementation

A mechanism similar to the one for resources (where you can add lines as you go) would solve this. So, have a button that adds more extra field rows via JS. (Extra fields don't need up/down buttons that the Resource table has)

Nice to have: a blank field is added when you tab from the last filled-in field in the table.

1289995010000000 1311176917000000
#818 requirement cygri ckan-backlog new Rethinking the author and maintainer fields

The semantics of the Author and Maintainer fields are really unclear at the moment. This leads to very inconsistent usage. Also, perhaps Name and Email are not the only fields that are needed for a contact.

Here is a table that shows the current usage of these fields in CKAN: http://richard.cyganiak.de/2010/ckan/ckan-ppl.html

We note several problems:

  • Author and Maintainer are often the same
  • Author and Maintainer are often used interchangeably
  • People really want to specify URLs for the contacts and stick them into random places because there is no field for it
  • Multiple comma-separated names in a single field

I'm not sure what to do about this, but a redesign is necessary in my opinion.

Some ideas:

  • Remove the maintainer field?
  • Make really clear that Author doesn't refer to the metadata on CKAN, but to the original data
  • Add an “author URL” field?
1290003524000000 1339774621000000
#1168 enhancement thejimmyg dread ckan-backlog assigned Test system for deb packaging

Get buildbot to:

  • build the deb packages
  • install them into a fresh virtual machine
  • run smoke tests on the installed ckan
1306441994000000 1330990423000000
#1341 enhancement kindly kindly ckan-backlog reopened Delete spam users from ckan

Spam users where added to thedatahub and we need to clean them.

1315995034000000 1320141540000000
#1447 defect kindly dread ckan-backlog assigned disk space leakage

Periodically we see some CKAN servers fall over because they run out of disk space. We need to find out if there is a common cause and fix it.

One problem in the past has been file handles running out when creating lots of tiny files in the data directory.

Another problem has been several enourmous backups being created every day - pdeu on eu25.

1320666843000000 1340727330000000
#2202 enhancement rgrp ckan-future reopened Display page view count on dataset and resource pages

Just like we display download counts we should display view counts.

1330765455000000 1338204929000000
#2243 enhancement seanh seanh ckan-v1.9 reopened Fix ckanext-example 1332172710000000 1340635768000000
#2331 defect kindly rgrp ckan-sprint-2012-05-29 reopened Search should AND terms not OR terms

Appears current default search in CKAN ORs terms rather than ANDing them (i.e. adding more terms increasing number of items found rather than reducing it).

Not sure when this crept in or if it has been there for a long time.

1335637485000000 1356474344000000
#2457 enhancement johnmartin aron.carroll demo phase 4 assigned Create demo tags list page

This includes the tag page as well for now.

Discussion:

https://okfn.basecamphq.com/projects/9558659-demo-ckan-front-end/posts/62998445/comments

Implementation:

http://s031.okserver.org:2375/en/tag

1338211735000000 1352658878000000
#84 enhancement kindly rgrp ckan-future assigned Revert support on versioned objects

Basic revert in the classic wiki form is already support by purging a Revision. However may wish to support:

  1. Cases where multiple objects changed in a revision but only want to revert 1 (low priority)
  2. Want to revert but have reversion as a new revision of that object.

Seems low priority at present.

Cost: ?

1248339543000000 1340626385000000
#140 enhancement rgrp ckan-backlog new News section on front page

Have a news section (suggest as a sidebar item).

News section will link to latest 3/4 blog posts on CKAN from blog.okfn.org.

Details:

  • Suggest pulling via rss or similar.
  • Will want to cache this ...

Cost: 4h?

1254902541000000 1265625159000000
#143 enhancement thejimmyg dread ckan-backlog assigned Most active users listed on homepage

Display league of users' recent activity on homepage.

1255010373000000 1312372381000000
#235 enhancement tobes dread ckan-v1.9 assigned Resource format normalization and detection

Try to gather proper MIME information for all package resources in CKAN. This is a shared ticket with dcat-tools (https://bitbucket.org/pudo/dcat-tools), i.e. opendatasearch.org. This can then also be used by ckanrdf, the CKAN RDF conversion service.

Sub-tasks:

  • Create a Google Spreadsheet with two Worksheets: "MIME-Mappings", i.e. "CSV" -> "text/csv" and "Name mappings", i.e. "text/csv" -> "Comma-Separated Spreadsheet".
  • Collect and map surface forms from all CKANs
  • Access this via Swiss and apply, store as a PackageResource? extra field pending #826 (Resource extras).
  • Add heuristics for format auto-detections:
    • Map well-known file extensions
    • Recognize obvious magic (Zip, Tar)
    • Peek into Zipfile/Tarfiles?
  • Define a convention for generic data types (many CKAN packages have only "Spreadsheet" defined, either detect specific type or set MIME to */tabular-data or similar)
  • See also: #816 (Autocomplete for the resource format field)
1263827604000000 1340627624000000
#250 enhancement icmurray dread ckan-v1.9 assigned RDF link in Atom feed

Add link to RDF representation of a package in our Atom feed.

1266507695000000 1340631430000000
#256 requirement dread ckan-backlog assigned Package relationships - 3. Edit in WUI

WUI:

  • Editable as part of package or separately? (e.g. like authz)
  • Do we normalize to only one type name of the pair?
  • Do we allow create of relationship from both ends (e.g. only from dependency to dependent or either way?)
1266928561000000 1339774714000000
#277 enhancement zephod dread ckan-backlog assigned Set some config options / settings in WUI (extension)

Use case

As a ckan administrator I want to easily change options about the CKAN install.

Implementation

Settings to be in DB

Suggested:

## Title of site (using in several places including templates and <title> tag
ckan.site_title = CKAN

## Logo image to use (replaces site_title string on front page if defined)
ckan.site_logo = http://assets.okfn.org/p/ckan/img/ckan_logo_box.png

## Site tagline / description (used on front page)
ckan.site_description = 

## Used in creating some absolute urls (such as rss feeds, css files) and 
## dump filenames
ckan.site_url =

## Favicon (default is the CKAN software favicon)
ckan.favicon = http://assets.okfn.org/p/ckan/img/ckan.ico


## An 'id' for the site (using, for example, when creating entries in a common search index) 
## If not specified derived from the site_url
# ckan.site_id = ckan.net

## API url to use (e.g. in AJAX callbacks)
## Enable if the API is at a different domain
# ckan.api_url = http://www.ckan.net

## html content to be inserted just before </body> tag (e.g. google analytics code)
## NB: can use html e.g. <strong>blah</strong>
## NB: can have multiline strings just indent following lines
# ckan.template_footer_end = 

NB: these will still need to be stored somewhere for loading on initialization. do this in db init function ...

Settings / Options / KeyValues? Table

Columns:

  • [namespace]: ? only if KeyValues? (for settings this would then always be settings)
  • key
  • label
  • value (json)
  • type (e.g. date and to specify in advance what type should be)
  • description
  • tags: ?? (for grouping ...)

Loading settings from DB

Do this in ckan/config/environment.py

WUI

  • /ckan-admin/settings
  • Show label, plus description plus text field

Depends

  • Would be part of ckan-admin section and hence build on ticket:833 (Administrative dashboard)
1269274861000000 1318247121000000
#285 enhancement rgrp assigned Paginate list of packages on tag read page

Is this worth doing? On hmg.ckan.net start to have a lot of packages with a given tag ...

1270664606000000 1340631923000000
#331 enhancement rgrp ckan-backlog new Timezone of CKAN timestamps should be configurable

Revisions are timestamped using the server's clock, which may not relate to the expected timezone for the site. e.g. the Norway site has a server on GMT. No timezone info is displayed either.

Would like to set timezone for a CKAN instance to use in rendering revision timestamps. For example, use CET or EST timezone.

1275302440000000 1339774701000000
#338 story johnbywater johnbywater v1.1 closed Reference groups by ID in addition to name, since group names can change 1275901137000000 1280446480000000
#351 enhancement dread ckan-backlog new Homepage: list new, updated and 'hot' packages

Have a simpler list of exciting data, as opposed to the big revision list.

For example:

Hot data
===========

New packages: package1, package2, package3
Updated resources: package1, package2, package3
Popular packages: 
1276595816000000 1339774677000000
#369 enhancement shudson@… ckan-backlog new "Package Listing Key" should appear on Tag results

Currently there's a nice legend titled "Package Listing Key" that appears in right side of "Browse Packages" results. The same key should show on other search results like when searching for a tag.

1279821634000000 1339774666000000
#370 enhancement shudson@… ckan-backlog new Use better email encryption for author_email and maintainer_email

The JavaScript? email encryption used is not very reassuring. Google's MailHide? is a much better solution that is easily implemented.

http://www.google.com/recaptcha/mailhide/

Check on the Mailhide API where there are even some Python libraries already built.

1279821819000000 1339774649000000
#372 bug johnbywater johnbywater ckan-v1.2 closed Fix system limits on CKAN for DGU

Set limits in /etc/security/limits.conf so that we can always ssh in at least. Requested by DGU.

1279885752000000 1281522535000000
#837 enhancement rgrp ckan-backlog new CKAN integration with freebase gridworks / google refine

Thread: http://lists.okfn.org/pipermail/ckan-discuss/2010-November/000718.html

Scenario 1

  1. User installs Refine and CKAN extension for refine
  2. On booting refine and asked to load data they can choose from any data package on CKAN.net (or any other CKAN instance)
  3. They edit the dataset on Refine
  4. On save (or perhaps as a separate option) they are prompted as to whether they wish to sync the dataset back to CKAN (either as a new package or as a new resource on the existing package)

NB: for the dataset sync back some form of "CKAN" storage would be required (we already have storage.ckan.net running but a closer integration would be required)

Scenario 2

  1. User visits a package on CKAN.net (or another CKAN instance)
  2. There is a button on the page "View and edit this dataset in Google Refine"
  3. Click button -- ask them if they have Google refine installed
    • Yes: instructions for loading dataset into refine
    • No: load dataset in hosted version of google refine (we could run this)
  4. User edits dataset and hits save. As in previous scenario they are prompted to sync the dataset.
1291140609000000 1339774605000000
#895 defect memespring ckan-backlog new Add version number (or simular) to css/js includes query string

Updates to css after a new deploy don't come through without a hard refresh. Adding the version number to the include urls will solve this e.g.

mycssfile.css?v=12345678

1294343382000000 1339774593000000
#909 enhancement pudo ckan-backlog assigned DCat importer for CKAN

Write an importer that supports most well known variants of DCat in importing the data as CKAN packages. In particular, the following sources should be supported:

  • CKANrdf generated exports
  • opengov.se RDF (not really DCat)
  • Sunlight Nationdatacatalog as harvested by dcat-tools.
1295265958000000 1346669602000000
#924 enhancement dread ckan-backlog new Search box has no search button

The search box at the top-right of CKAN's page doesn't have a 'go' button. I feel that a larger percentage of users expect a 'go' or 'search' button on the right-hand side of the box to press to start searching. Techies tend to know the keyboard shortcut of pressing 'carriage-return' but it might be better to follow standard practise on this.

Examples with 'search' button: Internet Explorer, Firefox, Google, Amazon, trac Examples without: ?

1295867533000000 1323170436000000
#948 enhancement dread ckan-future assigned Highlight (to a sysadmin) which packages are deleted

When a customer logs in as a sysadmin then he/she see all packages, including deleted and pending ones. These are hidden to the average user, but the sysadmin has no idea of this until he clicks on the package and sees at the bottom 'state: deleted'.

It should be more obvious than that on the search view - an icon, message or crossed-out name to packages are deleted.

1296646369000000 1338206280000000
#979 enhancement kindly dread ckan-backlog assigned Edit Resource extras in the API

Follows on from #826. We can now edit resource extras in the WUI (to some extent - see #978 for remaining issues) and we can view resource extras in the API, but we can't yet edit them in the API.

1297429777000000 1315222244000000
#989 enhancement kindly pudo ckan-future new Extending the model from plugins

We need to support extending the model from plugins. This could involve:

  • Adding a plugin hook to extend the mapper
  • Adding an upgrade hook for plugin schema migrations
  • Documenting how this is to be done
  • Find a way to avoid conflicts
1297689724000000 1340034311000000
#1009 enhancement pudo rgrp ckan-backlog new Improvements to user accounts sytem
  • Forgot password (email a new password)
  • Confirm email
  • Do not show register page if you are logged in (redirect to home page)
  • ticket:1010 - listing of users.
    • Do not use /user for general user account home page (either user normal user page /user/{id} or /user/myaccount)
1298635991000000 1310128574000000
#1041 enhancement thejimmyg thejimmyg ckan-backlog assigned Start Using the CKAN Wiki for Tutorial-style documentation

For example, I will document the following:

I'd love if someone else would write:

  • An authorisation tutorial covering the core model, the command line tools and examples of every possible way of using the system
  • A HOWTO guide with screenshots for adding a package
1300284715000000 1312372367000000
#1062 defect johnglover sebbacon ckan-backlog assigned Data preview encoding error

The preview of "Species Misc Turtle Download" at http://ckan.net/package/taxonconcept results in the following error:

Unable to Preview - Had an error from dataproxy: Data Transformation Error (Data transformation failed. Reason: 'utf8' codec can't decode byte 0x8b in position 1: unexpected code byte

1301396143000000 1311773731000000
#1069 enhancement tobes rgrp assigned Stub datasets (request for datasets)

Idea is to have stubs for datasets that someone wants but don't yet exist (or haven't been discovered) in the way one has stub pages on a wiki.

We could do this within the existing model by a slight 'abuse' - create a dataset and mark it with a special tag e.g. todo.does-not-yet-exist or similar ...

(Just as we have datasets listed that exist but aren't available ...)

Alternative would be to have a request for datasets subsystem.

I prefer the stub dataset model because it's simpler, provides a simple workflow (as a dataset is found or comes into existence), and the package page provides a natural space in which to accumulate information about what is wanted and what exists.

Implementation

  • Agree a new dedicated tag. e.g. todo.does-not-exist
1301666919000000 1340632215000000
#1077 enhancement kindly rgrp ckan-backlog new Move to simpler vdm system

Option 1: 'Changeset' Model

See ticket:1135 for vdm ticket. This would involve a) moving to changeset in vdm b) doing the migration in ckan to support this.

Have developed a new "changeset" based model for revisioning in vdm.

Implementation

  • The main challenge with this change is schema and data migration

Every revisioned object has a revision_id and revision attribute.

Approximate algorithm:

Revision -> Changeset

for revtype in [PackageRevision, ...]:
    for pkgrev in package_revision:
        changeset = lookupchangeset(package_revision)
        ChangeObject(cset, (table, id), dictize(pkgrev))

Question:

  • does pkg include tags attributes or not? or we have to dictize, pkgrev, pkg2tagrev, and tag. Probably the latter.

Option 2: Simplify Revision Object Model

Just use a simpler vdm, see ticket:1136 (move to SessionExtension) and ticket:1137 (remove need for statefulness in vdm).

Discussion

Advantage of Option 1 versus 2:

  • Easier support for pending state and similar behaviour
  • No need to introduce new tables (and hence migrations) when making something revisioned (or not).

Disadvantages

  • Migration is required
  • More difficult to query revision history.
    • Could be addressed by having ChangeObject have separate cols for table name and id but would likely be more difficult.
  • Performance (?)
    • Have one big ChangeObject table to query when looking at changed objects rather than many revision tables.
      • Not sure this is a biggie as even with Revision model biggest revision object tables are probably on the order of the ChangeObject table

Conclusion

Implement Option 2 and leave Option 1 for present.

Option 1 includes Option 2 so it seems that that is required in either case (so we may as well with Option 2).

Option 1 requires significant effort (esp migration) so leave for present and then review the situation at some later date.

1302304464000000 1340034345000000
#1096 defect rufuspollock pudo ckan-future new [super] CKAN Hosted

Many users of CKAN want to have their own instance without much effort. Setting these up in separate places is a maintenance nightmare, we should much rather have some tenant separation in core CKAN. Some ideas:

  • introduce model.Site and c.site
    • site has: custom CSS, extra_template_path, title, languages list, package_form, group_form (all configured via web UI)
  • Subdomain detector to activate sites.
  • use site in Authorizer instead of System, have a NullSite? for global things
  • allow cross-site search
  • packages are in a list of sites, m:n rather than 1:n
    • list of sites is string-based, can contain sites not in site table to express harvested external material which is not editable locally.
1303235062000000 1339774484000000
#1101 enhancement sebbacon ckan-backlog new Integrate googlanalytics into site nav

There's a stats plugin (e.g. at http://trac.ckan.org/ticket/832).

Output from the googleanalytics plugin should append to that page, if the stats plugin is present.

Possibly the stats plugin and the googleanalytics plugin should be merged?

Finally, if the stats plugin is active, then a link to the stats page should be added to the main site footer.

1303393926000000 1339774582000000
#1120 enhancement tsm ckan-backlog new Atom feeds of each tag

Tags could/should have an Atom feed. This would mean that every edit to relevant packages could be easily monitored. See [1].

[1] http://lists.okfn.org/pipermail/ckan-discuss/2011-May/001162.html

1304309285000000 1339774568000000
#1130 enhancement lucychambers assigned First time users

Send users to FAQ first time on CKAN

1304938761000000 1340633514000000
#1134 CREP amercader ckan-backlog new CREP0003: Description and Configuration of Harvesters

Proposer: Adrià Mercader

Abstract

The new harvester interface allows to create harvesters for different sources, but right now harvesters don't have many ways to describe and configure themselves. We need a way of allowing them to:

  • Expose their type and other details so they can be used internally and on the UI.
  • Define configuration settings for particular harvester instances.

The Problem

Harvester description

The current UI for adding and editing harvest sources is the same used in ckanext-dgu, and thus the 3 harvester types used in DGU to harvest various GEMINI realted sources are hardcoded in the form. The form will be migrated to a DGU-independent one, so we need the harvesters to provide all the necessary data. There is a current get_type method that returns the harvester type, but for make it compatible with the DGU forms, it returns a machine-readable string (e.g. "CSW Server"), making it error prone.

Arbitrary configuration

In the current implementation, when the harvest process is started, ckanext-harvest looks for all the available plugins that implement the IHarvester interface and calls the appropiate methods for the current stage (gather_stage,fetch_stage,import_stage). At these stages, harvesters have no way of applying arbitrary configuration options, so all harvesters of the same type behave on the same way. For instance, the CKAN harvester needs a way to define the API version to use when harvesting remote instances (Right now, the version 2 is hardcoded on the code).

Specification

Harvester description

Harvesters will need to provide the following information so the UI form can be built:

  • name: machine-readable name (e.g. "waf"). This will be the value stored in the database, and the one used by ckanext-harvest to call the appropiate harvester.
  • title: human-readable name (e.g. "Web Accessible Folder (WAF)"). This will appear in the form's select box.
  • description: a description of what the harvester does (e.g. "A Web Accessible Folder (WAF) displaying a list of GEMINI 2.1 documents"). This will appear on the form as a guidance to the user.

The way to provide it will be an info method that all harvesters must implement, which will return a dictionary with the previous elements:

    {
        'name': 'csw',
        'title': 'CSW Server',
        'description': 'A server that implements OGC's Catalog Service 
                        for the Web (CSW) standard'
    }

Arbitrary configuration

As different harvesters will have very different needs, we need to provide a way to persist arbitrary configuration flags for each harvest source. The more flexible way given the current architecture in my opinion would be to store the configuration options as a JSON encoded object as a property of the harvest source (There already is an unused DB field called config in the database) (Maybe using JsonType??).

This will mean adding an extra field in the harvest source form to allow entering the configuration. This could be just a simple text field where users enter the JSON encoded object or a more clever mechanism (i.e an "Add a configuration flag" link that adds two new text fields for the key and value for each flag, and a mechanism to later build the JSON object). In any case, this should probably be hidden in an "Advance options" section.

Why do it this way

Harvester description

The info method would provide a single point to get all the information related to the harvester, and future properties could be added to the dictionary returned without having to modify the interface.

Arbitrary configuration

There is an already existing config field in the database, so we won't need to change the model. Harvesters could access the config object at any of the stages. Of course they could provide default values in their implementations so users don't need to enter them everytime.

Implementation plan

Deliverables

Risks and mitigations

The highest risk on the harvesters info method side is that harvester implementation don't offer one of the necessary properties (namely name and title). This could fire a warning when showing the UI form or using the CLI.

Participants

Adrià Mercader to do it.

Progress

None yet.

1305108868000000 1339774554000000
Note: See TracReports for help on using and creating reports.