{22} Trac tickets (2647 matches)

Results (1201 - 1300 of 2647)

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Id Type Owner Reporter Milestone Status Resolution Summary Description Posixtime Modifiedtime
#1702 enhancement amercader kindly ckan-sprint-2012-02-06 closed duplicate Normalize character encoding for ckan search.

Make sure accented characters are normalized when indexed and when searched for.

1327419369000000 1327419922000000
#1646 defect zephod dread ckan-sprint-2012-01-23 closed worksforme Resource navigator options display spuriously

When viewing a dataset, the "Resources" navigation button contained the Resource titles on the Resource navigator button, instead of in a drop-down mouse-hover menu.

http://thedatahub.org/dataset/realtime-birth-data-in-bulgaria/resource/66fc5831-ce01-4954-9beb-e2889ef8a20f

Chrome/Linux?

1326452700000000 1327407044000000
#1640 enhancement amercader dread closed fixed Setup publicdata.eu harvester for Serbian CKAN datasets

Set-up publicdata.eu to harvest datasets at rs.ckan.net (Serbian community CKAN).

1326370425000000 1327340939000000
#1696 defect johnglover johnglover ckan-sprint-2012-02-06 closed fixed Maintain backwards compatibility with older way of creating custom forms

To maintain backwards compatibility, the package controller (new/edit) should check to see if the controller has a package_form variable defined, and if so render the form pointed to by this variable before calling the new self._package_form() function.

This behaviour is now deprecated however.

1327326243000000 1327326421000000
#1299 enhancement seanh kindly ckan-sprint-2012-01-23 closed fixed Activity streams table migration

Migrate tables for activity streams

1314696635000000 1327322739000000
#1691 enhancement rgrp rgrp ckan-sprint-2012-01-23 closed invalid paster user create command takes password on command line

Needed to support automated deployment more easily.

Est: 15m

1327077314000000 1327314081000000
#1655 task amercader amercader ckan-sprint-2012-01-23 closed fixed Setup issues on s025 (Publicdata.eu)

Time estimate: 2d

  • Fix logs (apache, ckan, harvest): rotate, set suitable levels
  • Fix harvesting jobs: supervisord for gather consumer, cron job
  • Fix backups

Also it may be worth setting up a test instance ( on s023 ?)

1326710844000000 1327312857000000
#1541 task icmurray icmurray ckan-sprint-2012-01-23 closed fixed Setup server for the DGU form-refactor.

To enable us to show DGU work in progress, for feedback.

1323359484000000 1327311698000000
#1645 enhancement icmurray icmurray ckan-sprint-2012-01-23 closed fixed Update and test existing DGU package form : Apply a simple theme

Theme the DGU form.

Doesn't need to be an exact replica of DGU, but just enough to show it's possible.

1326394622000000 1327311679000000
#1690 enhancement ross ross closed fixed Rename storage settings with the ckan prefix

Missed the ckan prefix on the storage settings names so this needs to be fixed.

1327064844000000 1327066713000000
#1330 enhancement rgrp ckan-backlog closed invalid Deprecate / Remove test_authz.py

test_authz.py appears to test in great detail some very specific additional authz (related to total site lock-down it seems -- introduced I think for hri project).

I think there are simpler ways to get total site lockdown (use external auth!) and this test is slow and delicate (e.g. depends on specific words in templates). Suggest removing. If we don't remove we should at least refactor tests for access to certain pages to use a proper method of testing (e.g. agreed html comments in each page) rather than being depending on the presence of absence of specific wording.

1315899129000000 1327060201000000
#1609 enhancement ross ross ckan-sprint-2012-01-23 closed fixed Celery task for ckanext-archiver to write to webstore.

From super Storage changes - #1574 - and http://ckan.okfnpad.org/newstorage we determined that ckanext-archiver should have a celery task for grabbing local file uploads and writing to webstore

Analysis

When I upload a file to CKAN:

  • End up with file in permanent storage
  • IF file is ot type ... csv,xls,xlsx,sqlite,.sql
    • End up with new db in webstore
      • Where? {username}/{resource-id}/...
        • If a single table: name it after the file name (appropriately slugified)
      • A resource *always* corresponds to a 'database' in webstore ...
      • In Data Explorer have "Sheets" tab ...
  • Resource url = /dataset/{x}/resource/{y}/link -> cached_url ...
1325582253000000 1327057030000000
#1576 enhancement rgrp rgrp ckan-sprint-2012-01-09 closed fixed Move stats extension back into core - 0.5d

Est: 0.5d.

Questions:

  • Why do this?
    • tiny extension with few dependencies - and really nice to have out of the box
    • trial for doing this on larger scale
  • Do we keep as extension (even if in core)?
    • Ans: Yes, keep as extension because:
      • Already set up that way
      • Cleaner
      • Easier to disable / enable
  • How do we integrate with main them (e.g. have stats link)
    • Ans: not sure (this is part of more general issue of how we update theme for varying changes elsewhere). Best answer is to have some widgetization in theme.
  • Hide ratings section (at least until we reinstate ratings #1598)
    • Ans: no, let's not bother (and having ratings there encourages us to do #1598 and/or find out whether people are interested in ratings)
1324317313000000 1327056070000000
#1683 defect dread dread ckan-sprint-2012-01-23 closed fixed Dataset search results - last item out of order

On each page of package search results, all the items are neatly sorted apart from the last item of the page. SOLR gets the sorting of the results incorrect.

This is a known issue: https://issues.apache.org/jira/browse/SOLR-1777 affecting SOLR 1.4 only (which comes with Ubuntu 10.04)

It is highlighted in CKAN test ckan/tests/functional/test_pagination.py:TestPaginationPackage of commit 39096ed54bda86d043521b08b2e14fc5e283a0ff which fails most of the time it is run (passes intermittently).

1326971864000000 1326976925000000
#1680 enhancement ross ross closed wontfix Group refactor top level element

The new group refactor allows for a hierarchy of groups where each group has a type - to be able to implement a tree of groups.

It will need a flag within the group to denote that it is a top level group, to circumvent the need to determine whether the group is a child of a parent node (for a specific use case - a publisher representing the department that only contains publishers).

Suggest 'is_top_level'

1326896362000000 1326900832000000
#1623 enhancement dread dread ckan-sprint-2012-01-23 closed fixed Dump to exclude deleted objects

The database dump currently contains all Packages and their associated objects, even those that have been set to state=deleted. We should exclude these from the dump now.

Dump = paster db simple-dump-csv/json

reasoning

The dumps are designed for end-users to be able to run scripts on the mass of data. Since end-users don't see state=deleted packages then they shouldn't need them in the dump. In fact their presence in the dump probably confuses them.

Admins get the full database anyway in the backup pg_dump.

We only included them in the user dump because it was designed before use of state=deleted was established.

Time estimate: 2h

1326118987000000 1326892264000000
#1627 defect dread dread ckan-sprint-2012-01-23 closed fixed favicon broken

On thedatahub.org the favicon doesn't display. i.e. the CKAN logo should appear in the browser's tab.

Original ticket #48

1326207102000000 1326890614000000
#1629 defect dread dread ckan-sprint-2012-01-23 closed fixed permissions changed during upgrade to 1.5.1

This was seen on datacatalogs. When we upgraded it 1.5 to 1.5.1 we saw some permissions being reset so that it could be spammed. Anonymous and logged in users were given anon_editor and editor permissions. I don't know what the previous config was. Permissions for sysadmins remain unaffected.

1326215162000000 1326823222000000
#1597 enhancement dread ckan-sprint-2012-01-23 closed fixed Tag search - filter by group

I want to browse tags, but filtered for a particular group. Currently our tag API doesn't allow for filtering by group.

This is important for improving groups as communities within a site #1521. It would be easy to do this by adding an option to filter by a group. BUT are there any other use cases that would warrant a more complete faceted tag search?

--

BTW I can currently draw a tag cloud for a group - I can get the top tags used in a group like this:

curl http://thedatahub.org/api/action/package_search -d '{"q":"groups:country-ca", "facet.field":"tags", "rows":"0"}'

but it only contains the top 20 tags.

1324550492000000 1326821156000000
#1613 defect dread dread ckan-sprint-2012-01-23 closed wontfix Post-dataset-edit URL has #section
  1. Viewing a dataset, hit 'Edit'
  2. Click on the "Basic Information" tab (note: URL has suffix #section-basic-information
  3. Click 'Save'
  4. URL still has suffix #section-basic-information

Affects 1.5, 1.5.1, 1.5.2a

1325685555000000 1326813924000000
#1575 enhancement dread dread ckan-sprint-2012-01-23 closed fixed tag punctuation lost in ca.ckan.net import

Last week I imported ca.ckan.net datasets into thedatahub.org, but the tags seem to have lost their dashes, underscores and dots.

1324316860000000 1326808657000000
#1677 enhancement amercader ckan-v1.6 closed duplicate Make synchronous search the default behaviour

Right now you need to explicitly load the synchronous_search plugin in your ini file, when this is probably the behaviour that all users expect by default. We could keep a config flag to deactivate it, but synchronous search should be the default behaviour.

1326807604000000 1326807655000000
#1668 defect dread ckan-backlog new repoze version discrepency

There's a discrepency in repoze.who versions between the source and package installs:

  • repoze.who - package 1.0.18 vs source 1.0.19
  • repoze.who-friendlyform - package 1.0b3 vs source 1.0.8

We get a test failure [1] with the 1.0b3 version (from the ubuntu 10.04 python-repoze.who-plugins package). But we've not noticed any problems on s057 instances (br, no, ie etc) which have the package versions of repoze.who.

The reason the package install uses the earlier packaged versions rather than the ones we'd like is that repoze uses all sorts of horrendous import hacks, making it too difficult to put into our 'ckan-conflict' source package.

James suggests we 'do something horrible like dynamically patch repoze on CKAN import'.

[1] http://buildbot.okfn.org/builders/builder-ckan/builds/1371/steps/shell/logs/stdio ERROR: ckan.tests.functional.test_user.TestUserController?.test_user_create_unicode

1326801746000000 1326801746000000
#1659 defect dread dread ckan-sprint-2012-01-23 closed fixed Cannot logout if CKAN mounted at non-root url

If you set WSGIScriptAlias to mount CKAN at a URL other than / then you cannot logout without adjusting the OpenID logged_out_url to match in who.ini config. e.g.

[plugin:openid] ... logged_out_url = /sub/dir/user/logged_out

Note: all the other URLs in who.ini should not have the /sub/dir/ - it is just this one that doesn't take account of the mounting point.

The solution is to fix-up the repoze.who OpenID plugin to take account of the mounting point.

1326716302000000 1326747205000000
#1637 enhancement seanh seanh ckan-v1.6 closed fixed API call for getting the list of activity detail items for a given activty stream item

(and add test cases for it)

1326304817000000 1326737169000000
#1631 enhancement seanh seanh ckan-sprint-2012-01-23 closed fixed Add activity stream events for new/changed groups 1326304020000000 1326736381000000
#1625 enhancement seanh seanh ckan-sprint-2012-01-23 closed fixed Add activity stream events for new/changed users

This requires adding a logic function for emitting an activity stream event, and then editing the logic functions for creating or updating users and making them call the new emit event function. This same emit event function can later be used to emit activity stream events for other types of object as well.

1326187794000000 1326736328000000
#1662 defect dread closed wontfix OpenID not compatible with mounting CKAN at non-root URL

Mounting CKAN at a non-root URL was made to work properly here: #1659

Unfortunately OpenID doesn't play nicely and would require some work to get working.

1326730366000000 1326730414000000
#1656 enhancement ross closed duplicate Configuration for reverse proxying

Provide configuration for reverse proxying that will correctly handle the mapping of a URL to a sub-folder (using X-SCRIPT-NAME)

  • Analysis of the best solution [1d]
  • Implement reverse proxying in tandem with #1653 [2d]
  • Document and store the configuration files. [1d]
1326711575000000 1326711659000000
#1590 enhancement amercader amercader closed fixed Create customized feeds for the IATI Registry

We need a way to track changes on the registry (datasets edited or updated), globally and on a per country/publisher/etc. basis. RSS and Atom feeds are really popular, and after closing #191 and #1498 crating them from the search results should be fairly easy.

The following URLs are pretty self-explanatory:

http://localhost:5000/feed/registry.rss

http://localhost:5000/feed/country/AF.rss
http://localhost:5000/feed/publisher/worldbank.rss
http://localhost:5000/feed/organisation_type/10.rss

As we need to implement custom wrappers for countries, publishers, etc, we might as well offer a fully customizable feed, e.g.:

http://localhost:5000/feed/custom.rss?q=activity_count:[* TO 100]
http://localhost:5000/feed/custom.rss?publishertype=primary_source

Apart from the actual feeds, there will be a small amount of work at the template level to add the links to the suitable pages (and maybe a generic page showing all available feeds)

1324486965000000 1326711608000000
#1651 enhancement johnglover dread closed fixed Explicit link mapper

In this commit https://github.com/okfn/ckan/commit/1772a5c John Glover set map.explicit=True in ckan/config/routing.py.

The reason this was done was to avoid links collecting parameters. e.g. if you were on page /dataset/{id}/resource/{resource_id} then by default all the links on that page generated by url_for (Routes) would include the id and resource_id parameters as well. To avoid this, you had to go through all the links and add id=None and resource_id=None to the url_for parameters.

When map.explicit was changed to True, the value of the controller, action, id and any parameters were no longer automatically carried over into the generated links for the page. So previously links within the same controller didn't need to specify the controller (for example), but now they did. So when we did this we also had to fix up links that weren't explicit:

John made the config change on 5/11/2011 which was merged to master https://github.com/okfn/ckan/commit/5a01e67 21/11/2011. The related fixes mentioned were in within the same week. This all went into release 1.5.1. The requirement

1326709852000000 1326711005000000
#1647 enhancement shevski ckan-backlog new add links to ckan discuss & dev to thedatahub

In the footer as well as more clearly & directly on the About page

1326673852000000 1326707383000000
#1644 enhancement shevski ckan-backlog new Order default dataset page by most downloaded resources on thedatahub

Instead of alphabetically as we do currently, alternatively by most viewed datasets

for http://thedatahub.org/dataset

1326393542000000 1326393542000000
#1643 enhancement shevski ckan-backlog new Add fixed tags to thedatahub for better browsing

Similar to publicdata.eu, want to have themed areas such as finance, environment, census, etc and country tags

1326393293000000 1326393293000000
#1641 enhancement amercader amercader ckan-sprint-2012-01-23 closed fixed ckanext-archiver: Content-length header not reliable to check if resource has been modified

The download task in ckanext-archiver performs a HEAD request on the resource URL and checks if the "Content-Type" and "Content-Length" headers differ from the values stored to see if the resource needs to be updated [1].

The "Content-Length" header, although widely used, is not mandatory and some servers don't provide it, e.g.:

$ curl -I http://portfolio.theglobalfund.org/en/IATI/Activities?countryCode=AFG
HTTP/1.1 200 OK
Cache-Control: private
Transfer-Encoding: chunked
Content-Type: text/xml
Vary: Accept-Encoding
Server: Microsoft-IIS/7.5
Set-Cookie: ASP.NET_SessionId=3qhqekddgmre0kmk5cynq0sy; path=/; HttpOnly
X-AspNetMvc-Version: 3.0
content-disposition: attachment; filename=AFG_IATI_12012012.xml
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Thu, 12 Jan 2012 12:36:43 GMT

Also worth noting that requests, the python library that uses ckanext-archiver, sets an "Accept-Encoding: gzip" header by default, which depending on the configuration of the remote web server, may prevent the "Content-Length" server from being sent, e.g.:

$ curl -H "Accept-Encoding: gzip" -I http://iatistandard.org/published-temp/adb-activities.xml
HTTP/1.1 200 OK
Date: Thu, 12 Jan 2012 12:12:46 GMT
Server: Apache
Last-Modified: Mon, 28 Nov 2011 15:55:35 GMT
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Type: application/xml

curl -I http://iatistandard.org/published-temp/adb-activities.xml
HTTP/1.1 200 OK
Date: Thu, 12 Jan 2012 11:56:23 GMT
Server: Apache
Last-Modified: Mon, 28 Nov 2011 15:55:35 GMT
Accept-Ranges: bytes
Content-Length: 2686720
Vary: Accept-Encoding
Content-Type: application/xml

All this can lead to some resources never getting updated, and of course the size property of the resource not being set.

As we need to download the resource anyway, it would be better to check if the real length of the data has been modified (and store it).

[1] https://github.com/okfn/ckanext-archiver/blob/0a189262dca4ab5b286fb6a02b4ab8a201f639f3/ckanext/archiver/tasks.py#L72

1326376420000000 1326376777000000
#1638 enhancement seanh seanh ckan-sprint-2012-01-23 closed wontfix Don't use JsonType in activity streams

Dump and load JSON explicitly instead.

1326304935000000 1326305570000000
#1592 enhancement amercader amercader ckan-sprint-2012-01-09 closed fixed Add metadata_modified and metadata_created to package_dictize output

The dict returned by package_dictize does not include metadata_modified and metadata_created. These are really useful properties, so it's worth having them on the standard package dict representation, which is used in several places, like at the template level.

1324488909000000 1326304321000000
#1446 enhancement rgrp rgrp ckan-sprint-2012-01-09 closed fixed Data Explorer v2

Super ticket: #1602 (Data Previewer v2)

We already have first pass of Data Explorer that was released as part of #1357.

Tickets include (* indicates improvement over current explorer)

Est: 10-15d (should be broken down -- partly is in recline issues)

1320665596000000 1326281658000000
#1385 enhancement dread dread closed fixed Resolve postgres permissions issues

Currently there is a problem because the docs guide us to set the sqlalchemy url to use 'localhost' i.e. loopback, whereas paster commands don't specify '-h localhost' so use unix sockets (you need to do 'sudo -u postgres'). These should be the same.

Also do we need to tell people to add a line to their postgres authentication config /etc/postgresql/8.4/main/pg_hba.conf to help things? Florian suggests:

local   std         std                          md5

/etc/postgresql/8.4/main/pg_hba.conf

1318418537000000 1326218703000000
#1624 defect dread pudo ckan-sprint-2012-01-23 closed fixed Typo in dataset edit mode

Futher Information -> Further Information

1326121197000000 1326216362000000
#1626 enhancement dread dread ckan-sprint-2012-01-23 closed fixed 'About CKAN' page update

thedatahub.org/about contains info that is very general to the CKAN and really quite technical. The text should be changed to be both specific to thedatahub.org and provide the context in a non-technical way. It should be easy to customise the About page to be appropriate for say new-york.ckan.net - a bit of info about who runs it, plus the general stuff about CKAN powering it and it was written by OKF to further open data.

1326205236000000 1326215877000000
#1531 enhancement kindly kindly ckan-sprint-2012-01-09 closed fixed Update group create/update so you can add capacities and group types.

The new members table needs a way so you can add arbitrary domain objects against them.

We need to extend the group schema to accept types, and instead of just being able to add packages to groups add members with their capacities that associated with different table rows.

4d

1323272500000000 1326155226000000
#1622 enhancement johnglover johnglover ckan-sprint-2012-01-23 closed fixed Deploy QA on DGU UAT test server - 0.5d
  • Update CKAN on DGU UAT to 1.5.1
  • Deploy Celery
  • Deploy QA extension
1326116380000000 1326127702000000
#1467 defect thejimmyg thejimmyg ckan-sprint-2012-01-09 closed worksforme CKAN dumps dgu miss certain publisher information

Pawel knows about this so David Read, Pawel and I need to find time to discuss it.

1321376042000000 1326120319000000
#1582 enhancement johnglover johnglover ckan-sprint-2012-01-09 closed fixed Deploy QA for thedatahub - 0.5d
  • deploy celery
  • deploy QA and archiver tasks
  • write up a blog post announcing QA on thedatahub
1324458494000000 1326110801000000
#1494 enhancement seanh seanh ckan-sprint-2012-01-09 closed wontfix API call for getting a user's public activity stream as rendered rtext

This could be implemented as a separate API call, or the rendered text versions of the activities could be added into the JSON returned by the existing API call.

This requires setting up templates for rendering activity streams items and detail items as nice, human-readable text.

There are some open questions, e.g.: Do we want the entire activity stream rendered as a block of plain text? As HTML? Or do we want a list of JSON objects, where each object contains its textual and/or HTML representations as fields?

Activity stream items and their related detail items are separate objects that each have their own textual representations.

For a mockup of the kind of text messages we want, see:

http://datahub.pudo.org/pudo

but note that this ticket is just for creating the text snippets themselves, not rendering then in an HTML page or RSS feed. Also the mockup only show activity items and not their detail items.

1322495447000000 1326109757000000
#1540 defect amercader amercader ckan-sprint-2012-01-09 closed fixed Search API returns an error if empty parameters are provided

Both in 1.5.1b:

http://thedatahub.org/api/search/dataset?groups=lodcloud&title=

and 1.5.2a (current master):

http://test.ckan.net/api/search/dataset?groups=lodcloud&title=

Although the error message in 1.5.2a is more verbose:

"Bad request - Bad search option: HTTP code=400, reason=org.apache.lucene.queryParser.ParseException?: Cannot parse 'groups:lodcloud title:': Encountered \"<EOF>\" at line 1, column 22. Was expecting one of: \"(\" ... \"*\" ... <QUOTED> ... <TERM> ... <PREFIXTERM> ... <WILDTERM> ... \"[\" ... \"{\" ... <NUMBER> ..."

Some parameter validation before sending it to Solr should do the trick

1323359388000000 1326060385000000
#1610 enhancement ross ross closed duplicate Move webstore to Postgres instead of Sqlite

The default backing store for webstore should be Postgres and not sqlite. This was agreed as part of the #1574 storage changes and on http://ckan.okfnpad.org/newstorage

1325587341000000 1325852472000000
#1611 enhancement ross ross closed duplicate Implement auth API calls for webstore/external use

As part of #1574 we decided that it would be better for webstore (and future external services) to be able to authenticate simply with CKAN-Core.

Currently webstore access the CKAN database to obtain the key for the user but it would be better if this connection was not so tightly bound and that webstore used an API as any other external service might.

Need to discuss further with dread

1325590191000000 1325846987000000
#1448 enhancement kindly kindly closed fixed Set up nice way to do celery deployment.

Celery is awkward to deply, need to find a way to do it more simply. i.e using celery-pylons and supervisor. A modified version of celery-pylons may be the best solution. 1d

1320666977000000 1325774155000000
#1614 enhancement kindly kindly ckan-sprint-2012-01-09 closed fixed remove po files from git diff

Its a pain to see the difference between branches as there are normally a lot of po file transaction. Make the default be see that they have changed without actually show the diffs themselves.

1325686639000000 1325689136000000
#1612 enhancement kindly kindly ckan-sprint-2012-01-09 closed fixed Group view page slow

Group show that lists packages is slow due to not using query in pagination.

1325633737000000 1325688886000000
#1394 defect dread dread ckan-sprint-2012-01-09 closed fixed Resource validation error messages misleading

(Editing a dataset) If the second resource contains any validation error then it says "Resources: Package resource(s) incomplete" and "Resource 1:".

1318515262000000 1325604784000000
#1298 enhancement kindly kindly ckan-sprint-2012-01-09 closed fixed Generate activites to be put into activities table.

This should be done from the logic layer or automatically from a session extension.

1314696442000000 1325591582000000
#1599 enhancement rgrp rgrp ckan-sprint-2012-01-09 closed fixed [contrib] Simple embeddable dataset count widget (esp for group count)

Simple embeddable widget for use on 3rd party sites showing dataset counts for a given search query. Have a specific version just for groups.

  • Simple group count widget in JS for embedding in wordpress and elsewhere
    • Requested by several people (e.g Guo Xu from Econ working group). Already have something like this in CKAN JS for doing an embeddable search box.
  • All you need to do is do a dataset query over the API e.g. http://thedatahub.org/api/search/dataset?groups=economics and then embed in some html!

Estimate: 30m (for someone who knows their jquery).

1325246358000000 1325555201000000
#1588 enhancement johnglover johnglover ckan-backlog new QA - Give SPARQL endpoints a 4 star rating

Super: #1594

From Richard Cyganiak on the CKAN Discuss list:

Besides considering the media type of resources, it would also make sense to check for the presence of a SPARQL endpoint. SPARQL endpoints are recorded for more than 300 datasets on the Data Hub using the pseudo-type "api/sparql". A few more are recorded with the format "SPARQL". I suggest that datasets with such resources should also be considered for the fourth star.

1324480405000000 1325475178000000
#1589 enhancement johnglover johnglover ckan-backlog new QA - Give 5 star rating to datasets with link metadata

Super: #1594

From Richard Cyganiak on the CKAN Discuss list:

Regarding the fifth star (is the dataset linked to others?). This cannot be automatically determined just by looking at the format. It either requires inspection of the actual data, or information about links in the metadata. As you're probably aware, we've established conventions for recording information on data links in CKAN [1], as part of the work of the lodcloud group on the Data Hub. Link information is captured for hundreds of datasets. I would claim that we have the majority of four-star datasets covered there, and hence you can determine if they should get the fifth star by checking for the presence of a links:xxx field.

1324480600000000 1325475095000000
#1439 enhancement dread ckan-backlog new Action API discoverablility

A good service API needs to be discoverable, so you are not always having to refer to the documentation html.

Maybe /api/action should return a list of actions available? (Currently this returns a 404.)

  • It would be nice to sort these into get/create/update/delete.
  • #1438 Parameters for each of the actions must be discoverable too

/api/action/{action_name} should also return the help text / parameters allowable. (Currently this returns 400 error)

1320161970000000 1325474974000000
#1598 enhancement rgrp ckan-backlog new Reinstate Ratings

Ratings were disabled approximately a year ago because:

  • Unclear purpose and UX. What did ratings tell you? How useful were they?
  • Spamming (esp by bots: you could submit an anonymous rating via a GET request which caused problems)

Both problems are solvable and it would be nice to have this feature reinstated.

  • Purpose: can make this more purposable by limiting to logged in users (or at least distinguishing logged in from non-logged in users)
    • Even better we could allow ratings to be made public (I'm interested in what someone else I respect finds important)
  • Spamming: limit to logged in users and / or use AJAX over an API to submit ...
1325177524000000 1325474818000000
#1231 requirement kindly thejimmyg ckan-backlog closed wontfix [super] Management Information Reporting

Child tickets:

  • #1101 Integrate stats and googlanalytics into site nav

We have a spreadsheet from UKLP of statistics we'd like to generate

1311173919000000 1325474447000000
#1581 enhancement mark.wainwright@… johnglover ckan-future new Blog post about Google Analytics extension for CKAN

The CKAN Google Analytics extension has been updated to work with the latest version of CKAN, could make for a nice blog post.

Can ping John Glover in January for any details required.

Key link is: http://thedatahub.org/analytics/dataset/top though this should probably move to be under stats (e.g. http://thedatahub.org/stats/usage)

1324402800000000 1325474274000000
#1549 enhancement ross ckan-backlog closed wontfix [super] Short link tool

It would be great to have a CKAN extension that allowed users (or CKAN itself) to generate short links to other URIs (both internal and external). Once created, shortlinks made by CKAN should be changeable. This would allow uploaded content to be moved without the user's link changing at all. The tool itself might also be of use as a general link-shortener to users other than the CKAN system itself.

Another useful feature would be for this to also collect some simple analytics such as the referrer and client IP for future reference. I'm not yet sure what we would do with the analytics other than some sort of popularity metric.

Questions:

  • Core, or Extension, or Self-hosted?
1324036998000000 1325474219000000
#1577 defect rgrp dread ckan-backlog new Can't upload file with foreign chars in filename

Looks like uploading a file with foreign characters fails due to encoding reasons.

URL: http://thedatahub.org/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ%C3%AD-%C4%8Cesk%C3%A9-republiky-_-P%C5%99%C3%ADprava-rozpo%C4%8Dtu.pdf
Module weberror.errormiddleware:162 in __call__
<<              __traceback_supplement__ = Supplement, self, environ
                   sr_checker = ResponseStartChecker(start_response)
                   app_iter = self.application(environ, sr_checker)
                   return self.make_catching_iter(app_iter, environ, sr_checker)
               except:
>>  app_iter = self.application(environ, sr_checker)
Module beaker.middleware:73 in __call__
<<                                                     self.cache_manager)
               environ[self.environ_key] = self.cache_manager
               return self.app(environ, start_response)
>>  return self.app(environ, start_response)
Module beaker.middleware:152 in __call__
<<                          headers.append(('Set-cookie', cookie))
                   return start_response(status, headers, exc_info)
               return self.wrap_app(environ, session_start_response)
           
           def _get_session(self):
>>  return self.wrap_app(environ, session_start_response)
Module routes.middleware:130 in __call__
<<                  environ['SCRIPT_NAME'] = environ['SCRIPT_NAME'][:-1]
               
               response = self.app(environ, start_response)
               
               # Wrapped in try as in rare cases the attribute will be gone already
>>  response = self.app(environ, start_response)
Module pylons.wsgiapp:125 in __call__
<<          
               controller = self.resolve(environ, start_response)
               response = self.dispatch(controller, environ, start_response)
               
               if 'paste.testing_variables' in environ and hasattr(response,
>>  response = self.dispatch(controller, environ, start_response)
Module pylons.wsgiapp:324 in dispatch
<<          if log_debug:
                   log.debug("Calling controller class with WSGI interface")
               return controller(environ, start_response)
           
           def load_test_env(self, environ):
>>  return controller(environ, start_response)
Module ckan.lib.base:123 in __call__
<<          # available in environ['pylons.routes_dict']    
               try:
                   return WSGIController.__call__(self, environ, start_response)
               finally:
                   model.Session.remove()
>>  return WSGIController.__call__(self, environ, start_response)
Module pylons.controllers.core:221 in __call__
<<                  return response(environ, self.start_response)
               
               response = self._dispatch_call()
               if not start_response_called:
                   self.start_response = start_response
>>  response = self._dispatch_call()
Module pylons.controllers.core:172 in _dispatch_call
<<              req.environ['pylons.action_method'] = func
                   
                   response = self._inspect_call(func)
               else:
                   if log_debug:
>>  response = self._inspect_call(func)
Module pylons.controllers.core:107 in _inspect_call
<<                        func.__name__, args)
               try:
                   result = self._perform_call(func, args)
               except HTTPException, httpe:
                   if log_debug:
>>  result = self._perform_call(func, args)
Module pylons.controllers.core:60 in _perform_call
<<          """Hide the traceback for everything above this method"""
               __traceback_hide__ = 'before_and_this'
               return func(**args)
           
           def _inspect_call(self, func):
>>  return func(**args)
Module ckanext.storage.controller:2 in auth_form
Module ckan.lib.jsonp:26 in jsonpify
<<      Very much modelled after pylons.decorators.jsonify .
           """
           data = func(*args, **kwargs)
           return to_jsonp(data)
>>  data = func(*args, **kwargs)
Module ckanext.storage.controller:301 in auth_form
<<          method = 'POST'
               authorize(method, bucket, label, c.userobj, self.ofs)
               data = self._get_form_data(label)
               return data
>>  authorize(method, bucket, label, c.userobj, self.ofs)
Module ckanext.storage.controller:79 in authorize
<<      if method != 'GET':
               # do not allow overwriting
               if ofs.exists(bucket, key):
                   abort(409)
               # now check user stuff
>>  if ofs.exists(bucket, key):
Module ofs.remote.botostore:53 in exists
<<          if bucket is None: 
                   return False
               return (label is None) or (label in bucket)
           
           def claim_bucket(self, bucket):
>>  return (label is None) or (label in bucket)
Module boto.s3.bucket:87 in __contains__
<<      def __contains__(self, key_name):
              return not (self.get_key(key_name) is None)
       
           def startElement(self, name, attrs, connection):
>>  return not (self.get_key(key_name) is None)
Module boto.s3.bucket:144 in get_key
<<          response = self.connection.make_request('HEAD', self.name, key_name,
                                                       headers=headers,
                                                       query_args=query_args)
               # Allow any success status (2xx) - for example this lets us
               # support Range gets, which return status 206:
>>  query_args=query_args)
Module boto.s3.connection:388 in make_request
<<          if isinstance(key, Key):
                   key = key.name
               path = self.calling_format.build_path_base(bucket, key)
               boto.log.debug('path=%s' % path)
               auth_path = self.calling_format.build_auth_path(bucket, key)
>>  path = self.calling_format.build_path_base(bucket, key)
Module boto.s3.connection:88 in build_path_base
<<      def build_path_base(self, bucket, key=''):
               return '/%s' % urllib.quote(key)
       
       class SubdomainCallingFormat(_CallingFormat):
>>  return '/%s' % urllib.quote(key)
Module urllib:1222 in quote
<<              safe_map[c] = (c in safe) and c or ('%%%02X' % i)
               _safemaps[cachekey] = safe_map
           res = map(safe_map.__getitem__, s)
           return ''.join(res)
>>  res = map(safe_map.__getitem__, s)
KeyError: u'\xed'
CGI Variables
AUTH_TYPE	'cookie'
CONTENT_TYPE	'; charset=utf-8'
DOCUMENT_ROOT	'/htdocs'
GATEWAY_INTERFACE	'CGI/1.1'
HTTP_ACCEPT	'*/*'
HTTP_ACCEPT_CHARSET	'ISO-8859-1,utf-8;q=0.7,*;q=0.3'
HTTP_ACCEPT_ENCODING	'gzip,deflate,sdch'
HTTP_ACCEPT_LANGUAGE	'en-US,en;q=0.8'
HTTP_CACHE_CONTROL	'max-age=259200'
HTTP_CONNECTION	'keep-alive'
HTTP_COOKIE	'thedatahub_net=27a7f095fcca1ea6b36df996d595e3278b16f4538862bf7f88d49e2000b9246547c8fd0e; auth_tkt="f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode"; auth_tkt="f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode"; ckan_user=elenaibp; ckan_display_name="Elena Mondo"; ckan_apikey=decd48b1-49ee-4250-bff4-98ccca9c02a5; hide_welcome_message=1; __utma=119670349.1809834699.1323782464.1324293066.1324298316.4; __utmb=119670349.3.10.1324298316; __utmc=119670349; __utmz=119670349.1323782464.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)'
HTTP_HOST	'thedatahub.org'
HTTP_REFERER	'http://thedatahub.org/dataset/edit/budget-library-czeck-republic'
HTTP_USER_AGENT	'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7'
HTTP_VIA	'1.1 localhost (squid/3.0.STABLE19)'
HTTP_X_FORWARDED_FOR	'87.114.74.190'
HTTP_X_REQUESTED_WITH	'XMLHttpRequest'
PATH	'/usr/local/bin:/usr/bin:/bin'
PATH_INFO	'/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ\xc3\xad-\xc4\x8cesk\xc3\xa9-republiky-_-P\xc5\x99\xc3\xadprava-rozpo\xc4\x8dtu.pdf'
PATH_TRANSLATED	'/home/okfn/var/srvc/ckan.net/pyenv/bin/ckan.net.py/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ\xc3\xad-\xc4\x8cesk\xc3\xa9-republiky-_-P\xc5\x99\xc3\xadprava-rozpo\xc4\x8dtu.pdf'
REMOTE_ADDR	'193.34.146.142'
REMOTE_PORT	'55419'
REMOTE_USER	u'elenaibp'
REMOTE_USER_DATA	'userid_type:unicode'
REMOTE_USER_TOKENS	['']
REQUEST_METHOD	'GET'
REQUEST_URI	'/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ%C3%AD-%C4%8Cesk%C3%A9-republiky-_-P%C5%99%C3%ADprava-rozpo%C4%8Dtu.pdf'
SCRIPT_FILENAME	'/home/okfn/var/srvc/ckan.net/pyenv/bin/ckan.net.py'
SCRIPT_URI	'http://thedatahub.org/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ\xc3\xad-\xc4\x8cesk\xc3\xa9-republiky-_-P\xc5\x99\xc3\xadprava-rozpo\xc4\x8dtu.pdf'
SCRIPT_URL	'/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ\xc3\xad-\xc4\x8cesk\xc3\xa9-republiky-_-P\xc5\x99\xc3\xadprava-rozpo\xc4\x8dtu.pdf'
SERVER_ADDR	'193.34.146.146'
SERVER_ADMIN	'[no address given]'
SERVER_NAME	'thedatahub.org'
SERVER_PORT	'80'
SERVER_PROTOCOL	'HTTP/1.0'
SERVER_SIGNATURE	'<address>Apache/2.2.14 (Ubuntu) Server at thedatahub.org Port 80</address>\n'
SERVER_SOFTWARE	'Apache/2.2.14 (Ubuntu)'
WSGI Variables
application	<beaker.middleware.CacheMiddleware object at 0x7f22601c7dd0>
beaker.cache	<beaker.cache.CacheManager object at 0x7f22601c7b50>
beaker.get_session	<bound method SessionMiddleware._get_session of <beaker.middleware.SessionMiddleware object at 0x7f22601c7a90>>
beaker.session	{'_accessed_time': 1324298703.071357, '_creation_time': 1324293077.4139669}
mod_wsgi.application_group	'ckan.net|'
mod_wsgi.callable_object	'application'
mod_wsgi.listener_host	''
mod_wsgi.listener_port	'80'
mod_wsgi.process_group	'ckan.net'
mod_wsgi.reload_mechanism	'1'
mod_wsgi.script_reloading	'1'
mod_wsgi.version	(2, 8)
paste.cookies	(<SimpleCookie: __utma='119670349.1809834699.1323782464.1324293066.1324298316.4' __utmb='119670349.3.10.1324298316' __utmc='119670349' __utmz='119670349.1323782464.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)' auth_tkt='f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode' ckan_apikey='decd48b1-49ee-4250-bff4-98ccca9c02a5' ckan_display_name='Elena Mondo' ckan_user='elenaibp' hide_welcome_message='1' thedatahub_net='27a7f095fcca1ea6b36df996d595e3278b16f4538862bf7f88d49e2000b9246547c8fd0e'>, 'thedatahub_net=27a7f095fcca1ea6b36df996d595e3278b16f4538862bf7f88d49e2000b9246547c8fd0e; auth_tkt="f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode"; auth_tkt="f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode"; ckan_user=elenaibp; ckan_display_name="Elena Mondo"; ckan_apikey=decd48b1-49ee-4250-bff4-98ccca9c02a5; hide_welcome_message=1; _ _utma=119670349.1809834699.1323782464.1324293066.1324298316.4; __utmb=119670349.3.10...)|utmcmd=(none)')
paste.registry	<paste.registry.Registry object at 0x7f226194df50>
paste.throw_errors	True
pylons.action_method	<bound method StorageAPIController.auth_form of <ckanext.storage.controller.StorageAPIController object at 0x7f2261dad990>>
pylons.controller	<ckanext.storage.controller.StorageAPIController object at 0x7f2261dad990>
pylons.environ_config	{'session': 'beaker.session', 'cache': 'beaker.cache'}
pylons.pylons	<pylons.util.PylonsContext object at 0x7f2261daddd0>
pylons.routes_dict	{'action': u'auth_form', 'controller': u'ckanext.storage.controller:StorageAPIController', 'label': u'2011-12-19T124447/Ministerstvo-financ\xed-\u010cesk\xe9-republiky-_-P\u0159\xedprava-rozpo\u010dtu.pdf'}
repoze.who.identity	<repoze.who identity (hidden, dict-like) at 139785645747120>
repoze.who.logger	<logging.Logger instance at 0x7f225e23c098>
repoze.who.plugins	{'openid': <OpenIdIdentificationPlugin 139785625065680>, 'friendlyform': <FriendlyFormPlugin 139785618095248>, 'ckan.lib.authenticator:UsernamePasswordAuthenticator': <ckan.lib.authenticator.UsernamePasswordAuthenticator object at 0x7f2260874c10>, 'auth_tkt': <AuthTktCookiePlugin 139785625065808>, 'ckan.lib.authenticator:OpenIDAuthenticator': <ckan.lib.authenticator.OpenIDAuthenticator object at 0x7f2260874c90>}
routes.route	<routes.route.Route object at 0x7f22601a1090>
routes.url	<routes.util.URLGenerator object at 0x7f2261dadf50>
webob._parsed_query_vars	(GET([]), '')
webob.adhoc_attrs	{'language': 'en-us'}
wsgi process	'Multiprocess'
wsgi.file_wrapper	<built-in method file_wrapper of mod_wsgi.Adapter object at 0x7f2261da9af8>
wsgiorg.routing_args	(<routes.util.URLGenerator object at 0x7f2261dadf50>, {'action': u'auth_form', 'controller': u'ckanext.storage.controller:StorageAPIController', 'label': u'2011-12-19T124447/Ministerstvo-financ\xed-\u010cesk\xe9-republiky-_-P\u0159\xedprava-rozpo\u010dtu.pdf'})
1324317659000000 1325473564000000
#1240 enhancement kindly rgrp ckan-backlog assigned [super] API v4

(Just creating this ticket as somewhere to keep notes)

  • Decide on REST api versus action API
    • Do we want to support both?
  • Tidying
    • Unify on /api/v{version num}/... structure (do we want a default option that points to current default? e.g. /api/default/ ...)
    • extras merged into normal field list in package
    • Get rid of /rest/ so just have api/v1/package
    • Get rid of separation of search api from 'rest' api
      • Propose that GET on REST index is search e.g. /package/?q=...
        • This is also resolves issue whereby GET at root returns whole package set (a *bad* idea) as this would now become the matchall search query (with a default limit on items returned)
  • Resource read/write in API (separate from package)
    • Does this need authorization work?
  • user/account API - read/write
  • Remove autocomplete -- can just use search
    • Do not worry about backwards compat as should only be used in our js (if others using it too bad!)
1311525660000000 1325473312000000
#1578 enhancement rgrp ckan-backlog new [super] Re-enable and refactor ratings 1324322443000000 1325473015000000
#1331 defect dread dread closed worksforme Setting a tag twice causes exception

To reproduce:

  1. Create a package with two tags the same: "bulk bulk"
  2. Click 'save'
  3. 500 ERROR - 'Server Error'
1315905959000000 1325355631000000
#1175 defect dread fccoelho@… closed invalid Stats extension not working

Hi, I get a 500, Internal server error when I enable ckanext-stats. Flavio

1307350823000000 1325355170000000
#1141 CREP johnglover ckan-backlog closed fixed [super] Moderated Edits User Interface

Proposer: John Glover
Seconder: James Gardner

Abstract

We are trying to achieve these goals:

  • To get people involved with making edits to CKAN metadata.
  • To have an ownership model as to who can moderate and validate these changes
  • To not put too huge a burden on these owners.

This feature allows anyone to edit a package and create a new revision, but requires an owner/moderator to approve a revision before it is are made "official".

There have been a lot of discussions around the revisioning system side of this ticket (CREP 0002) and I think these are now largely resolved. We now want to discuss the user interface.

The Problem

We require the following functionality:

  • Allow a group of changes to be stored as a new revision.
  • Allow a linear stack of "community" revisions.
  • Provide a way for the editor and moderator to compare previous revisions to the current one.
  • When a moderator approves a change it creates a new revision flagged "moderated" (this is analogous to a merge commit)
  • Provide a way for the editor and moderator comment on revisions if necessary.

Extra features:

  • Need a way to summarise the changes (as part of the preview perhaps)
  • Sysadmin needs to purge a revision completely

Specification

UI/UX

UI Mockup:

Revisions:

  • Revisions are per package rather than per field.
  • Internally CKAN has separate revisions for resources, extras and package metadata. From a user's point of view this could be confusing to expose, so everything that they see on a package form when they hit save is a single revision.

On the Edit page:

  • We have a panel on the right, listing all the revisions with the current moderated one selected. Moderated revisions are highligted in some way (red and bold?).
  • The values displayed in the form are by default populated from the latest revision (whether community or moderated)
  • Under each field is a "shadow", showing the value of the field in the revision selected in the panel, if it is different from the value in the field. By default the shadow values are populated from the latest moderated revision which is the one selected in the revision panel by default too.
  • When you change the value of a field, a shadow may appear or disappear accordingly. If they disappear a box saying that they are the same replaces it
  • If you want to edit values from a previous revision, you first select that revision to get the shadows populated. There is a button named "Replace fields with values from this revision" under the revision list. You click this, a warning pops up and then you say "Yes". You then select the moderated revision again.
  • We also allow package comments the same way as the todo extension works at the moment. Additionally, we need to be able to differentiate between what the moderator wrote and what a community member wrote, and so we may need to make a small change to the todo extension to facilitate this.
  • In addition to package comments, each revision will have a revision log (analogous to a commit message).

Technical Details

  • This CREP will result in a new CKAN extension.
  • It depends heavily on the new revisioning system (CREP0002), some of the details of which are yet to be finalised.
  • This CREP therefore requires working closely with David Raznick to come up with an API that the UI AJAX calls can use.
  • We will then use suitable test data to mimic these API calls until CREP0002 is ready.

Why do it this way

This hopefully provides a clear and consistent mechanism allowing both a community member to make new revisions and a moderator to view and approve revisions, with largely the same UI/UX.

Implementation plan

Deliverables

A new CKAN extension, consisting of:

  • Code: Python, HTML, CSS, Javascript
  • Unit tests
  • Localization
  • Documentation

Participants

John Glover to do it.

Progress

John has implemented the bulk of this UI. Just some things to tidy up before it is complete:

  • Genshi stream filters to be updated with CKAN 1.5 / 1.5.1 templates
  • history_ajax / read_ajax to be replaced with calls to Action API (or Util REST API)

I've split these two off into a new ticket #1604.

Related Progress

The Todo extension is written and available at: https://bitbucket.org/johnglover/ckanext-todo.

In the section 'The Problem', under extra features, we mention a need for the sysadmin to be able to purge a revision already. This is already done.

See also

#1129 Backend work

1305721003000000 1325352507000000
#1604 enhancement dread ckan-backlog new Get ckanext-moderatededits working with CKAN 1.5+ templates

ckanext-moderatededits requires an old and possibly development version of CKAN. It would be good to update it for later CKAN versions.

According to the README, you need CKAN from branch feature-1141-moderated-edits-ajax but the changelog suggests this branch went into version 1.4.2. So it possibly works with 1.4.2 and 1.4.3(.1). But CKAN 1.5 has revamped templates, so the genshi stream filters definitely don't work.

BTW history_ajax/read_ajax calls have been deprecated in CKAN since 1.5.2a and will need fixing up to use the Action API too as part of this.

1325352429000000 1325352429000000
#1129 CREP kindly ckan-v1.5 closed fixed CREP0002: Moderated Edits

Proposer: David Raznick

Abstract.

We are trying to achieve these goals.

  • To get people involved with making edits to CKAN metadata.
  • To have an ownership model as to who can moderate and validate these changes
  • To not put too huge a burden on these owners.

In order to achieve this, a feature which lets anyone edit a package but only let the moderator/owner accept it. The moderator should be able to look at a list of changes and accept the ones that

This cep is not about 'if' we need such a feature, it is about 'how' we go about implementing it. Another cep may needed for the 'if' case.

The Problem

We need the following to be possible.

  • Storing revision of objects that are not the current active one.
  • A way of the user viewing past revisions.
  • Accessing not only the history of a particular object but also of related objects at that time. i.e If a resource related to a package changes we need a way to see this when looking at the package.
  • A robust way of doing this in the face of database schema changes.
  • Make sure database queries are quick.

Solutions.

  1. Store the whole dictization of the package and all its related objects every time you change anything in its dictized representation and only save to the database proper if accepted.

Pros

  • Easy to implement, we already have a preview which makes the dictized form of a package without actually saving it. This will just need to be persisted in some way.
  • Fast retrieval.
  • Potential to store a branching revision tree of changes.

Cons

  • No easy way to remake the dictized packages historically or if there is an there a change in the way we represent packages, i.e schema changes.
  • Will only work for the particular objects we decide to store these changes for.
  • Stores a lot of repeated information
  1. Write specialized queries for every read of the database looking only at the revision tables.

This method requires there to be a change in the way we use VDM, so that we manage statefulness ourselves. We will need to add other states such as 'waiting for approval'.

Pros

  • No specialized storage required
  • Only need to change queries when schema changes
  • Can be made to work easily for other objects

Cons

  • Slower query time on read, as even looking at the last active package will need to do a fairly complicated query.

Implementation details.

1.

A new table with columns id, user, package_id, timestamp, revision_id, parent_id, dictized_package. revision_id should be null unless it is actually persisted to the database. parent_id is the id that this package_dict was changed from.

We could store only the diffs of the dictized_package as long as we assure that everything inside the json is stably sorted, this will make getting the historical data out slower.

Getting out the history of the dictized packages is an intensive task, as it will require replaying the whole history of all the changes and creating the dict for each change. This re-caching will need to be redone for every change we make to dictized representation of a package.

2.

Every normal packages read needs to look at the revision table to see the last accepted change in the dictized representation of the package. We also need to way to get what the dictized representation of the package was like at any point of its revision history. This querying is non-trivial in sql.

Participants

David Raznick to do it.

Progress.

Decided to go with option 2. However we will change the revisioning system to be like the schema attached. This gets rid of difficult querying problems caused by querying the revision tables by adding an end date, meaning you can do range queries.

The better and more normalized version of a revisioning system is outlined https://docs.google.com/drawings/d/1Y7nMgVsrs081Pame2RdbZHlCAlV33ddTZ8VAsab1j-0/edit?hl=en_GB&authkey=CJfd8vsB. We will be a step closer to that, with this change, but we will keep the current vdm more or less, intact.

1304851498000000 1325268100000000
#1568 enhancement David Raznik jilly mathews ckan-future closed duplicate Moderated Edits

Can this be released as a standard CKAN feature?

1324293776000000 1325267998000000
#1545 enhancement amercader ckan-sprint-2012-01-09 closed wontfix Remove external asset dependencies

CKAN is pulling a number of resources from external locations. This causes problems when connectivity is limited and you have to work locally. Maybe some of them cold be moved to CKAN source to avoid external requests.

Quick search:

./ckan/templates/layout_base.html:            <img src="http://assets.okfn.org/images/logo/okf_logo_white_and_green_tiny.png" id="footer-okf-logo" />
./ckan/templates/layout_base.html:            <a href="http://opendefinition.org/"><img alt="This Content and Data is Open" src="http://assets.okfn.org/images/ok_buttons/od_80x15_blue.png" style="border: none ; margin-bottom: -4px;"/></a>
./ckan/templates/package/resource_read.html:                <img src="http://assets.okfn.org/images/ok_buttons/od_80x15_blue.png" alt="[Open Data]" />
./ckan/templates/package/read.html:          <img src="http://assets.okfn.org/images/ok_buttons/od_80x15_blue.png" alt="[Open Data]" /></a>
./ckan/templates/_util.html:                    <img src="http://assets.okfn.org/images/ok_buttons/od_80x15_blue.png" alt="[Open Data]" />
./ckan/templates/_util.html:                  <img src="http://assets.okfn.org/images/ok_buttons/od_80x15_blue.png" alt="[Open Data]" />
./ckan/public/scripts/vendor/ckanjs/1.0.0/ckanjs.js:      this.$dialog.html('<h2>Loading results...</h2><img src="http://assets.okfn.org/images/icons/ajaxload-circle.gif" />');
./ckan/public/scripts/vendor/ckanjs/1.0.0/ckanjs.js:          self.setMessage('Uploading file ... <img src="http://assets.okfn.org/images/icons/ajaxload-circle.gif" class="spinner" />');
./ckan/public/scripts/vendor/ckanjs/1.0.0/ckanjs.js:      self.setMessage('Checking upload permissions ... <img src="http://assets.okfn.org/images/icons/ajaxload-circle.gif" class="spinner" />');
Binary file ./ckan/lib/app_globals.pyc matches
./ckan/lib/app_globals.py:                                  'http://assets.okfn.org/p/ckan/img/ckan.ico')
./ckan/config/deployment.ini_tmpl:ckan.favicon = http://assets.okfn.org/p/ckan/img/ckan.ico
1323702635000000 1325260051000000
#945 enhancement kindly kindly ckan-v1.6 closed fixed [super] Richer resources - Resource Groups, new fields, improved UI

Super ticket: #1032

This is a meta ticket for changes that are going to happen in resources.

  • New resource group table. #956
  • New kind field in resource. #957
  • UI for new kind field. #958
  • Resources in REST API ticket:358
  • Resources in WUI #1445
  • Make Resources first class entity. #922 (duplicate?)

Background on this change can be found at:

1296475283000000 1325259350000000
#1065 enhancement zephod johnlawrenceaspden ckan-v1.6 closed fixed [super] Change Authorization System

Child tickets

  • #1198 Publisher hierarchy
  • #1050 Authz lib improvement and refactor of ckan/lib/authztool.py
  • #1004 Group creation instructions missing
  • #1099 Strange interactions between two browsers while playing with authz groups
  • #1115 can have two authzgroups with the same name
  • #1133 command line rights manipulation doesn't work
  • #1138 minor navigations behave inconsistently

Old ticket description:

  1. Change name of AuthzGroup? to UserGroup? to reflect what it is for
  1. Get rid of Roles, and replace them with direct assignment of actions, even though there are many actions, and extensions can add arbitrary ones.
    • Debatable whether we should cut the number of actions to correspond to the three roles defined by the base system.
    • Have a method of finding roles (or, in future, actions) relevant to a given protection object (e.g. FILE-UPLOAD(ER) not relevant to Packages)
  1. Change UserGroups? so that they can have a hierarchical structure,

More info on Hierarchy change

e.g. UserGroup? NHS contains the User nhsysadmin, as well as the UserGroups? SURREY and BERKS, which themselves contain users.

One user in SURREY is Simon the Sysadmin, who has permissions on the whole system. His permissions should not leak out to other users or groups, and user permissions generally should not.

Each Group has permissions over various objects.

A user has permissions in his own right, and also has the permissions of his own group, and of all the groups contained in his group, and so on recursively.

Algorithm:

possible(user, action, package):

if user has permission for action on package

or any of have that permission

or any of his groups group-children (but not user-children), and so on recursively have the permission.

1301508331000000 1324550041000000
#1543 defect johnglover amercader ckan-sprint-2012-01-09 closed fixed Pagination links in the dataset listings don't keep the current filters

E.g. Pagination links on this page don't include groups=lodcloud http://thedatahub.org/dataset?groups=lodcloud

Not sure if related to #1501 (probably not)

1323442623000000 1324483367000000
#1502 defect johnglover johnglover ckan-sprint-2011-12-05 closed fixed Group package list is ordered by revision timestamp instead of alphabetically 1322680312000000 1324480415000000
#1580 enhancement johnglover johnglover ckan-sprint-2012-01-09 closed fixed Documenting TaskStatus table and QA changes - 0.5d 1324399664000000 1324478635000000
#1505 defect dread dread ckan-sprint-2011-12-05 closed fixed SearchError and SearchQueryError cause exception in Action API

This query caused ckan to except because ckan/controllers/api.py doesn't catch SearchError? and SearchQueryError?:

curl http://localhost:5000/api/action/package_search -d '{"sort": "metadata_modified"}'
1322758968000000 1324474577000000
#1455 defect johnglover dread ckan-sprint-2011-12-05 closed fixed Search results when 'all_fields' don't include 'extra' fields

When you do a search like this:

http://thedatahub.org/api/search/package?q=tauberer+census&all_fields=1

the "extra" fields (e.g. "triples", "shortname") get missed off the results. The docs say it should be a "full record" and I don't see any reason why this is missed off.

This is a problem because search all_fields is the only way for clients and front-ends to get packages in bulk. They end up (like lodcloud) doing thousands of requests to get packages individually.

The full record is:

http://thedatahub.org/api/rest/dataset/2000-us-census-rdf
{"count": 1, "results": [{"res_description": ["Download", "XML Sitemap", "SPARQL enpdoint", "Example (RDF/XML)"], "name": "2000-us-census-rdf", "license": "Non-OKD Compliant::Creative Commons Non-Commercial (Any)", "author": "Joshua Tauberer", "author_email": "http://razor.occams.info/", "ckan_url": "http://thedatahub.org/dataset/2000-us-census-rdf", "notes": "2000 U.S. Census converted into over a billion RDF triples.\n\nPopulation statistics at various geographic levels, from the U.S. as a whole, down through states, counties, sub-counties (roughly, cities and incorporated towns)\n\nNotes: also found in the of SPARQL Endpoints.\n\nFrom home page:\n\n> * For the detailed Census statistics, you'll have to download the raw Census data files from the Census Bureau, my Perl script and the patch file below and run it yourself because the files are too big for me to offer as a download!\n> \n> * The data and scripts can be reused under Creative Commons Attribution-NonCommercial-ShareAlike.\n", "entity_type": "package", "site_id": "www.ckan.net", "download_url": "http://www.rdfabout.com/demo/census/", "indexed_ts": "2011-11-01T12:52:36.034Z", "url": "http://www.rdfabout.com/demo/census/", "state": "active", "title": "2000 U.S. Census in RDF (rdfabout.com)", "groups": ["lod", "lodcloud"], "res_format": ["", "meta/sitemap", "api/sparql", "example/rdf+xml"], "license_id": "cc-nc", "revision_id": "fcbad0de-79ea-41bd-8e01-eb832a05b732", "res_url": ["http://www.rdfabout.com/demo/census/", "http://www.rdfabout.com/sitemap.xml", "http://www.rdfabout.com/sparql", "http://www.rdfabout.com/rdf/usgov/geo/us/ny"], "id": "551ec435-f198-4d52-9b56-ec0b0be6aec9", "tags": ["census", "data", "demographics", "deref-vocab", "format-dc", "format-geonames", "format-politico", "format-rdf", "geographic", "linkeddata", "lod", "lodcloud.nolinks", "no-license-metadata", "no-provenance-metadata", "no-vocab-mappings", "population", "published-by-third-party", "rdf", "statistics", "us"]}]}
1320858265000000 1324474466000000
#1493 defect dread dread ckan-sprint-2011-12-05 closed fixed 'search-index rebuild/clear' doesn't work if no ckan.site_id

You can't delete things from the SOLR search index if the ckan.site_id and ckan.site_url are blank.

Should assert that one of these are set up.

1322484422000000 1324474360000000
#1487 enhancement kindly ckan-sprint-2011-12-05 closed fixed Fix group ordering on homepage

ordering on homepage by name instead of group count

1322094280000000 1324474147000000
#1470 defect dread amercader ckan-sprint-2011-11-21 closed fixed Check user name in the profile form 1321446143000000 1324473955000000
#1433 enhancement kindly rgrp ckan-sprint-2011-11-21 closed fixed Support SQLAlchemy 0.7

Why: current stable version of sqlalchemy. geoalchemy stuff required 0.7 and likely that some other things will require it soon.

Probably requires work on vdm https://bitbucket.org/okfn/vdm

NB: should have discussion before making 0.7 the default required version in CKAN core.

1320143453000000 1324472583000000
#1456 enhancement amercader amercader ckan-sprint-2011-11-21 closed fixed Use resource description instead of name if both are present

If a resource has both description and name the name is used. Descriptions are generally more, well, descriptive, so let's use those.

1320862619000000 1324472178000000
#892 enhancement johnglover pudo ckan-sprint-2012-01-09 closed fixed Make stored data available in WUI - 0.5d

Once we have storage, make the data available in the following ways:

  • Now have a cached_url field can show in the frontend ...
  • Add a [<a href="${cached_url}">cached</a>] link to right of real url on resource listing on dataset page.
  • On resource page: will not add it yet.
    • At the moment no clear place to pu this given nice big download button (could put in list of items on left but that does not seem right and note that it will turn up in big list of info at bottom)
  • Add test (?)
  • Deploy
1294053293000000 1324402480000000
#1451 enhancement johnglover johnglover ckan-sprint-2012-01-09 closed fixed Reintegrate download stats on dataset and resource view page - 0.5d
  • css class: resource-url-analytics
  • Assign to the link tag (a) everywhere we want to count (dataset, resource view) - 0.25d
  • Display counts in same place ...

And deploy on http://thedatahub.org/ - 0.25d

Possible: Also move analytics extension into core (decided not to).

1320677859000000 1324401792000000
#1402 enhancement kindly rgrp ckan-v1.6 closed fixed Migrate repository from mercurial to git

Plan to migrate from mercurial git

Process:

  • Do trial run
  • Announce conversion date / time
  • Require everyone to have pushed all outstanding changes at that time
  • Do conversion
  • Test
  • Announce on list and ckan.org/
1318811651000000 1324334011000000
#1522 enhancement kindly kindly ckan-sprint-2011-12-19 closed fixed Add capacity to member table.

Need to add capacities to member tables.

1323172610000000 1324333827000000
#1529 enhancement dread dread ckan-sprint-2011-12-19 closed fixed Display user name when logged in

Currently when you log in it says "logged-in". Most sites show your user-name and this is helpful when you have more than one account or human using the computer.

1323252086000000 1324318628000000
#1519 enhancement johnglover shevski ckan-sprint-2012-01-09 closed wontfix combine stats and analytics extensions into one in UI as well as deployment

Makes more sense to only have one comprehensive stats/analytics extension, so when people are looking to add a stats extension they won't have to add two which may be confusing (is one an old version of the other? why do I have to add two? what's the difference? etc)

User-wise we need a way to display our stats with google analytics in the same place

1323169033000000 1324317373000000
#1563 enhancement David Raznik jilly mathews ckan-future closed invalid Finish Data Storage

Unsure what needs to happen here. Need to list outstanding tasks and implement.

1324292346000000 1324314806000000
#1570 enhancement David Raznik jilly mathews ckan-future closed invalid Integrated file Storage

Is this ready for release? What needs to be done?

1324294142000000 1324314741000000
#1573 enhancement David Raznik jilly mathews ckan-future new Apps and Ideas

Estimate 2 weeks for someone to finish and test.

1324294593000000 1324294593000000
#1572 enhancement David Raznik jilly mathews ckan-future new Meta data Harvester

Need to write custom harvesters for each client. Is it worth having one for data hub?

1324294509000000 1324294509000000
#1569 enhancement David Raznik jilly mathews ckan-future new Wordpressser

How much effort will this be to be ready to use?

1324294056000000 1324294056000000
#1567 enhancement David Raznik jilly mathews ckan-future new Finish QA extension

Requires change to celeryd. Estimated 4 weeks.

1324293599000000 1324293599000000
#1565 enhancement Rufus Pollock jilly mathews ckan-future new Admin dashboard finished?

Is testing complete and ready for release?

1324293092000000 1324293092000000
#1564 enhancement David Raznik jilly mathews ckan-future new Structured Data (Data API)

Basic websotre exists but this may be not what is described yet.

CKAN provides a rich API for the data itself, allowing users to query retrieve and use data instantly from datasets in CKAN without needing to download or process it first.

1324292834000000 1324292834000000
#1489 enhancement dread ckan-backlog assigned Updating example theme/extension

ckanext-example needs updating for CKAN 1.5:

  • theme changes
  • new forms

About: 'ckanext-exampletheme' was created in Spring 2011 as an example CKAN extension that showed how to customise the look & operation of CKAN. This moved to github and renamed 'ckanext-example'.

1322137920000000 1324292384000000
#1562 enhancement Adria jilly mathews ckan-future new Finish Geo Spatial

Estimated 4 weeks of Adria's time. I guess this will need to be broken down into more tickets. This feature is being requested by a number of potential customers and we have some ideas of requirements between Rufus and Jilly for this. This is the most popular new feature we talk about to new clients.

1324292193000000 1324292193000000
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Note: See TracReports for help on using and creating reports.