{22} Trac tickets (2647 matches)

Results (1 - 100 of 2647)

1 2 3 4 5 6 7 8 9 10 11
Id Type Owner Reporter Milestone Status Resolution Summary Description Posixtime Modifiedtime
#350 enhancement dread ckan-backlog reopened Search engine optimisation

Need to research what can easily be done to improve CKAN packages in the search rankings.

Comments from Glen Barnes:

We've been pretty successful at SEO without even really trying (see http://www.google.co.nz/search?client=safari&rls=en&q=auckland+google+transit+feed&ie=UTF-8&oe=UTF-8&redir_esc=&ei=dsYSTOzJLs2eceuZiI8I as an example). This to me is key. If we are to make data available it has to be findable which is the main reason for a catalogue. There are probably things we should be doing on CKAN like using slugged urls (http://www.ckan.net/package/ascoe -> http://www.ckan.net/package/ascoe/atmospheric-chemistry-studies-in-the-oceanic-environment), setting the H1 tag correctly ("Atmospheric Chemistry Studies in the Oceanic Environment" on the example above). Some basic SEO 101 on page optimisations.

1276594541000000 1339774690000000
#1341 enhancement kindly kindly ckan-backlog reopened Delete spam users from ckan

Spam users where added to thedatahub and we need to clean them.

1315995034000000 1320141540000000
#2202 enhancement rgrp ckan-future reopened Display page view count on dataset and resource pages

Just like we display download counts we should display view counts.

1330765455000000 1338204929000000
#2243 enhancement seanh seanh ckan-v1.9 reopened Fix ckanext-example 1332172710000000 1340635768000000
#2331 defect kindly rgrp ckan-sprint-2012-05-29 reopened Search should AND terms not OR terms

Appears current default search in CKAN ORs terms rather than ANDing them (i.e. adding more terms increasing number of items found rather than reducing it).

Not sure when this crept in or if it has been there for a long time.

1335637485000000 1356474344000000
#140 enhancement rgrp ckan-backlog new News section on front page

Have a news section (suggest as a sidebar item).

News section will link to latest 3/4 blog posts on CKAN from blog.okfn.org.

Details:

  • Suggest pulling via rss or similar.
  • Will want to cache this ...

Cost: 4h?

1254902541000000 1265625159000000
#331 enhancement rgrp ckan-backlog new Timezone of CKAN timestamps should be configurable

Revisions are timestamped using the server's clock, which may not relate to the expected timezone for the site. e.g. the Norway site has a server on GMT. No timezone info is displayed either.

Would like to set timezone for a CKAN instance to use in rendering revision timestamps. For example, use CET or EST timezone.

1275302440000000 1339774701000000
#351 enhancement dread ckan-backlog new Homepage: list new, updated and 'hot' packages

Have a simpler list of exciting data, as opposed to the big revision list.

For example:

Hot data
===========

New packages: package1, package2, package3
Updated resources: package1, package2, package3
Popular packages: 
1276595816000000 1339774677000000
#369 enhancement shudson@… ckan-backlog new "Package Listing Key" should appear on Tag results

Currently there's a nice legend titled "Package Listing Key" that appears in right side of "Browse Packages" results. The same key should show on other search results like when searching for a tag.

1279821634000000 1339774666000000
#370 enhancement shudson@… ckan-backlog new Use better email encryption for author_email and maintainer_email

The JavaScript? email encryption used is not very reassuring. Google's MailHide? is a much better solution that is easily implemented.

http://www.google.com/recaptcha/mailhide/

Check on the Mailhide API where there are even some Python libraries already built.

1279821819000000 1339774649000000
#653 requirement dread ckan-backlog new Trackback links for packages

When people link to a package, a track-back link is auto-created. (Similar system as for blogs).

As suggested by Tim Davies:

Allowing some form of ‘track back’ against datasets When a non-technical user comes to look at a dataset it would be really useful for them to be able to see if anyone has created an interface interpretation of it already.

I found quite a few cases in research of end-users struggling to make sense of a dataset when good interfaces to that data had already been built and blogged about, but without there being any link from the dataset listing to those data uses. Accepting track backs could also make it easier for technical users to find blog posts / shared code etc. relating to a given dataset.

1285062025000000 1339774636000000
#737 enhancement dread ckan-backlog new Markdown syntax summary page

I suggest we produce a quick Markdown cheat-sheet page, showing the key runes: e.g. create a title and quote some text. This page can link to the full Markdown docs for advanced users.

A user going to the Markdown docs that we link will have to read a couple of pages of the raison-d'etre of Markdown before he gets to the syntax. And it's not very easy to read, and being white on black it looks like proper geek stuff.

1287766749000000 1323170239000000
#811 defect cygri ckan-backlog new Extra field editing form layout breaks when there are long field names

The layout of the editing section for extra fields breaks when a field name is slightly too long. Field names jump over to the right. See http://ckan.net/package/edit/dbpedia for examples.

1289994812000000 1323170289000000
#812 defect cygri ckan-backlog new Package edit form only allows three extra fields

Rationale

The package edit form is restricted to three extra fields. To enter more than three fields, one has to save the package and hit edit again (or hit preview).

Implementation

A mechanism similar to the one for resources (where you can add lines as you go) would solve this. So, have a button that adds more extra field rows via JS. (Extra fields don't need up/down buttons that the Resource table has)

Nice to have: a blank field is added when you tab from the last filled-in field in the table.

1289995010000000 1311176917000000
#818 requirement cygri ckan-backlog new Rethinking the author and maintainer fields

The semantics of the Author and Maintainer fields are really unclear at the moment. This leads to very inconsistent usage. Also, perhaps Name and Email are not the only fields that are needed for a contact.

Here is a table that shows the current usage of these fields in CKAN: http://richard.cyganiak.de/2010/ckan/ckan-ppl.html

We note several problems:

  • Author and Maintainer are often the same
  • Author and Maintainer are often used interchangeably
  • People really want to specify URLs for the contacts and stick them into random places because there is no field for it
  • Multiple comma-separated names in a single field

I'm not sure what to do about this, but a redesign is necessary in my opinion.

Some ideas:

  • Remove the maintainer field?
  • Make really clear that Author doesn't refer to the metadata on CKAN, but to the original data
  • Add an “author URL” field?
1290003524000000 1339774621000000
#837 enhancement rgrp ckan-backlog new CKAN integration with freebase gridworks / google refine

Thread: http://lists.okfn.org/pipermail/ckan-discuss/2010-November/000718.html

Scenario 1

  1. User installs Refine and CKAN extension for refine
  2. On booting refine and asked to load data they can choose from any data package on CKAN.net (or any other CKAN instance)
  3. They edit the dataset on Refine
  4. On save (or perhaps as a separate option) they are prompted as to whether they wish to sync the dataset back to CKAN (either as a new package or as a new resource on the existing package)

NB: for the dataset sync back some form of "CKAN" storage would be required (we already have storage.ckan.net running but a closer integration would be required)

Scenario 2

  1. User visits a package on CKAN.net (or another CKAN instance)
  2. There is a button on the page "View and edit this dataset in Google Refine"
  3. Click button -- ask them if they have Google refine installed
    • Yes: instructions for loading dataset into refine
    • No: load dataset in hosted version of google refine (we could run this)
  4. User edits dataset and hits save. As in previous scenario they are prompted to sync the dataset.
1291140609000000 1339774605000000
#895 defect memespring ckan-backlog new Add version number (or simular) to css/js includes query string

Updates to css after a new deploy don't come through without a hard refresh. Adding the version number to the include urls will solve this e.g.

mycssfile.css?v=12345678

1294343382000000 1339774593000000
#924 enhancement dread ckan-backlog new Search box has no search button

The search box at the top-right of CKAN's page doesn't have a 'go' button. I feel that a larger percentage of users expect a 'go' or 'search' button on the right-hand side of the box to press to start searching. Techies tend to know the keyboard shortcut of pressing 'carriage-return' but it might be better to follow standard practise on this.

Examples with 'search' button: Internet Explorer, Firefox, Google, Amazon, trac Examples without: ?

1295867533000000 1323170436000000
#989 enhancement kindly pudo ckan-future new Extending the model from plugins

We need to support extending the model from plugins. This could involve:

  • Adding a plugin hook to extend the mapper
  • Adding an upgrade hook for plugin schema migrations
  • Documenting how this is to be done
  • Find a way to avoid conflicts
1297689724000000 1340034311000000
#1009 enhancement pudo rgrp ckan-backlog new Improvements to user accounts sytem
  • Forgot password (email a new password)
  • Confirm email
  • Do not show register page if you are logged in (redirect to home page)
  • ticket:1010 - listing of users.
    • Do not use /user for general user account home page (either user normal user page /user/{id} or /user/myaccount)
1298635991000000 1310128574000000
#1077 enhancement kindly rgrp ckan-backlog new Move to simpler vdm system

Option 1: 'Changeset' Model

See ticket:1135 for vdm ticket. This would involve a) moving to changeset in vdm b) doing the migration in ckan to support this.

Have developed a new "changeset" based model for revisioning in vdm.

Implementation

  • The main challenge with this change is schema and data migration

Every revisioned object has a revision_id and revision attribute.

Approximate algorithm:

Revision -> Changeset

for revtype in [PackageRevision, ...]:
    for pkgrev in package_revision:
        changeset = lookupchangeset(package_revision)
        ChangeObject(cset, (table, id), dictize(pkgrev))

Question:

  • does pkg include tags attributes or not? or we have to dictize, pkgrev, pkg2tagrev, and tag. Probably the latter.

Option 2: Simplify Revision Object Model

Just use a simpler vdm, see ticket:1136 (move to SessionExtension) and ticket:1137 (remove need for statefulness in vdm).

Discussion

Advantage of Option 1 versus 2:

  • Easier support for pending state and similar behaviour
  • No need to introduce new tables (and hence migrations) when making something revisioned (or not).

Disadvantages

  • Migration is required
  • More difficult to query revision history.
    • Could be addressed by having ChangeObject have separate cols for table name and id but would likely be more difficult.
  • Performance (?)
    • Have one big ChangeObject table to query when looking at changed objects rather than many revision tables.
      • Not sure this is a biggie as even with Revision model biggest revision object tables are probably on the order of the ChangeObject table

Conclusion

Implement Option 2 and leave Option 1 for present.

Option 1 includes Option 2 so it seems that that is required in either case (so we may as well with Option 2).

Option 1 requires significant effort (esp migration) so leave for present and then review the situation at some later date.

1302304464000000 1340034345000000
#1096 defect rufuspollock pudo ckan-future new [super] CKAN Hosted

Many users of CKAN want to have their own instance without much effort. Setting these up in separate places is a maintenance nightmare, we should much rather have some tenant separation in core CKAN. Some ideas:

  • introduce model.Site and c.site
    • site has: custom CSS, extra_template_path, title, languages list, package_form, group_form (all configured via web UI)
  • Subdomain detector to activate sites.
  • use site in Authorizer instead of System, have a NullSite? for global things
  • allow cross-site search
  • packages are in a list of sites, m:n rather than 1:n
    • list of sites is string-based, can contain sites not in site table to express harvested external material which is not editable locally.
1303235062000000 1339774484000000
#1101 enhancement sebbacon ckan-backlog new Integrate googlanalytics into site nav

There's a stats plugin (e.g. at http://trac.ckan.org/ticket/832).

Output from the googleanalytics plugin should append to that page, if the stats plugin is present.

Possibly the stats plugin and the googleanalytics plugin should be merged?

Finally, if the stats plugin is active, then a link to the stats page should be added to the main site footer.

1303393926000000 1339774582000000
#1120 enhancement tsm ckan-backlog new Atom feeds of each tag

Tags could/should have an Atom feed. This would mean that every edit to relevant packages could be easily monitored. See [1].

[1] http://lists.okfn.org/pipermail/ckan-discuss/2011-May/001162.html

1304309285000000 1339774568000000
#1134 CREP amercader ckan-backlog new CREP0003: Description and Configuration of Harvesters

Proposer: Adrià Mercader

Abstract

The new harvester interface allows to create harvesters for different sources, but right now harvesters don't have many ways to describe and configure themselves. We need a way of allowing them to:

  • Expose their type and other details so they can be used internally and on the UI.
  • Define configuration settings for particular harvester instances.

The Problem

Harvester description

The current UI for adding and editing harvest sources is the same used in ckanext-dgu, and thus the 3 harvester types used in DGU to harvest various GEMINI realted sources are hardcoded in the form. The form will be migrated to a DGU-independent one, so we need the harvesters to provide all the necessary data. There is a current get_type method that returns the harvester type, but for make it compatible with the DGU forms, it returns a machine-readable string (e.g. "CSW Server"), making it error prone.

Arbitrary configuration

In the current implementation, when the harvest process is started, ckanext-harvest looks for all the available plugins that implement the IHarvester interface and calls the appropiate methods for the current stage (gather_stage,fetch_stage,import_stage). At these stages, harvesters have no way of applying arbitrary configuration options, so all harvesters of the same type behave on the same way. For instance, the CKAN harvester needs a way to define the API version to use when harvesting remote instances (Right now, the version 2 is hardcoded on the code).

Specification

Harvester description

Harvesters will need to provide the following information so the UI form can be built:

  • name: machine-readable name (e.g. "waf"). This will be the value stored in the database, and the one used by ckanext-harvest to call the appropiate harvester.
  • title: human-readable name (e.g. "Web Accessible Folder (WAF)"). This will appear in the form's select box.
  • description: a description of what the harvester does (e.g. "A Web Accessible Folder (WAF) displaying a list of GEMINI 2.1 documents"). This will appear on the form as a guidance to the user.

The way to provide it will be an info method that all harvesters must implement, which will return a dictionary with the previous elements:

    {
        'name': 'csw',
        'title': 'CSW Server',
        'description': 'A server that implements OGC's Catalog Service 
                        for the Web (CSW) standard'
    }

Arbitrary configuration

As different harvesters will have very different needs, we need to provide a way to persist arbitrary configuration flags for each harvest source. The more flexible way given the current architecture in my opinion would be to store the configuration options as a JSON encoded object as a property of the harvest source (There already is an unused DB field called config in the database) (Maybe using JsonType??).

This will mean adding an extra field in the harvest source form to allow entering the configuration. This could be just a simple text field where users enter the JSON encoded object or a more clever mechanism (i.e an "Add a configuration flag" link that adds two new text fields for the key and value for each flag, and a mechanism to later build the JSON object). In any case, this should probably be hidden in an "Advance options" section.

Why do it this way

Harvester description

The info method would provide a single point to get all the information related to the harvester, and future properties could be added to the dictionary returned without having to modify the interface.

Arbitrary configuration

There is an already existing config field in the database, so we won't need to change the model. Harvesters could access the config object at any of the stages. Of course they could provide default values in their implementations so users don't need to enter them everytime.

Implementation plan

Deliverables

Risks and mitigations

The highest risk on the harvesters info method side is that harvester implementation don't offer one of the necessary properties (namely name and title). This could fire a warning when showing the UI form or using the CLI.

Participants

Adrià Mercader to do it.

Progress

None yet.

1305108868000000 1339774554000000
#1144 enhancement timmcnamara ckan-backlog new Support DSPL

DSPL, the Dataset Publishing Language, is being promoted by Google for its "Google Public Data Explorer" system. It is an XML format with metadata.

The format is described on the developer docs ofthe Google Code site.

Google provides a Python script which reads CSV data and generates DSPL

Sample from http://code.google.com/apis/publicdata/docs/dspl_sample.html:

<?xml version="1.0" encoding="UTF-8"?>
<dspl xmlns="http://schemas.google.com/dspl/2010"
    xmlns:geo="http://www.google.com/publicdata/dataset/google/geo"
    xmlns:geo_usa="http://www.google.com/publicdata/dataset/google/geo/us"
    xmlns:time="http://www.google.com/publicdata/dataset/google/time"
    xmlns:quantity="http://www.google.com/publicdata/dataset/google/quantity"
    xmlns:entity="http://www.google.com/publicdata/dataset/google/entity">

  <import namespace="http://www.google.com/publicdata/dataset/google/time"/>
  <import namespace="http://www.google.com/publicdata/dataset/google/quantity"/>
  <import namespace="http://www.google.com/publicdata/dataset/google/entity"/>
  <import namespace="http://www.google.com/publicdata/dataset/google/geo"/>
  
  <info>
    <name>
      <value>My statistics</value>
    </name>
    <description>
      <value>Some very interesting statistics about countries</value>
    </description>
    <url>
      <value>http://www.stats-bureau.com/mystats/info.html</value>
    </url>
  </info>

  <provider>
    <name>
      <value>Bureau of Statistics</value>
    </name>
    <url>
      <value>http://www.stats-bureau.com</value>
    </url>
  </provider>

  <topics>
    <topic id="geography">
      <info>
        <name><value>Geography</value></name>
      </info>
    </topic>
    <topic id="social_indicators">
      <info>
        <name><value>Social indicators</value></name>
      </info>
      <topic id="population_indicators">
        <info>
          <name><value>Population indicators</value></name>
        </info>
      </topic>
      <topic id="poverty_and_income">
        <info>
          <name><value>Poverty & income</value></name>
        </info>
      </topic>
      <topic id="health">
        <info>
          <name><value>Health</value></name>
        </info>
      </topic>
    </topic>
  </topics>

  <concepts>
    <!-- As noted in the tutorial, this concept should extend quantity:amount.-->
    <concept id="population">
      <info>
        <name>
          <value>Population</value>
        </name>
        <description>
          <value>Size of the resident population.</value>
        </description>
      </info>
      <topic ref="population_indicators"/>
      <type ref="integer"/>
    </concept>

    <!-- This country concept is defined for educational purposes only. A country
    concept exists in the Google geo dataset. See:

    http://code.google.com/apis/publicdata/docs/canonical/geo.html --> 
    <concept id="country" extends="geo:location">
      <info>
        <name>
          <value>Country</value>
        </name>
        <description>
          <value>My list of countries</value>
        </description>
      </info>
      <type ref="string"/>
      <property id="name">
        <info>
          <name><value xml:lang="en">Country name</value></name>
          <description>
            <value xml:lang="en">The official name of the country</value>
          </description>
        </info>
        <type ref="string"/>
      </property>
      <table ref="countries_table"/>
    </concept>

    <!-- This US state concept is defined for educational purposes only. A US state
      concept exists in the Google geo US dataset. See:

      http://code.google.com/apis/publicdata/docs/canonical/geo.us.html --> 
    <concept id="state" extends="geo:location">
      <info>
        <name>
          <value>State</value>
        </name>
        <description>
          <value>US states</value>
        </description>
      </info>
      <type ref="string"/>
      <property concept="country" isParent="true"/>
      <table ref="states_table"/>
    </concept>

    <concept id="gender" extends="entity:entity">
      <info>
          <name>
          <value>Gender</value>
        </name>
        <description>
          <value>Gender, Male or Female</value>
        </description>
        <pluralName><value>Genders</value></pluralName>
        <totalName><value>Both genders</value></totalName>
      </info>
      <type ref="string"/>
      <table ref="genders_table"/>
    </concept>

    <concept id="unemployment_rate" extends="quantity:rate">
      <info>
        <name>
          <value>unemployment rate</value>
        </name>
        <description>
          <value>The percent of the labor force that is unemployed, not seasonally
            adjusted.</value>
        </description>
        <url><value>http://www.bls.gov/cps/cps_htgm.htm</value></url>
      </info>
      <topic ref="social_indicators"/>
      <type ref="float"/>
      <attribute id="is_percentage">
        <type ref="boolean"/>
        <value>true</value>
      </attribute>
    </concept>

  </concepts>

  <slices>
    <slice id="countries_slice">
      <dimension concept="country"/>
      <dimension concept="time:year"/>
      <metric concept="population"/>
      <table ref="countries_slice_table"/>
    </slice>

    <slice id="states_slice">
      <dimension concept="state"/>
      <dimension concept="time:year"/>
      <metric concept="population"/>
      <metric concept="unemployment_rate"/>
      <table ref="states_slice_table"/>
    </slice>

    <slice id="countries_gender_slice">
      <dimension concept="country"/>
      <dimension concept="gender"/>
      <dimension concept="time:year"/>
      <metric concept="population"/>
      <table ref="countries_gender_slice_table"/>
    </slice>

  </slices>

  <tables>
    <table id="countries_table">
      <column id="country" type="string"/>
      <column id="name" type="string"/>
      <column id="latitude" type="float"/>
      <column id="longitude" type="float"/>
      <data>
        <file format="csv" encoding="utf-8">countries.csv</file>
      </data>
    </table>

    <table id="countries_slice_table">
      <column id="country" type="string"/>
      <column id="year" type="date" format="yyyy"/>
      <column id="population" type="integer"/>
      <data>
        <file format="csv" encoding="utf-8">country_slice.csv</file>
      </data>
    </table>

    <table id="states_table">
      <column id="state" type="string"/>
      <column id="name" type="string"/>
      <column id="country" type="string">
        <value>US</value>
      </column>
      <column id="latitude" type="float"/>
      <column id="longitude" type="float"/>
      <data>
        <file format="csv" encoding="utf-8">states.csv</file>
      </data>
    </table>

    <table id="states_slice_table">
      <column id="state" type="string"/>
      <column id="year" type="date" format="yyyy"/>
      <column id="population" type="integer"/>
      <column id="unemployment_rate" type="float"/>
      <data>
        <file format="csv" encoding="utf-8">state_slice.csv</file>
      </data>
    </table>

    <table id="genders_table">
      <column id="gender" type="string"/>
      <column id="name" type="string"/>
      <data>
        <file format="csv" encoding="utf-8">genders.csv</file>
      </data>
    </table>

    <table id="countries_gender_slice_table">
      <column id="country" type="string"/>
      <column id="gender" type="string"/>
      <column id="year" type="date" format="yyyy"/>
      <column id="population" type="integer"/>
      <data>
        <file format="csv" encoding="utf-8">gender_country_slice.csv</file>
      </data>
    </table>
  </tables>

</dspl>
1305763609000000 1339774517000000
#1145 enhancement timmcnamara ckan-backlog new Support the Handle System

The Handle System is an initiative to provide persistent references for resources. That is, it's basically a proxy system for preventing link rot.

Its documentation is here: http://www.handle.net/. Servers running CKAN could host a "Local Handle Service", which redirects a hash of a resource to an actual URL.

Some suggested use cases:

  • Researcher would like to cite where data came from
  • Agencies would like to have a way to prevent vendor lock-in from CKAN if they decide to move to another platform
1305764775000000 1339774502000000
#1152 enhancement amercader amercader ckan-backlog new True support for generic CSW servers

The CSW harvesters implemented at the moment were developed with the DGU project in mind, and they assume all remote CSW servers to implement the Gemini 2 specification. Gemini 2 is the profile defined in the UK for INSPIRE complying metadata, so obviously catalogs from other countries or non-INSPIRE complying ones won't be able to be harvested.

The changes needed to support generic CSW servers (i.e. those implementing the ISO 19139 profile) are:

  • Handling the validators (right now are hardcoded in the harvester

code). This probably involves issues discussed in the CREP 3 (ticket #1134)

  • Changes in the model to adapt the specification to ISO 19139
  • Renaming objects and classes which are now Gemini-centric

List of CSW servers tested:

https://spreadsheets.google.com/spreadsheet/ccc?key=0Atp3cZFjuIOAdDBVQWRINnlfN1d0b2lleHVEdjBSb2c&hl=en_US&authkey=CNu4hsEB#gid=0

1306141334000000 1313411822000000
#1163 enhancement rgrp rgrp ckan-backlog new Improvements to Storage Extension

Storage is now working but there are

  • Integrate with Resources (e.g. create a resource for each file upload and give option to associate with a package)
    • Should we introduce rule that files *not* associated with a Resource are periodically deleted?
  • Allow setting of a file name/path before upload
  • Allow for file overwriting/deleting etc (how should this work -- do we want to allow this sort of thing)
  • Integrate local file upload stuff in api/auth/*

Different Backend Issues

Local file store is rather different from 'remote' storage in various ways:

  • For remote you don't want to use many buckets as there are bucket limits while for local you want to. Should we there have a single path that users provide which we then partition differently for different backends.
1306408778000000 1310133808000000
#1165 enhancement nils.toedtmann ckan-future new Add multi-site support to ckan

Currently, each ckan site needs its own ckan wsgi process. That eats a lot of resources where many ckan sites are served from one machine (e.g. eu3).

That would dramatically change if a ckan process could behave like multiple ckans (e.g. like Apache's "<VirtualHost?>", or tracd). Depending on the "Host:" header in the HTTP1.1 request, it would choose which local ckan ini file to obey.

I see two ways to constitute the map hostname-to-ini-file map:

  • ckan reads a set of ini files, and each ini file declares which servers names it is responsible for
  • In a global ini file, there are directives mapping servernames to ini files.

In either case there should be a global ckan ini having the default settings for all local ckan sites. Each site ini could be very short then, just having e.g. title, name, database credentials, active plugins etc.

1306413667000000 1339774466000000
#1179 enhancement timmcnamara ckan-backlog new Support tag aliases

A small number of tags are near-duplicates of each other.

Perhaps we could support word stemming from NLTK and/or manual tag aliases:

statistics statistik ... survey surveying surveys

1307429221000000 1339774332000000
#1182 defect timmcnamara ckan-backlog new Comments from deleted packages appear in "Recent Comments" feed

When a package has been deleted, say for spam moderation, comments still appear in the recent comments section.

This is a problem because non-admin users will be shown a warning that they're not authorised to view the package if they click on the link.

At CKAN.net currently, this affects the most recent comment.

1307658251000000 1339774319000000
#1184 enhancement timmcnamara ckan-backlog new Support Wuala as CKAN storage option

Most of CKANs storage options are tied to the USA. This brings concerns of data security for some organisations who may wish to adopt the system. Wuala is a distributed file system that stores data in a peer-to-peer manner. The company behind it, LaCie? sells storage for a fee. However, they also enable clients to have 'free' storage space when machines act as a storage node.

In order to be a storage node, a machine needs to be online for more than 14% of the time - roughly 4h per day. Most CKAN servers are likely to have a far greater uptime than this.

Supporting Wuala would go some way to enabling CKAN to be used in a secure manner. That is, CKAN could be promoted for organisational use where there is lots of data to be stored and large geographic distances to be managed. There is a Python client available and a fairly long Google Tech Talk that overviews the system.

1308034751000000 1339774306000000
#1185 defect timmcnamara ckan-backlog new Administrators can't delete packages from web UI

Administrators have "View", "Edit" and "History" tabs. However, I can't see a way to delete a package from the web UI.

Version: CKAN.net as of today

1308111818000000 1339774289000000
#1188 enhancement nickstenning ckan-backlog new Allow diffing against initial (blank) package version

Currently the history page only allows diffing between different versions of a package, but there doesn't appear to be any easy way to see the changes introduced by the first version of a package.

I'm requesting the ability to diff against a "blank slate" initial state of a project, so I can see the content of the first project commit.

Not sure if this is a vdm feature, so I'm putting this ticket in against ckan.

1308153160000000 1339774275000000
#1198 enhancement dread ckan-backlog new Publisher hierarchy

'Publisher' entities in the model. They are hierarchical.

'User-Publisher' connections with one or more roles (e.g. drafter, moderator).

Authorization settings can control who can set what values in a 'published by' type field.

Publishers and User-Publishers available to read in the API.

Future tickets will provide:

  • API to write Publishers and User-Publishers
  • UI to edit Publishers and User-Publishers

(This feature deprecates authorization groups)

1308820592000000 1339774200000000
#1201 enhancement kindly ckan-backlog new seperate out logic in atom feeds to logic layer.

Simplify the logic in the atom feed an make all feeds use logic layer to return lists.

1308928892000000 1310124297000000
#1203 defect johnglover rolf ckan-backlog new Moderated edits: html code shows as "changed" although it is not

I've installed the Moderated Edits extension (ckanext-moderatededits) and am editing a package imported from IATIregistry.org, with an extra field which contains a bit of HTML.

The editor indicates the field has changed, although the content hasn't (see screenshot). All I can find so far is a minor difference: in the field content, there is a code &#8212 and in the rendered table that is an &mdash;

1309274970000000 1313401579000000
#1227 enhancement timmcnamara ckan-backlog new Display packages' tags in search results

In when displaying search results, it would be useful to also display the tags of a package. Sometimes it's difficult to infer the scope of what the package does from the title and the first sentence of the description. Tags are quite concise way to display rich information.

ENV=datacatalos.org, with CKAN 1.4.2a

1311034262000000 1339774147000000
#1232 enhancement thejimmyg ckan-backlog new [super] Interface improvements

Child tickets:

  • #1194 "Welcome back" message for newly registered user
  • #1202 Links to datapkg utility don't lead to info about it
  • #925 Change the search box icon to remove the down arrow
  • #923 Search box doesn't work in leaderboard page in stats extension
  • #1034 Flash message cached
  • #737 Markdown syntax summary page
  • #811 Extra field editing form layout breaks when there are long field names
1311178296000000 1315948536000000
#1233 enhancement thejimmyg ckan-backlog new [super] Improve wiki-style functionality for history

At the moment we have a good revisioning system but a poor history interface. We need to improve this in a number of areas:

  • #191 Searching by modification date
  • #193 Searching by time-related field
  • #301 Package discussion pages
  • #1236 Package history page should provide links to pages at particular revisions, similar to the wikipedia pages
  • #1236 Viewing old revisions or unmoderated changes should have a message at the top of the package page
  • Other improvements as per my word doc.
1311179392000000 1315948668000000
#1235 enhancement thejimmyg ckan-future new [super] Search Improvements

Child tickets:

  • #234 UI Review - Autocomplete package names & tags in search
  • #193 Searching by time-related field
  • #191 Searching by modification date
  • #905 Unable to search with accented characters in package names
  • #906 Ability to search without accents for accented words
  • #924 Search box has no search button

Broadly speaking though we need to choose PostgreSQL, Solr or something else. We don't want to invest our time maintaining two search backends with a limited abstraction layer between the two.

1311182641000000 1311182641000000
#1257 enhancement dread ckan-backlog new Anti-Spam tools

We are getting more and more spam on ckan.net and we need to improve our strategy of combating it. It is bad because google ranks who we link to (which we do want for legitimate links), and our front page contains the latest package edits, so spam is immediately very visible.

Spam:

  • Package creation
  • Package edit
  • User creation
  • Group creation

Systems to consider:

  • Automatic
    • captcha
    • bayesian scoring
  • Sysadmin
    • a tool to quickly analyse and remove packages, edits and users
  • Helpful users
    • button to click indicating spam, leading to any/all of:
      • add TODO item (which emails package admin and sysadmins)
      • quarantine the package/user so it doesn't show up on the front page and a warning is shown when you view it

General thoughts:

  • We should learn from the wikipedia tools, as they are the experts
  • This is a serious problem that is only going to get worse and is a threat to the whole site.
1312283565000000 1339774129000000
#1259 enhancement johnglover pudo ckan-backlog new "Add a row" for Extras on Package form

The default package form offers 4 empty extras fields. Like the resource section, it should have an "add more" button to add another row.

1312302693000000 1312907056000000
#1260 enhancement pudo ckan-backlog new Remove duplicate functions from _util.html

There seems to be both a list view for dictized and non dictized data structures for all entities in _util.html at the moment. Probably in the back of someone's mind already, but cleanup here would be nice.

1312366652000000 1313401499000000
#1261 defect pudo ckan-backlog new Investigate dots in extras search

It seems that searching for extras_foo:value works with solr, but extras_foo.bar:value doesn't. No theory on why.

1312366768000000 1312366768000000
#1262 enhancement pudo ckan-backlog new Enforce "create-user" permission

This does not seem to have any implications at the moment, it should lock down registration and remove all related links.

1312375296000000 1323090112000000
#1273 requirement amercader ckan-backlog new Create docs for API v3 1313412083000000 1313412083000000
#1278 enhancement amercader ckan-backlog new Refactor authorized_query calls

There are some functions that still use the Auhtorizer().authorized_query method:

./ckan/controllers/authorization_group.py:24:        query = ckan.authz.Authorizer().authorized_query(c.user, model.AuthorizationGroup)
./ckan/lib/base.py:237:        groups = ckan.authz.Authorizer.authorized_query(c.user, model.Group, 
./ckan/lib/search/sql.py:55:        q = authz.Authorizer().authorized_query(username, model.Group)
./ckan/lib/search/sql.py:118:        q = authz.Authorizer().authorized_query(self.options.get('username'), model.Package)
./ckan/logic/action/get.py:154:    query = Authorizer().authorized_query(user, model.Group, model.Action.EDIT)

./ckan/tests/test_authz.py:158:        q = self.authorizer.authorized_query(self.notadmin.name, model.Package)
./ckan/tests/test_authz.py:353:        q = self.authorizer.authorized_query(self.notmember.name, model.Package)
./ckan/tests/test_authz.py:357:        q = self.authorizer.authorized_query(self.member.name, model.Package)
./ckan/tests/functional/test_authorization_group.py:44:        group_count = Authorizer.authorized_query(u'russianfan', model.AuthorizationGroup).count()
1313415177000000 1313415177000000
#1286 enhancement rgrp ckan-backlog new Remove remaining formalchemy stuff

Stuff I've spotted:

  • forms/*
  • template/group/edit_form.html
  • template/package/edit_form.html

This can go once new DGU form is in.

1314116996000000 1342436420000000
#1288 defect dread ckan-backlog new Package edit/creation can't include 'relationships' field

When you create or edit a package (via the API), you aren't able to specify the relationships it has. (If you do you get 409 {"__junk": ["The input field __junk was not expected."]} )

The normal way to create relationships is via /api/rest/relationships/ and this works. But when you GET a package, the dictionary lists all relationship details. So this bug creates a problem for editing a package that has relationships - you want to GET it, make any edits and then PUT it back. The work-around is to delete the 'relationships' key from the dict before you PUT it back.

Options

Ideally, CKAN would read the 'relationships' key and make the necessary changes. This is a chunk of work.

Another good option is to allow an unchanged 'relationships' value, but barf it is edited. This is also a chunk of work.

A bad option would be to just ignore the 'relationships' value, since users will get frustrated changing this value and wonder why it never saves, not understanding it is different to all the rest, without error message.

A final option is to get rid of relationships altogether.

1314203695000000 1339774098000000
#1311 enhancement rgrp rgrp ckan-backlog new Modal user register and login form

Subticket of: #1294

Rather than having to visit a dedicated page it would be good if registration and login could be done from a modal form (separate or combined ...).

Why

  • It could be used from dataset creation page in situations where user needs to be registered / logged in to create a dataset (so we could allow someone to start creating a dataset and only get them to login at the end ...)
  • It allows for quicker and easier logging in

Implementation

  • See Friedrich's work on the datahub
1315297227000000 1315297227000000
#1326 enhancement thejimmyg ckan-backlog new Write a set of auth plugin functions to integrate with Druapl

Ticket #787 described join auth between CKAN and Drupal. The authentication part is live and implemented. This ticket is a placeholder for work that will be needed in the new auth system to link authorization functions to Drupal. It is dependent on the groups refactor.

1315821084000000 1315821084000000
#1336 defect johnglover dread ckan-backlog new License fudge

cset:4b59ab34137d ckan/logic/action/get.py:

-            isopen = model.Package.get_license_register()[license_id].isopen()
-            result_dict['isopen'] = isopen
+            try:
+                isopen = model.Package.get_license_register()[license_id].isopen()
+                result_dict['isopen'] = isopen
+            except KeyError:
+                # TODO: create a log message this error?
+                result_dict['isopen'] = False 

This change hides problems with the license server and returns potentially incorrect values for openness.

This has been noted as 'temporary fix' but seems to be forgotten about, since it has been merged to default and gone into release 1.4.3.

I suggest the licenses are cached (I thought this was already the case when CKAN first requests them after start-up?). I suggest failure would return 503.

1315912057000000 1323173073000000
#1343 enhancement rgrp rgrp ckan-backlog new [super] User related improvements (login, user pages etc)
  • Disallow account creation via openid - #1386
  • Require email field - #1319
    • Require email confirmation to be activated (?)
  • Improvements to user page (e.g. show activity and more info about user) - #1396
  • Modal user login - #1311
1316017098000000 1318528138000000
#1352 enhancement amercader ckan-backlog new Use logic functions instead of as_dict when indexing entities

The current search implementation uses the output of the the as_dict method of the domain Package object to update the index

https://bitbucket.org/okfn/ckan/src/56c79e3fc44c/ckan/lib/search/index.py#cl-48

It also uses package_to_api1 in the SynchronousSearch? plugin:

https://bitbucket.org/okfn/ckan/src/f9dfb0506594/ckan/lib/search/__init__.py#cl-93

This prevents extensions from being able to index custom properties (e.g. faceting by custom extras not included in the model).

The search should use the logic function to get the package properties:

get_action('package_show')(context,data_dict)
1316615397000000 1339774086000000
#1355 defect amercader ckan-backlog new Package extras property does not include the newly created ones

The extras in the package object sent to the extensions after editing (https://bitbucket.org/okfn/ckan/src/01efd5649c10/ckan/logic/action/update.py#cl-226) do not include the newly added.

1317034126000000 1339774056000000
#1382 defect thejimmyg thejimmyg ckan-backlog new Deleted resources are present for harvested package

Perhaps the importer deletes them before re-importing. We shouldn't have deleted resources, so let's investigate.

1318337889000000 1318337889000000
#1384 task rgrp shevski ckan-backlog new CKAN wiki needs updating to refer to thedatahub.org instead of ckan.net and datasets instead of packages

Most articles still refer and link to ckan.net, wiki.ckan.net and to packages

1318414077000000 1318414077000000
#1403 enhancement zephod zephod ckan-backlog new Refactor groups index page

Groups are listed alphabetically with paging - not an ideal user experience. We would like to list groups in order of 'popularity': The number of datasets they contain.

Following this chain of thought, then, it would be nice to rearrange the groups table by clicking on column headers and having it sort by that column.

Furthermore, then, we'd like to implement a full-fledged groups search feature (if this is at all feasible).

The forthcoming groups refactor will probably have some bearing on this task.

1318847512000000 1318847566000000
#1406 enhancement zephod ckan-backlog new Re-enable RSS subscriptions

RSS 'subscribe' buttons appeared in many places on the site but were not very helpful. They took (confused) users pointed to the raw feed code, and Google Reader could not understand the feed. Safari, however, could interpret it correctly.

Their presentation needs to be clear and consistent; the RSS feed really needs testing in a variety of readers; and we need to decide exactly which items should get a feed. (Package updates? Groups?)

1318861327000000 1320930088000000
#1411 enhancement zephod rgrp ckan-backlog new Force resource format to be lower case (also mimetype)

Format should be lowercase. Automatically lower case (for extra points have a bit of javascript to force lower case when entering).

  • Even more points: do a update on thedatahub repo to make all format lower case (or script this as an update?)
1319319604000000 1319319604000000
#1414 enhancement shevski ckan-backlog new track user log-ins on thedatahub.org

Set up tracking for user logins so that we have stats about how many active users of thedatahub exist want to be able to see who logged in the the last x months

1319454782000000 1319454782000000
#1423 enhancement markbrough ckan-backlog new Edit resources suggestions
  • Description vs Name - Edit Resources view is showing the name of the package rather than the description, and a lot (all?) of the packages before the upgrade don't have names, so might be good to swap this round again, e.g.: http://thedatahub.org/dataset/edit/iati-registry
  • Moving resources - Moving them up or down the list used to be quite useful if you had a lot of resources that you might want to leave on the resources page, but only one or two that were actually current and that you wanted to draw attention to. This doesn't exist any more on CKAN but I think it would be good to add it back in.
1319641906000000 1338203678000000
#1424 enhancement dread ckan-backlog new Openness notice should be clearer

ckan-discuss discussion suggests changes to the 'openness' indicator ( http://lists.okfn.org/pipermail/ckan-discuss/2011-October/001786.html )

Dataset view page:

  • If there is an explicit but non-OKD compliant license, such as CC-BY-NC, then this should be stated explicitly, perhaps: “This dataset is Not Open. License: Creative Commons Attribution Noncommerical. This is not an open license as it does not meet the Open Knowledge Definition.”
  • If the license is marked as “Other::License Not Specified”, then this should be stated explicitly, perhaps: “This dataset is Not Open. It is published without an explicit license, the publisher reserves all rights to the dataset.”
  • 3. If the license field was left empty by the contributor of the Data Hub record, then again this should be stated explicitly, perhaps: “This dataset is Not Open. The license of this dataset is unknown or unspecified. Start an enquiry on IsItOpenData? »
  • There is a bug so that non-open licenses doesn't have an openness notice.
  • If downloadable resources are not available, this should not affect 'openness' - check this has been removed.
1319648089000000 1319648089000000
#1429 enhancement rgrp rgrp ckan-backlog new Provide DOIs for datasets in a CKAN instance

DOI = digital object identifier = http://www.doi.org/

As a Publisher I want a DOI for my dataset so that it can be cited by and linked to by others in a standard and easy way.

Details

  • Probably implement as extension rather than core
1319977305000000 1319977305000000
#1432 enhancement rgrp ckan-backlog new [super] Data processing system for CKAN and Webstore

Super ticket: #1190

A data processing system which utilizes the Webstore. One could get a long way with simple javascript running in the browser for development with this javascript then run offline using something like nodejs. Alternatively one could allow one to specify a url to e.g. a python file which would then be run in a sandbox (with access to some specified set of python modules)

1320142747000000 1339774041000000
#1438 enhancement dread ckan-future new Action API - parameter discovery/checking

Many actions in the Action API require parameters. What params are needed should be listed and checked. Because currently, if you get them wrong you simply get a useless 500 error.

Currently they are listed in the docs, extracted from the code manually.

So you could GET /action/api/package_list to receive not only the help text, but a list of arguments.

And if you send an extra or missing argument then an intelligent error message can be returned.

implementation

How about some sort of decorator on the action function:

@logic_params(id, offset, limit)
def get_package_list(context, data_dict):
    ...

This would do the param checking, and is there a way to extract these params from the function? Or do a registration of the logic function?

I'd certainly like to keep the list of the list of params for the function with the function, for ease of reading the code.

Another good thing would be to pass in the params named as themselves, rather than having them contained in the data_dict.

1320161890000000 1338205050000000
#1439 enhancement dread ckan-backlog new Action API discoverablility

A good service API needs to be discoverable, so you are not always having to refer to the documentation html.

Maybe /api/action should return a list of actions available? (Currently this returns a 404.)

  • It would be nice to sort these into get/create/update/delete.
  • #1438 Parameters for each of the actions must be discoverable too

/api/action/{action_name} should also return the help text / parameters allowable. (Currently this returns 400 error)

1320161970000000 1325474974000000
#1457 defect jilly.mathews ckan-future new Bug with DataNL instance

"when logging into http://register.data.overheid.nl/ with OpenID, the /user/me page gives a 404. CKAN version 1.3.4."

n the manual it says an API key kan be created via http://test.ckan.net/user/me /. However, when I try the corresponding http://register.data.overheid.nl/user/me, I get a 404 error (not found).

1320925041000000 1320925041000000
#1459 enhancement rgrp rgrp ckan-backlog new Featured Dataset feature

Provide way to mark a dataset as featured. Featured database show up on the front page.

TODO: detail this more.

1321113012000000 1321113012000000
#1466 enhancement thejimmyg ckan-backlog new Need to support https login for multiple instances as part of the CKAN package install 1321375978000000 1328529062000000
#1534 enhancement rgrp ckan-backlog new Change revisions to record userid rather than username

The use of username is problematic because username's can change.

  • Change all revision creation code to use user id (simplest is to change c.author field in lib/base.py (?))
    • (?) Add a field ipaddr for ip address of anonymous users? (or just keep putting this in author field on Revision and then acception that those won't match when we do a look up against user table)
  • Change user view page to look up against user id rather than name
  • Perform migration on existing Revision objects
    • Match should probably be against both openid and username when searching Revisions' author field (especially true on CKAN where some people have already changed their username from being their openid)
1323278790000000 1338205050000000
#1535 enhancement dread ckan-backlog new Plump for auth header of: X-CKAN-API-KEY

When using the API, the apikey needs to be supplied in a header called 'Authorization'. Because some proxys / deployments use this header for other things, a configurable header was provided as an alternative, with default "X-CKAN-API-KEY".

Rufus suggests having *one* way for this. a) making this not configurable any more b) making X-CKAN-API-KEY the default

(keep Authorization allowed, but not documented, for backwards compatibility)

1323279082000000 1339774019000000
#1542 enhancement dread ckan-backlog new Buttons to purge spam datasets and groups

A sysadmin should be able to easily examine a suspect group or package, determine if it was created by a spammer (as opposed to being a legitimate object that has been graffitied by a spammer) and purge it.

The existing two-stage revision delete is currently unreliable and perhaps too laborious.

Olav and Richard have needs along this line.

1323364930000000 1339774000000000
#1544 task dread ckan-backlog new delete old git branches

We have 150 odd branches (git branch -a) - most of them old - we should prune them. At very least, branches that have been merged in should be deleted. Look at old branches that haven't been merged in and wonder why.

May be of some use:

git branch --merged
git branch --no-merged
1323702610000000 1323702610000000
#1557 enhancement David Rasnik jilly mathews ckan-future new Complete Webstore Preview Extension

Finish any work out standing on web store preview extension to be able to package and release.

Ref James and I going through existing features and trying to mention any polishing that needed doing to get exiting features ready for release with projects such as CKAN hosted.

1324291253000000 1324291253000000
#1558 enhancement David Raznik jilly mathews ckan-future new Publisher Tools

Summarise final set of requirements for this and finish development and test. Estimated 10 working days.

1324291573000000 1324291573000000
#1560 enhancement David Raznik jilly mathews ckan-future new Follow extension

Estimate 2 days to finish dev and test.

David can you add any info needed here?

1324291879000000 1324291879000000
#1561 enhancement David Raznik jilly mathews ckan-future new To do extension

Can we finish this ready for release on data hub and CKAN Hosted.

1324291972000000 1324291972000000
#1562 enhancement Adria jilly mathews ckan-future new Finish Geo Spatial

Estimated 4 weeks of Adria's time. I guess this will need to be broken down into more tickets. This feature is being requested by a number of potential customers and we have some ideas of requirements between Rufus and Jilly for this. This is the most popular new feature we talk about to new clients.

1324292193000000 1324292193000000
#1564 enhancement David Raznik jilly mathews ckan-future new Structured Data (Data API)

Basic websotre exists but this may be not what is described yet.

CKAN provides a rich API for the data itself, allowing users to query retrieve and use data instantly from datasets in CKAN without needing to download or process it first.

1324292834000000 1324292834000000
#1565 enhancement Rufus Pollock jilly mathews ckan-future new Admin dashboard finished?

Is testing complete and ready for release?

1324293092000000 1324293092000000
#1567 enhancement David Raznik jilly mathews ckan-future new Finish QA extension

Requires change to celeryd. Estimated 4 weeks.

1324293599000000 1324293599000000
#1569 enhancement David Raznik jilly mathews ckan-future new Wordpressser

How much effort will this be to be ready to use?

1324294056000000 1324294056000000
#1572 enhancement David Raznik jilly mathews ckan-future new Meta data Harvester

Need to write custom harvesters for each client. Is it worth having one for data hub?

1324294509000000 1324294509000000
#1573 enhancement David Raznik jilly mathews ckan-future new Apps and Ideas

Estimate 2 weeks for someone to finish and test.

1324294593000000 1324294593000000
#1577 defect rgrp dread ckan-backlog new Can't upload file with foreign chars in filename

Looks like uploading a file with foreign characters fails due to encoding reasons.

URL: http://thedatahub.org/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ%C3%AD-%C4%8Cesk%C3%A9-republiky-_-P%C5%99%C3%ADprava-rozpo%C4%8Dtu.pdf
Module weberror.errormiddleware:162 in __call__
<<              __traceback_supplement__ = Supplement, self, environ
                   sr_checker = ResponseStartChecker(start_response)
                   app_iter = self.application(environ, sr_checker)
                   return self.make_catching_iter(app_iter, environ, sr_checker)
               except:
>>  app_iter = self.application(environ, sr_checker)
Module beaker.middleware:73 in __call__
<<                                                     self.cache_manager)
               environ[self.environ_key] = self.cache_manager
               return self.app(environ, start_response)
>>  return self.app(environ, start_response)
Module beaker.middleware:152 in __call__
<<                          headers.append(('Set-cookie', cookie))
                   return start_response(status, headers, exc_info)
               return self.wrap_app(environ, session_start_response)
           
           def _get_session(self):
>>  return self.wrap_app(environ, session_start_response)
Module routes.middleware:130 in __call__
<<                  environ['SCRIPT_NAME'] = environ['SCRIPT_NAME'][:-1]
               
               response = self.app(environ, start_response)
               
               # Wrapped in try as in rare cases the attribute will be gone already
>>  response = self.app(environ, start_response)
Module pylons.wsgiapp:125 in __call__
<<          
               controller = self.resolve(environ, start_response)
               response = self.dispatch(controller, environ, start_response)
               
               if 'paste.testing_variables' in environ and hasattr(response,
>>  response = self.dispatch(controller, environ, start_response)
Module pylons.wsgiapp:324 in dispatch
<<          if log_debug:
                   log.debug("Calling controller class with WSGI interface")
               return controller(environ, start_response)
           
           def load_test_env(self, environ):
>>  return controller(environ, start_response)
Module ckan.lib.base:123 in __call__
<<          # available in environ['pylons.routes_dict']    
               try:
                   return WSGIController.__call__(self, environ, start_response)
               finally:
                   model.Session.remove()
>>  return WSGIController.__call__(self, environ, start_response)
Module pylons.controllers.core:221 in __call__
<<                  return response(environ, self.start_response)
               
               response = self._dispatch_call()
               if not start_response_called:
                   self.start_response = start_response
>>  response = self._dispatch_call()
Module pylons.controllers.core:172 in _dispatch_call
<<              req.environ['pylons.action_method'] = func
                   
                   response = self._inspect_call(func)
               else:
                   if log_debug:
>>  response = self._inspect_call(func)
Module pylons.controllers.core:107 in _inspect_call
<<                        func.__name__, args)
               try:
                   result = self._perform_call(func, args)
               except HTTPException, httpe:
                   if log_debug:
>>  result = self._perform_call(func, args)
Module pylons.controllers.core:60 in _perform_call
<<          """Hide the traceback for everything above this method"""
               __traceback_hide__ = 'before_and_this'
               return func(**args)
           
           def _inspect_call(self, func):
>>  return func(**args)
Module ckanext.storage.controller:2 in auth_form
Module ckan.lib.jsonp:26 in jsonpify
<<      Very much modelled after pylons.decorators.jsonify .
           """
           data = func(*args, **kwargs)
           return to_jsonp(data)
>>  data = func(*args, **kwargs)
Module ckanext.storage.controller:301 in auth_form
<<          method = 'POST'
               authorize(method, bucket, label, c.userobj, self.ofs)
               data = self._get_form_data(label)
               return data
>>  authorize(method, bucket, label, c.userobj, self.ofs)
Module ckanext.storage.controller:79 in authorize
<<      if method != 'GET':
               # do not allow overwriting
               if ofs.exists(bucket, key):
                   abort(409)
               # now check user stuff
>>  if ofs.exists(bucket, key):
Module ofs.remote.botostore:53 in exists
<<          if bucket is None: 
                   return False
               return (label is None) or (label in bucket)
           
           def claim_bucket(self, bucket):
>>  return (label is None) or (label in bucket)
Module boto.s3.bucket:87 in __contains__
<<      def __contains__(self, key_name):
              return not (self.get_key(key_name) is None)
       
           def startElement(self, name, attrs, connection):
>>  return not (self.get_key(key_name) is None)
Module boto.s3.bucket:144 in get_key
<<          response = self.connection.make_request('HEAD', self.name, key_name,
                                                       headers=headers,
                                                       query_args=query_args)
               # Allow any success status (2xx) - for example this lets us
               # support Range gets, which return status 206:
>>  query_args=query_args)
Module boto.s3.connection:388 in make_request
<<          if isinstance(key, Key):
                   key = key.name
               path = self.calling_format.build_path_base(bucket, key)
               boto.log.debug('path=%s' % path)
               auth_path = self.calling_format.build_auth_path(bucket, key)
>>  path = self.calling_format.build_path_base(bucket, key)
Module boto.s3.connection:88 in build_path_base
<<      def build_path_base(self, bucket, key=''):
               return '/%s' % urllib.quote(key)
       
       class SubdomainCallingFormat(_CallingFormat):
>>  return '/%s' % urllib.quote(key)
Module urllib:1222 in quote
<<              safe_map[c] = (c in safe) and c or ('%%%02X' % i)
               _safemaps[cachekey] = safe_map
           res = map(safe_map.__getitem__, s)
           return ''.join(res)
>>  res = map(safe_map.__getitem__, s)
KeyError: u'\xed'
CGI Variables
AUTH_TYPE	'cookie'
CONTENT_TYPE	'; charset=utf-8'
DOCUMENT_ROOT	'/htdocs'
GATEWAY_INTERFACE	'CGI/1.1'
HTTP_ACCEPT	'*/*'
HTTP_ACCEPT_CHARSET	'ISO-8859-1,utf-8;q=0.7,*;q=0.3'
HTTP_ACCEPT_ENCODING	'gzip,deflate,sdch'
HTTP_ACCEPT_LANGUAGE	'en-US,en;q=0.8'
HTTP_CACHE_CONTROL	'max-age=259200'
HTTP_CONNECTION	'keep-alive'
HTTP_COOKIE	'thedatahub_net=27a7f095fcca1ea6b36df996d595e3278b16f4538862bf7f88d49e2000b9246547c8fd0e; auth_tkt="f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode"; auth_tkt="f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode"; ckan_user=elenaibp; ckan_display_name="Elena Mondo"; ckan_apikey=decd48b1-49ee-4250-bff4-98ccca9c02a5; hide_welcome_message=1; __utma=119670349.1809834699.1323782464.1324293066.1324298316.4; __utmb=119670349.3.10.1324298316; __utmc=119670349; __utmz=119670349.1323782464.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)'
HTTP_HOST	'thedatahub.org'
HTTP_REFERER	'http://thedatahub.org/dataset/edit/budget-library-czeck-republic'
HTTP_USER_AGENT	'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7'
HTTP_VIA	'1.1 localhost (squid/3.0.STABLE19)'
HTTP_X_FORWARDED_FOR	'87.114.74.190'
HTTP_X_REQUESTED_WITH	'XMLHttpRequest'
PATH	'/usr/local/bin:/usr/bin:/bin'
PATH_INFO	'/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ\xc3\xad-\xc4\x8cesk\xc3\xa9-republiky-_-P\xc5\x99\xc3\xadprava-rozpo\xc4\x8dtu.pdf'
PATH_TRANSLATED	'/home/okfn/var/srvc/ckan.net/pyenv/bin/ckan.net.py/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ\xc3\xad-\xc4\x8cesk\xc3\xa9-republiky-_-P\xc5\x99\xc3\xadprava-rozpo\xc4\x8dtu.pdf'
REMOTE_ADDR	'193.34.146.142'
REMOTE_PORT	'55419'
REMOTE_USER	u'elenaibp'
REMOTE_USER_DATA	'userid_type:unicode'
REMOTE_USER_TOKENS	['']
REQUEST_METHOD	'GET'
REQUEST_URI	'/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ%C3%AD-%C4%8Cesk%C3%A9-republiky-_-P%C5%99%C3%ADprava-rozpo%C4%8Dtu.pdf'
SCRIPT_FILENAME	'/home/okfn/var/srvc/ckan.net/pyenv/bin/ckan.net.py'
SCRIPT_URI	'http://thedatahub.org/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ\xc3\xad-\xc4\x8cesk\xc3\xa9-republiky-_-P\xc5\x99\xc3\xadprava-rozpo\xc4\x8dtu.pdf'
SCRIPT_URL	'/api/storage/auth/form/2011-12-19T124447/Ministerstvo-financ\xc3\xad-\xc4\x8cesk\xc3\xa9-republiky-_-P\xc5\x99\xc3\xadprava-rozpo\xc4\x8dtu.pdf'
SERVER_ADDR	'193.34.146.146'
SERVER_ADMIN	'[no address given]'
SERVER_NAME	'thedatahub.org'
SERVER_PORT	'80'
SERVER_PROTOCOL	'HTTP/1.0'
SERVER_SIGNATURE	'<address>Apache/2.2.14 (Ubuntu) Server at thedatahub.org Port 80</address>\n'
SERVER_SOFTWARE	'Apache/2.2.14 (Ubuntu)'
WSGI Variables
application	<beaker.middleware.CacheMiddleware object at 0x7f22601c7dd0>
beaker.cache	<beaker.cache.CacheManager object at 0x7f22601c7b50>
beaker.get_session	<bound method SessionMiddleware._get_session of <beaker.middleware.SessionMiddleware object at 0x7f22601c7a90>>
beaker.session	{'_accessed_time': 1324298703.071357, '_creation_time': 1324293077.4139669}
mod_wsgi.application_group	'ckan.net|'
mod_wsgi.callable_object	'application'
mod_wsgi.listener_host	''
mod_wsgi.listener_port	'80'
mod_wsgi.process_group	'ckan.net'
mod_wsgi.reload_mechanism	'1'
mod_wsgi.script_reloading	'1'
mod_wsgi.version	(2, 8)
paste.cookies	(<SimpleCookie: __utma='119670349.1809834699.1323782464.1324293066.1324298316.4' __utmb='119670349.3.10.1324298316' __utmc='119670349' __utmz='119670349.1323782464.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)' auth_tkt='f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode' ckan_apikey='decd48b1-49ee-4250-bff4-98ccca9c02a5' ckan_display_name='Elena Mondo' ckan_user='elenaibp' hide_welcome_message='1' thedatahub_net='27a7f095fcca1ea6b36df996d595e3278b16f4538862bf7f88d49e2000b9246547c8fd0e'>, 'thedatahub_net=27a7f095fcca1ea6b36df996d595e3278b16f4538862bf7f88d49e2000b9246547c8fd0e; auth_tkt="f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode"; auth_tkt="f9c6ab2b0d9fcd71c4c2408bc12fab544eef1c45elenaibp!userid_type:unicode"; ckan_user=elenaibp; ckan_display_name="Elena Mondo"; ckan_apikey=decd48b1-49ee-4250-bff4-98ccca9c02a5; hide_welcome_message=1; _ _utma=119670349.1809834699.1323782464.1324293066.1324298316.4; __utmb=119670349.3.10...)|utmcmd=(none)')
paste.registry	<paste.registry.Registry object at 0x7f226194df50>
paste.throw_errors	True
pylons.action_method	<bound method StorageAPIController.auth_form of <ckanext.storage.controller.StorageAPIController object at 0x7f2261dad990>>
pylons.controller	<ckanext.storage.controller.StorageAPIController object at 0x7f2261dad990>
pylons.environ_config	{'session': 'beaker.session', 'cache': 'beaker.cache'}
pylons.pylons	<pylons.util.PylonsContext object at 0x7f2261daddd0>
pylons.routes_dict	{'action': u'auth_form', 'controller': u'ckanext.storage.controller:StorageAPIController', 'label': u'2011-12-19T124447/Ministerstvo-financ\xed-\u010cesk\xe9-republiky-_-P\u0159\xedprava-rozpo\u010dtu.pdf'}
repoze.who.identity	<repoze.who identity (hidden, dict-like) at 139785645747120>
repoze.who.logger	<logging.Logger instance at 0x7f225e23c098>
repoze.who.plugins	{'openid': <OpenIdIdentificationPlugin 139785625065680>, 'friendlyform': <FriendlyFormPlugin 139785618095248>, 'ckan.lib.authenticator:UsernamePasswordAuthenticator': <ckan.lib.authenticator.UsernamePasswordAuthenticator object at 0x7f2260874c10>, 'auth_tkt': <AuthTktCookiePlugin 139785625065808>, 'ckan.lib.authenticator:OpenIDAuthenticator': <ckan.lib.authenticator.OpenIDAuthenticator object at 0x7f2260874c90>}
routes.route	<routes.route.Route object at 0x7f22601a1090>
routes.url	<routes.util.URLGenerator object at 0x7f2261dadf50>
webob._parsed_query_vars	(GET([]), '')
webob.adhoc_attrs	{'language': 'en-us'}
wsgi process	'Multiprocess'
wsgi.file_wrapper	<built-in method file_wrapper of mod_wsgi.Adapter object at 0x7f2261da9af8>
wsgiorg.routing_args	(<routes.util.URLGenerator object at 0x7f2261dadf50>, {'action': u'auth_form', 'controller': u'ckanext.storage.controller:StorageAPIController', 'label': u'2011-12-19T124447/Ministerstvo-financ\xed-\u010cesk\xe9-republiky-_-P\u0159\xedprava-rozpo\u010dtu.pdf'})
1324317659000000 1325473564000000
#1578 enhancement rgrp ckan-backlog new [super] Re-enable and refactor ratings 1324322443000000 1325473015000000
#1581 enhancement mark.wainwright@… johnglover ckan-future new Blog post about Google Analytics extension for CKAN

The CKAN Google Analytics extension has been updated to work with the latest version of CKAN, could make for a nice blog post.

Can ping John Glover in January for any details required.

Key link is: http://thedatahub.org/analytics/dataset/top though this should probably move to be under stats (e.g. http://thedatahub.org/stats/usage)

1324402800000000 1325474274000000
#1584 enhancement johnglover johnglover ckan-backlog new QA report improvements - 2.5d

Super: #1594

  • qa/{username}
  • qa/{groupname}
  • paginate QA results
  • search / filter QA results
  • list organisation report by default, but can disable via config option (done)
  • UX tidy up of report pages - hide border if no sidebar, etc
1324459433000000 1338981975000000
#1588 enhancement johnglover johnglover ckan-backlog new QA - Give SPARQL endpoints a 4 star rating

Super: #1594

From Richard Cyganiak on the CKAN Discuss list:

Besides considering the media type of resources, it would also make sense to check for the presence of a SPARQL endpoint. SPARQL endpoints are recorded for more than 300 datasets on the Data Hub using the pseudo-type "api/sparql". A few more are recorded with the format "SPARQL". I suggest that datasets with such resources should also be considered for the fourth star.

1324480405000000 1325475178000000
#1589 enhancement johnglover johnglover ckan-backlog new QA - Give 5 star rating to datasets with link metadata

Super: #1594

From Richard Cyganiak on the CKAN Discuss list:

Regarding the fifth star (is the dataset linked to others?). This cannot be automatically determined just by looking at the format. It either requires inspection of the actual data, or information about links in the metadata. As you're probably aware, we've established conventions for recording information on data links in CKAN [1], as part of the work of the lodcloud group on the Data Hub. Link information is captured for hundreds of datasets. I would claim that we have the majority of four-star datasets covered there, and hence you can determine if they should get the fifth star by checking for the presence of a links:xxx field.

1324480600000000 1325475095000000
#1596 enhancement dread ckan-future new Refactor authz roles

Suggestions from rgrp:

  • Get rid of Roles, and replace them with direct assignment of actions, even though there are many actions, and extensions can add arbitrary ones.
  • Debatable whether we should cut the number of actions to correspond to the three roles defined by the base system.
  • Have a method of finding roles (or, in future, actions) relevant to a given protection object (e.g. FILE-UPLOAD(ER) not relevant to Packages)

(This ticket is split off from #1065)

1324549888000000 1338205019000000
#1598 enhancement rgrp ckan-backlog new Reinstate Ratings

Ratings were disabled approximately a year ago because:

  • Unclear purpose and UX. What did ratings tell you? How useful were they?
  • Spamming (esp by bots: you could submit an anonymous rating via a GET request which caused problems)

Both problems are solvable and it would be nice to have this feature reinstated.

  • Purpose: can make this more purposable by limiting to logged in users (or at least distinguishing logged in from non-logged in users)
    • Even better we could allow ratings to be made public (I'm interested in what someone else I respect finds important)
  • Spamming: limit to logged in users and / or use AJAX over an API to submit ...
1325177524000000 1325474818000000
#1604 enhancement dread ckan-backlog new Get ckanext-moderatededits working with CKAN 1.5+ templates

ckanext-moderatededits requires an old and possibly development version of CKAN. It would be good to update it for later CKAN versions.

According to the README, you need CKAN from branch feature-1141-moderated-edits-ajax but the changelog suggests this branch went into version 1.4.2. So it possibly works with 1.4.2 and 1.4.3(.1). But CKAN 1.5 has revamped templates, so the genshi stream filters definitely don't work.

BTW history_ajax/read_ajax calls have been deprecated in CKAN since 1.5.2a and will need fixing up to use the Action API too as part of this.

1325352429000000 1325352429000000
#1606 enhancement dread ckan-backlog new metadata license config option

Add a config option to choose the metadata licence. Set default to Open Database License.

Currently the dataset edit form says "Important: By submitting content, you agree to release your contributions under the Open Database License." This is hard-coded, but not suitable for when DGU uses the CKAN form - they use the OGL.

1325501130000000 1339773981000000
#1635 enhancement seanh seanh ckan-backlog new Email notifications (e.g. for activity streams)

CKAN should be able to send email notifications to users.

Maybe have a notifications table in the db, and a server-side job that runs periodically and consumes rows from this table, mailing them to the users.

One thing that we may want to send users notifications of is activity stream events. So the activity streams code would have to add rows to the notifications table for the mailer job to consume. But remember that email notifications feature is separate from activity streams - we may want to send notifications of other things as well.

Need to implement (at least some of) #1634 before this can be implemented, in order to have something to send notifications about.

Analysis here: http://ckan.okfnpad.org/27

1326304587000000 1355141157000000
#1642 defect pudo ckan-backlog new Extra link generators generate garbled HTML

I had a package descriptions with URLs that contain "group:foo". This produces garbled output as the system tries to generate two sets of links: the outer link and an inner link.

Need to fix the parser.

Text:

Webdienst basierende Bereitstellung von Geobasisdaten der Freien und Hansestadt Hamburg. Folgende Geobasisdaten werden als WebMapTileService? (WMT-S) für die Dauer des Wettbewerbs netzbasiert unter der Creative Commons Lizenz zur Verfügung gestellt: Digitale Orthophotos 40 cm Auflösung (Layer: apps4d_DOP40), Digitale Stadtkarte (Layer: apps4d_DISK), Digitale Regionalkarte (Layer: apps4d_DIRK), Digitale Karte 1:5000 (Layer: apps4d_DK5).

Metadateneinträge zu den Daten im PortalU:

One fix is quoting the URLs

1326382171000000 1339773967000000
#1643 enhancement shevski ckan-backlog new Add fixed tags to thedatahub for better browsing

Similar to publicdata.eu, want to have themed areas such as finance, environment, census, etc and country tags

1326393293000000 1326393293000000
1 2 3 4 5 6 7 8 9 10 11
Note: See TracReports for help on using and creating reports.