{22} Trac tickets (2647 matches)

Results (1501 - 1600 of 2647)

Id Type Owner Reporter Milestone Status Resolution Summary Description Posixtime Modifiedtime
#1156 enhancement pudo pudo pdeu-1 closed fixed Scraping harvesters for Paris and Vienna Catalogues

Import metadata from both sources into PDEU via the Harvesting framework but by scraping their respective catalogue pages.

1306337428000000 1306855111000000
#1155 enhancement pudo pudo pdeu-1 closed fixed Harvester for data.london.gov.uk

Write a harvester for data.london.gov.uk to import catalogue metadata into PDEU. API (or at least documentation) is available at: http://sourceforge.net/projects/londondatastore/files/

1306337318000000 1306773174000000
#1154 enhancement johnglover nils.toedtmann ckan-sprint-2011-10-28 closed fixed Make ckan robust against solr failure

According to pudo, a ckan with activated solr extension throws a 5xx when solr is unreachable. Instead, it should behave more like a ckan without ckanext-solr when this happens.

1306254472000000 1314287519000000
#1153 enhancement lucychambers lucychambers ckan-backlog closed fixed Update CKAN wiki front page

Update CKAN wiki front page - a la OpenSpending?: http://wiki.openspending.org/Main_Page

Sections should relate to different types of people using the site:

Developers, Users etc..

1306155211000000 1306941386000000
#1152 enhancement amercader amercader ckan-backlog new True support for generic CSW servers

The CSW harvesters implemented at the moment were developed with the DGU project in mind, and they assume all remote CSW servers to implement the Gemini 2 specification. Gemini 2 is the profile defined in the UK for INSPIRE complying metadata, so obviously catalogs from other countries or non-INSPIRE complying ones won't be able to be harvested.

The changes needed to support generic CSW servers (i.e. those implementing the ISO 19139 profile) are:

  • Handling the validators (right now are hardcoded in the harvester

code). This probably involves issues discussed in the CREP 3 (ticket #1134)

  • Changes in the model to adapt the specification to ISO 19139
  • Renaming objects and classes which are now Gemini-centric

List of CSW servers tested:

https://spreadsheets.google.com/spreadsheet/ccc?key=0Atp3cZFjuIOAdDBVQWRINnlfN1d0b2lleHVEdjBSb2c&hl=en_US&authkey=CNu4hsEB#gid=0

1306141334000000 1313411822000000
#1151 enhancement rgrp timmcnamara ckan-sprint-2012-03-05 closed wontfix Preview for geographic data should be a map

Data viewer / previewer for a resource that has a KML file, or others, such as GeoRSS and GeoJSON, we should provide a map.

  • Support for KML files: #1458

This is non-trivial for external files as we need a way to jsonify. For files stored locally this is more of a recline issue (and will require a bit of work to either guess columns or allow user to specify them).

1306043217000000 1330908188000000
#1150 defect johnglover timmcnamara ckan-sprint-2011-10-28 closed fixed Non-ASCII chars prevent data preview

Characters outside of ASCII range are not supported within data previews.

Steps to reproduce:

  1. Visit http://ckan.net/package/kivele2010
  2. Click on [preview] for any of the resources
1306019914000000 1311774141000000
#1149 enhancement kindly kindly ckan-v1.5-sprint-1 closed fixed Change domain object modification plugin to use Session extension.

This should make it more efficient as it currently does a lot of repeating work. i.e if you change a package and a resource in the same commit it sends out 2 notifications and should only really send out 1.

1305969863000000 1306090663000000
#1148 refactor kindly kindly ckan-v1.5-sprint-1 closed fixed test speed improvements and cleanup

The tests have been running slower recently and need fixing. They also could do with a bit more consistency to them.

1305969223000000 1305969925000000
#1147 enhancement kindly kindly ckan-v1.5-sprint-1 closed fixed Add expired_id to all revision tables.

Revision tables need expired_id to make querying history AND pending changes more efficient.

This involves making a session extension and a large table migration.

1305839833000000 1307957556000000
#1146 enhancement kindly kindly ckan-v1.5-sprint-1 closed fixed make logic layer control its own state

Logic layer should not use any vdm defined state and should manage it itself.

1305829117000000 1307957527000000
#1145 enhancement timmcnamara ckan-backlog new Support the Handle System

The Handle System is an initiative to provide persistent references for resources. That is, it's basically a proxy system for preventing link rot.

Its documentation is here: http://www.handle.net/. Servers running CKAN could host a "Local Handle Service", which redirects a hash of a resource to an actual URL.

Some suggested use cases:

  • Researcher would like to cite where data came from
  • Agencies would like to have a way to prevent vendor lock-in from CKAN if they decide to move to another platform
1305764775000000 1339774502000000
#1144 enhancement timmcnamara ckan-backlog new Support DSPL

DSPL, the Dataset Publishing Language, is being promoted by Google for its "Google Public Data Explorer" system. It is an XML format with metadata.

The format is described on the developer docs ofthe Google Code site.

Google provides a Python script which reads CSV data and generates DSPL

Sample from http://code.google.com/apis/publicdata/docs/dspl_sample.html:

<?xml version="1.0" encoding="UTF-8"?>
<dspl xmlns="http://schemas.google.com/dspl/2010"
    xmlns:geo="http://www.google.com/publicdata/dataset/google/geo"
    xmlns:geo_usa="http://www.google.com/publicdata/dataset/google/geo/us"
    xmlns:time="http://www.google.com/publicdata/dataset/google/time"
    xmlns:quantity="http://www.google.com/publicdata/dataset/google/quantity"
    xmlns:entity="http://www.google.com/publicdata/dataset/google/entity">

  <import namespace="http://www.google.com/publicdata/dataset/google/time"/>
  <import namespace="http://www.google.com/publicdata/dataset/google/quantity"/>
  <import namespace="http://www.google.com/publicdata/dataset/google/entity"/>
  <import namespace="http://www.google.com/publicdata/dataset/google/geo"/>
  
  <info>
    <name>
      <value>My statistics</value>
    </name>
    <description>
      <value>Some very interesting statistics about countries</value>
    </description>
    <url>
      <value>http://www.stats-bureau.com/mystats/info.html</value>
    </url>
  </info>

  <provider>
    <name>
      <value>Bureau of Statistics</value>
    </name>
    <url>
      <value>http://www.stats-bureau.com</value>
    </url>
  </provider>

  <topics>
    <topic id="geography">
      <info>
        <name><value>Geography</value></name>
      </info>
    </topic>
    <topic id="social_indicators">
      <info>
        <name><value>Social indicators</value></name>
      </info>
      <topic id="population_indicators">
        <info>
          <name><value>Population indicators</value></name>
        </info>
      </topic>
      <topic id="poverty_and_income">
        <info>
          <name><value>Poverty & income</value></name>
        </info>
      </topic>
      <topic id="health">
        <info>
          <name><value>Health</value></name>
        </info>
      </topic>
    </topic>
  </topics>

  <concepts>
    <!-- As noted in the tutorial, this concept should extend quantity:amount.-->
    <concept id="population">
      <info>
        <name>
          <value>Population</value>
        </name>
        <description>
          <value>Size of the resident population.</value>
        </description>
      </info>
      <topic ref="population_indicators"/>
      <type ref="integer"/>
    </concept>

    <!-- This country concept is defined for educational purposes only. A country
    concept exists in the Google geo dataset. See:

    http://code.google.com/apis/publicdata/docs/canonical/geo.html --> 
    <concept id="country" extends="geo:location">
      <info>
        <name>
          <value>Country</value>
        </name>
        <description>
          <value>My list of countries</value>
        </description>
      </info>
      <type ref="string"/>
      <property id="name">
        <info>
          <name><value xml:lang="en">Country name</value></name>
          <description>
            <value xml:lang="en">The official name of the country</value>
          </description>
        </info>
        <type ref="string"/>
      </property>
      <table ref="countries_table"/>
    </concept>

    <!-- This US state concept is defined for educational purposes only. A US state
      concept exists in the Google geo US dataset. See:

      http://code.google.com/apis/publicdata/docs/canonical/geo.us.html --> 
    <concept id="state" extends="geo:location">
      <info>
        <name>
          <value>State</value>
        </name>
        <description>
          <value>US states</value>
        </description>
      </info>
      <type ref="string"/>
      <property concept="country" isParent="true"/>
      <table ref="states_table"/>
    </concept>

    <concept id="gender" extends="entity:entity">
      <info>
          <name>
          <value>Gender</value>
        </name>
        <description>
          <value>Gender, Male or Female</value>
        </description>
        <pluralName><value>Genders</value></pluralName>
        <totalName><value>Both genders</value></totalName>
      </info>
      <type ref="string"/>
      <table ref="genders_table"/>
    </concept>

    <concept id="unemployment_rate" extends="quantity:rate">
      <info>
        <name>
          <value>unemployment rate</value>
        </name>
        <description>
          <value>The percent of the labor force that is unemployed, not seasonally
            adjusted.</value>
        </description>
        <url><value>http://www.bls.gov/cps/cps_htgm.htm</value></url>
      </info>
      <topic ref="social_indicators"/>
      <type ref="float"/>
      <attribute id="is_percentage">
        <type ref="boolean"/>
        <value>true</value>
      </attribute>
    </concept>

  </concepts>

  <slices>
    <slice id="countries_slice">
      <dimension concept="country"/>
      <dimension concept="time:year"/>
      <metric concept="population"/>
      <table ref="countries_slice_table"/>
    </slice>

    <slice id="states_slice">
      <dimension concept="state"/>
      <dimension concept="time:year"/>
      <metric concept="population"/>
      <metric concept="unemployment_rate"/>
      <table ref="states_slice_table"/>
    </slice>

    <slice id="countries_gender_slice">
      <dimension concept="country"/>
      <dimension concept="gender"/>
      <dimension concept="time:year"/>
      <metric concept="population"/>
      <table ref="countries_gender_slice_table"/>
    </slice>

  </slices>

  <tables>
    <table id="countries_table">
      <column id="country" type="string"/>
      <column id="name" type="string"/>
      <column id="latitude" type="float"/>
      <column id="longitude" type="float"/>
      <data>
        <file format="csv" encoding="utf-8">countries.csv</file>
      </data>
    </table>

    <table id="countries_slice_table">
      <column id="country" type="string"/>
      <column id="year" type="date" format="yyyy"/>
      <column id="population" type="integer"/>
      <data>
        <file format="csv" encoding="utf-8">country_slice.csv</file>
      </data>
    </table>

    <table id="states_table">
      <column id="state" type="string"/>
      <column id="name" type="string"/>
      <column id="country" type="string">
        <value>US</value>
      </column>
      <column id="latitude" type="float"/>
      <column id="longitude" type="float"/>
      <data>
        <file format="csv" encoding="utf-8">states.csv</file>
      </data>
    </table>

    <table id="states_slice_table">
      <column id="state" type="string"/>
      <column id="year" type="date" format="yyyy"/>
      <column id="population" type="integer"/>
      <column id="unemployment_rate" type="float"/>
      <data>
        <file format="csv" encoding="utf-8">state_slice.csv</file>
      </data>
    </table>

    <table id="genders_table">
      <column id="gender" type="string"/>
      <column id="name" type="string"/>
      <data>
        <file format="csv" encoding="utf-8">genders.csv</file>
      </data>
    </table>

    <table id="countries_gender_slice_table">
      <column id="country" type="string"/>
      <column id="gender" type="string"/>
      <column id="year" type="date" format="yyyy"/>
      <column id="population" type="integer"/>
      <data>
        <file format="csv" encoding="utf-8">gender_country_slice.csv</file>
      </data>
    </table>
  </tables>

</dspl>
1305763609000000 1339774517000000
#1143 enhancement dread dread ckan-v1.5-sprint-2 closed fixed Improve stats page
  • Ensure we don't include deleted packages in the stats
  • Some visual improvements:
    • Number of packages:
      • fix x axis to start at first revision
      • fix y axis to start at zero
    • Fix problem with legend to 'Revisions to packages' graph.
    • Hide sidebar in the template, so it isn't right-cropped in the DGU theme
  • Add some testing of the stats lib - results of basic stats
1305734836000000 1307024681000000
#1142 enhancement annapowellsmith rgrp ckan-v1.5 closed fixed [super] Major Overhaul and Extension of CKAN Documentation

Child tickets:

  • #1041 Start Using the CKAN Wiki for Tutorial-style documentation
  • #1192 Convert CKAN Sphinx docs into admin/reference manual

Previous super ticket (from 3m ago): http://trac.ckan.org/ticket/927

  • CKAN 1-page overview (for enterprise and for data hackers)
  • Administrator's Guide (including install)
  • Extensions Guide
  • Separate CKAN.net info from Software Documentation (?)

Also now a wiki page with more detail: http://wiki.ckan.net/Documentation_Plans

Minor Items

1305727452000000 1313775665000000
#1141 CREP johnglover ckan-backlog closed fixed [super] Moderated Edits User Interface

Proposer: John Glover
Seconder: James Gardner

Abstract

We are trying to achieve these goals:

  • To get people involved with making edits to CKAN metadata.
  • To have an ownership model as to who can moderate and validate these changes
  • To not put too huge a burden on these owners.

This feature allows anyone to edit a package and create a new revision, but requires an owner/moderator to approve a revision before it is are made "official".

There have been a lot of discussions around the revisioning system side of this ticket (CREP 0002) and I think these are now largely resolved. We now want to discuss the user interface.

The Problem

We require the following functionality:

  • Allow a group of changes to be stored as a new revision.
  • Allow a linear stack of "community" revisions.
  • Provide a way for the editor and moderator to compare previous revisions to the current one.
  • When a moderator approves a change it creates a new revision flagged "moderated" (this is analogous to a merge commit)
  • Provide a way for the editor and moderator comment on revisions if necessary.

Extra features:

  • Need a way to summarise the changes (as part of the preview perhaps)
  • Sysadmin needs to purge a revision completely

Specification

UI/UX

UI Mockup:

Revisions:

  • Revisions are per package rather than per field.
  • Internally CKAN has separate revisions for resources, extras and package metadata. From a user's point of view this could be confusing to expose, so everything that they see on a package form when they hit save is a single revision.

On the Edit page:

  • We have a panel on the right, listing all the revisions with the current moderated one selected. Moderated revisions are highligted in some way (red and bold?).
  • The values displayed in the form are by default populated from the latest revision (whether community or moderated)
  • Under each field is a "shadow", showing the value of the field in the revision selected in the panel, if it is different from the value in the field. By default the shadow values are populated from the latest moderated revision which is the one selected in the revision panel by default too.
  • When you change the value of a field, a shadow may appear or disappear accordingly. If they disappear a box saying that they are the same replaces it
  • If you want to edit values from a previous revision, you first select that revision to get the shadows populated. There is a button named "Replace fields with values from this revision" under the revision list. You click this, a warning pops up and then you say "Yes". You then select the moderated revision again.
  • We also allow package comments the same way as the todo extension works at the moment. Additionally, we need to be able to differentiate between what the moderator wrote and what a community member wrote, and so we may need to make a small change to the todo extension to facilitate this.
  • In addition to package comments, each revision will have a revision log (analogous to a commit message).

Technical Details

  • This CREP will result in a new CKAN extension.
  • It depends heavily on the new revisioning system (CREP0002), some of the details of which are yet to be finalised.
  • This CREP therefore requires working closely with David Raznick to come up with an API that the UI AJAX calls can use.
  • We will then use suitable test data to mimic these API calls until CREP0002 is ready.

Why do it this way

This hopefully provides a clear and consistent mechanism allowing both a community member to make new revisions and a moderator to view and approve revisions, with largely the same UI/UX.

Implementation plan

Deliverables

A new CKAN extension, consisting of:

  • Code: Python, HTML, CSS, Javascript
  • Unit tests
  • Localization
  • Documentation

Participants

John Glover to do it.

Progress

John has implemented the bulk of this UI. Just some things to tidy up before it is complete:

  • Genshi stream filters to be updated with CKAN 1.5 / 1.5.1 templates
  • history_ajax / read_ajax to be replaced with calls to Action API (or Util REST API)

I've split these two off into a new ticket #1604.

Related Progress

The Todo extension is written and available at: https://bitbucket.org/johnglover/ckanext-todo.

In the section 'The Problem', under extra features, we mention a need for the sysadmin to be able to purge a revision already. This is already done.

See also

#1129 Backend work

1305721003000000 1325352507000000
#1140 defect dread ckan-v1.5 closed fixed Adding the package to the group is not search indexed

To reproduce:

  1. paster db init
  2. paster create-test-data
  3. In Web UI: create new group 'tilo', which includes package 'annakarenina'
  4. curl http://localhost:5000/api/search/package?groups=tilo results in 0 results (WRONG)
  5. paster search-index rebuild
  6. curl http://localhost:5000/api/search/package?groups=tilo results in 1 result
1305718290000000 1312537257000000
#1139 enhancement lucychambers lucychambers ckan-backlog closed fixed Create CKAN Theme Gallery

Take screenshots of existing ckan instances esp those mentioned <http://wiki.ckan.net/Theming> and put on flickr in ckan or ckan-theme group so we can create a gallery ... (both to illustrate theming but also to show ckan instances that are around -- could add to http://wiki.ckan.net/Instances)

1305645859000000 1306941356000000
#1138 enhancement johnlawrenceaspden closed invalid minor navigations behave inconsistently

For Authorization Groups, if you have admin privileges you see view, edit and authz tabs, and if you don't have the necessary privileges you only see the view tab.

For Packages, you see all tabs whatever your permissions, so there's a link you can click on which will redirect you to the login page.

1305279888000000 1316965357000000
#1137 enhancement kindly rgrp assigned Remove need for statefulness in vdm

Statefulness, especially statefulness for relation (esp m2m) is cause of most of the complexity in vdm. It is required because, atm, revision objects have FKs to continuity objects.

This ticket proposes the following changes:

NB: this could be limited just to case of join tables (leaving state stuff on other tables)

  • Remove FKs from revision to continuity (or allow for them to be nullable).
    • We could just limit this to m2m stuff
  • Delete of an object leads to:
    • Deletion of continuity object
    • Adding an entry in revision table with state set to deleted (we retain state on revision table)

If this is done we will no longer need to worry about filtering on state on relationships as join table will only contain "active" relationships.

If we do this on all tables we remove need for any state awareness in client (e.g. no need to filter tables on active state).

The only disadvantage of this change is that undeletion becomes more problematic (we have to recreate some continuity objects).

1305211628000000 1340631974000000
#1136 enhancement kindly rgrp assigned Move to SessionExtension in vdm

When vdm was created there was no SessionExtension so we use MapperExtension for doing revisioning. Now that SessionExtension? exists we should use it. We can also follow the existing SQLAlchemy recipe: <http://www.sqlalchemy.org/docs/orm/examples.html?highlight=versioning#versioned-objects>

1305210855000000 1340632980000000
#1135 enhancement kindly rgrp assigned Changeset model for vdm

Move to Changeset model for vdm.

A changeset model is like an Audit-Log model in which we just record Changesets with Change-Objects rather than have Revision-Objects for each Object that is revisioned.

This change would also incorporate significant simplication of vdm.

1305209986000000 1340632267000000
#1134 CREP amercader ckan-backlog new CREP0003: Description and Configuration of Harvesters

Proposer: Adrià Mercader

Abstract

The new harvester interface allows to create harvesters for different sources, but right now harvesters don't have many ways to describe and configure themselves. We need a way of allowing them to:

  • Expose their type and other details so they can be used internally and on the UI.
  • Define configuration settings for particular harvester instances.

The Problem

Harvester description

The current UI for adding and editing harvest sources is the same used in ckanext-dgu, and thus the 3 harvester types used in DGU to harvest various GEMINI realted sources are hardcoded in the form. The form will be migrated to a DGU-independent one, so we need the harvesters to provide all the necessary data. There is a current get_type method that returns the harvester type, but for make it compatible with the DGU forms, it returns a machine-readable string (e.g. "CSW Server"), making it error prone.

Arbitrary configuration

In the current implementation, when the harvest process is started, ckanext-harvest looks for all the available plugins that implement the IHarvester interface and calls the appropiate methods for the current stage (gather_stage,fetch_stage,import_stage). At these stages, harvesters have no way of applying arbitrary configuration options, so all harvesters of the same type behave on the same way. For instance, the CKAN harvester needs a way to define the API version to use when harvesting remote instances (Right now, the version 2 is hardcoded on the code).

Specification

Harvester description

Harvesters will need to provide the following information so the UI form can be built:

  • name: machine-readable name (e.g. "waf"). This will be the value stored in the database, and the one used by ckanext-harvest to call the appropiate harvester.
  • title: human-readable name (e.g. "Web Accessible Folder (WAF)"). This will appear in the form's select box.
  • description: a description of what the harvester does (e.g. "A Web Accessible Folder (WAF) displaying a list of GEMINI 2.1 documents"). This will appear on the form as a guidance to the user.

The way to provide it will be an info method that all harvesters must implement, which will return a dictionary with the previous elements:

    {
        'name': 'csw',
        'title': 'CSW Server',
        'description': 'A server that implements OGC's Catalog Service 
                        for the Web (CSW) standard'
    }

Arbitrary configuration

As different harvesters will have very different needs, we need to provide a way to persist arbitrary configuration flags for each harvest source. The more flexible way given the current architecture in my opinion would be to store the configuration options as a JSON encoded object as a property of the harvest source (There already is an unused DB field called config in the database) (Maybe using JsonType??).

This will mean adding an extra field in the harvest source form to allow entering the configuration. This could be just a simple text field where users enter the JSON encoded object or a more clever mechanism (i.e an "Add a configuration flag" link that adds two new text fields for the key and value for each flag, and a mechanism to later build the JSON object). In any case, this should probably be hidden in an "Advance options" section.

Why do it this way

Harvester description

The info method would provide a single point to get all the information related to the harvester, and future properties could be added to the dictionary returned without having to modify the interface.

Arbitrary configuration

There is an already existing config field in the database, so we won't need to change the model. Harvesters could access the config object at any of the stages. Of course they could provide default values in their implementations so users don't need to enter them everytime.

Implementation plan

Deliverables

Risks and mitigations

The highest risk on the harvesters info method side is that harvester implementation don't offer one of the necessary properties (namely name and title). This could fire a warning when showing the UI form or using the CLI.

Participants

Adrià Mercader to do it.

Progress

None yet.

1305108868000000 1339774554000000
#1133 defect johnlawrenceaspden closed worksforme command line rights manipulation doesn't work

It appears that the command

$ paster rights add russianfan admin warandpeace

has no effect, even though

$ paster rights remove russianfan admin warandpeace

works fine. This may be specific to something I've done, could someone confirm?

If it's the case more generally, then I'm assuming this behaviour is untested? Tests should probably be added.

1305054948000000 1324057072000000
#1132 defect johnlawrenceaspden ckan-backlog closed invalid test_authz doesn't run

Trying to run the tests in test_authz.py with

$ nosetests --ckan ckan/tests/functional/test_authz.py results in no tests being run:


Ran 0 tests in 0.840s

OK (SKIP=3)

1304966923000000 1307352675000000
#1131 defect dread dread ckan-v1.4-sprint-7 closed fixed Search param validation exception not caught

Example request:

http://nl.ckan.net/api/2/search/package?q=delft&order_by=&offset=&limit=&tags=

Gives 500 error:

<type 'exceptions.ValueError'>: invalid literal for int() with base 10: ''
1304942023000000 1305537897000000
#1130 enhancement lucychambers assigned First time users

Send users to FAQ first time on CKAN

1304938761000000 1340633514000000
#1129 CREP kindly ckan-v1.5 closed fixed CREP0002: Moderated Edits

Proposer: David Raznick

Abstract.

We are trying to achieve these goals.

  • To get people involved with making edits to CKAN metadata.
  • To have an ownership model as to who can moderate and validate these changes
  • To not put too huge a burden on these owners.

In order to achieve this, a feature which lets anyone edit a package but only let the moderator/owner accept it. The moderator should be able to look at a list of changes and accept the ones that

This cep is not about 'if' we need such a feature, it is about 'how' we go about implementing it. Another cep may needed for the 'if' case.

The Problem

We need the following to be possible.

  • Storing revision of objects that are not the current active one.
  • A way of the user viewing past revisions.
  • Accessing not only the history of a particular object but also of related objects at that time. i.e If a resource related to a package changes we need a way to see this when looking at the package.
  • A robust way of doing this in the face of database schema changes.
  • Make sure database queries are quick.

Solutions.

  1. Store the whole dictization of the package and all its related objects every time you change anything in its dictized representation and only save to the database proper if accepted.

Pros

  • Easy to implement, we already have a preview which makes the dictized form of a package without actually saving it. This will just need to be persisted in some way.
  • Fast retrieval.
  • Potential to store a branching revision tree of changes.

Cons

  • No easy way to remake the dictized packages historically or if there is an there a change in the way we represent packages, i.e schema changes.
  • Will only work for the particular objects we decide to store these changes for.
  • Stores a lot of repeated information
  1. Write specialized queries for every read of the database looking only at the revision tables.

This method requires there to be a change in the way we use VDM, so that we manage statefulness ourselves. We will need to add other states such as 'waiting for approval'.

Pros

  • No specialized storage required
  • Only need to change queries when schema changes
  • Can be made to work easily for other objects

Cons

  • Slower query time on read, as even looking at the last active package will need to do a fairly complicated query.

Implementation details.

1.

A new table with columns id, user, package_id, timestamp, revision_id, parent_id, dictized_package. revision_id should be null unless it is actually persisted to the database. parent_id is the id that this package_dict was changed from.

We could store only the diffs of the dictized_package as long as we assure that everything inside the json is stably sorted, this will make getting the historical data out slower.

Getting out the history of the dictized packages is an intensive task, as it will require replaying the whole history of all the changes and creating the dict for each change. This re-caching will need to be redone for every change we make to dictized representation of a package.

2.

Every normal packages read needs to look at the revision table to see the last accepted change in the dictized representation of the package. We also need to way to get what the dictized representation of the package was like at any point of its revision history. This querying is non-trivial in sql.

Participants

David Raznick to do it.

Progress.

Decided to go with option 2. However we will change the revisioning system to be like the schema attached. This gets rid of difficult querying problems caused by querying the revision tables by adding an end date, meaning you can do range queries.

The better and more normalized version of a revisioning system is outlined https://docs.google.com/drawings/d/1Y7nMgVsrs081Pame2RdbZHlCAlV33ddTZ8VAsab1j-0/edit?hl=en_GB&authkey=CJfd8vsB. We will be a step closer to that, with this change, but we will keep the current vdm more or less, intact.

1304851498000000 1325268100000000
#1128 task dread dread ckan-sprint-2011-10-28 closed fixed Upload Scotland gov data

Upload to ckan.net: https://sites.google.com/site/scotlandsdata/

1304712293000000 1310568028000000
#1127 CREP sebbacon closed fixed CREP0001: Formalise new feature discussion and definition using CREPs

Proposer: Seb Bacon
Seconder: Rufus Pollock

Abstract

When adding major new features to CKAN, a longer, more formal discussion will improve software design quality and documentation, better engage the wider community, and ensure the core team are up to date with latest developments.

I propose a formal process (CREP -- CKAN Revision and Enhancement Proposal) for making this happen.

The Problem

The current workflow for introducing major new features into CKAN is very informal, typically based around one person's great idea, which they've discussed with one or two other people in the team. The originator of the idea is typically the only person with access to all the input they've had through such discussions. Often, the only location of this information is in that person's head.

However, there is a lot of experience embodied in the CKAN community which should be drawn on before making large design decisions. This will lead to better software. Additionally, building consensus in the community around a proposal before implementation ensures positive community engagement and buy-in to new features, making them more likely to be a success.

We aren't great at documenting new features. Documentation after coding is complete is an unrewarding experience for most programmers. Requiring skeleton documentation before code is written is a good discipline that can form the basis of better documentation in the future (e.g. by a writer rather than a programmer).

Specification

Minor features don't require a CREP, and can just be entered in the issue tracking system as a bug or feature. As a rule of thumb, a feature is major if it will take more than a day to implement, or is likely to involve matters of opinion in its design.

A developer may decide that a CREP is too formal and long-winded. The decision to write a CREP is at at their discretion; however, new features MUST always be proposed via email, even if this is just a couple of sentences.

If a feature requires a CREP, the proposer should find a seconder for their idea. This sanity check step happens before a CREP is written to ensure at least the possibility of consensus on the CREP.

Next the proposer should write a CREP, starting by copying and pasting the template on the wiki into a new Trac ticket. This will be with a status of "new" and Type of "CREP". The proposer should notify the ckan-dev mailing list, and possibly the ckan-discuss list for less technical CREPs.

The draft can be discussed via email, verbally, or via the trac ticket. In any case, it is the proposer's responsibility to keep the CREP updated to reflect the current consensus.

Once consensus has been reached, the ticket should be marked with the "accepted" status and assigned to a CKAN release milestone.

When an accepted CREP has been implemented, it should be resolved as "fixed".

If no consensus can be reached on a draft CREP, or for some reason an accepted CREP doesn't get completed, it should be marked as or "wontfix".

If a completed CREP becomes obsolete, it should be marked as "invalid", with a note pointing to the obsoleting ticket(s)

Why do it this way

Given the distributed nature of the core team plus other volunteers, some kind of written procedure is necessary to ensure a fully documented and discussed proposal.

The idea of "Enhancement Proposals" which can be semi-formally proposed and discussed prior to implementation is common in the Open Source world (PEPs, DEPs, PLIPs, to name three).

Existing historic proposals exist, called CEPs. The proposed system is called CREP (CKAN revision or enhancement proposal) to disambiguate it from the legacy proposals, and from the delicious fungus Boletus Edulis.

Giving a formal structure to the proposal is useful as it gives the community a means to identify a CREP that's not had sufficient thought or discussion. An informal email thread can easily be lost and important questions (such as backwards compatibility) overlooked. The use of the proposed template empowers any community member to ask the proposer to expand on rationale, deliverables, etc.

The structure chosen is somewhere between Debian's and Plone's. It aims to give a structure to the debate, a clear start at documentation, and also prompt some thinking about implementation and timescales.

All this policy about structure should not be construed as mandatory. In particular, the later fields in the CREP template regarding Implementation Plan may be omitted if the author doesn't find them helpful.

Some projects (e.g. Debian) keep their enhancement proposals in a versioning repository; others (e.g. Plone) keep them in an issue tracking system. Trac is proposed for CKAN because we already use it for small feature proposals and for team planning. It seems unlikely that change tracking on an individual CREP will be useful; a CREP that changes sufficiently from its original form should probably be marked "obselete" and a new CREP started. Using an issue tracking system also means we can easily track CREPs by state.

Backwards Compatibility

Some [https://bitbucket.org/okfn/ceps/src/76b274888bcf/cep/ legacy enhancement proposals], called CEPs, have previously been started.

They are currently all marked as "active". Any which require discussion should be altered by the proposer to match the new CREP specification and submitted to trac. The original CEP should be updated with a banner at the top pointing a reader to the new CREP.

Any that are now obselete should be clearly marked as such in a banner at the top, pointing a reader to the trac for new CREPs.

Implementation plan

Deliverables

  • This CREP, agreed
  • Support for proposed statuses in Trac
  • Canned reports for listing CREPs in Trac

Risks and mitigations

  • That this CREP is agreed, but rarely acted on. This risk can be mitigated by nominating a CREP champion in the community or core team, whose job it is to say "where's the CREP for that?" and generally own the quality of CREPS

Participants

Seb Bacon: as current Documentation Czar (May 2011), responsible for ensuring CREPs are up to date.

Progress

This document is the entire proposal.

1304601313000000 1305622850000000
#1126 enhancement dread dread ckan-v1.4-sprint-7 closed fixed Exceptions arising from error page

I'm not completely clear what the use case is for loading the error page in this way, but somehow original_request is None and that creates an unnecessary exception with the logic refactor.

http://ckan.net/error/document?__cache=39020485
...
Module ckan.controllers.error:29 in document
<<          original_response = request.environ.get('pylons.original_response')
               # Bypass error template for API operations.
               if original_request.path.startswith('/api'):
                   return original_response.body
               # Otherwise, decorate original response with error template.
>>  if original_request.path.startswith('/api'):
AttributeError: 'NoneType' object has no attribute 'path'
1304586683000000 1304587078000000
#1125 enhancement dread nils.toedtmann closed fixed Debian package "ckan" should not depend on "postgresql"

The debian package "ckan" with the two scripts "ckan-create-instance" and "ckan-instance-maintenance" depends against "postgresql". But "ckan-create-instance" is quite handy even when the DB is remote: it creates all the data dirs with the correct permissions, and the ckan and apache configs.

Please add a flag "--without-local-db" to "ckan-create-instance" and remove the postgres dependancy from the debain package.

1304538095000000 1310134813000000
#1124 enhancement thejimmyg nils.toedtmann ckan-sprint-2011-12-05 closed fixed push apt package python-ckanext-solr into our debian repository

python-ckanext-solr is already available in http://apt-alpha.ckan.org/datanl-dev, but not yet in http://apt-alpha.ckan.org/debian (that is why we had to [pip-install it for DataGM). Please push into main repo.

1304537793000000 1323168156000000
#1123 requirement dread nils.toedtmann closed fixed Please re-package CKAN packages as "noarch"

... or, if the CKAN packages do contain architecture-specific binary code, build packages for i386 too.

Currently, http://apt-alpha.ckan.org/debian only offers packages for amd64, but e.g. "m1.small" EC2 instances are i386.

We would need this in order to migrate the community instances to a packaged based CKAN.

Rufus, pls prioritise.

1304530050000000 1311863806000000
#1122 enhancement pudo dread closed wontfix JSON Extra data not searchable

It is possible to use the CKAN API to insert JSON format data into package extra values, but this data is not found on searching.

Full text from Pascal:

we encountered a Problem concerning accessing Arrays/Lists.

curl -XGET 'http://ckan.net/api/rest/package/hbz_unioncatalog'

will get you amongst others:

 "extras": {"publishingInstitution":
"[u'http://lobid.org/organisation/DE-605',
u'http://lobid.org/organisation/DE-290',
u'http://lobid.org/organisation/DE-38M',
u'http://lobid.org/organisation/DE-98',
u'http://lobid.org/organisation/DE-38',
u'http://lobid.org/organisation/DE-Kn41',
u'http://lobid.org/organisation/DE-82',
u'http://lobid.org/organisation/DE-107',
u'http://lobid.org/organisation/DE-929',
u'http://lobid.org/organisation/DE-Zw1',
u'http://lobid.org/organisation/DE-832']"}

but if I try to query this:

wget
'http://ckan.net/api/search/package?q=lobid&publishingInstitution="http://lobid.org/organisation/DE-605"'

I get only two packages, among the package "hbz_unioncatalog" is
missing. (These two packages have only one value for
"publishingInstitution").

The "extra/publishingInstitution"-Array was uploaded through a "curl
-XPUT ...
 "extras": {
   "publishingInstitution":[
     "http://lobid.org/organisation/DE-605",
     "http://lobid.org/organisation/DE-290",
     "http://lobid.org/organisation/DE-38M",
     "http://lobid.org/organisation/DE-98",
     "http://lobid.org/organisation/DE-38",
     "http://lobid.org/organisation/DE-Kn41",
     "http://lobid.org/organisation/DE-82",
     "http://lobid.org/organisation/DE-107",
     "http://lobid.org/organisation/DE-929",
     "http://lobid.org/organisation/DE-Zw1",
     "http://lobid.org/organisation/DE-832"
   ]
   },
...
1304367510000000 1306747714000000
#1121 enhancement dread closed wontfix JSON extras appear in package edit form mangled

It is possible to use the CKAN API to insert JSON format data into package extra values, but this data is displayed in the package edit form.

Example: Package http://ckan.net/package/hbz_unioncatalog in the API the extra value as a list:

"extras": {"publishingInstitution": ["http://lobid.org/organisation/DE-605", "http://lobid.org/organisation/DE-290"...

yet when you edit the package it loses all the quotes and brackets: http://ckan.net/package/edit/hbz_unioncatalog {{{http://lobid.org/organisation/DE-605http://lobid.org/organisation/DE-290... }}} so when you save the package, the list is mangled into a bad string.

1304367504000000 1307358426000000
#1120 enhancement tsm ckan-backlog new Atom feeds of each tag

Tags could/should have an Atom feed. This would mean that every edit to relevant packages could be easily monitored. See [1].

[1] http://lists.okfn.org/pipermail/ckan-discuss/2011-May/001162.html

1304309285000000 1339774568000000
#1119 enhancement rgrp rgrp ckan-v1.4-sprint-7 closed fixed Fully functional storage extension with file upload

Previous work in #877 and #879 + #853 (storage API). In this ticket:

  • Improve authorization
  • Establish convention for laying out files on disk
  • Add documentation
  • Fix bugs with #879 (does not currently work -- boto may have changed)
1304094382000000 1305037201000000
#1118 defect johnlawrenceaspden closed invalid tests are testing something other than the behaviour seen in the browser

I'm finding that if I try to take an action with insufficient credentials from a test then I often (but not always) get a 401 error, whereas in the browser I get redirected to the login page.

It's a bit worrying that the program in its test environment doesn't behave like it does in the browser.

1304093017000000 1311174062000000
#1117 defect thejimmyg nils.toedtmann closed invalid Depend deb package "ckan" against ubuntu package "python-pastescript"

... otherwise the scripts fails.

1304089619000000 1304277240000000
#1116 defect dread dread ckan-v1.4-sprint-7 closed fixed api search loses boolean q options

filter_by_openness and filter_by_downloadable options don't work when specified as URI parameters. (They do work in qjson parameters)

1304086823000000 1304358664000000
#1115 defect johnlawrenceaspden closed wontfix can have two authzgroups with the same name

If you've got edit permission on an authzgroup, then you can change its name to be the same as another existing authzgroup.

This causes some strange UI effects at worst, and probably causes worse problems somewhere else.

Is there any reason why changing the names of existing authzgroups should be allowed? And if so, name collisions should presumably be guarded against in both the name-changing and creation functions

1304085120000000 1324054704000000
#1114 enhancement dread dread ckan-v1.4-sprint-7 closed fixed CLI for viewing search index of a package

To see what lexemes are generated for debug purposes.

1304078353000000 1304085484000000
#1113 defect kindly kindly closed fixed lists in extras serialized wrongly on get with the api.

Lists are being converted to unicode and then translated into a json when getting from the api.

1304017353000000 1304024611000000
#1112 enhancement dread dread ckan-v1.4-sprint-7 closed fixed Allow searching for all packages
GET api/search/package?q=

returns all packages. This is so you can filter them e.g. by openness, which is not currently possible.

1304007852000000 1304358643000000
#1111 task lucychambers lucychambers closed fixed FAQ - For CKAN

Write CKAN FAQ (Basis can be: http://wiki.ckan.net/FAQ)

Post preliminary questions on:

http://notebook.okfn.org/

1303906561000000 1305881039000000
#1110 enhancement kindly kindly closed wontfix profile ckan

We need to see what areas of ckan are slow.

1303840041000000 1340034394000000
#1109 defect kindly kindly closed fixed When extras has a value other than a string an integrity error occurs in the api.

This is a regression that happened after refactoring the api.

It was shown by

http://pastebin.com/2v7QasZy

1303839943000000 1305124697000000
#1108 enhancement zephod pudo ckan-sprint-2011-09-12 closed fixed Create a more modern theme for CKAN

CKAN looks a bit aged, it should be styled more modernly and some elements could be re-arranged:

  • Collect user info in top bar
  • re-add the logo to ckan.net
  • Remove tags from main menu, replace with /sitemap.xml

Inpiration:

quora.com, github.com, Google Projects, Google Refine, etc.

CKAN.net or CKAN general theme?

To be decided. Suggest we start with ckan.net specific and then backwards integrate (?). Existing ckan.net theme repo:

https://bitbucket.org/okfn/ckan-ckan.net

1303830790000000 1315140879000000
#1107 refactor tidy-up bitesize rgrp closed fixed Move package autocomplete from package controller and move to API

Currently autocomplete method on package controller. This method should be in API (like other autocomplete methods).

Will need to update client code (just forms atm I think).

1303808480000000 1340632612000000
#1106 defect rgrp rgrp ckan-v1.4-sprint-6 closed fixed Bugs related to routes arising from API refactor + removal of default routes

Various bugs I've been encountering:

Latter issue was masked by existence of 'default' routes:

   map.connect('/{controller}', action='index')
   map.connect('/:controller/{action}')
   map.connect('/{controller}/{action}/{id}')

Having these is, I think, bad practice as it is better to be explicit and we should therefore remove asap.

In addition I think we should be cautious about 'default' routes in core such as:

    map.connect('/api/rest/:register', controller='api', action='list',
        conditions=dict(method=['GET'])
        )

As it makes it harder for extensions to introduce their own APIs (here one could perhaps add something at /api/rest/{my-object} but only by using before_map rather than after_map).

1303747360000000 1303834069000000
#1105 defect nils.toedtmann closed invalid test ticket, please ignore

.

1303508261000000 1303508330000000
#1104 defect dread johnlawrenceaspden ckan-v1.4-sprint-7 closed fixed create-test-data doesn't index the packages it creates

With the default test data created by

paster db clean paster db init paster create-test-data

going to the front page shows two recently changed packages A Wonderful Story A Novel by Tolstoy

But none of those words "Wonderful", etc produce search hits. In fact as far as I can tell, nothing produces any search hits.

That isn't true on ckan.net, where searching seems to work.

1303494635000000 1303920791000000
#1103 defect johnlawrenceaspden closed duplicate searching broken in development setup

With the default test data created by

paster db clean paster db init paster create-test-data

going to the front page shows two recently changed packages A Wonderful Story A Novel by Tolstoy

But none of those words "Wonderful", etc produce search hits. In fact as far as I can tell, nothing produces any search hits.

That isn't true on ckan.net, where searching seems to work.

1303494538000000 1303744575000000
#1102 defect johnlawrenceaspden closed duplicate searching broken in development setup

With the default test data created by

paster db clean paster db init paster create-test-data

going to the front page shows two recently changed packages A Wonderful Story A Novel by Tolstoy

But none of those words "Wonderful", etc produce search hits. In fact as far as I can tell, nothing produces any search hits.

That isn't true on ckan.net, where searching seems to work.

1303491912000000 1303744552000000
#1101 enhancement sebbacon ckan-backlog new Integrate googlanalytics into site nav

There's a stats plugin (e.g. at http://trac.ckan.org/ticket/832).

Output from the googleanalytics plugin should append to that page, if the stats plugin is present.

Possibly the stats plugin and the googleanalytics plugin should be merged?

Finally, if the stats plugin is active, then a link to the stats page should be added to the main site footer.

1303393926000000 1339774582000000
#1100 defect dread ckan-v1.4-sprint-6 closed fixed Get buildbot running on ckan branches

Need some changes to pip-requirements files in release branches.

1303385267000000 1303406103000000
#1099 defect johnlawrenceaspden closed wontfix strange interactions between two browsers while playing with authz groups

While playing with the authorization groups, trying to design tests, I found that it was necessary to log in as two different users with two different browsers. Often actions of one user would cause server errors in the other user's browser.

I don't have a reproducible test case, but it happens fairly often so it shouldn't be too difficult to get one.

1303380824000000 1324057106000000
#1098 task dread dread ckan-v1.4-sprint-6 closed fixed --ckan-migration tests not initialised correctly

Only tests with failing --ckan-migration fail, due to authz not being initialised.

1303377336000000 1303406017000000
#1097 enhancement dread dread ckan-v1.4-sprint-6 closed fixed Sidebar hideable

The web interface has a sidebar (#primary) which should be hidden in some pages. This is for QA extension and useful for package new and edit pages. Must be compatible with DGU theme.

1303293416000000 1303293476000000
#1096 defect rufuspollock pudo ckan-future new [super] CKAN Hosted

Many users of CKAN want to have their own instance without much effort. Setting these up in separate places is a maintenance nightmare, we should much rather have some tenant separation in core CKAN. Some ideas:

  • introduce model.Site and c.site
    • site has: custom CSS, extra_template_path, title, languages list, package_form, group_form (all configured via web UI)
  • Subdomain detector to activate sites.
  • use site in Authorizer instead of System, have a NullSite? for global things
  • allow cross-site search
  • packages are in a list of sites, m:n rather than 1:n
    • list of sites is string-based, can contain sites not in site table to express harvested external material which is not editable locally.
1303235062000000 1339774484000000
#1095 defect kindly kindly closed fixed add way to pass in schema to logic layer.

We need a way to pass in schemas to the logic layer to deal with edge cases.

1303221854000000 1310134959000000
#1094 enhancement thejimmyg thejimmyg ckan-v1.5 closed duplicate [super] Refactor the Auth System

Here are some proposed changes related to CKAN's authorization system - they aren't very big, but should provide for some forthcoming use cases including #787.

Two man reasons for the changes are:

  • We have a completely refactored architecture now which introduces a logic layer. These Auth changes are designed to better support the way we work with that layer.
  • Different CKAN extension apps may need radically different authentication/authorisation so we need to allow whatever we have to be override-able.

The first two changes revolve around the is_authorized method, which is called by the logic layer to ask whether a particular user (e.g. Bob) is allowed to do a certain action (e.g. edit) on a certain object (e.g. Package).

  1. The first thing the is_authorized method is a hook to a plugin

which *overrides* the current call with its own implementation (note: in previous discussions we have considered allowing a chain of plugins, no longer!)

Reason: authorization can be completely delegated to another system (or partially)

  1. is_authorized method currently takes (username, action, object)

but for action=create_package, the object supplied is System, and for action=edit the object supplied is the package. Instead action should always be the string name of a function in the logic layer and object should always be the object passed to that function. This means our auth system is based around the actual actions we are performing (rather than a model them) and with the actual data that forms the action (rather than a related object). You never need a System object in this model.

  1. Rename these two classes to better reflect what they are
  1. Rename the Editor role to PriveledgeUser? since Editors sometimes can't edit.

Although this sounds a bit radical we already have auth extensions.

Read-only CKAN Web UI

(Additional requirement from #764)

Whilse using CKAN web interface, you are not tempted to edit stuff:

  • You know at all times this CKAN is read-only
  • All editing facilities are still seen but greyed-out with an indication why it is.
1303117973000000 1311173649000000
#1093 defect dread dread ckan-v1.4-sprint-6 closed fixed 500 errors on GET to api/rest/licenses

CKAN gets its license list from a license service, which can be a local file, but is often the http://licenses.opendefinition.org/2.0/ckan_original server. This server is currently flakey, but I think we only request the list on start up. The problem is we query it much more often than required. It is queried for every request to api/rest/licenses, and we are returning lots of 500 errors when the license server is timing out.

1302862261000000 1302865470000000
#1092 defect kindly kindly ckan-v1.4-sprint-6 closed fixed refactor logic layer to seperate out api, form logic

The logic layer is a bit too api centric. Make the reusable parts separate in preparation for the wui refactor.

1302777929000000 1305570822000000
#1091 defect johnlawrenceaspden closed wontfix usernames of users logged in using open ids are strange

If I use my gmail openID to log into a CKAN instance, then my username is:

https://www.google.com/accounts/o8/id?id=AItOawnduohQ5RgXdPJKHiq-SIPbvCBqUaERuEQ

This seems a bit odd.

1302701460000000 1323102767000000
#1090 defect dread dread ckan-v1.4-sprint-6 closed fixed Visitor can't create packages on new CKAN install

Default visitor roles in default config is reader, not anon_editor.

Problem caused by changes in #1066 (released in 1.3.3)

New installs will be affected, although simple to just increase permissions when the installer realises a visitor can't create packages.

The solution to the config getting out of sync with the code like this is to not have the default_roles in the config - refer to the code in the configuration instructions.

1302635219000000 1302635699000000
#1089 enhancement dread dread ckan-v1.4-sprint-6 closed fixed Check for "--ckan" when running nosetests

(because if you forget, you get difficult to understand errors, and more than one person has tripped up on this)

1302631189000000 1302631733000000
#1088 defect wwaites ckan-v1.4 closed fixed content-type autonegotiation is wonky

in ckan/controllers/package.py around line 130 it does some strange things...

perhaps replace with https://github.com/wwaites/autoneg

and handle redirection of these content types:

application/rdf+xml
application/turtle
text/plain
text/x-graphviz
1302630261000000 1303035487000000
#1087 enhancement wwaites ckan-sprint-2011-11-21 closed fixed version and contact info api call

a simple api call that returns data like this:

{ "version": ckan_software_version,
  "contact": { "name": "Some Admin", "mbox": "[email protected]" },
  "description": "Site Description",
  "url": "http://canonical.name.ckan.net/"
1302628944000000 1320866159000000
#1086 defect thejimmyg johnlawrenceaspden closed wontfix no way to delete authorization groups from web interface

as title.

1302625333000000 1323346552000000
#1085 defect dread johnlawrenceaspden closed fixed local development copy of ckan depends on existence of ckan.net

ckan.net appears to have either gone down or be running ultra slowly.

this means that ckan copies running locally on my machine run very slowly indeed.

is this behaviour desirable?

This command finds lots of http://~~~ckan.net references in python, html and javascript files:

find ~/pyenv/src \( -name "*.py" -or -name "*.html" -or -name "*.js" \) -print0 | xargs -0 -e grep --color -nH -e "http://.*ckan.net"

output for reference:


/home/okfn/pyenv/src/ckan/ckan/init__.py:5:Network (CKAN) site: http://www.ckan.net. /home/okfn/pyenv/src/ckan/ckan/lib/create_test_data.py:346:<http://ckan.net/> /home/okfn/pyenv/src/ckan/ckan/lib/rdf.py:3:DOMAIN = 'http://ckan.net' /home/okfn/pyenv/src/ckan/ckan/lib/rdf.py:4:CKAN_NAMESPACE = 'http://ckan.net/#' /home/okfn/pyenv/src/ckan/ckan/lib/talis.py:60: 'ckan':'http://ckan.net/ns#', /home/okfn/pyenv/src/ckan/ckan/public/scripts/bookmarklet.js:2: f='http://ckan.net/package/new?url='+encodeURIComponent(window.location.href)+'&title='+encodeURIComponent(document.title); /home/okfn/pyenv/src/ckan/ckan/public/scripts/test_bookmarklet.html:16: addtockan.src='http://ckan.net/scripts/bookmarklet.js'; /home/okfn/pyenv/src/ckan/ckan/public/scripts/test_bookmarklet.html:27: <p><strong>Proper bookmarklet (compressed -- need to escape &amp;):</strong> <a href="javascript:(function(){f='http://ckan.net/package/new?url='+encodeURIComponent(window.location.href)+'&amp;title='+encodeURIComponent(document.title);if((n=document.getElementsByName('description')[0])&amp;&amp;(d=n.content)){f+='&amp;notes='+encodeURIComponent(d);}a=function(){if(!window.open(f)){location.href=f;}};if(/Firefox/.test(navigator.userAgent)){setTimeout(a,0)}else{a()}})()">Add to CKAN</a> /home/okfn/pyenv/src/ckan/ckan/templates/home/license.html:31: For convenience, all material - including all package, tag and revision information - is available in bulk, in the form of a full dump of the CKAN database. This (gzipped) dump file is updated daily and can be downloaded from <a href="http://www.ckan.net/dump/">http://www.ckan.net/dump/</a>. /home/okfn/pyenv/src/ckan/ckan/tests/dictization.py:71: 'notes': u'Some test notes\n\n### A 3rd level heading\n\nSome bolded text.\n\n*Some italicized text.*\n\nForeign characters:\nu with umlaut \xfc\n66-style quote \u201c\nforeign word: th\xfcmb\n \nNeeds escaping:\nleft arrow <\n\n<http://ckan.net/>\n\n', /home/okfn/pyenv/src/ckan/ckan/tests/dictization.py:137: 'notes': u'Some test notes\n\n### A 3rd level heading\n\nSome bolded text.\n\n*Some italicized text.*\n\nForeign characters:\nu with umlaut \xfc\n66-style quote \u201c\nforeign word: th\xfcmb\n \nNeeds escaping:\nleft arrow <\n\n<http://ckan.net/>\n\n', /home/okfn/pyenv/src/ckan/ckan/tests/dictization.py:447: 'notes': u'Some test notes\n\n### A 3rd level heading\n\nSome bolded text.\n\n*Some italicized text.*\n\nForeign characters:\nu with umlaut \xfc\n66-style quote \u201c\nforeign word: th\xfcmb\n \nNeeds escaping:\nleft arrow <\n\n<http://ckan.net/>\n\n', /home/okfn/pyenv/src/ckan/ckan/tests/dictization.py:458: 'notes': u'Some test notes\n\n### A 3rd level heading\n\nSome bolded text.\n\n*Some italicized text.*\n\nForeign characters:\nu with umlaut \xfc\n66-style quote \u201c\nforeign word: th\xfcmb\n \nNeeds escaping:\nleft arrow <\n\n<http://ckan.net/>\n\n', /home/okfn/pyenv/src/ckan/ckan/tests/functional/api/base.py:178: assert '"ckan_url": "http://test.ckan.net/package/annakarenina"' in msg, msg /home/okfn/pyenv/src/ckanclient/ckanclient/init__.py:116: api e.g. http://ckan.net/api rather than http://ckan.net/api/rest) /home/okfn/pyenv/src/ckanclient/ckanclient/init__.py:261: :param base_location: default *http://www.ckan.net/api* /home/okfn/pyenv/src/ckanclient/ckanclient/init__.py:267: base_location = 'http://www.ckan.net/api'


1302620434000000 1302625314000000
#1084 task wwaites wwaites ckan-v1.4-sprint-6 closed fixed ckan.net RDF links changed

need to make some changes for the links to semantic.ckan.net. it should use http://semantic.ckan.net/record/<package_id> now

append .rdf, .ttl, .nt, .dot, .json (even .html for an ugly table) to taste (or just leave off the suffix and let content negotiation take care of it)

the base url is changed, but it now uses id not name.

see for example:

1302616717000000 1304934534000000
#1083 defect johnlawrenceaspden johnlawrenceaspden ckan-v1.5-sprint-1 closed fixed userobjectroles added twice can't be deleted

the add_user_to_role/remove_user_from_role functions are asymmetrical in that the add function is happy to add the same role twice but the remove asserts that it's only in the table once and crashes if that's not true.

an attempt has been made to guard against this, but fails, I think because the add functions rely on the caller committing the change to the db.

same problem affects corresponding authorization_group functions

I'll try to sort this out. Making a note here.

1302550660000000 1305537827000000
#1082 defect johnlawrenceaspden closed fixed language changes behave strangely

Set language to Greek, flash message says 'Language set to: English', but page is now about half in Greek.

Set language back to English causes server error:

AttributeError?: 'NoneType?' object has no attribute 'path'

Module ckan.controllers.error:29 in document view

if original_request.path.startswith('/api'):

However going to a new page reveals that it's back to English

1302541989000000 1315917217000000
#1081 defect johnlawrenceaspden johnlawrenceaspden closed fixed can't remove user from authz group

I've found that if I make an authorization group I sometimes can't remove myself from it. I've no idea why. I can add and remove other users. I'll investigate, just making a note of it here.

1302541056000000 1303489474000000
#1079 enhancement kindly rgrp ckan-v1.4-sprint-5 closed fixed Refactor API to use new logic layer and dictization
  • Convert current api saves to the new standard dict format.
1302509530000000 1302777504000000
#1078 enhancement kindly rgrp ckan-v1.5 closed fixed Refactors WUI controllers and forms to use logic layer
  • Deserialize forms to new dict format.
  • Replace controllers/forms to use dictization.
1302509347000000 1305828973000000
#1077 enhancement kindly rgrp ckan-backlog new Move to simpler vdm system

Option 1: 'Changeset' Model

See ticket:1135 for vdm ticket. This would involve a) moving to changeset in vdm b) doing the migration in ckan to support this.

Have developed a new "changeset" based model for revisioning in vdm.

Implementation

  • The main challenge with this change is schema and data migration

Every revisioned object has a revision_id and revision attribute.

Approximate algorithm:

Revision -> Changeset

for revtype in [PackageRevision, ...]:
    for pkgrev in package_revision:
        changeset = lookupchangeset(package_revision)
        ChangeObject(cset, (table, id), dictize(pkgrev))

Question:

  • does pkg include tags attributes or not? or we have to dictize, pkgrev, pkg2tagrev, and tag. Probably the latter.

Option 2: Simplify Revision Object Model

Just use a simpler vdm, see ticket:1136 (move to SessionExtension) and ticket:1137 (remove need for statefulness in vdm).

Discussion

Advantage of Option 1 versus 2:

  • Easier support for pending state and similar behaviour
  • No need to introduce new tables (and hence migrations) when making something revisioned (or not).

Disadvantages

  • Migration is required
  • More difficult to query revision history.
    • Could be addressed by having ChangeObject have separate cols for table name and id but would likely be more difficult.
  • Performance (?)
    • Have one big ChangeObject table to query when looking at changed objects rather than many revision tables.
      • Not sure this is a biggie as even with Revision model biggest revision object tables are probably on the order of the ChangeObject table

Conclusion

Implement Option 2 and leave Option 1 for present.

Option 1 includes Option 2 so it seems that that is required in either case (so we may as well with Option 2).

Option 1 requires significant effort (esp migration) so leave for present and then review the situation at some later date.

1302304464000000 1340034345000000
#1076 enhancement johnlawrenceaspden rgrp ckan-v1.4 closed fixed Improve revision and package purge system

Purging Revisions

  • Delete button displayed on:
    • /revision/list
    • (/package/history)
      • /package/history is problematic because html does not allow nested forms and we already have form for doing diff/comparison.
    • /revision/{id}
  • Delete button submits to delete action on revision and changes revision state to 'deleted'.
    • undelete button now displayed and revisions are marked as deleted in some way (e.g. greyed out?)
  • Sysadmins then visit /ckan-admin/trash which lists all revisions with deleted state. There is a large button: "Empty trash" (irreversible). Click button purges all revisions with deleted state.

Purging Packages

  • Put into deleted state.
  • Listed on /ckan-admin/trash
  • Separate Empty trash button which deletes all associated revisions.
    • Should be separate from Empty trash for revisions

Current system

  • Single purge link on revision listing if a sysadmin which permanently purges the revision and all associated changes (without confirmation atm!)
1302283442000000 1303236302000000
#1075 enhancement johnlawrenceaspden rgrp ckan-v1.4-sprint-6 closed fixed Administrative dashboard - Edit Authorization related to System object

Roles on System object are important because admin role on system equates to being a 'sysadmin' (i.e. able to do anything).

  • Make users sysadmin (either as separate action or as part of editing roles on system object)
  • /authz subpage for editing roles on system object
    • Add and update user roles
    • Add and update authz group roles
    • List actions associated to roles at top of page (extra points for checkbox table with editability)
  • Document on http://wiki.ckan.net/Authorization what roles on System object 'mean' esp sysadmin role on System

Related Tickets

  • super ticket: #833
  • authz lib improvement and authztool refactor #1050
1302279799000000 1303227982000000
#1074 enhancement rgrp rgrp ckan-v1.5 closed fixed Refactor authz web user interface to have common code and templating

Currently repeat the same template and code across Package Authz, Group Authz, and Authz Group authz.

Having now implemented a new, cleaner setup in ckanext-admin we should port this back into core.

  • Common template code (checkbox template)
  • Logic code (or just common code) for wiring into authz system
  • Look for all places thoroughout the system where usernames, authzgroups or groups need to be typed into boxes, and make sure that they auto-complete appropriately.

Will also deliver a significant improvement in the form of ajax user lookup.

1302271586000000 1314303581000000
#1073 enhancement dread dread ckan-v1.4-sprint-5 closed fixed Search index checker

Tool that checks which packages have not been indexed.

Required for DGU: https://trac.dataco.coi.gov.uk/projects/datagov/ticket/940

1302185444000000 1302185825000000
#1072 enhancement dread dread ckan-v1.4-sprint-5 closed fixed Add filters to authztool

It takes several minutes to print the 'rights' on DGU, which is annoying when you only want to grep for a few lines. Much quicker than grepping is to filter in the query.

1302106311000000 1302106474000000
#1071 defect dread dread ckan-v1.4-sprint-5 closed fixed Package history API moved to /api/rest/package/revisions

api/rest/package_history is not RESTful or follow API naming conventions. Therefore move it to /api/rest/package/revisions

Also, API docs incomplete.

1301937882000000 1301943180000000
#1070 enhancement rgrp rgrp ckan-v1.5 closed fixed Plan a new domain model and layer architecture for CKAN

See http://wiki.ckan.net/Domain_Model especially section on v2.

  • New domain model is planned but not yet finally agreed.
  • Layer architecture is complete and implemented
1301910940000000 1310117129000000
#1069 enhancement tobes rgrp assigned Stub datasets (request for datasets)

Idea is to have stubs for datasets that someone wants but don't yet exist (or haven't been discovered) in the way one has stub pages on a wiki.

We could do this within the existing model by a slight 'abuse' - create a dataset and mark it with a special tag e.g. todo.does-not-yet-exist or similar ...

(Just as we have datasets listed that exist but aren't available ...)

Alternative would be to have a request for datasets subsystem.

I prefer the stub dataset model because it's simpler, provides a simple workflow (as a dataset is found or comes into existence), and the package page provides a natural space in which to accumulate information about what is wanted and what exists.

Implementation

  • Agree a new dedicated tag. e.g. todo.does-not-exist
1301666919000000 1340632215000000
#1068 defect dread dread ckan-v1.4-sprint-5 closed fixed metadata_modified problem

This test has been failing since the clocks changed:

======================================================================
FAIL: ckan.tests.models.test_package.TestPackageRevisions.test_02_metadata_created_and_modified
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/dread/hgroot/pyenv-ckan2/lib/python2.6/site-packages/nose-0.11.3-py2.6.egg/nose/case.py", line 186, in runTest
    self.test(*self.arg)
  File "/home/dread/hgroot/ckan2/ckan/tests/models/test_package.py", line 283, in test_02_metadata_created_and_modified
    assert out == exp, (out, exp)
AssertionError: (datetime.datetime(2011, 4, 1, 10, 45, 50, 875509), datetime.datetime(2011, 4, 1, 9, 45, 50, 875509))

----------------------------------------------------------------------
1301652085000000 1302109505000000
#1067 enhancement dread dread ckan-v1.4-sprint-5 closed fixed CLI for loading/dumping complete databases

Use 'db dump' and 'db load' for 'pg_dump' and 'psql -f' of a database. Use pylons config to find out database options.

1301645463000000 1302186503000000
#1066 enhancement dread dread ckan-v1.4-sprint-5 closed fixed Default reader role too permissive

The definition of the 'reader' role includes creating packages, which is too permissive for some CKAN instances (e.g. DGU). 'Reader' suggests only reading, so I think this role should avoid creating and editing.

All projects so far want all roles to be able to create users, so this stays as a Reader action for now, as a convenience.

Implementation:

  • Action.PACKAGE_CREATE removed from reader's default_role_actions
  • Visitor has a new default role, called 'anon_editor' which can edit packages, but not groups / auth groups - you have to log in for that.
  • Migration script not needed?
  • Code comments written, to make clear the suggested policy
1301645250000000 1301932136000000
#1065 enhancement zephod johnlawrenceaspden ckan-v1.6 closed fixed [super] Change Authorization System

Child tickets

  • #1198 Publisher hierarchy
  • #1050 Authz lib improvement and refactor of ckan/lib/authztool.py
  • #1004 Group creation instructions missing
  • #1099 Strange interactions between two browsers while playing with authz groups
  • #1115 can have two authzgroups with the same name
  • #1133 command line rights manipulation doesn't work
  • #1138 minor navigations behave inconsistently

Old ticket description:

  1. Change name of AuthzGroup? to UserGroup? to reflect what it is for
  1. Get rid of Roles, and replace them with direct assignment of actions, even though there are many actions, and extensions can add arbitrary ones.
    • Debatable whether we should cut the number of actions to correspond to the three roles defined by the base system.
    • Have a method of finding roles (or, in future, actions) relevant to a given protection object (e.g. FILE-UPLOAD(ER) not relevant to Packages)
  1. Change UserGroups? so that they can have a hierarchical structure,

More info on Hierarchy change

e.g. UserGroup? NHS contains the User nhsysadmin, as well as the UserGroups? SURREY and BERKS, which themselves contain users.

One user in SURREY is Simon the Sysadmin, who has permissions on the whole system. His permissions should not leak out to other users or groups, and user permissions generally should not.

Each Group has permissions over various objects.

A user has permissions in his own right, and also has the permissions of his own group, and of all the groups contained in his group, and so on recursively.

Algorithm:

possible(user, action, package):

if user has permission for action on package

or any of have that permission

or any of his groups group-children (but not user-children), and so on recursively have the permission.

1301508331000000 1324550041000000
#1064 defect amercader closed duplicate Remove Workers from ckanext-queue

The current implementation of Workers in ckanext-queue is broken. Basically the various consume / callback functions expect three arguments (routing_key, operation, payload) when they are in fact receiving only two of them (message_data, message). This is fairly easy to fix, but the question is if Workers add an extra complexity to use the messaging library directly.

1301417891000000 1323169787000000
#1063 defect sebbacon sebbacon closed fixed Groups listing widget on package screen shouldn't show group name by default

I've been asked if we can do something about the overflow of the Group name in the right hand column on this page:

http://register.data.overheid.nl/package/europese-aanbestedingen

The reason is that the list display for groups is in the form "group_tltie (group_name)", and of course group_name can't have spaces and so can't wrap nicely.

I was wondering if there's a good reason why we don't only display group_title (if it exists) and group_name only when there's not a title?

1301408459000000 1302514033000000
#1062 defect johnglover sebbacon ckan-backlog assigned Data preview encoding error

The preview of "Species Misc Turtle Download" at http://ckan.net/package/taxonconcept results in the following error:

Unable to Preview - Had an error from dataproxy: Data Transformation Error (Data transformation failed. Reason: 'utf8' codec can't decode byte 0x8b in position 1: unexpected code byte

1301396143000000 1311773731000000
#1061 enhancement dread dread ckan-v1.4-sprint-5 closed invalid Orphaned home/license page

No links to home/license and it contains out of date references to knowledgeforge. Remove it.

1301392968000000 1301922350000000
#1060 defect dread ckan-v1.4-sprint-5 closed fixed Spreadsheet importer tries to import readonly keys

e.g. we just added notes_rendered and that is read in as an extra field. Tests failing in ckanext-importlib

Also related: we are missing lost metadata_created and metadata_modified in the dumps.

1301312210000000 1301312487000000
#1059 defect dread dread ckan-v1.4-sprint-5 closed fixed Loader coping better with poor search indexing

Loader currently checks for same name, but also should check for name_, name etc.

1301310596000000 1301312516000000
#1058 defect dread dread ckan-v1.4-sprint-4 closed fixed Give 400 error (not 500) for invalid locale or package_form

Examples which prompt annoying exception emails:

http://ckan.net/locale?locale=ja
Module ckan.i18n:21 in set_session_locale
           assert locale in _KNOWN_LOCALES

A bot has caused these:

http://ca.ckan.net/package/new?package_form=gov
Module ckan.forms.registry:32 in get_fieldset
               raise ValueError('Could not find package_form name %r in those found: \n%r' % (package_form, [en.name for en in entrypoints]))
ValueError: Could not find package_form name u'gov)' in those found: ['gov', 'standard', 'ca']
1301302303000000 1301303315000000
#1057 defect dread closed fixed JSONP parameter isn't escaped
$ curl "http://127.0.0.1:5000/api/rest/package/annakarenina?callback=<script>jsoncallback"

gives:

<script>jsoncallback({"id": "c10ebd31-5b45-4f6f-885d-dca9b18caec4", "name": "annakarenina", "title": "A Novel By Tolstoy",

which could run script code in the client who made the call.

One idea for filtering: http://tav.espians.com/sanitising-jsonp-callback-identifiers-for-security.html Maybe just better to have a restricted whitelist of characters to be even more sure.

Same as: https://trac.dataco.coi.gov.uk/projects/datagov/ticket/906

1301078389000000 1329150236000000
#1056 defect dread pudo ckan-v1.4-sprint-6 closed fixed User links for OpenID users are broken

Use case:

  • Login using OpenID
  • Click on 'My account' - results in 404

Solutions:

  • User user.id instead of their name
  • Escape the URL properly.
1301060249000000 1302882616000000
Note: See TracReports for help on using and creating reports.