Ticket #2733 (new enhancement)

Opened 22 months ago

Last modified 21 months ago

Datastore logic functions

Reported by: johnglover Owned by: johnglover
Priority: major Milestone: ckan-v1.9
Component: ckan Keywords:
Cc: Repository: ckan
Theme: none

Description (last modified by johnglover) (diff)

Where does the data go?

In a postgres database configured by the ckan.datastore_write_url config option which is a sqlalchemy url.

The user should have rights to create tables.

Whats the api like?

We will just implement it as logic functions like the rest of CKAN and will part of core. After that we may add some nicer api functions that use these but that is a secondary concern.

What are the initial logic functions?

  • datastore_create
  • datastore_delete
  • datastore_show

What is the JSON input format for datastore_create

To begin with it can have the following keys. It is fairly consistent with Max Ogdens' gut servers. Except adds resource_id.

resource_id: resource_id # the data is going to be stored against.
fields: a list of dictionaries of fields/columns and their extra metadata.
records: a list of dictionaries of the data eg  [{"dob": "2005", "some_stuff": ['a', b']}, ..]
  • The first row will be used to guess types not in the fields and the guessed types will be added to the headers permanently. Consecutive rows have to conform to the field definitions.
  • rows: can be empty so that you can just set the fields
  • fields are optional but needed if you want to do type hinting or add extra information for certain columns or to explicitly define ordering.

eg: [{"id": "dob", "type": "timestamp" }, {"id": "some_stuff", "type": "text"}, ...]. A header items values can not be changed after it has been defined nor can the ordering of them be changed. They can be extended though.

  • Any error results in total failure!! For now pass back the actual error.
  • Should be transactional

What json does datastore_delete take?

resource_id: resource_id # the data is going to be deleted.
filters: dictionary of matching conditions to delete
    e.g  {'key1': 'a. 'key2': 'b'}  this will be equivalent to "delete from table where key1 = 'a' and key2 = 'b' ".
    No filters (either not present or not defined) then delete the table. If we want truncate then add truncate: true to truncate the table.

What json does datastore_search take?

resource_id: resource_id # the data is going to be selected.
filters : dictionary of matching conditions to select
    e.g  {'key1': 'a. 'key2': 'b'}  this will be equivalent to "select * from table where key1 = 'a' and key2 = 'b' "
q: full text query
limit: limit the amount of rows to size default 100
offset: offset the amount of rows
fields:  list of fields return in that order, defaults (empty or not present) to all fields in fields order.
sort: comma separated field names with ordering e.g "fieldname1, fieldname2 desc"

Some free code: https://gist.github.com/3163864

What json does datastore_search return?

fields: same type as datastore_create accepts (i.e. with metadata)
offset: The same offset that was supplied in datastore_show
limit: The original limit
filters: The filters that were applied in data_show
total: # total matching records without size or offset
records: [same as data_create] # list of matching results

On error will return:

__error__: … sql error …

What types are allowed?

Aim to support as many postgres/postgis types that have string representations.




Each row in a table will be given an _id column which has an id generated by us which you can use in queries.

Other Features

Each row will store the _full_text index of all the data in the row. At some later point there will most likely be a way to index fields add constraints etc.

Change History

comment:1 Changed 21 months ago by johnglover

  • Description modified (diff)
Note: See TracTickets for help on using tickets.