Ticket #888 (closed enhancement: fixed)
Improvements to the dataproxy and the data API
Reported by: | rgrp | Owned by: | johnglover |
---|---|---|---|
Priority: | major | Milestone: | ckan-sprint-2011-10-28 |
Component: | ckan | Keywords: | |
Cc: | Repository: | ckan | |
Theme: | none |
Description
First version of dataproxy and data API working (ticket:698) but have identified a variety of important improvements. (Should split these into sub-tickets ...):
For dataproxy:
- Testing for dataproxy
- Can start by using known good remote urls (moving forward could switch to providing/mocking these locally)
- Remove content-lenght for csv requirement: just read the first x rows (up to some configurable maximum)
- Google docs style row/column selections
- Use the swiss library - https://bitbucket.org/okfn/swiss
- Support google docs spreadsheets (format = service/gdocs/ccc or gdocs/ccc or gdocs/spreadsheet)
- Handle redirects for content-length?
- Ignore resource type if not recognized and fall-back to trying to identify from extension (or mime-type?)
For dataapi:
- Ensure we pass on resource format as part of redirect i.e. /api/data/{id} -> {dataproxy}?url={resource-url}&type={resource-type}
Change History
comment:2 Changed 3 years ago by thejimmyg
- Owner changed from Stiivi to thejimmyg
- Repository set to ckan
- Theme set to none
- Status changed from new to assigned
I don't think any progress has been made on this for a bit so I'm assigning it to me.
comment:3 Changed 3 years ago by shevski
- Owner changed from thejimmyg to johnglover
- Milestone changed from ckan-v1.5 to ckan-current-sprint
comment:4 Changed 3 years ago by johnglover
- Status changed from assigned to closed
- Resolution set to fixed
Dataproxy / Dataapi now deprecated in favour of the combination of new QA archive / process commands and the webstore.
Changes in relation to Dataproxy / Dataapi:
- Currently only supports CSV files, but plans to add support for excel and google docs spreadsheets soon.
- Uses David Raznick's CSV parser instead of Brewery for parsing, handles messy CSV data better.
Changes in relation to old QA functionality:
- decoupled archiving (downloading) and QA process
- added a new 'process' command which parses downloaded files and adds them to a local webstore
Closing for now, any improvements/feature requests should be in tickets relating to either the QA functionality or the webstore.
Note: See
TracTickets for help on using
tickets.
Chages to Data Proxy:
Changes: https://bitbucket.org/Stiivi/dataproxy/changeset/fccbdd275be5
Data information: http://databrewery.org/doc/data_quality.html#brewery.dq.FieldStatistics