id,summary,reporter,owner,description,type,status,priority,milestone,component,resolution,keywords,cc,repo,theme
235,Resource format normalization and detection,dread,tobes,"Try to gather proper MIME  information for all package resources in CKAN. This is a shared ticket with dcat-tools (https://bitbucket.org/pudo/dcat-tools), i.e. opendatasearch.org. This can then also be used by ckanrdf, the CKAN RDF conversion service. 

Sub-tasks: 

 * Create a Google Spreadsheet with two Worksheets: ""MIME-Mappings"", i.e. ""CSV"" -> ""text/csv"" and ""Name mappings"", i.e. ""text/csv"" -> ""Comma-Separated Spreadsheet"". 
 * Collect and map surface forms from all CKANs
 * Access this via Swiss and apply, store as a PackageResource extra field pending #826 (Resource extras). 
 * Add heuristics for format auto-detections: 
  * Map well-known file extensions 
  * Recognize obvious magic (Zip, Tar)
  * Peek into Zipfile/Tarfiles
 * Define a convention for generic data types (many CKAN packages have only ""Spreadsheet"" defined, either detect specific type or set MIME to */tabular-data or similar)
 * See also: #816 (Autocomplete for the resource format field)",enhancement,assigned,awaiting triage,ckan-v1.9,ckan,,,,ckan,none
