Ticket #187 (closed enhancement: fixed)

Opened 4 years ago

Last modified 4 years ago

Full-text search

Reported by: rgrp Owned by: rgrp
Priority: critical Milestone: v0.11
Component: ckan Keywords:
Cc: Repository:
Theme:

Description (last modified by dread) (diff)

Standard search should search notes field in addition to name, title and tags (discussed in ticket:108 but not done). For this to work we need proper text search since o/w we get poor ordering and lots of bad results.

If we do this we need:

  1. To weight across fields in a sensible way
  2. We can also use proper text search on title or ...

Easiest way to do this is to use existing facilities in dbs e.g. postgres has full text support since 8.3: http://www.postgresql.org/docs/8.3/static/textsearch.html

Using this with sqlalchemy: http://lowmanio.co.uk/blog/entries/postgresql-full-text-search-and-sqlalchemy/

Issues with fulltext search:

  • tags not indexed, so would need to 'or' search of tags. This would cause problems with the order_by of the query, since the tags wouldn't have a ranking.
  • if tags are indexed then perhaps we don't want them converted into lexemes? Exact match could well be better.
  • can we split the name on dash or underscore before being indexed?
  • natural language search doesn't do partial words, so search for 'gov' doesn't bring up 'government'.
  • do we keep the existing search system usable with a config file switch for if we install on a db aside from postgres?
  • we want to weight name and title higher than other fields - achievable with custom trigger.

Change History

comment:1 Changed 4 years ago by dread

  • Description modified (diff)

comment:2 Changed 4 years ago by dread

Done in cset:af3cbf266750 and e70af291455e.

Issues addressed:

  • tags ARE indexed
  • if tags are converted into lexemes but we also search on exact match.
  • name is split on dash when indexed by postgres.
  • weight name and title higher than other fields.

Remaining issues:

  • natural language search doesn't do partial words, so search for 'gov' doesn't bring up 'government'.
  • previous search system not yet usable with a config file switch (for if we install on a db aside from postgres)

comment:3 Changed 4 years ago by dread

  • Status changed from new to closed
  • Resolution set to fixed

comment:4 Changed 4 years ago by dread

Cost: 16h

comment:5 Changed 4 years ago by rgrp

  • Milestone changed from v1.0 to v0.11
Note: See TracTickets for help on using tickets.