Ticket #868 (closed enhancement: fixed)

Opened 3 years ago

Last modified 3 years ago

Test improvements

Reported by: thejimmyg Owned by: sebbacon
Priority: critical Milestone:
Component: ckan Keywords:
Cc: Repository:
Theme:

Description

The tests currently take 41 mins to run on most laptops. This slows down development and discourages a test-driven approach.

We'd like to see the tests pass in 5 mins or less (but anything would be an improvement!)

Some suggestions for achieving this include:

  • Upgrading the entire codebase to SQLAlchemy 0.6 so that tests could run against an in-memory SQLite database
  • Not setting up and tearing down the database so frequently.

Attachments

ckan test speed times after 0.5 upgrade.csv (77.9 KB) - added by kindly@… 3 years ago.
speed per test after 0.57 upgrade
no_autoflush_deletes.diff (3.3 KB) - added by kindly@… 3 years ago.
This is a patch so that we do not need to monkeypatch sqlalchemy dialects
vdm_purge_no_autoflush.diff (557 bytes) - added by kindly@… 3 years ago.
This is the corrisponding patch for the vdm, so we dont need to monkeypatch sqlalchemy
postgres_speed.diff (1.9 KB) - added by kindly@… 3 years ago.
Speed up postgres by making sure postgres does not drop and reacreate each time.

Change History

comment:1 Changed 3 years ago by kindly@…

Below is a patch to make the tests run at least 2.5 times faster (about 15 mins on my old laptop). Instead of dropping the tables each time, it just deletes everything in them, using a low level connection. All the tests pass this way. It's a surprisingly clean patch. Here are a few points concerning it.

  • I tested truncating the tables but it's slower. If there are any big tables in the tests this way is the fastest (faster than drop).
  • The sequences (id columns) will start from where they left off.
  • I also investigated making postgres template database and cloning it, but the complication was not worth it.
  • sqlalchemy iterates the tables in reverse dependency order, which make this possible.
  • I targeted rebuild_db as that what most of the tests I saw where using, however I have not checked all tests to see if they all are.
  • There is a slight hack on the repo object to make sure it knows that "clean_db" is coming from the tests.
  • I refactored init_db for code reuse.
  • I have not done a version check. sqlalchemy >= 0.5 do this in a different way as outlined in the comments.
diff -r 7f2239b0f743 ckan/model/__init__.py
--- a/ckan/model/__init__.py	Fri Dec 17 10:34:47 2010 +0000
+++ b/ckan/model/__init__.py	Mon Dec 20 23:25:04 2010 +0000
@@ -41,6 +41,9 @@
 
     def init_db(self):
         super(Repository, self).init_db()
+        self.add_initial_data()
+
+    def add_initial_data(self):
         # assume if this exists everything else does too
         if not User.by_name(PSEUDO_USER__VISITOR):
             visitor = User(name=PSEUDO_USER__VISITOR)
@@ -69,6 +72,26 @@
         import migrate.versioning.api as mig
         version = mig.version(self.migrate_repository)
         return version
+    
+    def clean_db(self):
+        # delete only added for tests
+        if hasattr(self, "delete_only") and self.delete_only:
+            self.delete_all()
+        else:
+            super(Repository, self).clean_db()
+    
+    def delete_all(self):
+        
+        self.session.remove()
+        ## use raw connection for performance
+        connection = self.session.connection()
+        ## sqla sorts in reverse dependancy order.
+        ## in >= 0.5 use reversed(metadata.sorted_tables()) instead of table_iterator
+        for table in self.metadata.table_iterator():
+            connection.execute('delete from "%s"' % table.name)
+        self.session.commit()
+
+        self.add_initial_data()
 
     def setup_migration_version_control(self, version=None):
         import migrate.versioning.exceptions
diff -r 7f2239b0f743 ckan/tests/__init__.py
--- a/ckan/tests/__init__.py	Fri Dec 17 10:34:47 2010 +0000
+++ b/ckan/tests/__init__.py	Mon Dec 20 23:25:04 2010 +0000
@@ -55,6 +55,7 @@
 
 import ckan.model as model
 model.repo.rebuild_db()
+model.repo.delete_only = True
 
 class BaseCase(object):

comment:2 Changed 3 years ago by sebbacon

  • Owner changed from thejimmyg to sebbacon

See also #867

Thanks for the patch.

comment:3 Changed 3 years ago by sebbacon

I mean #876, of course.

Changed 3 years ago by kindly@…

speed per test after 0.57 upgrade

comment:4 Changed 3 years ago by anonymous

Attached are the timings I have for the tests after I upgraded to 0.57 and after a few simple test tweaks. They do not include setup and teardown time at the class level as they are not assignable to individual tests.

Changed 3 years ago by kindly@…

This is a patch so that we do not need to monkeypatch sqlalchemy dialects

Changed 3 years ago by kindly@…

This is the corrisponding patch for the vdm, so we dont need to monkeypatch sqlalchemy

Changed 3 years ago by kindly@…

Speed up postgres by making sure postgres does not drop and reacreate each time.

comment:5 Changed 3 years ago by dread

  • Status changed from new to closed
  • Resolution set to fixed

I've merged in David Raznick's patches:

  • no_autoflush_deletes.diff cset:2b9591172182
  • postgres_speed.diff cset:fa1b7e3a4e0f
  • vdm_purge_no_autoflush.diff vdm cset 8accdd0b9b7f

I've also merged in Seb's fork: cset:68d63fda4814 which closes this ticket, achieving test speeds of under 3 minutes!

Note: See TracTickets for help on using tickets.