Uploaded image for project: 'Service Packs and Hot Fixes'
  1. Service Packs and Hot Fixes
  2. MNT-11619

mysql upgrade script NodeDeleted.sql algorithm not efficient enough

    Details

    • Type: Service Pack Request
    • Status: Closed
    • Resolution: Fixed
    • Affects Version/s: 4.2.1
    • Fix Version/s: 4.2.3
    • Component/s: Upgrades
    • Labels:
      None
    • Environment:
      any 4.0 to 4.2 upgrade with mysql

      Description

      How to reproduce?
      ================

      1) During an upgrade from 4.0.2.26 to 4.2.1 the script
      classes/alfresco/dbscripts/upgrade/4.1/org.hibernate.dialect.MySQLInnoDBDialect/NodeDeleted.sql
      is run.

      The aim of the script is to remove a column node_deleted in the 4.0 alf_node table to code instead the information in the qname column.

      To do this, the current (simplified) algorithm used is:

      a) create a new table
      b) select all the columns but node_deleted of the old table into the new table altering the qname column if the node is deleted
      c) rebuild all the constraints and indexes

      2) new proposed alogirithm:

      a) do an in-place update of qname column only for the deleted nodes
      b) drop the node_deleted column.

      3) measurements:

      To measure the algorithms speeds I prefered to create a test mysql db rather than a full alfresco system.

      a) create data files to inject 50000000 rows into mysql (as you did not answer my question about how many deleted nodes you have I made the test with 10% of deleted nodes):

      To do this I ran the python script:


      #!/usr/bin/python3.2

      filenb=50
      perfile=1000000
      id=0
      for j in range(0,filenb):
      print("file",j)
      f=open('/tmp/file'str(j)'.sql','w')

      for i in range(0,perfile):
      id=id+1
      if i%10 == 0:
      deleted='1'
      else:
      deleted='0'
      f.write(""str(id)"\t"deleted"\t1\t'some text'\n")
      f.close()


      (I had to split into 50 files as mysql seems to have problems injecting larger files)

      b) create a test table in a test database:

      create table mytable (id int, isdeleted boolean, qname int, other varchar(255));

      c) inject the data into mysql running a series of:

      LOAD DATA INFILE '/tmp/file0.sql' INTO TABLE mytable;
      ...
      LOAD DATA INFILE '/tmp/file49.sql' INTO TABLE mytable;

      d) create indexes:

      create index idx_id on mytable (id);
      create index idx_isdeleted on mytable (isdeleted);

      e) then test the speed of the algorithms:

      current algorithm:
      -----------------
      create table mytable2 (id int, qname int,other varchar(255));
      insert into mytable2 (id,qname,other) (select id,(case when isdeleted then 2 else qname end),other from mytable);

      Query OK, 50000000 rows affected (15 min 50.77 sec)

      You will then need to create indexes on the new table taking approx. 5 minutes for each column.

      proposed algorithm (in place update):
      ---------------------------------------
      update mytable set qname=2 where isdeleted;

      Query OK, 5000000 rows affected (5 min 9.92 sec)

      Result:
      -------
      New algorithm 4 times faster (at least) than old algorithm

      Notes:
      ======
      1) customer has 55+ millions nodes and upgrade time is longer than the maximum downtime time allowed for their application. Dividing the time by 4 or 5 would make it acceptable.
      2) see also MNT-10402
      3) customer raised a ticket with Mysql Support who proposed another way to do in-place update:

      "...in short dropping or adding an FK constraint is an on-line inplace operation, but you need to disable foreign_key_checks in order to get that benefit, see: http://dev.mysql.com/doc/refman/5.6/en/server-system-variables.html#sysvar_foreign_key_checks

      I recommend that you alter their SQL script and disable foreign_key_checks for the duration of the script, then re-enable them at the end, this will reduce the number of table copies and should greatly reduce the time it takes to run your upgrade script."

        Attachments

          Issue Links

            Structure

              Activity

                People

                • Assignee:
                  closedbugs Closed Bugs (Inactive)
                  Reporter:
                  amadon Alex Madon [X] (Inactive)
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  4 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:

                    Time Tracking

                    Estimated:
                    Original Estimate - Not Specified
                    Not Specified
                    Remaining:
                    Remaining Estimate - 0 minutes
                    0m
                    Logged:
                    Time Spent - 2 days, 4 hours
                    2d 4h

                      Structure Helper Panel