Uploaded image for project: 'Service Packs and Hot Fixes'
  1. Service Packs and Hot Fixes
  2. MNT-3619

Lowering of txnCeiling is possiby too enthusiastic

    Details

      Description

      Following on from the changes made in ALF-5141 and ALF-5249 to limit the number of incoming webscript connections at times of high load...

      The ceiling is lowered and raised using:
      txnCeiling = Math.max(1, txnCountWhenStarted - 1);
      and
      txnCeiling = txnCountWhenStarted + 1;

      By definition, a lowering of the ceiling will only happen a long time after the Txn that is doing the lowering was started, and conversely, a raising of the ceiling will only happen very soon after the txn started.

      This means that a lowering of the ceiling will always cause the ceiling level to ignore any successful transactions that have taken place (successfully) since the slow transaction started.
      I think this can lead to over-enthusiastic lowering of the ceiling in response to a single slow transaction (especially if it is very slow). I can't think of any way around it though....

      E.g.
      Say you have ten transactions where only the third one takes a long time (a very long time) and the others are all dealt with within the specified limit.
      Let's say before the ten transactions, the txnCeiling is at X
      When the third txn starts, the ceiling will be a X+2
      By the time the third txn finishes, the next 7 txns will also be finished within the specified time, so the txnCeiling will be at X+9
      Because the third txn took so long, it will lower the ceiling again, but instead of lowering it to X+8, it will lower it to X+1 - i.e. one less than when it started.

      I'm not actually sure if this is a problem, but given graphs of the txnCeiling when connections are rejected due to too many txns (from the customers data under load testing), we see saw tooth traces, with very deep drops and small (1) increments upwards. (even though these traces are only showing the txnCeiling at the point of rejection of connections).

        Attachments

          Issue Links

            Activity

            Hide
            mrogers Mark Rogers [X] (Inactive) added a comment -

            I've got some big transactions, in particular transfer and wcm deployment. Do we need to consider marking them in some way to prevent the txnCelieng being changed when these big slow transactions do their stuff?

            Show
            mrogers Mark Rogers [X] (Inactive) added a comment - I've got some big transactions, in particular transfer and wcm deployment. Do we need to consider marking them in some way to prevent the txnCelieng being changed when these big slow transactions do their stuff?
            Hide
            dward Dave Ward [X] (Inactive) added a comment -

            Mark: no because you're not using the txnHelper associated with the webscript container.

            Show
            dward Dave Ward [X] (Inactive) added a comment - Mark: no because you're not using the txnHelper associated with the webscript container.
            Hide
            dward Dave Ward [X] (Inactive) added a comment -

            Now there is no ceiling and we just monitor all transaction start times. If a request comes in and the oldest incomplete transaction is older than the threshold then the request is rejected.

            Show
            dward Dave Ward [X] (Inactive) added a comment - Now there is no ceiling and we just monitor all transaction start times. If a request comes in and the oldest incomplete transaction is older than the threshold then the request is rejected.
            Hide
            dward Dave Ward [X] (Inactive) added a comment -

            For retest on V3.3.4 build 222

            Show
            dward Dave Ward [X] (Inactive) added a comment - For retest on V3.3.4 build 222
            Hide
            alfrescoqa Alfresco QA Team added a comment -

            Validated against 3.3.4.222 and 3.4.0.3262

            Show
            alfrescoqa Alfresco QA Team added a comment - Validated against 3.3.4.222 and 3.4.0.3262

              People

              • Assignee:
                closedbugs Closed Bugs
                Reporter:
                ahunt Andrew Hunt
                My watchers:
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: