Service Packs and Hot Fixes
  1. Service Packs and Hot Fixes
  2. MNT-3619

Lowering of txnCeiling is possiby too enthusiastic

    Details

      Description

      Following on from the changes made in ALF-5141 and ALF-5249 to limit the number of incoming webscript connections at times of high load...

      The ceiling is lowered and raised using:
      txnCeiling = Math.max(1, txnCountWhenStarted - 1);
      and
      txnCeiling = txnCountWhenStarted + 1;

      By definition, a lowering of the ceiling will only happen a long time after the Txn that is doing the lowering was started, and conversely, a raising of the ceiling will only happen very soon after the txn started.

      This means that a lowering of the ceiling will always cause the ceiling level to ignore any successful transactions that have taken place (successfully) since the slow transaction started.
      I think this can lead to over-enthusiastic lowering of the ceiling in response to a single slow transaction (especially if it is very slow). I can't think of any way around it though....

      E.g.
      Say you have ten transactions where only the third one takes a long time (a very long time) and the others are all dealt with within the specified limit.
      Let's say before the ten transactions, the txnCeiling is at X
      When the third txn starts, the ceiling will be a X+2
      By the time the third txn finishes, the next 7 txns will also be finished within the specified time, so the txnCeiling will be at X+9
      Because the third txn took so long, it will lower the ceiling again, but instead of lowering it to X+8, it will lower it to X+1 - i.e. one less than when it started.

      I'm not actually sure if this is a problem, but given graphs of the txnCeiling when connections are rejected due to too many txns (from the customers data under load testing), we see saw tooth traces, with very deep drops and small (1) increments upwards. (even though these traces are only showing the txnCeiling at the point of rejection of connections).

        Issue Links

          Activity

          Hide
          Mark Rogers added a comment -

          I've got some big transactions, in particular transfer and wcm deployment. Do we need to consider marking them in some way to prevent the txnCelieng being changed when these big slow transactions do their stuff?

          Show
          Mark Rogers added a comment - I've got some big transactions, in particular transfer and wcm deployment. Do we need to consider marking them in some way to prevent the txnCelieng being changed when these big slow transactions do their stuff?
          Hide
          dward added a comment -

          Mark: no because you're not using the txnHelper associated with the webscript container.

          Show
          dward added a comment - Mark: no because you're not using the txnHelper associated with the webscript container.
          Hide
          dward added a comment -

          Now there is no ceiling and we just monitor all transaction start times. If a request comes in and the oldest incomplete transaction is older than the threshold then the request is rejected.

          Show
          dward added a comment - Now there is no ceiling and we just monitor all transaction start times. If a request comes in and the oldest incomplete transaction is older than the threshold then the request is rejected.
          Hide
          dward added a comment -

          For retest on V3.3.4 build 222

          Show
          dward added a comment - For retest on V3.3.4 build 222
          Hide
          Alfresco QA Team added a comment -

          Validated against 3.3.4.222 and 3.4.0.3262

          Show
          Alfresco QA Team added a comment - Validated against 3.3.4.222 and 3.4.0.3262

            People

            • Assignee:
              Closed Bugs
              Reporter:
              Andrew Hunt
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: