The ceiling is lowered and raised using:
txnCeiling = Math.max(1, txnCountWhenStarted - 1);
txnCeiling = txnCountWhenStarted + 1;
By definition, a lowering of the ceiling will only happen a long time after the Txn that is doing the lowering was started, and conversely, a raising of the ceiling will only happen very soon after the txn started.
This means that a lowering of the ceiling will always cause the ceiling level to ignore any successful transactions that have taken place (successfully) since the slow transaction started.
I think this can lead to over-enthusiastic lowering of the ceiling in response to a single slow transaction (especially if it is very slow). I can't think of any way around it though....
Say you have ten transactions where only the third one takes a long time (a very long time) and the others are all dealt with within the specified limit.
Let's say before the ten transactions, the txnCeiling is at X
When the third txn starts, the ceiling will be a X+2
By the time the third txn finishes, the next 7 txns will also be finished within the specified time, so the txnCeiling will be at X+9
Because the third txn took so long, it will lower the ceiling again, but instead of lowering it to X+8, it will lower it to X+1 - i.e. one less than when it started.
I'm not actually sure if this is a problem, but given graphs of the txnCeiling when connections are rejected due to too many txns (from the customers data under load testing), we see saw tooth traces, with very deep drops and small (1) increments upwards. (even though these traces are only showing the txnCeiling at the point of rejection of connections).