[ALF-559] Concurrent CIFS uploads lead to the errors on MSSQL Created: 08-Dec-09  Updated: 01-Jul-10  Resolved: 27-Apr-10

Status: Closed
Project: Alfresco
Component/s: JLAN
Affects Version/s: 3.2 Enterprise
Fix Version/s: 3.3g Community, 3.3 Enterprise

Type: Bug Priority: Critical
Reporter: Alfresco QA Team (Inactive) Assignee: Closed Bugs (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

MSSQL


Attachments: Text File log.txt    
Issue Links:
Related
is related to by MNT-1133 CLONE for Hotfix -Data loss and error... Closed
is related to by MNT-6745 Data loss and errors on FTP upload Closed
Date of First Response:

 Description   

Concurrent CIFS uploads lead to the errors on MSSQL.
Snapshot isolation is enabled.
Please, see errors attached.
There are no such errors on MySQL.

AntonRy



 Comments   
Comment by Alfresco QA Team (Inactive) [ 08-Dec-09 ]

Found in Alfresco 3.2 EE build 283 using Windows 2003 SP2, Tomcat 6.0.18, MSSQL 2005, JDK 6u16

Comment by Dave Ward [X] (Inactive) [ 16-Mar-10 ]

Derek

Would it be possible for us to wrap the 'public' NodeService as one of those retrying transactional things so that we don't have to revisit all the file server code to make it retrying? I assume it would have no effect if it's already nested in a retrying call.

Comment by Dave Ward [X] (Inactive) [ 17-Mar-10 ]

We need retrying transaction behaviour here, but unfortunately most of the file server operations are not 'retryable'.

Suggestion:

org.alfresco.filesys.repo.ContentDiskDriver.closeFile(SrvSession, TreeConnection, NetworkFile)

Should separate out its file server actions and its repo actions into two transactions. The file server actions should remain in the existing manual transaction, delimited by beginWriteTransaction and endTransaction. The repository actions (nodeService.setProperty, fileFolderService.exists, nodeService.hasAspect, fileFolderService.delete ) should be moved into a consecutive transaction wrapped using the retrying transaction helper.

Comment by Gary Spencer [X] (Inactive) [ 24-Mar-10 ]

My concerns are that the database lock timeout is usually much longer than a CIFS request timeout (usually around 15secs), and in the case of an NFS request could be much lower at 1/2-2secs. The filesystem driver layer, ContentDiskDriver, is accessed by any protocol implementation, currently CIFS/NFS/FTP.

If changes are made to the transaction logic of the ContentDiskDriver I'm concerned about the performance impact this will have. The repo calls are already very slow in many cases as it is.

Looking at the log file it looks more like a low level database issue. Is there any timing information on how long the update error takes to trigger ?, that should indicate whether it's a lock or logic problem.

Comment by Dave Ward [X] (Inactive) [ 25-Mar-10 ]

Caused by: java.sql.SQLException: Snapshot isolation transaction aborted due to update conflict is a classic example of a 'transaction collision' on a SQL Server database. I've seen it a lot in the multi-user tests. It's normally retried silently by the transaction helper. We need retrying behaviour, at least to get the repo stuff through.

Comment by Dave Ward [X] (Inactive) [ 27-Apr-10 ]

Please retest. Should be fixed by ALF-558.

Comment by Steve Rigby [X] (Inactive) [ 04-May-10 ]

For retest in 3.3g b2826

Comment by Alfresco QA Team (Inactive) [ 05-May-10 ]

Community version doesn't support MSSQL.
Waiting for 3.3 Enterprise to validate.

Comment by Alfresco QA Team (Inactive) [ 18-May-10 ]

Successfully validated in Alfresco 3.3 EE b 34 using Windows 2008 SP1 x64, Tomcat 6.0.26, MSSQL 2008, JDK 6u16 x64.

Generated at Sun Mar 07 17:50:28 GMT 2021 using Jira 7.13.15#713015-sha1:7c5ddd2c3e1709974ae9c48c17df8edd3919fe2c.