IMPLEMENTATION OF 2-PHASE
COMMIT BY INTEGRATING THE
DATABASE & THE FILE SYSTEM
V. Raghavendra Prasad
Assoc. Professor & HOD, Dept of IT, SVIST, Angallu, Madanapalle, Andhra Pradesh 517325, INDIA.
Subhakar. M,
Asst. Professor, Dept of IT, SVIST,Angallu, Madanapalle, Andhra Pradesh 517325, INDIA.
Abstract:
Transaction is a series of data manipulation statements that must either fully complete or fully fail, leaving the system in a consistent state, Transactions are the key to reliable software applications In J2EE Business layer components accesses transactional resource managers like RDBMS/Messaging provider. From the database point of view, the Resource Managers coordinates along with the Transaction Manager to perform the work which is transparent to the developer; however this is not possible with regards to a important resource, i.e. file system. Moreover, DBMS has the capacity to commit or roll back a transaction but this in independent of the file system. In this paper, we integrate the two-phase commit protocol of the RDBMS with the file system by using Java. As Java IO does not provide transactional support and hence it requires the developer to implement the transaction support manually in their application, this paper aims to develop a transaction aware resource manager for manipulation of files in Java.
Keywords:2-Phase Commit, DBMS, File manager resource, Java, XAResource.
1. INTRODUCTION
1.1.File Systems and Transaction
Traditionally File Systems are not transactional as they do not support Atomic operation; also they do not support XA transactions.
Processing files within the transaction is a very general requirement where “do all or nothing” scenario has to be fulfilled
A most commonly occurring scenario is where large number of files are received from internal/external systems which has to be processed and data from these files needs to be updated to databases and finally the files needs to be modified/archived to another location whereas all these activities are supposed to be performed within transaction boundary so that if there is any error while processing the files or updating the database then all the operations should be rolled back.
This document examines various options available to us for manipulating the files within transactions and to develop a transaction aware resource manager for file I/O in Java so that application programmers can access the file system in a manner which supports ACID semantics.
1.2.The Need for Transaction Management in files
2. WAYS OF TRANSACTION PROCESS
2.1. Transaction updates in a single file
A Single file contains a large number of records (business data) which has to be read and processed, after processing each record the record needs to be updated to database or message to be sent via JMS and the record to be marked as "PROCESSED", each entry should be processed once and only once.
There could be large number of records and therefore may take lots of time. In case system crashes in between, the data integrity should be maintained.
Solution Approach: 1. Start an XA Transaction
2. Read a record from file, mark it as done, 3. Put the message in queue or update the database 4. Commit the transaction
In case of processing failure/Exception due to any reason, two phase XAtransaction ensures that the step 2 and step 3 are maintained as a single atomic operation.
2.2.Transactional updates on multiple files
Multiple files are received from another application in a source directory, these files are to be read, processed & transformed and entry made to database and finally moved to another destination directory, the source file may or may not be deleted.
Solution Approach:
1. Start XA Transaction 2. Read the input/source file
3. Process the source file, apply transformation and create the target file, update the data to database 4. Delete the source file
5. Commit the transaction
In case of processing failure/exception due to any reason, 2 phase XA transaction ensures that the step 2/3/4 are performed as a single atomic operation, i.e. source file is never deleted until and unless the target file is create and database is updated
2.3.Software installation & up gradation
In case of fresh installation, a large number of files are to be moved to destination directory, if any file fails to be created then the entire operation is to be rolled back
In case of software upgrade additional files are to be created, existing files are to be deleted or edited, in case of any error the entire operation should be rolled back so that the previous installation can be restored.
Solution Approach: 1. Start XA Transaction 2. Perform file operations 3. Commit transaction Various options
There are several options to achieve the transactional file access, some of them are following: 1. Write custom code
3. IMPLEMENTATION
Java File I/O API does not support transactions, hence programmers are required to custom build the transactional support manually in their programs, and these programs are complex and require significant efforts.
3.1.Use Apache Commons Transaction
Another option is to use Apache Commons Transaction library which provides transactional access to file systems for read, write, move, copy, delete operations, there may be other java based solutions available but apache is considered reliable.
The Commons Transaction is a Java based solution; it provides transactional access to file systems which is independent of the file system provider/implementation. This is achieved through a Java library whose API features ACID transactions on the file system using a pessimistic locking schema.
The core to the Apache Commons Transaction Library is the class FileResourceManager which provides facilities to read, write, move, copy and delete the files within transactional boundaries
Following activity diagram shows API usage of commons transaction in managing File resources
Following is the sample program which shows usage of commons transaction library in managing file I/O and resources within transaction boundaries.
Fig. 2. Program segment for the I/O operations within transaction boundaries.
Following are the various I/O operations performed by the program within transaction boundaries 1. Create 2 file resources named file1.txt and file2.txt
2. Write contents to file resource file1.txt
3. Move and rename file resource source.txt to destination.txt
In case of any error/exception condition occurring while execution of the program, all the above 3 activities would be rolled back.
publicvoid commonTxDemo() throws ResourceManagerException{
FileResourceManager fileResourceManager = null; String txId = null;
String storeDir = "C:/STORE_DIR"; String workDir = "C:/WORK_DIR";
try {
/**storeDiristhedirectorywherefileswouldbestoredaftertransactionisover*workDiristhedirectorywhere ongoingtransactionsstorestemporarydata/ files *booleanparameteridentifiesifURLencodingisrequiredfor theFile/DirPath */
fileResourceManager = new FileResourceManager( storeDir, workDir, false,
new Log4jLogger(logger)); fileResourceManager.start();
//Generate unique transactionId to be used in transaction management.
txId = fileResourceManager.generatedUniqueTxId(); fileResourceManager.startTransaction(txId);
//Following 2 lines would create 2 files within the transaction context
fileResourceManager.createResource(txId, "file1.txt"); fileResourceManager.createResource(txId, "file2.txt");
//Next 5 lines write content to a file within the transaction context
OutputStream outputStream = fileResourceManager.writeResource(txId,"file1.txt"); PrintWriter writer = new PrintWriter(outputStream);
writer.print("This content is for file1.txt"); writer.flush();
writer.close();
//following line moves/renames source.txt to destination.txt in the //transaction context
fileResourceManager.moveResource(txId, "source.txt", "destination.txt", true);
//following line commits the transaction identified by //the passed transactionId
fileResourceManager.commitTransaction(txId);
fileResourceManager.stop(FileResourceManager.SHUTDOWN_MODE_NORMAL);
} catch (ResourceManagerSystemException e) {
e.printStackTrace();
//Rollback Transaction incase of any exception
fileResourceManager.rollbackTransaction(txId);
} catch (ResourceManagerException e) {
//Rollback Transaction incase of any exception
fileResourceManager.rollbackTransaction(txId); e.printStackTrace();
3.2.Distributed Transaction Processing: XA
3.2.1 Figure showing interaction between Application Program, Resource Managers and Transaction Manager
Fig. 3. Distributed transaction processing - XA
3.2.2 Figure showing various phases of 2 phase commit protocol
Fig. 4. Various phases of 2 phase commit protocol
3.3.Writing XA enabled Resource Manager to work with files
FileResourceManager (Commons transactions) supports transactional file access but it do not support XA transactions, which means it does not participate in distributed transactions spanning across multiple resources,hence it would not be able to satisfy requirements 1 and 2 listed in section 3 above. FileResoureManager can be made to participate in XA transactions by XA enabling FileResoureManager which can be done by implementing XAResource interface (javax.transaction.xa.XAResource)
Application Program
Resource Managers
Transaction Manager (™)
Transaction demarcation
Tx
XA Resource access Resource
Manager API
3.3.1 XAResource Interface
To write XA enabled Resource Manager for any kind of resource XAResource Interface has to be implemented, this interface has 10 methods which needs to be implemented
Fig. 5. XAResource Interface
3.3.2 XAResource interface Specification
The XAResource interface is a Java mapping of the industry standard XA interface based on the X/Open CAE Specification (Distributed Transaction Processing: The XA Specification). It defines the contract between a resource manager and transaction manager, A JDBC driver or JMS provider has to implement this interface so that it can participate in a global transaction.
Following is the specification of those 10 methods of XAResource Interface which has to be implemented
Fig. 6. XAResource interface Specification
3.3.3 XAResource interface Specification (flags)
Fig. 7. XAResource Interface Specification with flags
3.3.4 XAResource Implementation (Proof of concept)
Following is the XAResource implementation for the file resources; the implementation class name is FileXAResource which will act as a ResourceManager to enable participation in 2 phase transactions.
Fig. 8. XAResource implementation for the file resources
4. THE IMPLEMENTATION OF TRANSACTIONAL PROCESS
Fig. 9. Transaction Management across Resources
The application demonstrates the ability of FileXAResource to participate in distributed transaction; it shows the database interaction and file I/O manipulation enclosed within the transaction as a single unit of work, where if there is an error while database interaction then file I/O operations are also rolled back.
And if there is an error while file I/O manipulation then the record inserted to the database is also rolled back.
4.1.Sample Application Class diagram
TxClient class is the starting point containing static void main method.
4.2.Application setup
The Resource Manager implementation is attached in TwoPhase.zip, this is organized as an eclipse project and can be imported in any eclipse based IDE
Create 2 folders namely STORE_DIR and WORK_DIR
STORE_DIR is the directory where files would be stored after transaction is over whereas WORK_DIR is the directory where ongoing transactions stores temporary Data/ files.
Update this information in file TxClient.java
Create the table named Stock as per the DDL.sql given in setup folder 4.3.Running the Application
To run the application execute the TxClient.java file, this java file will perform following steps 1. Begin XA Transaction
2. Create 3 files, file1.txt, file2.txt and file3.txt. 3. It will write some text to file2.txt and file3.txt
4. It will insert a record to the table named Stock using JDBC. 5. Commit the transaction
In case of any error or exception condition while executing any of the steps above, all the above steps will be rolled back. I.e. creation of text files would be rolled back and record inserted into the database table would also be rolled back.
The application uses FileXAResource for file related operations and JDBC for database related operations
6. RESULTS
TABLE 1 Output of SYSA & SYSB
$ kixrcvdmp -f file.dta ============ file summary ============ Total # blocks read to logical EOF is 1 There is 1 active UOW
There is 1 in-doubt UOW Total # records ignored is 0 Total # before images pending is 5
First record timestamp Thu Feb 17 15:53:15 2005 Final record timestamp Thu Feb 17 15:53:15 2005
The output shows that SYSA had no local VSAM updates outstanding and region SYSB had several updated VSAM resources. Region SYSB also has an in-doubt transaction active. More analysis must take place to determine the details of the transactions.
TABLE 2 Transactions with Remote Manager
$ kixrcvdmp -f file.dta -i ============ file summary ============
Total # blocks read to logical EOF is 2 There is 1 active UOW
There is 1 in-doubt UOW
unikixtran7 has TM information - TX state is PENDING
GTRID is 1|XA01|7|7035|169539
- it is not in-doubt with a superior - total RM associations is 3
(1) VSAM RM (2) SYSB CRM (3) Third Party RM
Total # records ignored is 0 Total # before images pending is 0
The output shows that the transaction involves the third-party RM and a remote region identified as SYSB. Make a note of the GTRID, which uniquely identifies the transaction in effect at this region. For example,
1|XA01|7|7035|169539
■Sequence number for this transaction processor.
■Transaction name.
■Transaction processor executing the transaction.
■Key identifying this region. In this case, the key was configured in the XA*Syskey property of the region’s unikixrc.cfg file.
■Unique time stamp of the first XA transaction at this processor.
TABLE 3 Status of the XA transactions at region SYSB.
$ kixxa -s XA Configuration ---
Debug mode is: ALL
Resync timeout is: 600 seconds RM open failure is: FATAL
This region has 1 user configured RMs - SYSA CRM
This region can accept in-bound XA requests Active XA Status ---
This region does not have any active protected sessions unikixtran7 is in-doubt with a superior TM
- remote GTRID is: 1|XA01|7|7035|185830 - currently awaiting XA resynchronization - currently performing XA recovery
RM VSAM RM , flags = Open,Enlisted,Associated RM IBTCP , flags = Open,Enlisted,Superior RM SYSA CRM , flags = Open
There are 1 in-doubt transactions at this time
TABLE 4 status of the XA transactions at region SYSA.
$ kixxa -s XA Configuration
--- Debug mode is: ALL
Resync timeout is: 0 seconds RM open failure is: FATAL
This region has 2 user configured RMs - SYSB CRM
- Third Party RM
This region can accept in-bound XA requests Active XA Status
---
This region does not have any active protected sessions unikixtran7 is in-doubt with its local TM
- local TM state is COMMIT
- local GTRID is: 1|XA01|7|7035|951856 - currently performing XA recovery
RM VSAM RM , flags = Open,Enlisted,Associated RM IBTCP , flags = Open,Superior
RM SYSB CRM , flags = Open,Enlisted,Associated *** connecting to partner region @ neptune :8045 *** This CRM is currently executing XA recovery *** Please verify that the target region is active RM Third Party RM , flags = Open,Enlisted,Associated There are 1 in-doubt transactions at this time
7. CONCLUSION
Two-phase commit does utilize the concept of rollback in the prepare-to-commit phase of the process. The systems are based on modern resource managers that support 2-PC at the functional level of abstraction. Thus, 2-PC will frequently not be the optimal solution to the business integration scenarios and we should carefully evaluate other alternatives. The DataDirect Connect for JDBC drivers provide this support. In combination with the other components of the distributed transaction process, DataDirect drivers enhance the capability, speed, and efficiency of the modern enterprise.
In order to meet the failure/recovery requirements of XA, the Transaction Manager and all XAResource managers have to record transaction information "durably". In practical terms, this means they have to save the data to disk in some sort of transaction log. The XA protocol defines exactly when such transaction log disk-forces have to happen. This gives guarantees to the various components and allows them to make certain assumptions.
XAResource Implementation interaction protocol with the mobility acts has higher performance faster than the equivalent static solutions, small capacity of software, small network traffic - it is possible to employ the simplest interaction protocol at the configuration of the agent according to the concept of Thick /Thin agent, where the thin agent is mobile.
References
[1] Apache commons transaction http://commons.apache.org/transaction/
[2] Java Open Transaction Manager. http://jotm.objectweb.org/download/index.html [3] An article on Transactional files systems in java. http://myjavatricks.com/jtfs.aspx
[4] JavaDoc for XAResource interface from Sun Microsystems. http://java.sun.com/j2ee/1.4/docs /api/javax/transaction/ xa/ XAResource.html
[5] Java Transaction API (JTA) reference http://java.sun.com/javaee/technologies/jta/index.jsp
[6] Maydene Fisher, Jon Ellis, and Jonathan Bruce, JDBC API Tutorial and Reference, Third Edition, 2003, Addison-Wesley.
[7] Mascardi V., Merelli E.: Agent-Oriented and Constraint Technologies for Distributed Transaction Management, DISI – Universita di Genova via Dodecaneso, 35-16146 Genova (Italy) e_mail: mascardi@disi.unige.it, Istituto di Informatica -
[8] Universita di Ancona via Brecce Bianche - 601301, Ancona (Italy) e-mail: merelli@inform.unian.it
[9] X/Open CAE Specification, Distributed Transaction Processing: The XA Specification,1991, The X/Open Company. [10] Tanenbaum, A.S., van Stehen, M.: DistributedSystems: Principlesand Paradigms. PrenticeHall International, 2008 [11] Schill, A., Springer, T.: Verteilte Systeme -Grundlagen Basistechnologien. Springer, Berlin, 2007
[12] Grey, J., Reuter, A.: TransactionProcessing. Conceptsand Techniques. Morgan Kaufmann Seriesin Data Systems, 1992 [13] Reese, G.: Database Programming with JDBC and Java. O’Reilly Media, 2000
[14] Open Group: DistributedTP: ReferenceModel, ISBN 1859121705, freePDF, availableonline, 1996
[15] Attaluri G. K. and Salem K., “The Presumedeither Two-phase Commit Protocol,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 5, pp. 1190-1196, 2002.
[16] Booch G., Jacobson I., and Rumbaugh J., The Unified Modeling Language User Guide, Addison-Wesley, 1998.
[17] Boutros B. S. and Desai B. C., “A Two-Phase Commit Protocol and its Performance,” in Proceedings of the 7th International Workshop on Database and Expert Systems Applications, pp.100-105, 1996.
[18] Goldfarb C. and Prescod P., XML Handbook, 5th Edition, Prentice Hall, 2003.
[19] Liu M. L., Agrawal D., and El Abbadi A., “The Performance of Two-Phase Commit Protocols in the Presence of Site Failures,” in Proceedings of the 24th International Symposium on Fault- Tolerant Computing, pp. 234-243, 1994.
[20] Oberg R., Mastering RMI: Developing Enterprise Applications in Java and EJB, JohnWiley & Sons, 2001. [21] Pitt E. and McNiff K., Java.rmi: The Remote Method Invocation Guide, Addison-Wesley, 2001.