Sunday, October 20, 2013

Distributed Database - Fragmentation continued


Fragmentation in Distributed Database System / Horizontal Fragmentation in Distributed Database / Derived Horizontal Fragmentation Example / Derived Horizontal Fragmentation Explained



Derived Horizontal Fragmentation


The process of creating horizontal fragments of a table in question based on the already created horizontal fragments of another relation (for example, base table) is called Derived Horizontal Fragmentation.

In the previous post, we have seen about Primary Horizontal Fragmentation. We use the primary horizontal technique when we would like to horizontally fragment a table which is not dependent on any other table, or without considering any other table. That is, a table fragmented based on set of conditions where all the conditional attributes are part of that table only. This type of fragmentation is simple and straight forward. But in most of the cases, we need to fragment a database as a whole. For example, consider a relation which is connected with another relation using foreign key concept. That is, whenever a record is inserted into the child table, the foreign key column value of the inserted record must be verified for its availability in its parent table. In such condition, we cannot fragment the parent table (Table with primary key) and the child table (table with foreign key). If we fragment the tables separately, then for every insertion of records the table must verify the existence of one such value in the parent table. Hence, for this case the Primary Horizontal Fragmentation would not work.

Consider an example, where an organization maintains the information about its customers.They store information about the customer in CUSTOMER table and the customer addresses in C_ADDRESS table as follows;
CUSTOMER(CId, CName, Prod_Purchased, Shop_Location)
C_ADDRESS(CId, C_Address)
The table CUSTOMER stores information about the customer, the product purchased from their shop, and the shop location where the product is purchased. C_Address stores information about permanent and present addresses of the customer. Here, CUSTOMER is the owner relation and C_ADDRESS is the member relation.


Figure 1: CUSTOMER table
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C001
Ram
Air Conditioner
Mumbai
C002
Guru
Television
Chennai
C010
Murugan
Television
Coimbatore
C003
Yuvraj
DVD Player
Pune
C004
Gopinath
Washing machine
Coimbatore

Figure 2: C_ADDRESS table
CID
C_ADDRESS
C001
Bandra, Mumbai
C001
XYZ, Pune
C002
T.Nagar, Chennai
C002
Kovil street, Madurai
C003
ABX, Pune
C004
Gandhipuram, Ooty
C004
North street, Erode
C010
Peelamedu, Coimbatore

If the organization would go for fragmenting the relation CUSTOMER on the shop_location attribute, it needs to create 4 fragments using horizontal fragmentation technique as given in Figure 3 below.

Figure 3: Horizontal fragments of Figure 1 on Shop_Location attribute
CUSTOMER1
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C001
Ram
Air Conditioner
Mumbai

CUSTOMER2
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C002
Guru
Television
Chennai

CUSTOMER3
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C010
Murugan
Television
Coimbatore
C004
Gopinath
Washing machine
Coimbatore

CUSTOMER4
CID
CNAME
PROD_PURCHASED
SHOP_LOCATION
C003
Yuvraj
DVD Player
Pune

Now, it is necessary to fragment the second relation C_ADDRESS based on the fragment created on CUSTOMER relation. Because, in any other way, if we fragment the relation C_ADDRESS, then it may end in different location for different data. For example, if C_ADDRESS is fragmented on the last digit of the CID attribute, it will end up with more number of fragments and the data may not be stored in the same location where customer information are stored. That is, customer ‘Ram’ information is stored in Mumbai and his address information might be stored somewhere else. To avoid such confusion, the table C_ADDRESS which is actually a member table of CUSTOMER, must be fragmented into four fragments and based on the CUSTOMER table fragments given in Figure 3. This type of fragmentation based on owner relation is called Derived Horizontal Fragmentation. This will work for relations where an equi-join is required for joining two relations. Because, an equi-join can be represented as set of semi-joins.

The fragmentation of C_ADDRESS is done as follow as set of semi-joins as follows.
C_ADDRESS1 = C_ADDRESS CUSTOMER1
C_ADDRESS2 = C_ADDRESS CUSTOMER2
C_ADDRESS3 = C_ADDRESS CUSTOMER3
C_ADDRESS4 = C_ADDRESS CUSTOMER4

This will result in four fragments of C_ADDRESS where the customer address of all customers of fragment CUSTOMER1 will go into C_ADDRESS1, and the customer address of all customers of fragment CUSTOMER2 will go into C_ADDRESS2, and so on. The resultant fragment of C_ADDRESS will be the following.

Figure 4: Derived Horizontal fragments of Figure 2 as a member relation of the owner relation’s fragments from Figure 3
C_ADDRESS1
CID
C_ADDRESS
C001
Bandra, Mumbai
C001
XYZ, Pune

C_ADDRESS2
CID
C_ADDRESS
C002
T.Nagar, Chennai
C002
Kovil street, Madurai

C_ADDRESS3
CID
C_ADDRESS
C004
Gandhipuram, Ooty
C004
North street, Erode
C010
Peelamedu, Coimbatore

C_ADDRESS4
CID
C_ADDRESS
C003
ABX, Pune

 

Checking for correctness

Completeness: The completeness of a derived horizontal fragmentation is more difficult than primary horizontal fragmentation. Because, the predicates used are determining the fragmentation of two relations. Formally, for fragmentation of two relations R and S, such as {R1, R2, …, R3} and {S1, S2, …, S3}, there should be one common attribute such as A. Then, for each tuple t of Ri, there should be a tuple Si which have a common value for A. This is known as referential integrity.
The derived fragmentation of C_ADDRESS is complete. Because, the value of the common attributes CID for the fragments CUSTOMERi and C_ADDRESSi are the same. For example, the value present in CID of CUSTOMER1 is also and only present in C_ADDRESS1,  etc.

Reconstruction: Reconstruction of a relation from its fragments is performed by the union operator in both the primary and the derived horizontal fragmentation.

Disjointness: If the minterm predicates are mutually exclusive then the disjointness rule is satisfied for Primary Horizontal Fragmentation. For derived horizontal fragmentation, we state that the fragments are disjoint if the fragments were created using the mutually exclusive simple predicates of the base relation. Hence, in our example, as the simple predicates Shop_Location=’Mumbai’, etc are mutually exclusive, the derived fragments are also disjoint.

*************

Go to Distributed Database page

Go to Distributed Database - Fragmentation page

Go to Primary Horizontal Fragmentation page



Tuesday, October 8, 2013

Distributed Database - Fragmentation


Fragmentation in Distributed Database System / Horizontal Fragmentation in Distributed Database / Primary Horizontal Fragmentation Example / Primary Horizontal Fragmentation Explained

Fragmentation



Fragmentation involves breaking a relation (table) into two or more pieces either horizontally (Horizontal Fragmentation) or vertically (Vertical Fragmentation) or both (Hybrid), mainly to improve the availability of data to the end user and end user programs. 

Let us start this section with an example. Consider XYZ bank, which is currently having around 1000 branches all over the country. Assume that it maintains its database at single location, say New Delhi (Head office - Central Site). Now the problem is that, all the requests generated from any part of the country can only be handled at the central site (New Delhi). The requests might be generated for withdrawal of money, balance inquiry, PIN change request, transfer of funds, POS purchase, etc., through ATM, Net Banking, POS terminals. Think about the number of transactions could be generated and the network traffic created if thousands of the bank customer uses the above said mode for daily transactions, including direct bank transactions at the bank counters. 

One possible solution for handling such a huge number of transactions is to have distributed database. But, we have set of questions in front of us. They are;
  • How are we going to fragment a table?
  • How many fragments to be created?
  • Which strategy of fragmentation would help improving the performance?
  • Should one need to fragment all the tables in a database or only a few tables?
  • Where do we keep the fragments after fragmentation? (Allocation problem)
Answer to these questions would help us in understanding, fragmenting, and improving the overall system.

Types of Fragmentation:

 The first question 'How are we going to fragment a table?' can be answered here. We have the following types of fragmentation.

1.     Horizontal Fragmentation




2.     Vertical Fragmentation

3.     Hybrid Fragmentation

We shall discuss one by one in detail. 

********************


Go back to Distributed database page





Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery