Principles Of Distributed Database Systems Exercise Solutions May 2026

The flickering neon sign of "The Partitioned Plate," a diner known for its chaotic yet surprisingly efficient service, hummed with a low-frequency buzz. Inside, Elara, a database architect with a penchant for solving unsolvable puzzles, sat hunched over a worn copy of "Principles of Distributed Database Systems."

She wasn't just reading; she was wrestling with a phantom. A phantom named "The Inconsistent State."

For weeks, her team's distributed transaction system had been plagued by phantom reads and lost updates. Every time they thought they had the concurrency control figured out, a new anomaly would ripple through the nodes like a digital seismic wave.

"Trouble with the exercise sets again, Elara?" a voice rasped from across the counter. It was Silas, the diner's owner, a man whose wisdom was as deep as his coffee was black.

Elara sighed, pushing the book toward him. "Exercise 12.4. Reliability and Fault Tolerance. I can't seem to find the right balance between replication and performance. Every time I increase the replication factor to handle node failures, the write latency skyrockets."

Silas leaned in, his eyes twinkling. "Think of this diner, Elara. We've got three kitchens, right? All serving the same menu. If one kitchen goes down, the others pick up the slack. But if we try to make sure every single chef in every kitchen knows exactly what every customer ordered the second they order it, nothing would ever get cooked."

Elara frowned. "But we need consistency, Silas. We can't have one customer getting their pancakes while another is told they're out of stock when they're not."

"Exactly," Silas said, tapping the book. "The key isn't perfect synchronization. It's about

consistency. You don't need every node to be identical every millisecond. You just need them to agree on the final state before the bill is paid."

He pointed to a specific diagram in the exercise set—a complex web of message exchanges and heartbeat protocols. "Look at the quorum-based protocols. They don't require everyone to agree, just a majority. It's like my staff. If three out of five servers say we're out of blueberry muffins, we're out of blueberry muffins. We don't need to wait for the other two to check the pantry."

Elara's eyes widened. She began to see the logic. The exercise wasn't about finding a single, perfect solution; it was about understanding the trade-offs. The "answer" wasn't a formula, but a strategy.

She spent the rest of the night scribbling notes, mapping out quorum systems and failure-aware commit protocols. The solutions weren't just lines of code; they were a blueprint for a resilient, distributed world.

As the sun began to peek over the horizon, Elara finally closed the book. The phantom of inconsistency hadn't vanished, but it was no longer a threat. She had the principles. She had the solutions. And most importantly, she had a fresh perspective, courtesy of a diner owner and a very challenging exercise set.

She left a generous tip, not just for the coffee, but for the clarity. The "Principles of Distributed Database Systems" were no longer just abstract concepts; they were the tools she would use to build something truly robust. And as she stepped out into the crisp morning air, she knew that even in a world of distributed systems and inevitable failures, consistency, eventually, would always prevail.

Introduction

Distributed database systems are designed to store and manage large amounts of data across multiple sites or nodes. The data is typically replicated or partitioned across multiple nodes to improve performance, reliability, and scalability. In this write-up, we will discuss the principles of distributed database systems and provide solutions to common exercises.

Principles of Distributed Database Systems

  1. Distribution: The data is divided into smaller fragments and stored across multiple nodes.
  2. Autonomy: Each node operates independently and makes its own decisions about data management.
  3. Heterogeneity: Nodes may have different hardware, software, and data models.
  4. Transparency: The distribution of data is transparent to users, who can access data without knowing its location.

Types of Distributed Database Systems

  1. Client-Server Systems: A centralized server manages data and clients access data through queries.
  2. Peer-to-Peer Systems: All nodes are equal and can act as both clients and servers.
  3. Federated Systems: Multiple autonomous databases are integrated to provide a unified view.

Exercise Solutions

Exercise 1: Design a Distributed Database Schema

Suppose we have a distributed database system for a university with three nodes: Node A ( New York), Node B (Chicago), and Node C (Los Angeles). The database has two relations: Students and Courses.

Solution

We can design a distributed database schema as follows:

  • Node A (New York): Students relation with attributes Student_ID, Name, Age
  • Node B (Chicago): Courses relation with attributes Course_ID, Course_Name, Credits
  • Node C (Los Angeles): Enrollments relation with attributes Student_ID, Course_ID, Grade

Exercise 2: Fragmentation and Allocation

Suppose we have a relation Orders with attributes Order_ID, Customer_ID, Order_Date, and Total. We want to fragment this relation into two fragments: Orders_1 and Orders_2. We also want to allocate these fragments to two nodes: Node A and Node B.

Solution

We can fragment the Orders relation based on the Order_Date attribute:

  • Orders_1: Orders with Order_Date between 2020 and 2022
  • Orders_2: Orders with Order_Date between 2023 and 2025

We can allocate these fragments to nodes as follows:

  • Node A: Orders_1
  • Node B: Orders_2

Exercise 3: Distributed Query Processing

Suppose we have a query to retrieve the names of students who are enrolled in a course with a specific course ID.

Solution

We can process this query in a distributed manner as follows:

  1. Node A (New York) receives the query and sends a subquery to Node C (Los Angeles) to retrieve the Student_IDs of students enrolled in the course.
  2. Node C (Los Angeles) executes the subquery and sends the Student_IDs back to Node A.
  3. Node A (New York) receives the Student_IDs and sends another subquery to Node A to retrieve the names of students with those Student_IDs.
  4. Node A (New York) executes the subquery and sends the names of students back to the user.

Conclusion

Distributed database systems are complex systems that require careful design, implementation, and management. Understanding the principles of distributed database systems, including distribution, autonomy, heterogeneity, and transparency, is crucial for designing and implementing efficient and scalable systems. The exercise solutions provided in this write-up demonstrate how to apply these principles to real-world problems.

References:

  • [1] M. T. Özsu and P. Valduriez, "Principles of Distributed Database Systems", 3rd ed., Springer, 2011.
  • [2] S. C. B. Tan, "Distributed Database Systems: A Tutorial", Prentice Hall, 2001.

Mastering the Core: Principles of Distributed Database Systems Exercise Solutions

Distributed database systems (DDBS) are the backbone of modern, globalized computing. From social media feeds to international banking, the ability to manage data across multiple physical locations is essential. However, the complexity of these systems—covering fragmentation, replication, query optimization, and transaction management—can be daunting.

Working through exercise solutions is often the only way to bridge the gap between abstract theory and technical implementation. This article explores the fundamental principles of DDBS through the lens of common problem sets and their solutions. 1. Data Fragmentation and Allocation

One of the first challenges in a distributed environment is deciding how to split data (fragmentation) and where to put it (allocation). Horizontal vs. Vertical Fragmentation

Horizontal Fragmentation: Dividing a relation into subsets of tuples (rows). Solutions usually involve defining selection predicates (e.g., WHERE City = 'New York').

Vertical Fragmentation: Dividing a relation into subsets of attributes (columns). Solutions focus on grouping attributes frequently accessed together, often using an Attribute Affinity Matrix. Common Exercise Scenario:

Problem: Given a global schema and specific site queries, determine the optimal fragments.

Solution Tip: Use Minterm Predicates. By combining all simple predicates from applications, you create non-overlapping fragments that satisfy the "completeness" and "disjointness" rules. 2. Distributed Query Processing

In a distributed system, the cost of moving data over a network often outweighs the cost of local disk I/O. Localization and Optimization

Query processing solutions typically follow a four-step process:

Query Decomposition: Rewriting the calculus query into an algebraic one.

Data Localization: Replacing global relations with their fragments. The flickering neon sign of "The Partitioned Plate,"

Global Optimization: Finding the best join order and communication strategy. Local Optimization: Selecting the best local access paths. Common Exercise Scenario:

Problem: Calculate the cost of a join between two tables located at different sites using a Semi-join.

Solution Tip: Remember that a semi-join reduces the size of the operand before it is sent across the network. If Size(Semi-join result) + Cost(Moving result) < Size(Original Table), the semi-join is more efficient. 3. Distributed Concurrency Control

Ensuring consistency when multiple users access data across sites requires sophisticated locking and ordering mechanisms. Locking and Timestamping

Distributed 2-Phase Locking (2PL): Managing "lock" and "unlock" phases across multiple nodes. Solutions often deal with Global Deadlock Detection, where a cycle exists in the Wait-For-Graph across different sites.

Timestamp Ordering: Assigning unique timestamps to transactions to ensure serializability without explicit locking. 4. Reliability and the Two-Phase Commit (2PC)

How do we ensure that a transaction either commits at every site or aborts at every site? The 2PC Protocol

Voting Phase: The coordinator asks participants if they are ready to commit.

Decision Phase: Based on the votes, the coordinator sends a "Global Commit" or "Global Abort" message. Common Exercise Scenario:

Problem: What happens if the coordinator fails after sending a "Prepare" message but before receiving all votes?

Solution Tip: This leads to a "blocked" state. Participants cannot decide on their own because they don't know the global outcome, highlighting a major weakness of basic 2PC (the need for 3PC or recovery protocols). 5. Parallel Database Systems

While distributed systems focus on geographic separation, parallel systems focus on performance via multiple processors and disks. Architectures Shared Memory: Fast but limited scalability.

Shared Disk: Good for clusters but suffers from communication overhead.

Shared Nothing: The gold standard for massive scalability (e.g., MapReduce, Hadoop). Conclusion: How to Approach Exercise Solutions

When studying "Principles of Distributed Database Systems," don't just look for the answer. Focus on the correctness rules: Completeness: No data is lost during fragmentation.

Reconstruction: You can rebuild the original relation from fragments.

Disjointness: Data isn't unnecessarily duplicated (unless specifically replicated for availability).

By mastering these mathematical and logical foundations, you move beyond rote memorization and toward designing resilient, high-performance distributed architectures.

Principles of Distributed Database Systems

A distributed database system is a collection of multiple databases that are connected through a network, allowing users to access and share data across different locations. The main goals of a distributed database system are:

  1. Improved data availability: Data is available at multiple sites, reducing the risk of data loss or unavailability.
  2. Increased scalability: Distributed databases can handle large amounts of data and support a large number of users.
  3. Enhanced performance: Data can be accessed from multiple sites, reducing the load on individual databases.

Key Concepts

  1. Fragmentation: Breaking a large database into smaller fragments, each stored at a different site.
  2. Replication: Maintaining multiple copies of data at different sites to improve availability and performance.
  3. Distribution: Storing data across multiple sites, each with its own database management system.

Types of Distributed Database Systems

  1. Client-Server Systems: A central server manages data, and clients access data through a network.
  2. Peer-to-Peer Systems: All sites are equal, and each site can act as both a client and a server.

Exercise Solutions

Exercise 1: What are the main advantages of a distributed database system?

Solution: The main advantages of a distributed database system are:

  • Improved data availability
  • Increased scalability
  • Enhanced performance

Exercise 2: What is fragmentation in a distributed database system?

Solution: Fragmentation is the process of breaking a large database into smaller fragments, each stored at a different site.

Exercise 3: What is replication in a distributed database system?

Solution: Replication is the process of maintaining multiple copies of data at different sites to improve availability and performance.

Exercise 4: Consider a distributed database system with three sites: A, B, and C. Each site has a copy of a relation R. The relation R has the following tuples:

| ID | Name | Age | | --- | --- | --- | | 1 | John | 25 | | 2 | Jane | 30 | | 3 | Joe | 35 |

Site A has the following fragment of R:

| ID | Name | Age | | --- | --- | --- | | 1 | John | 25 | | 2 | Jane | 30 |

Site B has the following fragment of R:

| ID | Name | Age | | --- | --- | --- | | 2 | Jane | 30 | | 3 | Joe | 35 |

Site C has the following fragment of R:

| ID | Name | Age | | --- | --- | --- | | 1 | John | 25 | | 3 | Joe | 35 |

a. What is the fragmentation of R?

b. What is the replication factor of R?

Solution:

a. The fragmentation of R is:

R = R1 ∪ R2 ∪ R3

where R1, R2, and R3 are the fragments of R at sites A, B, and C, respectively.

b. The replication factor of R is 3, since there are three copies of R, one at each site.

Exercise 5: Consider a distributed database system with two sites: A and B. Site A has a relation R1, and site B has a relation R2. The relations R1 and R2 have the following tuples:

R1:

| ID | Name | Age | | --- | --- | --- | | 1 | John | 25 | | 2 | Jane | 30 |

R2:

| ID | Name | Age | | --- | --- | --- | | 3 | Joe | 35 | | 4 | Sarah | 20 |

Design a distributed query to retrieve all tuples from R1 and R2.

Solution:

The distributed query can be written as:

SELECT * FROM R1 UNION SELECT * FROM R2

This query retrieves all tuples from R1 at site A and R2 at site B, and combines them into a single result set.

This essay explores the core principles of distributed database systems (DDBS) by analyzing common architectural challenges and their standard exercise solutions. Distributed databases manage data across multiple physical locations while appearing as a single logical unit to the user, necessitating complex solutions for transparency, consistency, and reliability. The Principle of Distribution Transparency

A primary goal of a DDBS is to hide the complexities of data distribution from the user. Exercise solutions in this area typically focus on Location Transparency and Fragmentation Transparency.

Problem: How can a user query a table without knowing it is split across servers in New York and London?

Solution: Systems use a Global Conceptual Schema (GCS) that maps logical tables to physical fragments. Solutions often involve "Transparent Mapping," where the query optimizer automatically decomposes a global query into sub-queries targeted at specific nodes. This ensures that the user's SQL remains identical regardless of where the data resides. Data Fragmentation and Allocation

Efficiency in a distributed system depends on how data is divided. Exercises often ask for the best way to fragment a database based on access patterns.

Horizontal Fragmentation: Dividing a relation into subsets of tuples (rows). Solutions usually involve using selection predicates (e.g., WHERE City = 'Chicago') to keep data close to its most frequent users.

Vertical Fragmentation: Dividing a relation into subsets of attributes (columns). Solutions focus on grouping attributes that are frequently accessed together to reduce unnecessary I/O across the network.

Allocation: The "Materialization" of these fragments. Exercise solutions typically apply the "Locality of Reference" principle—placing data where it is most frequently accessed to minimize communication costs. Distributed Query Processing

Querying across multiple nodes introduces the "Join" problem. Since moving large tables across a network is expensive, solutions prioritize minimizing data transfer.

Semijoin Optimization: A classic exercise solution to reduce communication cost. Instead of sending an entire Table A to Table B’s site for a join, the system sends only the joining column of A. Table B filters its rows against this column and sends back only the matching records. This drastically reduces the volume of data crossing the network. Concurrency Control and Consistency

Maintaining data integrity across sites is perhaps the most difficult aspect of DDBS. Exercises often center on the CAP Theorem (Consistency, Availability, Partition Tolerance) and the Two-Phase Commit (2PC) protocol.

Two-Phase Commit (2PC): To ensure atomicity (all or nothing), solutions follow a "Prepare" phase and a "Commit" phase. A coordinator asks all participants if they are ready; if even one node fails or votes "No," the entire transaction is rolled back.

Deadlock Detection: In distributed systems, deadlocks can occur across sites. Solutions often involve a "Global Wait-For Graph" (GWFG) or timestamp-based techniques like "Wait-Die" or "Wound-Wait" to prevent circular dependencies between remote transactions. Reliability and Replication

Replication ensures that if one node fails, the system remains operational. However, keeping replicas synchronized is a major hurdle.

Exercise Solution: Solutions often utilize a Primary Copy or Voting algorithm. In a Primary Copy setup, all updates go to one master node first. In Voting, a transaction must write to a "quorum" (majority) of replicas to be considered successful, balancing the trade-off between high availability and strict consistency. Conclusion

The study of distributed database system exercises reveals a consistent theme: the trade-off between performance and transparency. Solutions to these problems—ranging from semijoins for query optimization to two-phase commits for integrity—demonstrate the necessity of rigorous protocols to manage the inherent "noise" and latency of networked environments. Understanding these principles is essential for building scalable, resilient modern applications.

Official exercise solutions for Principles of Distributed Database Systems

by M. Tamer Özsu and Patrick Valduriez (3rd and 4th editions) are primarily restricted to instructors. However, students can access several high-quality alternative resources for practice. University of Waterloo 1. Official Companion Sites (Instructor Restricted)

The authors provide companion websites for the latest editions. While these sites host presentation slides and errata for public download, full exercise solutions require instructor registration and evidence of course adoption. University of Waterloo 4th Edition Companion Site 3rd Edition Companion Site University of Waterloo 2. Available Public Study Resources

If you are looking for specific problem breakdowns, several academic and community platforms host partial solutions: Chapter-Specific Solutions : Platforms like host documents covering specific topics, such as Chapter 3: Distributed Database Design (Horizontal/Vertical Fragmentation). University Course Documents

: Some university portals host solution manuals or PDFs uploaded by students for study purposes, such as the Principles Of Distributed Database Systems Solution Manual

which covers key concepts like the CAP theorem and ACID properties. GitHub Tech Notes

: Developers and students often post personal notes and summaries of textbook exercises. For example, tech-notes

provides structured summaries of the principles discussed in the text. 3. Alternative Practice Resources

If you are using the book for self-study and cannot access the restricted solutions, consider these similar resources that provide open-access practice problems: Database System Concepts

: This textbook (Silberschatz, Korth, Sudarshan) provides a public Solution to Practice Exercises

page, which includes a dedicated section on distributed databases. Distributed Systems - Principles and Paradigms : The authors of this related text provide a comprehensive open PDF of solutions

for concepts like distribution transparency and failure recovery. Database System Concepts - 7th edition particular type of problem (e.g., fragmentation or concurrency control) to solve? Principles of Distributed Database Systems, Third Edition

Finding formal exercise solutions for the authoritative textbook Principles of Distributed Database Systems

(4th Edition, 2020) by M. Tamer Özsu and Patrick Valduriez can be challenging because the authors primarily restrict full solution manuals to instructors. University of Waterloo

However, you can access specific helpful resources and sample solutions through the following official and verified academic channels: 1. Official Textbook Resources The authors maintain a dedicated site at the University of Waterloo

for the 4th edition. While the full manual is restricted, this site is the most reliable source for: Solutions to Selected Exercises

: Links to specific PDFs containing verified answers for core chapters. Presentation Slides

: These often contain "in-class" examples and solved problems that mirror the exercises in the book.

: Crucial for ensuring you aren't trying to solve an exercise with a typo. Official Site Principles of Distributed Database Systems, 4th Ed 2. Verified Solutions for Key Concepts

Common exercises in this field often focus on specific algorithmic problems. You can find high-quality, solved examples for these topics on academic platforms: Data Fragmentation & Allocation

: Step-by-step solutions for vertical and horizontal fragmentation can be found on Distributed Query Optimization

: Look for solutions regarding join ordering and semijoin programs, which are frequently used in distributed systems homework. Concurrency Control Distribution : The data is divided into smaller

: Solutions involving Two-Phase Commit (2PC) and Paxos consensus algorithms are often provided in university course repositories like those at 3. Alternative Peer-to-Peer Learning

If official solutions are unavailable for a specific problem, these platforms host student-uploaded solution sets: CourseHero

: Hosts various versions of the "Principles of Distributed Database Systems Exercise Solutions" uploaded by students from institutions like GITAM University BITS Pilani Database System Concepts (Practice Site) : While for a different book, the Practice Exercises

by Silberschatz et al. provide publicly available solutions for overlapping topics like distributed transactions and deadlock. Course Hero

If you are looking for resources related to the textbook " Principles of Distributed Database Systems " (by M. Tamer Özsu and Patrick Valduriez), 📚 Official & Academic Resources

Official Author Site: The authors often provide slide decks and supplementary materials. Check the official book website for potential sample solutions or instructor resources.

GitHub Repositories: Many students and researchers post their own implementations of the book's concepts (like join algorithms or deadlock detection). Searching GitHub for "Principles of Distributed Database Systems Solutions" often yields community-driven answer keys.

University Course Pages: Many universities use this as a standard text. Searching for site:.edu "Principles of Distributed Database Systems" assignment solutions can lead to public course archives from past semesters. 🛠️ Common Topics in Exercises Exercises in this field typically focus on:

Data Fragmentation: Defining horizontal and vertical fragments for a given schema.

Distributed Query Optimization: Calculating the cost of distributed joins and semi-joins.

Transaction Management: Solving problems related to 2-Phase Commit (2PC) and distributed deadlock detection.

Reliability Protocols: Analyzing how systems handle site or network failures. 💻 Peer-Shared Solutions

For specific step-by-step answers to the textbook's problems, platforms like Course Hero and Scribd have user-uploaded PDFs. Note that these often require a subscription to view in full.

Are you working on a specific chapter or problem number? If you share the question text, I can help you work through the logic of the solution.

Principles of Distributed Database Systems: Exercise Solutions & Key Concepts

Mastering distributed database systems (DDBS) requires more than just reading theory; it demands a hands-on approach to solving complex architectural puzzles. Whether you are studying for an exam or designing a scalable system, working through exercise solutions is the best way to internalize how data moves across a network.

This guide explores the core principles of DDBS through the lens of common exercise problems and their practical solutions. 1. Data Fragmentation and Allocation

One of the first hurdles in any DDBS course is determining how to split a global relation into pieces (fragmentation) and where to store them (allocation). Exercise Scenario:

You have a global relation Employee (EmpID, Name, Dept, Salary, Location). You need to fragment this based on the query: "Find employees working in New York or London." Solution Approach:

Horizontal Fragmentation: This involves using a SELECT operation. You define fragments based on the Location attribute.

Vertical Fragmentation: If a query only needs Name and Salary, you would use a PROJECT operation to split columns rather than rows.

The Correctness Rules: Ensure your solution meets three criteria: Completeness (no data lost), Reconstruction (can join/union back to the original), and Disjointness (no unnecessary duplication). 2. Distributed Query Optimization

Querying a distributed system is expensive because of "communication costs." Exercises often ask you to calculate the cost of a Join operation across two different sites. Key Concept: Semijoins

A common solution to reduce data transfer is the Semijoin. Instead of sending an entire table across the network, you send only the joining column, filter the remote table, and send the smaller result back.

Exercise Tip: When asked to find the "optimal execution plan," always compare the total bytes transferred in a standard Join versus a Semijoin. The formula usually looks like: 3. Distributed Concurrency Control

How do you maintain consistency when multiple users edit the same data on different continents? Solution: Two-Phase Locking (2PL)

In distributed exercises, you'll often encounter the Centralized 2PL vs. Distributed 2PL debate.

Centralized: One site manages all locks. Simple, but a single point of failure.

Distributed: Each site manages locks for its own data. More resilient, but harder to detect Global Deadlocks.

Wait-Die vs. Wound-Wait: These are common algorithmic solutions for deadlock prevention.

Wait-Die: Older transaction waits for younger, younger dies. Wound-Wait: Older transaction "wounds" (preempts) younger. 4. Reliability and the Two-Phase Commit (2PC)

Reliability exercises often focus on what happens when a site or a link fails during a transaction. The 2PC Protocol Steps:

Voting Phase: The coordinator asks all participants if they are ready to commit.

Decision Phase: If all vote "Yes," the coordinator sends a "Global Commit." If any vote "No" or timeout, it sends a "Global Abort."

Common Problem: What happens if the coordinator fails after the voting phase?Solution: This is the "blocking problem" of 2PC. Participants may be left in an uncertain state, holding locks indefinitely until the coordinator recovers. This is why modern systems often look toward Three-Phase Commit (3PC) or Paxos/Raft consensus algorithms. 5. Parallelism and Data Replication

Modern exercises often touch on CAP Theorem (Consistency, Availability, Partition Tolerance).

Exercise Question: "Can a system be CA (Consistent and Available) during a network partition?"

Solution: No. During a partition (P), you must choose between Consistency (refusing the update to keep data uniform) or Availability (allowing the update even if other sites don't see it yet). Summary Checklist for Students

When looking for or writing solutions to distributed database problems, always check for:

Minimization of data transfer: Is there a way to do this with fewer bytes?

Transparency: Does the user feel like they are using a single database?

Site Autonomy: Can a single site function if the others go offline?

By applying these principles to your exercises, you move from theoretical knowledge to architectural expertise.


1. Data Fragmentation: Horizontal, Vertical, and Hybrid

One of the first exercises students encounter involves designing correct and complete fragmentation schemas.

Part 3: Distributed Concurrency Control

4. Use formal tools where useful

  • Two-phase locking (2PL): show locking sequences and check for deadlocks.
  • Timestamp ordering: show timestamps, compare read/write rules, and show conflicts.
  • Conflict/serialization graphs: draw nodes = transactions, directed edges = conflicting ops.
  • Vector clocks: show vectors at events to prove causality or detect concurrent updates.

The Problem Type

Given read and write operations from transactions T1, T2, T3 on data items X, Y, Z stored at different sites. Determine if the schedule is conflict-serializable and if the protocol would allow it.

Sample Exercise & Solution

Exercise: Given PROJECT(Pno, Pname, Budget, Location). Applications: Types of Distributed Database Systems

  1. Query on Budget and Location for projects with Budget > 100000.
  2. Query on Pno, Pname only for reporting.

Solution:

  • Vertical Fragmentation: F1 = Pno, Pname and F2 = Pno, Budget, Location. The key Pno is present in both.
  • Horizontal Fragmentation on F2: F2a = σ_Budget > 100000 F2, F2b = σ_Budget ≤ 100000 F2.

Solution Strategy: Always start by identifying the primary key. For vertical, check that every attribute appears at least once. For horizontal, ensure predicates are complete and mutually exclusive.